Last modified: Jun 13, 2026
Install Neo4j Graph Data Science in Python
Neo4j is a powerful graph database. The Graph Data Science (GDS) library adds advanced analytics. This guide shows you how to install and use it with Python.
You will learn step-by-step instructions. We include code examples and outputs. This article is perfect for beginners.
What is Neo4j Graph Data Science?
Neo4j Graph Data Science is a library for graph algorithms. It provides functions for community detection, pathfinding, and centrality. You can run these algorithms directly on your graph data.
GDS works with Python through a driver. The driver lets you send queries and get results. You need to install both the database and the Python client.
Prerequisites
Before you start, ensure you have these tools:
- Python 3.7 or higher installed
- Neo4j Database (version 4.x or 5.x) running
- Basic knowledge of Cypher query language
You also need a Neo4j instance. You can use Neo4j Desktop or a cloud instance like Neo4j AuraDB. For local development, Neo4j Desktop is easiest.
Step 1: Install the Neo4j Python Driver
The official driver is neo4j. Install it using pip:
pip install neo4j
This installs the driver that connects Python to your database. Verify the installation:
pip show neo4j
Output should show version details:
Name: neo4j
Version: 5.20.0
Summary: Neo4j Bolt driver for Python
Step 2: Install the Graph Data Science Plugin
The GDS plugin runs inside Neo4j. You must install it on your database server. For Neo4j Desktop, follow these steps:
- Open Neo4j Desktop.
- Select your database project.
- Click "Manage" then "Plugins".
- Find "Graph Data Science" and click "Install".
For cloud instances like AuraDB, GDS is pre-installed. Check your instance settings. If not, you may need to upgrade your plan.
After installation, restart your database. The GDS library is now available.
Step 3: Connect Python to Neo4j
Create a Python script to connect. Use the GraphDatabase class from the driver:
from neo4j import GraphDatabase
# Connection details
uri = "bolt://localhost:7687"
username = "neo4j"
password = "your_password"
# Create driver instance
driver = GraphDatabase.driver(uri, auth=(username, password))
# Test connection
with driver.session() as session:
result = session.run("RETURN 'Hello, Neo4j!' AS message")
for record in result:
print(record["message"])
driver.close()
Output:
Hello, Neo4j!
Replace your_password with your actual password. If successful, you are connected.
Step 4: Load Data for GDS
GDS works on in-memory graphs. First, load your graph data into a named graph. Use the gds.graph.project Cypher command. Here is an example with sample data:
# Create sample nodes and relationships
with driver.session() as session:
session.run("CREATE (a:Person {name: 'Alice'})")
session.run("CREATE (b:Person {name: 'Bob'})")
session.run("CREATE (a)-[:KNOWS]->(b)")
# Project the graph for GDS
with driver.session() as session:
query = """
CALL gds.graph.project(
'myGraph',
'Person',
'KNOWS'
)
"""
session.run(query)
print("Graph 'myGraph' created successfully.")
Output:
Graph 'myGraph' created successfully.
Now your graph is ready for algorithms.
Step 5: Run a GDS Algorithm
Let's run a simple PageRank algorithm. This finds important nodes in the graph:
# Run PageRank
with driver.session() as session:
query = """
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
"""
result = session.run(query)
for record in result:
print(f"{record['name']}: {record['score']:.4f}")
Output:
Alice: 0.1500
Bob: 0.1500
Both nodes have equal scores because the graph is small. With more data, scores will vary.
Step 6: Handle Common Errors
You may encounter errors during installation. Here are common ones:
Error 1: "ModuleNotFoundError: No module named 'neo4j'"
Solution: Run pip install neo4j again. Ensure you are in the correct Python environment.
Error 2: "Unable to connect to Neo4j"
Solution: Check if your database is running. Verify the URI and port. For Neo4j Desktop, the default port is 7687.
Error 3: "There is no procedure with the name gds.graph.project"
Solution: The GDS plugin is not installed. Reinstall it from the plugin manager. Restart the database after installation.
Step 7: Use GDS with Pandas
You can combine GDS results with Pandas for analysis. Install Pandas first:
pip install pandas
Then convert results to a DataFrame:
import pandas as pd
with driver.session() as session:
query = """
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
"""
result = session.run(query)
df = pd.DataFrame([record.data() for record in result])
print(df)
Output:
name score
0 Alice 0.15
1 Bob 0.15
This makes it easy to visualize or export data.
Best Practices
Always close the driver after use. Use with statements for sessions. This prevents resource leaks.
For large graphs, use the gds.graph.project command with memory limits. This avoids crashing your database.
Test your installation with a small dataset first. Then scale up gradually.
Conclusion
Installing Neo4j Graph Data Science in Python is straightforward. You need the Python driver and the GDS plugin. Connect, load data, and run algorithms like PageRank.
Remember to handle errors by checking your connection and plugin installation. Use Pandas for data manipulation. With these steps, you can start analyzing graph data today.
Graph analytics opens new insights. Explore community detection, shortest paths, and centrality. Your Python setup is ready for advanced graph science.