Last modified: Jun 13, 2026

Install Neo4j Graph Data Science in Python

Neo4j is a powerful graph database. The Graph Data Science (GDS) library adds advanced analytics. This guide shows you how to install and use it with Python.

You will learn step-by-step instructions. We include code examples and outputs. This article is perfect for beginners.

What is Neo4j Graph Data Science?

Neo4j Graph Data Science is a library for graph algorithms. It provides functions for community detection, pathfinding, and centrality. You can run these algorithms directly on your graph data.

GDS works with Python through a driver. The driver lets you send queries and get results. You need to install both the database and the Python client.

Prerequisites

Before you start, ensure you have these tools:

  • Python 3.7 or higher installed
  • Neo4j Database (version 4.x or 5.x) running
  • Basic knowledge of Cypher query language

You also need a Neo4j instance. You can use Neo4j Desktop or a cloud instance like Neo4j AuraDB. For local development, Neo4j Desktop is easiest.

Step 1: Install the Neo4j Python Driver

The official driver is neo4j. Install it using pip:


pip install neo4j

This installs the driver that connects Python to your database. Verify the installation:


pip show neo4j

Output should show version details:


Name: neo4j
Version: 5.20.0
Summary: Neo4j Bolt driver for Python

Step 2: Install the Graph Data Science Plugin

The GDS plugin runs inside Neo4j. You must install it on your database server. For Neo4j Desktop, follow these steps:

  1. Open Neo4j Desktop.
  2. Select your database project.
  3. Click "Manage" then "Plugins".
  4. Find "Graph Data Science" and click "Install".

For cloud instances like AuraDB, GDS is pre-installed. Check your instance settings. If not, you may need to upgrade your plan.

After installation, restart your database. The GDS library is now available.

Step 3: Connect Python to Neo4j

Create a Python script to connect. Use the GraphDatabase class from the driver:


from neo4j import GraphDatabase

# Connection details
uri = "bolt://localhost:7687"
username = "neo4j"
password = "your_password"

# Create driver instance
driver = GraphDatabase.driver(uri, auth=(username, password))

# Test connection
with driver.session() as session:
    result = session.run("RETURN 'Hello, Neo4j!' AS message")
    for record in result:
        print(record["message"])

driver.close()

Output:


Hello, Neo4j!

Replace your_password with your actual password. If successful, you are connected.

Step 4: Load Data for GDS

GDS works on in-memory graphs. First, load your graph data into a named graph. Use the gds.graph.project Cypher command. Here is an example with sample data:


# Create sample nodes and relationships
with driver.session() as session:
    session.run("CREATE (a:Person {name: 'Alice'})")
    session.run("CREATE (b:Person {name: 'Bob'})")
    session.run("CREATE (a)-[:KNOWS]->(b)")

# Project the graph for GDS
with driver.session() as session:
    query = """
    CALL gds.graph.project(
        'myGraph',
        'Person',
        'KNOWS'
    )
    """
    session.run(query)
    print("Graph 'myGraph' created successfully.")

Output:


Graph 'myGraph' created successfully.

Now your graph is ready for algorithms.

Step 5: Run a GDS Algorithm

Let's run a simple PageRank algorithm. This finds important nodes in the graph:


# Run PageRank
with driver.session() as session:
    query = """
    CALL gds.pageRank.stream('myGraph')
    YIELD nodeId, score
    RETURN gds.util.asNode(nodeId).name AS name, score
    ORDER BY score DESC
    """
    result = session.run(query)
    for record in result:
        print(f"{record['name']}: {record['score']:.4f}")

Output:


Alice: 0.1500
Bob: 0.1500

Both nodes have equal scores because the graph is small. With more data, scores will vary.

Step 6: Handle Common Errors

You may encounter errors during installation. Here are common ones:

Error 1: "ModuleNotFoundError: No module named 'neo4j'"
Solution: Run pip install neo4j again. Ensure you are in the correct Python environment.

Error 2: "Unable to connect to Neo4j"
Solution: Check if your database is running. Verify the URI and port. For Neo4j Desktop, the default port is 7687.

Error 3: "There is no procedure with the name gds.graph.project"
Solution: The GDS plugin is not installed. Reinstall it from the plugin manager. Restart the database after installation.

Step 7: Use GDS with Pandas

You can combine GDS results with Pandas for analysis. Install Pandas first:


pip install pandas

Then convert results to a DataFrame:


import pandas as pd

with driver.session() as session:
    query = """
    CALL gds.pageRank.stream('myGraph')
    YIELD nodeId, score
    RETURN gds.util.asNode(nodeId).name AS name, score
    """
    result = session.run(query)
    df = pd.DataFrame([record.data() for record in result])
    print(df)

Output:


    name  score
0  Alice    0.15
1    Bob    0.15

This makes it easy to visualize or export data.

Best Practices

Always close the driver after use. Use with statements for sessions. This prevents resource leaks.

For large graphs, use the gds.graph.project command with memory limits. This avoids crashing your database.

Test your installation with a small dataset first. Then scale up gradually.

Conclusion

Installing Neo4j Graph Data Science in Python is straightforward. You need the Python driver and the GDS plugin. Connect, load data, and run algorithms like PageRank.

Remember to handle errors by checking your connection and plugin installation. Use Pandas for data manipulation. With these steps, you can start analyzing graph data today.

Graph analytics opens new insights. Explore community detection, shortest paths, and centrality. Your Python setup is ready for advanced graph science.