Last modified: Jan 02, 2025 By Alexander Williams
Python Bokeh ColumnDataSource Guide - Share Plot Data
The ColumnDataSource
in Bokeh is a fundamental data structure that efficiently manages and shares data between multiple plots. It provides a convenient way to handle tabular data and create interactive visualizations.
Understanding ColumnDataSource
A ColumnDataSource serves as a container for named columns of data. Each column can contain arrays or lists of values, and all columns must have the same length to maintain data consistency.
Basic Usage
Let's start with a simple example of creating a ColumnDataSource:
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
# Create data
data = {
'x': [1, 2, 3, 4, 5],
'y': [6, 7, 8, 9, 10],
'colors': ['red', 'blue', 'green', 'yellow', 'orange']
}
# Create ColumnDataSource
source = ColumnDataSource(data)
# Create plot
p = figure(title='Basic ColumnDataSource Example')
p.circle('x', 'y', size=10, color='colors', source=source)
show(p)
Sharing Data Between Multiple Plots
One of the most powerful features of ColumnDataSource is its ability to share data between multiple plots. This is particularly useful when creating linked visualizations using gridplot.
from bokeh.layouts import row
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
# Create data source
source = ColumnDataSource({
'x': [1, 2, 3, 4, 5],
'y1': [2, 5, 8, 2, 7],
'y2': [3, 6, 9, 4, 8]
})
# Create first plot
p1 = figure(title='Plot 1')
p1.circle('x', 'y1', source=source, color='blue')
# Create second plot
p2 = figure(title='Plot 2')
p2.circle('x', 'y2', source=source, color='red')
# Show plots side by side
show(row(p1, p2))
Updating Data Dynamically
Dynamic data updates are essential for creating interactive visualizations. ColumnDataSource provides methods to update data on the fly:
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
import numpy as np
# Initial data
source = ColumnDataSource({
'x': [1, 2, 3],
'y': [4, 5, 6]
})
# Create plot
p = figure(title='Dynamic Data Update')
p.line('x', 'y', source=source)
# Update data
new_data = {
'x': [1, 2, 3, 4],
'y': [2, 4, 6, 8]
}
source.data = new_data # Complete replacement
# OR
source.patch({'y': [(2, 10)]}) # Partial update
Working with Pandas DataFrames
ColumnDataSource works seamlessly with pandas DataFrames, making it easy to visualize data from various sources. Here's an example using a DataFrame:
import pandas as pd
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
# Create DataFrame
df = pd.DataFrame({
'date': pd.date_range('2023-01-01', periods=5),
'values': [10, 15, 13, 17, 20]
})
# Convert DataFrame to ColumnDataSource
source = ColumnDataSource(df)
# Create plot
p = figure(title='DataFrame Visualization', x_axis_type='datetime')
p.line('date', 'values', source=source)
show(p)
Advanced Features and Best Practices
When working with ColumnDataSource, consider these important practices:
- Always ensure all columns have the same length
- Use meaningful column names that reflect your data
- Consider using
patch
for small updates instead of replacing entire datasets
Integration with Interactive Features
ColumnDataSource integrates well with Bokeh's interactive features and can be used to create dynamic scatter plots or line plots.
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure, show
# Create source with additional data for hover
source = ColumnDataSource({
'x': [1, 2, 3, 4, 5],
'y': [2, 5, 8, 2, 7],
'desc': ['A', 'B', 'C', 'D', 'E']
})
# Create plot with hover tool
p = figure(title='Interactive Plot')
p.circle('x', 'y', source=source)
# Add hover tool
hover = HoverTool(tooltips=[
('x', '@x'),
('y', '@y'),
('Description', '@desc')
])
p.add_tools(hover)
show(p)
Conclusion
ColumnDataSource is a powerful tool in Bokeh that simplifies data management and enables the creation of sophisticated interactive visualizations. Its ability to share data between plots and handle dynamic updates makes it essential for modern data visualization tasks.
Understanding how to effectively use ColumnDataSource is crucial for creating complex, interactive Bokeh visualizations. Whether you're working with simple arrays or complex DataFrames, it provides a flexible and efficient way to manage your data.