Last modified: Jun 04, 2025 By Alexander Williams

Install imbalanced-learn in Python

Handling imbalanced datasets is a common challenge in machine learning. The imbalanced-learn library helps solve this problem.

This guide will show you how to install and use imbalanced-learn in Python.

What is imbalanced-learn?

Imbalanced-learn (imblearn) is a Python library for handling imbalanced datasets. It provides resampling techniques to balance class distribution.

The library integrates well with scikit-learn. It offers methods like oversampling, undersampling, and hybrid approaches.

Prerequisites

Before installing imbalanced-learn, ensure you have Python installed. Python 3.6 or higher is recommended.

You should also have pip installed. Pip is Python's package installer. Check your Python and pip versions:


import sys
print(sys.version)


python --version
pip --version

Install imbalanced-learn

The easiest way to install imbalanced-learn is using pip. Run this command:


pip install -U imbalanced-learn

The -U flag ensures you get the latest version. If you need a specific version, specify it:


pip install imbalanced-learn==0.9.0

Verify Installation

After installation, verify it works. Try importing the library in Python:


from imblearn import __version__
print(__version__)

This should print the installed version without errors.

Install with Conda

If you use Anaconda, install imbalanced-learn with conda:


conda install -c conda-forge imbalanced-learn

Conda handles dependencies automatically. It's a good option if you use Anaconda for data science.

Common Installation Issues

Some users face installation problems. Here are common issues and fixes:

1. Dependency conflicts: Ensure compatible versions of scikit-learn and numpy are installed.

2. Permission errors: Use --user flag or virtual environments.

3. Outdated pip: Update pip first with pip install --upgrade pip.

Basic Usage Example

Here's a simple example using SMOTE (Synthetic Minority Over-sampling Technique):


from imblearn.over_sampling import SMOTE
from sklearn.datasets import make_classification

# Create imbalanced dataset
X, y = make_classification(n_classes=2, weights=[0.1, 0.9], n_samples=1000)

# Apply SMOTE
smote = SMOTE()
X_res, y_res = smote.fit_resample(X, y)

print(f"Original class counts: {np.bincount(y)}")
print(f"Resampled class counts: {np.bincount(y_res)}")

This code balances the dataset by creating synthetic samples.

Alternative Installation Methods

For advanced users, you can install from source:


git clone https://github.com/scikit-learn-contrib/imbalanced-learn.git
cd imbalanced-learn
pip install .

This method is useful if you need the latest development version.

Integrating with Other Libraries

Imbalanced-learn works well with other Python libraries. For example, you can combine it with UMAP for dimensionality reduction.

It also pairs well with HDBSCAN for clustering tasks. For gradient boosting, consider CatBoost which handles imbalanced data well.

Conclusion

Installing imbalanced-learn is straightforward with pip or conda. The library provides essential tools for handling imbalanced datasets.

Remember to check dependencies and use virtual environments to avoid conflicts. Now you're ready to tackle class imbalance in your machine learning projects.