Last modified: Apr 16, 2026

Separate Audio in Python: Vocals, Instruments

Audio source separation is a powerful technique. It allows you to split a single audio file into its individual components. You can isolate vocals, drums, bass, and other instruments.

This guide will show you how to do it with Python. We will use popular libraries and provide clear code examples. You will learn to create your own audio separation tool.

What is Audio Source Separation?

Audio source separation is a process. It takes a mixed audio signal and separates it into distinct sources. Think of a song. The final mix contains vocals, guitar, drums, and more.

Separation algorithms use machine learning. They are trained on thousands of songs. They learn to identify and isolate different sound types. This technology powers many music editing applications.

For a broader look at handling sound files, see our Python Audio Processing Guide for Beginners.

Key Python Libraries for Audio Separation

Several libraries make audio separation accessible. They provide pre-trained models and simple APIs. You do not need deep machine learning expertise to start.

1. Spleeter by Deezer

Spleeter is a very popular library. It is developed by Deezer's research team. It uses TensorFlow and offers high-quality separation.

It comes with pre-trained models. The 2-stem model separates vocals and accompaniment. The 4-stem model separates vocals, drums, bass, and other instruments.

Installation is straightforward using pip.


pip install spleeter

2. Demucs

Demucs is another excellent library. It is known for its high separation quality. It can separate audio into four stems: vocals, drums, bass, and other.

It uses PyTorch and is actively maintained. It often produces cleaner results than Spleeter for complex tracks.


pip install demucs

3. Librosa for Basic Processing

Librosa is a fundamental library for audio analysis. While it does not perform deep learning separation itself, it is crucial for loading, visualizing, and preprocessing audio before using other tools. To understand its full role, explore our guide on Python Audio Libraries: Play, Record, Process.

Separating Audio with Spleeter: A Step-by-Step Example

Let's separate a song using Spleeter. We will use the 2-stem model to get vocals and accompaniment.

First, ensure you have an audio file. We will use a file named "song.mp3".


# Import the necessary function from Spleeter
from spleeter.separator import Separator

# Initialize the separator with the '2stems' model.
# This model separates vocals and accompaniment.
separator = Separator('spleeter:2stems')

# Define the path to your input audio file.
audio_file = "song.mp3"

# Define the output directory where separated tracks will be saved.
output_dir = "output/"

# Perform the separation.
# The separate_to_file function handles the entire process.
separator.separate_to_file(audio_file, output_dir)

print("Separation complete! Check the 'output' folder.")

After running this script, check your "output" folder. You will find a new folder named after your song. Inside, you will find two files:

vocals.wav: The isolated vocal track.
accompaniment.wav: The music without the vocals.

The separate_to_file function is the core of the process. It loads the model, processes the audio, and saves the results.

Separating Audio with Demucs

Demucs can be used from the command line or within a Python script. Here is a Python example using the htdemucs model.


from demucs import separate
import torch

# Define paths
input_song = "song.mp3"
out_folder = "./demucs_output/"

# Run the separation. This may take a few minutes depending on the song length.
separate.main(["--mp3", "--two-stems", "vocals", "-n", "htdemucs", input_song, "-o", out_folder])

print("Demucs separation finished.")

This command uses the separate.main function to process the audio. The --two-stems vocals argument tells it to separate only vocals and instrumental. The results will be in the "demucs_output" folder.

Practical Applications of Audio Separation

Why would you want to separate audio? The uses are creative and practical.

Music Remixing and Production: Isolate a vocal acapella or a drum loop to create a new track.

Karaoke Creation: Remove vocals from any song to make instant karaoke backing tracks.

Educational Analysis: Study the bassline or guitar part of a song in isolation to learn it.

Audio Restoration: Reduce noise or rebalance elements in an old recording.

Tips for Better Results

Audio separation is impressive but not perfect. Here are tips to improve your outcomes.

Use high-quality source files. A 320kbps MP3 or WAV file will give the model more data to work with than a low-bitrate file.

Experiment with different libraries. Try both Spleeter and Demucs on the same song. One may perform better for your specific audio.

Post-process the separated tracks. Use audio editing software to clean up artifacts or adjust levels.

Conclusion

Separating audio in Python is an accessible and powerful skill. Libraries like Spleeter and Demucs put advanced machine learning models at your fingertips.

You can isolate vocals, extract instrumentals, and unlock new creative possibilities. Start with the simple code examples provided. Experiment with your own music files.

The field of audio AI is moving fast. New models and techniques are emerging regularly. By mastering these tools today, you prepare for even more amazing audio applications tomorrow.