Last modified: Apr 12, 2025 By Alexander Williams

Python OCR: Extract Text from Images Easily

OCR (Optical Character Recognition) converts images with text into machine-readable text. Python makes it easy with powerful libraries.

This guide will show you how to extract text from images using Python. We'll cover installation, basic usage, and practical examples.

Table Of Contents

Why Use Python for OCR?
Installing Required Libraries
Basic Image to Text Conversion
Improving OCR Accuracy
Advanced OCR Techniques
Handling Multiple Languages
Processing Multiple Images
Conclusion

Why Use Python for OCR?

Python is perfect for OCR tasks. It has simple syntax and powerful libraries. You can process images quickly and accurately.

Common uses include digitizing documents, automating data entry, and processing receipts. It's also great for text extraction from PDFs when combined with conversion tools.

Installing Required Libraries

First, install pytesseract and Pillow. These are the main libraries for OCR in Python.


# Install required packages
pip install pytesseract Pillow

You'll also need Tesseract OCR engine. Install it from the official GitHub repository for your operating system.

Basic Image to Text Conversion

Here's a simple script to extract text from an image. We'll use Pillow to open the image and pytesseract for OCR.


from PIL import Image
import pytesseract

# Open the image file
image = Image.open('sample.jpg')

# Perform OCR
text = pytesseract.image_to_string(image)

print(text)


This is sample text extracted from an image.
Line two of the sample text.

The image_to_string function does all the hard work. It returns the extracted text as a string.

Improving OCR Accuracy

OCR accuracy depends on image quality. Here are ways to improve results:

1. Use high-resolution images
2. Ensure proper lighting
3. Pre-process images with Python image segmentation techniques

You can also pre-process images before OCR. Try cropping, resizing, or enhancing contrast.

Advanced OCR Techniques

For more control, specify OCR parameters. You can set language, page segmentation mode, and more.


text = pytesseract.image_to_string(
    image,
    lang='eng',
    config='--psm 6 --oem 3'
)

PSM (Page Segmentation Mode) helps with layout analysis. OEM (OCR Engine Mode) selects the recognition algorithm.

Handling Multiple Languages

Tesseract supports many languages. Download additional language data files as needed.


# Extract text in Spanish
text = pytesseract.image_to_string(image, lang='spa')

Combine this with Python text extraction techniques for multilingual documents.

Processing Multiple Images

You can batch process multiple images. This is useful for digitizing documents or receipts.


import os

for filename in os.listdir('images/'):
    if filename.endswith(('.jpg', '.png')):
        image = Image.open(f'images/{filename}')
        text = pytesseract.image_to_string(image)
        print(f'{filename}:\n{text}\n')

Conclusion

Python OCR is powerful for extracting text from images. With pytesseract and proper image pre-processing, you can achieve great results.

Remember to check image quality and experiment with settings. For more advanced tasks, combine OCR with other techniques like Python image recognition.

Start with simple images and gradually tackle more complex documents. The possibilities are endless with Python OCR!