Last modified: Nov 11, 2025 By Alexander Williams
Replace DOCX Placeholder Text with Python
Document automation saves time and reduces errors. Many businesses use template documents. These templates contain placeholder text. Python can automate replacing these placeholders.
This guide shows how to use python-docx. You will learn to find and replace text in Word documents. This method works for contracts, reports, and forms.
Installing python-docx Library
First, install the python-docx library. Use pip, the Python package manager. Run this command in your terminal.
pip install python-docx
The library provides tools to work with DOCX files. It can read, create, and modify documents. You do not need Microsoft Word installed.
Understanding DOCX Document Structure
DOCX files have a complex structure. They contain paragraphs, runs, and tables. Paragraphs are blocks of text. Runs are styled segments within paragraphs.
Placeholder text can be in any part of the document. You must search through all paragraphs. Also check tables and other elements.
Basic Text Replacement Function
Create a function to replace placeholder text. This function searches all paragraphs. It replaces old text with new text.
from docx import Document
def replace_placeholder(doc_path, old_text, new_text, output_path):
# Load the document
doc = Document(doc_path)
# Loop through all paragraphs
for paragraph in doc.paragraphs:
if old_text in paragraph.text:
# Replace the text
paragraph.text = paragraph.text.replace(old_text, new_text)
# Save the modified document
doc.save(output_path)
This function takes four parameters. The document path, old text, new text, and output path. It loads the document and processes each paragraph.
Using the Replacement Function
Call the function with your specific values. Replace the placeholders with actual data. Here is a practical example.
# Example usage
replace_placeholder(
'template.docx',
'{{client_name}}',
'ABC Corporation',
'completed_document.docx'
)
This replaces {{client_name}} with ABC Corporation. The result saves as a new file. The original template remains unchanged.
Handling Text in Tables
Documents often contain tables. Placeholders in tables need special handling. You must loop through all table cells.
def replace_in_tables(doc, old_text, new_text):
# Loop through all tables
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
if old_text in paragraph.text:
paragraph.text = paragraph.text.replace(old_text, new_text)
This function processes tables within a document. It checks every cell in every row. This ensures no placeholders are missed.
Complete Replacement Solution
Combine paragraph and table replacement. This creates a comprehensive solution. The function handles all document sections.
def comprehensive_replace(doc_path, replacements, output_path):
"""
Replace multiple placeholders in a DOCX document
replacements: dictionary with old_text: new_text pairs
"""
doc = Document(doc_path)
# Replace in paragraphs
for paragraph in doc.paragraphs:
for old_text, new_text in replacements.items():
if old_text in paragraph.text:
paragraph.text = paragraph.text.replace(old_text, new_text)
# Replace in tables
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
for old_text, new_text in replacements.items():
if old_text in paragraph.text:
paragraph.text = paragraph.text.replace(old_text, new_text)
doc.save(output_path)
This function accepts a dictionary of replacements. It can process multiple placeholders at once. This is efficient for complex documents.
Practical Example: Invoice Generation
Create invoices from templates. Use placeholders for variable data. This automates the invoice creation process.
# Define replacement values
invoice_data = {
'{{invoice_number}}': 'INV-2023-045',
'{{date}}': '2023-10-15',
'{{client}}': 'Global Solutions Inc.',
'{{amount}}': '$2,450.00',
'{{due_date}}': '2023-11-14'
}
# Generate the invoice
comprehensive_replace('invoice_template.docx', invoice_data, 'invoice_045.docx')
The template contains all these placeholders. The function replaces them with actual values. This creates a professional invoice quickly.
For more advanced document generation, see our guide on Generate Invoices in DOCX Using Python.
Preserving Text Formatting
Simple text replacement may lose formatting. The paragraph.text approach replaces entire paragraphs. This removes bold, italics, and other styles.
To preserve formatting, work with runs. Runs are styled text segments within paragraphs. This method maintains original formatting.
def replace_preserve_formatting(doc_path, old_text, new_text, output_path):
doc = Document(doc_path)
for paragraph in doc.paragraphs:
if old_text in paragraph.text:
# Clear the paragraph
paragraph.clear()
# Add new text with similar formatting
paragraph.add_run(new_text)
doc.save(output_path)
This approach is simpler but may not handle complex formatting. For precise formatting control, additional logic is needed.
Advanced Formatting Preservation
For complex documents, use advanced replacement. This maintains exact formatting of replaced text. It requires more sophisticated code.
def advanced_replace(doc_path, old_text, new_text, output_path):
doc = Document(doc_path)
for paragraph in doc.paragraphs:
if old_text in paragraph.text:
# Store the first run's formatting
if paragraph.runs:
first_run = paragraph.runs[0]
font = first_run.font
# Clear and rebuild paragraph
paragraph.clear()
new_run = paragraph.add_run(new_text)
# Apply original formatting
new_run.font.name = font.name
new_run.font.size = font.size
new_run.font.bold = font.bold
new_run.font.italic = font.italic
doc.save(output_path)
This preserves basic formatting attributes. It copies font name, size, and style from the original text.
Working with Document Properties
DOCX documents contain metadata properties. These include title, author, and keywords. Python-docx can modify these properties.
For comprehensive metadata handling, check our Python docx Document Properties Metadata Guide.
Error Handling and Validation
Add error handling to your replacement functions. This makes the code more robust. It handles missing files and other issues.
import os
def safe_replace(doc_path, replacements, output_path):
# Check if input file exists
if not os.path.exists(doc_path):
raise FileNotFoundError(f"Template file {doc_path} not found")
# Check if output directory exists
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
try:
comprehensive_replace(doc_path, replacements, output_path)
print(f"Successfully created {output_path}")
except Exception as e:
print(f"Error processing document: {e}")
This function validates inputs before processing. It creates output directories if needed. It also provides helpful error messages.
Batch Processing Multiple Documents
Process multiple templates at once. This is useful for bulk operations. Loop through a directory of template files.
import glob
def batch_process(template_dir, output_dir, replacements):
# Create output directory if it doesn't exist
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Process all DOCX files in template directory
for template_path in glob.glob(os.path.join(template_dir, "*.docx")):
filename = os.path.basename(template_path)
output_path = os.path.join(output_dir, f"processed_{filename}")
safe_replace(template_path, replacements, output_path)
This function processes all DOCX files in a directory. It applies the same replacements to each file. The results save with new filenames.
Integration with Other Systems
Combine DOCX processing with other Python libraries. Read data from CSV files or databases. Use this data to populate templates.
import csv
def generate_from_csv(template_path, csv_path, output_dir):
with open(csv_path, 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
# Use row data for replacements
output_path = os.path.join(output_dir, f"document_{row['id']}.docx")
safe_replace(template_path, row, output_path)
This reads data from a CSV file. It generates one document per row. Each document gets unique data from the CSV.
Legal Document Applications
Legal documents often use templates. They contain client-specific information. Automated replacement ensures accuracy and consistency.
For legal document specifics, see Python docx Legal Document Formatting Tips.
Conclusion
Python-docx provides powerful DOCX manipulation capabilities. Text replacement automates document generation. This saves time and reduces errors.
Start with simple paragraph replacement. Progress to handling tables and preserving formatting. Add error handling for production use.
Document automation streamlines business processes. It ensures consistency across generated documents. Python makes this accessible to developers of all levels.
Remember to always keep original templates unchanged. Generate new files for each automation run. Test your replacement logic thoroughly before deployment.