Combine hundreds of Excel files into one with Python. Fast, automated solution for data merging and Excel file processing — no manual copy-paste need.
Let’s be honest — manually copying data from dozens (or hundreds) of Excel files into a single sheet is soul-crushing. If you’ve ever sat there thinking “There has to be a better way” — good news: there is! In this post, I’ll walk you through how to automatically combine multiple Excel files into one using Python. Whether you’re managing HR records, monthly reports, or survey data, this method will save you hours — maybe even days — of work.
🔍 Why Automate Excel File Processing with Python?
Here are a few real-world use cases where this script is a game-changer:
- You receive monthly Excel reports from multiple departments and want to compile them into a master sheet.
- You’ve collected form submissions or survey responses saved as separate .xlsx files.
- Your company stores employee records or sales data as individual files.
If any of that sounds familiar, read on.
🚀 How to Merge Excel Spreadsheets Automatically with Python
🛠️ What You’ll Need
To follow along, you’ll need:
- Basic knowledge of Python (or willingness to copy & paste 😄)
- The pandas and openpyxl libraries
- Don’t have Python installed yet? Here’s a step-by-step guide to install Python on Windows before you begin. 👈
pip install pandas openpyxl
📂 Step 1: Organize Your Excel Files
Put all your .xlsx files into one folder. For example:
employee_excels/
├── employees_001.xlsx
├── employees_002.xlsx
├── ...
This folder will be your source for merging.
🧠 Step 2: Use This Python Script to Combine All Files
import pandas as pd
import os
input_dir = 'employee_excels' # Folder with your Excel files
output_file = 'all_employees_combined.xlsx'
# Get all Excel file paths
all_files = [os.path.join(input_dir, f) for f in os.listdir(input_dir) if f.endswith('.xlsx')]
df_list = []
for file in all_files:
df = pd.read_excel(file)
df['Source File'] = os.path.basename(file) # Optional: track original file
df_list.append(df)
# Combine all DataFrames
combined_df = pd.concat(df_list, ignore_index=True)
# Save to a single Excel file
combined_df.to_excel(output_file, index=False)
print(f"✅ Successfully combined {len(all_files)} files into '{output_file}'")
📦 Step 3: Check the Output
You’ll now have a brand new file:
all_employees_combined.xlsx
🔄 Bonus: Customizing the Script
Want to level up this script? Here are a few tweaks you can make:
- Save each original file as a separate sheet in one workbook using ExcelWriter.
- Filter or clean data before combining it (e.g., skip empty rows or filter by date).
- Format cells or add styles using openpyxl or xlsxwriter.
👩💻 Final Thoughts
This Python trick is one of those small automations that pays you back every single time you use it. Instead of wasting time copying and pasting between spreadsheets, let your code handle the heavy lifting.
It’s fast. It’s reliable. And once you’ve got it set up, you’ll never look at Excel the same way again.
COMMENTS