Extract CSV Files from PDF Reports

A Python utility to extract embedded CSV files from PDF reports in bulk.

This project scans a folder of PDF reports, unpacks file attachments from each PDF using pdftk, and saves all extracted .csv files into an output directory with cleaned, prefixed filenames.

Why this project is useful

Many enterprise and reporting workflows generate PDF files that contain embedded CSV attachments. Manually opening each report and exporting attachments is slow and repetitive.

This script automates that process by:

scanning a directory for PDF files
extracting embedded attachments from each PDF
filtering for CSV files only
renaming the extracted CSVs using the source PDF filename
saving everything into one output folder

Features

Bulk processing of PDF files
Extracts embedded file attachments from PDFs
Saves only CSV attachments
Automatically renames output files to keep them organized
Works well for report-processing and automation workflows

How it works

The script:

Walks through a directory of PDF files
Uses pdftk to unpack embedded files from each PDF
Checks the extracted files for .csv attachments
Renames each CSV using the original PDF filename as a prefix
Moves the final files into the output directory

Example

If the input folder contains:

report_january.pdf
report_february.pdf

and those PDFs contain embedded CSV files such as:

data.csv
summary.csv

the output may look like:

report_january_data.csv
report_february_summary.csv

Project structure

.
├── extractcsv.py
├── requirements.txt
└── README.md

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
README.md		README.md
extractcsv.py		extractcsv.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extract CSV Files from PDF Reports

Why this project is useful

Features

How it works

Example

Project structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Extract CSV Files from PDF Reports

Why this project is useful

Features

How it works

Example

Project structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages