Automated PDF Retrieval from Financial Accounting System — Saving 92 Hours per Year

Quick update on today's work.

I built a tool to automate the retrieval of payment voucher PDFs from a financial accounting system (GLOVIAUX).

The Problem

As part of budget execution tracking, payment voucher PDFs need to be periodically retrieved and archived.

This was an entirely manual process. The workflow for each voucher looked like this:

Log in to the financial accounting system
Enter a budget detail code and search
Click a voucher number to open the detail screen
Click "Print Voucher" → "Preview"
Save the displayed PDF

Each item took about 2 minutes. With roughly 3,000 items per year, over 100 hours annually were spent on this repetitive task.

The Solution

I automated the entire workflow using Python and Playwright as an RPA (Robotic Process Automation) solution.

Key Design Decisions

CDP connection to existing browser: Connects to an already-running Edge browser via CDP, bypassing the login process entirely
Non-engineer friendly: Voucher number lists are managed in Excel, making it easy for non-technical staff to operate
Automatic PDF extraction: Uses pyMuPDF to extract only the pages matching the target budget code from the downloaded PDFs
Simple deployment: Distributed as an Edge launcher batch file + exe — just two clicks to run

Results

	Before	After
Time per item	~2 min	~10 sec
Annual processing time (3,000 items)	~100 hours	~8 hours
Time saved		92 hours/year

Tech Stack

Python Playwright pyMuPDF openpyxl PyInstaller

RPA and workflow automation is a space I'll continue exploring alongside SheetToolBox. If you have repetitive tasks you'd like to automate, feel free to reach out.