Back to Blog
RPAPythonPlaywrightautomationproductivity

Automated PDF Retrieval from Financial Accounting System — Saving 92 Hours per Year

Automated PDF Retrieval from Financial Accounting System — Saving 92 Hours per Year

Quick update on today's work.

I built a tool to automate the retrieval of payment voucher PDFs from a financial accounting system (GLOVIAUX).

The Problem

As part of budget execution tracking, payment voucher PDFs need to be periodically retrieved and archived.

This was an entirely manual process. The workflow for each voucher looked like this:

  1. Log in to the financial accounting system
  2. Enter a budget detail code and search
  3. Click a voucher number to open the detail screen
  4. Click "Print Voucher" → "Preview"
  5. Save the displayed PDF

Each item took about 2 minutes. With roughly 3,000 items per year, over 100 hours annually were spent on this repetitive task.

The Solution

I automated the entire workflow using Python and Playwright as an RPA (Robotic Process Automation) solution.

Key Design Decisions

  • CDP connection to existing browser: Connects to an already-running Edge browser via CDP, bypassing the login process entirely
  • Non-engineer friendly: Voucher number lists are managed in Excel, making it easy for non-technical staff to operate
  • Automatic PDF extraction: Uses pyMuPDF to extract only the pages matching the target budget code from the downloaded PDFs
  • Simple deployment: Distributed as an Edge launcher batch file + exe — just two clicks to run

Results

Before After
Time per item ~2 min ~10 sec
Annual processing time (3,000 items) ~100 hours ~8 hours
Time saved 92 hours/year

Tech Stack

Python Playwright pyMuPDF openpyxl PyInstaller


RPA and workflow automation is a space I'll continue exploring alongside SheetToolBox. If you have repetitive tasks you'd like to automate, feel free to reach out.

Share:
View all posts