
Consider a typical finance manager at a Melbourne manufacturing company. Every Friday afternoon, she spends three hours pulling numbers from four different systems, copying them into Excel, making charts, converting to PDF, and emailing to the leadership team.
Fifteen hours a month. On a single report.
Here's the thing—according to research from McKinsey, knowledge workers spend 19% of their time searching for and gathering information, and another 14% communicating and collaborating internally. A significant chunk of that is report generation and distribution.
The good news? This is exactly the kind of repetitive, structured task that Python can automate completely. Automated reporting systems work well for accounting firms, logistics companies, and manufacturers across Australia. The pattern is always the same: what takes humans hours takes Python minutes.
This tutorial walks you through building a complete report automation pipeline—from database to inbox—with real code you can adapt to your needs.
Before we dive into code, let's understand what we're building in plain English.
Think of it like a robot assistant that does exactly what you do when creating a report—but without getting tired, making typos, or forgetting steps:
Real-world analogy: Imagine you hired a junior assistant who's incredibly reliable but only knows how to follow exact instructions. You'd write them a detailed checklist: "Every Monday at 7am, open this spreadsheet, calculate these totals, make this chart, save as PDF, email to these people." That's essentially what we're doing—but the "assistant" is a Python script that never calls in sick.
This tutorial is for you if:
You might want professional help if:
By the end of this tutorial, you'll have a system that:
Let's get into it.
If you've never written code before, start here.
Python is a programming language—a way to give instructions to your computer. Think of it like writing a very detailed recipe that your computer follows exactly.
Why Python for reports?
Before you can run any code, you need to set up your computer. This takes about 15-20 minutes, and you only do it once.
What is Python? The software that reads and runs your code.
Download links:
Important during installation (Windows):
To verify it worked:
python --versionPython 3.12.1What is a code editor? Like Microsoft Word, but for writing code. It highlights syntax and catches errors.
Recommended: Visual Studio Code (Free)
Alternative options:
What is a terminal? A text-based way to give commands to your computer. You'll use it to install libraries and run your scripts.
How to open it:
Windows key + R, type cmd, press EnterCmd + Space, type Terminal, press EnterCreate a folder for your report automation project:
# Windows
mkdir C:\reports-automation
cd C:\reports-automation
# Mac/Linux
mkdir ~/reports-automation
cd ~/reports-automation
What you need to get started:
How code works (simplified):
# Lines starting with # are comments (notes to yourself, computer ignores them)
# This tells Python to open a file called "sales.csv"
data = open_file("sales.csv")
# This tells Python to add up all the values in the "revenue" column
total = add_up(data, "revenue")
# This tells Python to show you the result
print(total) # Output: 125000
That's it. You write instructions, Python follows them. The code in this tutorial is more complex, but the principle is the same.
Before we write any code, here's what tools we're using and why. Think of these as the "ingredients" in our recipe:
| Component | Library | Why This Choice |
|---|---|---|
| Data handling | pandas | Industry standard, handles any data source |
| Visualisation | matplotlib | Reliable, extensive customisation |
| PDF generation | FPDF2 | Lightweight, fast, no external dependencies |
| Templating | Jinja2 | Clean separation of data and presentation |
| Scheduling | schedule | Simple syntax, good for single-machine deployments |
| smtplib | Built into Python, works with any SMTP server |
In plain English:
Step 1: Open your terminal (Command Prompt on Windows, Terminal on Mac)
Step 2: Navigate to your project folder:
# Windows
cd C:\reports-automation
# Mac
cd ~/reports-automation
Step 3: Install all the required libraries with one command:
pip install pandas matplotlib fpdf2 jinja2 schedule python-dotenv
What this command does: It tells Python's package manager (pip) to download and install all these libraries from the internet. It's like installing apps on your phone—you only need to do this once.
You should see: A bunch of text scrolling past, ending with "Successfully installed..." - that means it worked!
If you get an error: Try pip3 instead of pip:
pip3 install pandas matplotlib fpdf2 jinja2 schedule python-dotenv
Want to learn more about each library? Here are the official docs:
| Library | What It Does | Documentation |
|---|---|---|
| pandas | Data manipulation | pandas.pydata.org |
| matplotlib | Chart creation | matplotlib.org |
| FPDF2 | PDF generation | py-pdf.github.io/fpdf2 |
| Jinja2 | Templates | jinja.palletsprojects.com |
| schedule | Task scheduling | schedule.readthedocs.io |
| python-dotenv | Environment variables | pypi.org/project/python-dotenv |
FPDF2 hits the sweet spot between simplicity and capability. ReportLab is more powerful but overkill for most business reports. WeasyPrint is great for HTML-to-PDF but requires system dependencies that can be painful on Windows servers.
Let's start with the foundation—pulling and preparing your data.
Imagine you're creating a sales report. First, you'd:
That's exactly what this code does—automatically. Instead of you clicking through menus and typing formulas, the script handles it in milliseconds.
Here's the pattern:
import pandas as pd
from datetime import datetime, timedelta
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
class ReportDataPipeline:
"""Handles data extraction and transformation for reports."""
def __init__(self, data_source: str):
self.data_source = data_source
self.raw_data = None
self.processed_data = None
def load_data(self) -> pd.DataFrame:
"""Load data from CSV, Excel, or database."""
try:
if self.data_source.endswith('.csv'):
self.raw_data = pd.read_csv(self.data_source)
elif self.data_source.endswith(('.xlsx', '.xls')):
self.raw_data = pd.read_excel(self.data_source)
else:
# For database connections, you'd use sqlalchemy here
raise ValueError(f"Unsupported data source: {self.data_source}")
logging.info(f"Loaded {len(self.raw_data)} rows from {self.data_source}")
return self.raw_data
except Exception as e:
logging.error(f"Failed to load data: {e}")
raise
def filter_date_range(self, date_column: str, days: int = 30) -> pd.DataFrame:
"""Filter data to a specific date range."""
end_date = datetime.now()
start_date = end_date - timedelta(days=days)
self.raw_data[date_column] = pd.to_datetime(self.raw_data[date_column])
mask = (self.raw_data[date_column] >= start_date) & \
(self.raw_data[date_column] <= end_date)
self.processed_data = self.raw_data[mask].copy()
logging.info(f"Filtered to {len(self.processed_data)} rows in date range")
return self.processed_data
def calculate_summary_metrics(self, value_column: str,
group_column: str = None) -> dict:
"""Calculate key business metrics."""
data = self.processed_data if self.processed_data is not None \
else self.raw_data
metrics = {
'total': data[value_column].sum(),
'average': data[value_column].mean(),
'count': len(data),
'max': data[value_column].max(),
'min': data[value_column].min(),
'period_start': data.index.min() if hasattr(data.index, 'min') else None,
'period_end': data.index.max() if hasattr(data.index, 'max') else None,
}
if group_column:
metrics['by_group'] = data.groupby(group_column)[value_column].sum().to_dict()
return metrics
Don't let the code intimidate you. Here's what each part does:
class ReportDataPipeline - Think of this as creating a "report-making machine". We're defining all the steps this machine can perform.
load_data() - Opens your data file. It's smart enough to know if you're giving it an Excel file (.xlsx) or a CSV file, and handles each appropriately.
filter_date_range() - Keeps only the data from a specific time period. For example, "give me only the last 30 days of sales."
calculate_summary_metrics() - Does the maths:
Example in action:
# This is how you'd actually use the code above
pipeline = ReportDataPipeline("sales_data.csv")
pipeline.load_data() # Opens the file
pipeline.filter_date_range("order_date", days=30) # Last 30 days only
metrics = pipeline.calculate_summary_metrics("revenue", "region") # Calculate!
# Now 'metrics' contains:
# - total revenue for the month
# - average order value
# - number of orders
# - revenue broken down by region
Notice the logging throughout? That's not optional. When your scheduled report fails at 3am on a Sunday, you need to know exactly where and why. I once spent four hours debugging a report failure that turned out to be a single row with a malformed date. Add logging from the start—you'll thank yourself later.
Charts make reports scannable. A wall of numbers is hard to digest, but a well-designed chart tells the story instantly.
You know how in Excel you select data and click "Insert Chart"? This code does that automatically—but with more control over colours, formatting, and style. It creates three types of charts:
Here's a reusable pattern for generating consistent visualisations:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from io import BytesIO
class ReportChartGenerator:
"""Generates charts for PDF reports."""
def __init__(self, style: str = 'seaborn-v0_8-whitegrid'):
plt.style.use(style)
# Corporate colours - adjust to match your branding
self.colors = ['#2563eb', '#10b981', '#f59e0b', '#ef4444', '#8b5cf6']
def create_line_chart(self, data: pd.DataFrame,
x_column: str,
y_column: str,
title: str) -> BytesIO:
"""Create a line chart and return as bytes for PDF embedding."""
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(data[x_column], data[y_column],
color=self.colors[0], linewidth=2)
ax.fill_between(data[x_column], data[y_column],
alpha=0.3, color=self.colors[0])
ax.set_title(title, fontsize=14, fontweight='bold', pad=20)
ax.set_xlabel(x_column.replace('_', ' ').title())
ax.set_ylabel(y_column.replace('_', ' ').title())
# Format dates nicely if x-axis is datetime
if pd.api.types.is_datetime64_any_dtype(data[x_column]):
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d %b'))
plt.xticks(rotation=45)
ax.grid(True, alpha=0.3)
plt.tight_layout()
# Save to bytes buffer instead of file
buffer = BytesIO()
plt.savefig(buffer, format='png', dpi=150, bbox_inches='tight')
buffer.seek(0)
plt.close()
return buffer
def create_bar_chart(self, data: dict,
title: str,
xlabel: str = '',
ylabel: str = '') -> BytesIO:
"""Create a bar chart from a dictionary."""
fig, ax = plt.subplots(figsize=(10, 5))
categories = list(data.keys())
values = list(data.values())
bars = ax.bar(categories, values, color=self.colors[:len(categories)])
# Add value labels on bars
for bar, value in zip(bars, values):
height = bar.get_height()
ax.annotate(f'${value:,.0f}' if value > 1000 else f'{value:,.0f}',
xy=(bar.get_x() + bar.get_width() / 2, height),
xytext=(0, 3),
textcoords="offset points",
ha='center', va='bottom', fontsize=10)
ax.set_title(title, fontsize=14, fontweight='bold', pad=20)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
buffer = BytesIO()
plt.savefig(buffer, format='png', dpi=150, bbox_inches='tight')
buffer.seek(0)
plt.close()
return buffer
def create_pie_chart(self, data: dict, title: str) -> BytesIO:
"""Create a pie chart for distribution analysis."""
fig, ax = plt.subplots(figsize=(8, 8))
labels = list(data.keys())
sizes = list(data.values())
wedges, texts, autotexts = ax.pie(
sizes,
labels=labels,
autopct='%1.1f%%',
colors=self.colors[:len(labels)],
startangle=90
)
ax.set_title(title, fontsize=14, fontweight='bold', pad=20)
buffer = BytesIO()
plt.savefig(buffer, format='png', dpi=150, bbox_inches='tight')
buffer.seek(0)
plt.close()
return buffer
What each function does:
create_line_chart() - Draws a line connecting data points over time. Perfect for showing "sales this month vs last month" or "website visitors per day"
create_bar_chart() - Creates vertical bars where height = value. Great for "revenue by region" or "orders by product category"
create_pie_chart() - Divides a circle into slices showing percentages. Use for "market share" or "expense breakdown"
Key settings explained:
figsize=(10, 5) - The size of your chart (width x height in inches)self.colors - Your brand colours. The code uses professional blues, greens, and purples by defaultdpi=150 - Image quality. Higher = crisper but larger file sizeExample: Creating a sales trend chart
charts = ReportChartGenerator()
# Your data might look like this (simplified):
# Date: Jan 1, Jan 2, Jan 3...
# Revenue: $5000, $6200, $4800...
trend_chart = charts.create_line_chart(
data=my_sales_data,
x_column='date',
y_column='revenue',
title='Daily Revenue - January 2025'
)
# Result: A professional line chart saved in memory, ready for your PDF
Notice that I'm using BytesIO buffers and calling plt.close() after each chart? This prevents memory leaks when generating multiple charts. Report jobs can crash after generating 50+ charts when matplotlib figures aren't closed properly. In production environments, this is critical.
This is where we bring it together—turning your data and charts into a professional PDF document.
Imagine you're laying out a document in Word or InDesign:
This code does all of that automatically. Every time the script runs, it creates a fresh PDF with the latest data, perfectly formatted, without you touching anything.
Real-world example: Instead of copying numbers from Excel, making charts, pasting into Word, adjusting formatting, exporting to PDF... the script produces a polished 10-page report in under 5 seconds.
FPDF2 lets you build PDFs programmatically:
from fpdf import FPDF
from datetime import datetime
class BusinessReportPDF(FPDF):
"""Custom PDF class for business reports."""
def __init__(self, company_name: str = "Your Company"):
super().__init__()
self.company_name = company_name
self.set_auto_page_break(auto=True, margin=15)
def header(self):
"""Add header to each page."""
self.set_font('Helvetica', 'B', 12)
self.cell(0, 10, self.company_name, align='L')
self.set_font('Helvetica', '', 10)
self.cell(0, 10, datetime.now().strftime('%d %B %Y'), align='R', ln=True)
self.line(10, 25, 200, 25)
self.ln(10)
def footer(self):
"""Add footer with page numbers."""
self.set_y(-15)
self.set_font('Helvetica', 'I', 8)
self.cell(0, 10, f'Page {self.page_no()}/{{nb}}', align='C')
def add_title(self, title: str):
"""Add report title."""
self.set_font('Helvetica', 'B', 24)
self.cell(0, 20, title, ln=True, align='C')
self.ln(10)
def add_subtitle(self, subtitle: str):
"""Add section subtitle."""
self.set_font('Helvetica', 'B', 14)
self.set_fill_color(240, 240, 240)
self.cell(0, 10, subtitle, ln=True, fill=True)
self.ln(5)
def add_metric_box(self, label: str, value: str,
change: str = None, is_positive: bool = True):
"""Add a highlighted metric box."""
self.set_font('Helvetica', '', 10)
self.cell(50, 8, label, ln=False)
self.set_font('Helvetica', 'B', 14)
self.cell(40, 8, value, ln=False)
if change:
if is_positive:
self.set_text_color(0, 150, 0)
else:
self.set_text_color(200, 0, 0)
self.set_font('Helvetica', '', 10)
self.cell(0, 8, change, ln=True)
self.set_text_color(0, 0, 0)
else:
self.ln()
def add_chart(self, chart_buffer: BytesIO, width: int = 180):
"""Add a chart image from a BytesIO buffer."""
# Save buffer to temp file (FPDF2 requires file path)
import tempfile
with tempfile.NamedTemporaryFile(suffix='.png', delete=False) as tmp:
tmp.write(chart_buffer.getvalue())
tmp_path = tmp.name
self.image(tmp_path, x=15, w=width)
self.ln(10)
# Clean up temp file
import os
os.unlink(tmp_path)
def add_table(self, headers: list, data: list, col_widths: list = None):
"""Add a data table."""
if col_widths is None:
col_widths = [190 / len(headers)] * len(headers)
# Header row
self.set_font('Helvetica', 'B', 10)
self.set_fill_color(50, 50, 50)
self.set_text_color(255, 255, 255)
for header, width in zip(headers, col_widths):
self.cell(width, 8, header, border=1, fill=True, align='C')
self.ln()
# Data rows
self.set_font('Helvetica', '', 9)
self.set_text_color(0, 0, 0)
for row_idx, row in enumerate(data):
if row_idx % 2 == 0:
self.set_fill_color(245, 245, 245)
else:
self.set_fill_color(255, 255, 255)
for cell, width in zip(row, col_widths):
self.cell(width, 7, str(cell), border=1, fill=True, align='C')
self.ln()
self.ln(5)
def add_paragraph(self, text: str):
"""Add a text paragraph."""
self.set_font('Helvetica', '', 10)
self.multi_cell(0, 6, text)
self.ln(5)
This might look intimidating, but each function does one simple thing:
| Function | What It Does | Like This in Word/PDF |
|---|---|---|
header() | Adds company name + date at top | Your letterhead |
footer() | Adds page numbers at bottom | "Page 1 of 5" |
add_title() | Big, bold heading | Report title |
add_subtitle() | Section heading with grey background | Chapter headings |
add_metric_box() | Shows a key number with label | "Total Revenue: $125,000 +12%" |
add_chart() | Inserts a chart image | Your graphs |
add_table() | Creates a formatted data table | A nice table with headers |
add_paragraph() | Adds regular text | Body text |
The magic of PDF coordinates:
self.set_font('Helvetica', 'B', 14) means "use Helvetica font, Bold, 14 point size"self.cell(0, 10, text) means "create a text box full width, 10mm tall"self.ln(5) means "move down 5mm" (like pressing Enter)Customising for your brand:
# Change these colours to match your company:
self.set_fill_color(50, 50, 50) # Dark grey for table headers
self.set_text_color(0, 150, 0) # Green for positive numbers
self.set_text_color(200, 0, 0) # Red for negative numbers
Now we combine all the pieces into one script that does everything from start to finish.
This is the "master script" that coordinates everything:
Think of it like a conductor leading an orchestra—each musician (function) knows their part, and the conductor (this code) tells them when to play.
Here's the complete pipeline that ties everything together:
from datetime import datetime
import os
def generate_sales_report(data_file: str, output_path: str) -> str:
"""Generate a complete sales report PDF."""
# Step 1: Load and process data
pipeline = ReportDataPipeline(data_file)
pipeline.load_data()
pipeline.filter_date_range('order_date', days=30)
metrics = pipeline.calculate_summary_metrics(
value_column='revenue',
group_column='region'
)
# Step 2: Generate charts
charts = ReportChartGenerator()
# Daily revenue trend
daily_data = pipeline.processed_data.groupby(
pipeline.processed_data['order_date'].dt.date
)['revenue'].sum().reset_index()
daily_data.columns = ['date', 'revenue']
trend_chart = charts.create_line_chart(
daily_data, 'date', 'revenue',
'Daily Revenue Trend (Last 30 Days)'
)
# Revenue by region
region_chart = charts.create_bar_chart(
metrics['by_group'],
'Revenue by Region',
ylabel='Revenue ($)'
)
# Step 3: Build PDF
pdf = BusinessReportPDF(company_name="Acme Manufacturing")
pdf.alias_nb_pages()
pdf.add_page()
# Title
pdf.add_title("Monthly Sales Report")
pdf.add_paragraph(
f"Report generated: {datetime.now().strftime('%d %B %Y at %H:%M')}\n"
f"Period: Last 30 days"
)
# Key metrics section
pdf.add_subtitle("Key Metrics")
pdf.add_metric_box("Total Revenue", f"${metrics['total']:,.2f}", "+12.3%", True)
pdf.add_metric_box("Total Orders", f"{metrics['count']:,}", "+8.1%", True)
pdf.add_metric_box("Average Order", f"${metrics['average']:,.2f}", "-2.1%", False)
pdf.ln(10)
# Charts
pdf.add_subtitle("Revenue Trend")
pdf.add_chart(trend_chart)
pdf.add_page()
pdf.add_subtitle("Regional Performance")
pdf.add_chart(region_chart)
# Data table
pdf.add_subtitle("Top 10 Orders")
top_orders = pipeline.processed_data.nlargest(10, 'revenue')[
['order_id', 'customer', 'region', 'revenue']
].values.tolist()
pdf.add_table(
headers=['Order ID', 'Customer', 'Region', 'Revenue'],
data=[[row[0], row[1], row[2], f"${row[3]:,.2f}"] for row in top_orders],
col_widths=[30, 70, 40, 40]
)
# Save
output_file = os.path.join(output_path,
f"sales_report_{datetime.now().strftime('%Y%m%d')}.pdf")
pdf.output(output_file)
logging.info(f"Report generated: {output_file}")
return output_file
Here's what each section of that code does:
Loading and processing data:
pipeline = ReportDataPipeline(data_file) # Create our data handler
pipeline.load_data() # Open the file
pipeline.filter_date_range('order_date', days=30) # Keep only last 30 days
metrics = pipeline.calculate_summary_metrics(...) # Do the maths
Creating the charts:
charts = ReportChartGenerator() # Create our chart maker
trend_chart = charts.create_line_chart(...) # Make the trend line
region_chart = charts.create_bar_chart(...) # Make the bar chart
Building the PDF:
pdf = BusinessReportPDF(company_name="Acme Manufacturing") # Start new PDF
pdf.add_page() # Create first page
pdf.add_title(...) # Add the big title
pdf.add_metric_box(...) # Add key numbers
pdf.add_chart(...) # Insert the charts
pdf.add_table(...) # Add data tables
pdf.output(output_file) # Save the PDF file
How to run it:
report_generator.pyorder_date, revenue, region, order_id, customerpython report_generator.pyReports are useless if they sit on a server. Let's make them land in the right inboxes automatically.
This code does exactly what you'd do manually:
The difference? It does this in milliseconds, at 3am if you want, without you being awake.
Here's how to email them automatically:
import smtplib
import ssl
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication
from dotenv import load_dotenv
load_dotenv()
def send_report_email(report_path: str, recipients: list,
subject: str = "Your Automated Report"):
"""Send the generated report via email."""
smtp_server = os.getenv('SMTP_SERVER', 'smtp.gmail.com')
smtp_port = int(os.getenv('SMTP_PORT', '587'))
sender_email = os.getenv('SENDER_EMAIL')
sender_password = os.getenv('SENDER_PASSWORD')
# Create message
msg = MIMEMultipart()
msg['From'] = sender_email
msg['To'] = ', '.join(recipients)
msg['Subject'] = subject
# Email body
body = """
Hi,
Please find attached your automated report.
This report was generated automatically. If you have questions about
the data, please contact the analytics team.
Regards,
Automated Reporting System
"""
msg.attach(MIMEText(body, 'plain'))
# Attach PDF
with open(report_path, 'rb') as f:
attachment = MIMEApplication(f.read(), _subtype='pdf')
attachment.add_header(
'Content-Disposition',
'attachment',
filename=os.path.basename(report_path)
)
msg.attach(attachment)
# Send
try:
context = ssl.create_default_context()
with smtplib.SMTP(smtp_server, smtp_port) as server:
server.starttls(context=context)
server.login(sender_email, sender_password)
server.sendmail(sender_email, recipients, msg.as_string())
logging.info(f"Report emailed to {', '.join(recipients)}")
return True
except Exception as e:
logging.error(f"Failed to send email: {e}")
return False
Here's what each part does:
# These lines read your email settings from a secure file
smtp_server = os.getenv('SMTP_SERVER') # Your email provider's server
sender_email = os.getenv('SENDER_EMAIL') # Your email address
sender_password = os.getenv('SENDER_PASSWORD') # Your app password
# This creates the email message
msg = MIMEMultipart() # Start a new email
msg['From'] = sender_email # Who it's from
msg['To'] = ', '.join(recipients) # Who it's going to
msg['Subject'] = subject # Email subject line
msg.attach(MIMEText(body, 'plain')) # The message body
# This attaches your PDF
with open(report_path, 'rb') as f: # Open the PDF file
attachment = MIMEApplication(f.read()) # Read it into the email
msg.attach(attachment) # Attach it
Option 1: Gmail (Most Common)
Gmail requires a special "App Password" for scripts—you can't use your regular password.
How to create a Gmail App Password:
Create a .env file in your project folder with your settings:
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
SENDER_EMAIL=yourname@gmail.com
SENDER_PASSWORD=your-16-character-app-password
Security warning: Never share your .env file or commit it to version control. Add .env to your .gitignore file.
Option 2: Microsoft 365 / Outlook
SMTP_SERVER=smtp.office365.com
SMTP_PORT=587
SENDER_EMAIL=yourname@company.com.au
SENDER_PASSWORD=your-password
Option 3: Your Company's Email Server
Ask your IT department for:
The final piece—making it run automatically, even when you're asleep.
Right now, you have to manually run your script. Scheduling makes it run automatically at specific times:
It's like setting an alarm clock for your computer.
Here's the code:
import schedule
import time
def run_weekly_report():
"""Generate and distribute the weekly sales report."""
try:
report_file = generate_sales_report(
data_file='/path/to/sales_data.csv',
output_path='/path/to/reports/'
)
send_report_email(
report_path=report_file,
recipients=['manager@company.com', 'finance@company.com'],
subject=f"Weekly Sales Report - {datetime.now().strftime('%d %b %Y')}"
)
except Exception as e:
logging.error(f"Weekly report failed: {e}")
# Send alert email to IT
send_report_email(
report_path=None,
recipients=['it@company.com'],
subject="ALERT: Weekly Report Generation Failed"
)
# Schedule for Monday mornings at 7:00 AM
schedule.every().monday.at("07:00").do(run_weekly_report)
# Keep the script running
if __name__ == "__main__":
logging.info("Report scheduler started")
while True:
schedule.run_pending()
time.sleep(60) # Check every minute
For production deployments, I'd recommend using cron (Linux/Mac) or Task Scheduler (Windows) rather than keeping a Python script running. Here's the cron entry for Monday 7am:
0 7 * * 1 /usr/bin/python3 /path/to/report_generator.py
After implementing automated reporting for dozens of Australian businesses, here are the issues that come up repeatedly:
Your report will only be as good as your data. Build in validation checks:
def validate_data(df: pd.DataFrame) -> tuple[bool, list]:
"""Validate data before report generation."""
issues = []
if df.empty:
issues.append("Dataset is empty")
if df.isnull().sum().sum() > len(df) * 0.1:
issues.append("More than 10% missing values")
# Check for duplicates
if df.duplicated().sum() > 0:
issues.append(f"{df.duplicated().sum()} duplicate rows found")
return len(issues) == 0, issues
Australian businesses often have offices across multiple timezones. Always store datetimes in UTC and convert for display:
from zoneinfo import ZoneInfo
# Store in UTC
utc_time = datetime.now(ZoneInfo('UTC'))
# Display in local timezone
local_time = utc_time.astimezone(ZoneInfo('Australia/Sydney'))
Windows and Mac/Linux handle paths differently. Use pathlib for cross-platform compatibility:
from pathlib import Path
output_dir = Path('/path/to/reports')
output_dir.mkdir(parents=True, exist_ok=True)
output_file = output_dir / f"report_{datetime.now():%Y%m%d}.pdf"
Let's do the maths honestly. According to Business Dasher's 2024 research, businesses using automation report 10-50% cost reduction through decreased manual processing.
Here's a realistic breakdown for a typical weekly report:
| Metric | Manual Process | Automated | Improvement |
|---|---|---|---|
| Time per report | 3 hours | 0 minutes | 100% |
| Annual staff time | 156 hours | 0 hours | 156 hrs saved |
| Error rate | 2-5% | <0.1% | ~98% |
| Delivery time | Variable | Always on time | Consistent |
The multiplier effect: Most businesses have 5-10 regular reports. Once you've built the infrastructure for one, adding more reports costs a fraction of the original build. The same code patterns work for:
New to coding? Here are the issues you'll likely hit:
Problem: Your computer doesn't know where Python is installed.
Fix (Windows): Reinstall Python and make sure to check "Add Python to PATH"
Fix (Mac): Try using python3 instead of python
Problem: The library isn't installed.
Fix: Run pip install pandas (replace pandas with whatever module is missing)
Problem: You're trying to save somewhere you don't have access. Fix: Save to a folder in your Documents or Desktop instead
Problem: Your email credentials are wrong or you need an App Password. Fix: For Gmail, create an App Password (see email setup section above)
Problem: Page layout issues.
Fix: Adjust figsize for charts and use pdf.add_page() before sections that don't fit
Problem: Python can't find your data file.
Fix: Use the full path: C:\Users\YourName\Documents\sales.csv not just sales.csv
If you're ready to automate your reporting:
Resources for learning more:
Want help implementing automated reporting for your business? We've deployed these systems across accounting firms, manufacturers, and logistics companies throughout Australia.
Solve8 helps Australian businesses automate repetitive processes. Based in Brisbane, working nationally. No buzzwords, just systems that work.
Sources: