Automating Image Compression: A Parallel Python Script for Faster, Smarter Optimization
Introduction: The Problem
When managing large web projects or image-heavy applications, storage and performance can quickly become a concern. On our Linux-based server, we faced a common but critical problem: hundreds of megabytes—sometimes gigabytes—of unoptimized image files accumulating over time. These files, often uploaded by users or generated by internal systems, were primarily high-resolution JPEG and PNG images.
Over time, these oversized assets slowed down page loads, consumed unnecessary disk space, and even impacted backup and sync jobs.
Manual compression was not an option, so we needed:
- A reliable, automated solution
- Support for both JPEG and PNG
- And ideally, the ability to process images in parallel for speed
The Solution: A Parallel Image Compression Script in Python
To solve this, I wrote a Python script that scans a folder (recursively), sorts the image files by size and recency, and then compresses them using jpegoptim (for JPEGs) and pngquant (for PNGs). The script uses multithreading to compress multiple images in parallel, dramatically speeding up the process on modern multi-core servers.
How it works
Image Collection
The script recursively searches a target folder for .jpg, .jpeg, and .png files. For each file, it gathers metadata like file size and modification time.
Smart Sorting
Images are sorted by file size (largest first) and then by modification date (newest first). This ensures the script tackles the „biggest problems“ first.
Parallel Processing
Using Python’s ThreadPoolExecutor, up to 10 images are compressed simultaneously, leveraging multiple CPU threads to reduce total runtime.
JPEG Compression
jpegoptim is used to compress JPEG images with the following options:
--max=70: Reduces quality to a maximum of 70%--strip-all: Removes metadata--all-progressive: Ensures optimal web-friendly format
PNG Compression
pngquant handles PNG images with:
--quality=50-80: Sets acceptable lossy range--speed=1: Enables highest compression (slow but effective)
Each compressed PNG is saved as a new _compressed.png file.
Error Handling & Output
If an image fails to compress, the script catches the error and prints a clear message—without stopping the whole process. Output from jpegoptim and pngquant is shown for transparency.
Conclusion
This simple yet powerful script helped us reduce storage usage by over 70% in some cases, and it runs efficiently on our Ubuntu 24.04 server with minimal resources. It’s a great way to automate a task that most developers overlook—image compression at scale.
If you’re running into similar problems with image bloat, I highly recommend adapting or expanding this script to fit your own environment.
Requirements (Installation)
Before running the script, make sure the following packages are installed on your Ubuntu/Debian server:
sudo apt update && sudo apt install -y pngquant jpegoptim python3-pil python3
Explanation:
- pngquant – For lossy PNG compression
- jpegoptim – For JPEG compression with quality and metadata control
- python3-pil – Pillow (Python Imaging Library), used to identify image files
- python3 – Ensures Python 3 is available
If you also need pip3 for installing Python packages:
sudo apt install -y python3-pip
Or install Pillow manually via pip:
pip3 install pillow
The Script
#!/usr/bin/env python3
import os
import sys
import subprocess
from concurrent.futures import ThreadPoolExecutor, as_completed
def compress_png(input_path, output_path):
"""
Compresses a PNG file using pngquant with lossy but high compression.
"""
try:
result = subprocess.run(
["pngquant", "--quality=50-80", "--speed=1", "--force", "--output", output_path, input_path],
check=True,
text=True,
capture_output=True
)
print(f"[PNG] {input_path} → {output_path}")
if result.stdout.strip():
print(result.stdout.strip())
except subprocess.CalledProcessError as e:
print(f"[ERROR] PNG not compressed: {input_path}")
print(e.stderr.strip())
def compress_jpeg(input_path):
"""
Compresses a JPEG file using jpegoptim with maximum quality 70.
Strips metadata and makes it progressive.
"""
try:
result = subprocess.run(
["jpegoptim", "--max=70", "--strip-all", "--all-progressive", input_path],
check=True,
text=True,
capture_output=True
)
print(f"[JPEG] {input_path}")
print(result.stdout.strip())
except subprocess.CalledProcessError as e:
print(f"[ERROR] JPEG not compressed: {input_path}")
print(e.stderr.strip())
def process_file(file_path):
"""
Detects the file type and compresses it accordingly.
Supports PNG and JPEG/JPG.
"""
if not os.path.exists(file_path):
print(f"[WARNING] File not found: {file_path}")
return
file_ext = file_path.lower().split(".")[-1]
if file_ext == "png":
output_path = file_path.replace(".png", "_compressed.png")
compress_png(file_path, output_path)
elif file_ext in ["jpg", "jpeg"]:
compress_jpeg(file_path)
else:
print(f"[SKIPPED] Unsupported file format: {file_path}")
def collect_images(folder_path):
"""
Walks through a folder and collects all PNG/JPEG files,
along with their size and modification time for sorting.
"""
image_files = []
for root, _, files in os.walk(folder_path):
for file in files:
if file.lower().endswith((".png", ".jpg", ".jpeg")):
full_path = os.path.join(root, file)
try:
stat = os.stat(full_path)
image_files.append({
"path": full_path,
"size": stat.st_size,
"mtime": stat.st_mtime
})
except FileNotFoundError:
continue
return image_files
def compress_sorted_images(folder_path, max_workers=10):
"""
Compresses images in parallel, starting with the largest and newest first.
Uses ThreadPoolExecutor with the specified number of workers.
"""
images = collect_images(folder_path)
# Sort: largest files first, then newest
images.sort(key=lambda x: (-x["size"], -x["mtime"]))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(process_file, img["path"]) for img in images]
for future in as_completed(futures):
try:
future.result()
except Exception as e:
print(f"[THREAD ERROR] {e}")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python compress_images.py <file-or-folder-path>")
sys.exit(1)
path = sys.argv[1]
if os.path.isdir(path):
compress_sorted_images(path, max_workers=10)
elif os.path.isfile(path):
process_file(path)
else:
print("Error: Invalid file or folder path.")
sys.exit(1)