Introduction: The Problem
When managing large web projects or image-heavy applications, storage and performance can quickly become a concern. On our Linux-based server, we faced a common but critical problem: hundreds of megabytes—sometimes gigabytes—of unoptimized image files accumulating over time. These files, often uploaded by users or generated by internal systems, were primarily high-resolution JPEG and PNG images.
Over time, these oversized assets slowed down page loads, consumed unnecessary disk space, and even impacted backup and sync jobs.
Manual compression was not an option, so we needed:
- A reliable, automated solution
- Support for both JPEG and PNG
- And ideally, the ability to process images in parallel for speed
The Solution: A Parallel Image Compression Script in Python
How it works
Image Collection
The script recursively searches a target folder for .jpg
, .jpeg
, and .png
files. For each file, it gathers metadata like file size and modification time.
Smart Sorting
Parallel Processing
ThreadPoolExecutor
, up to 10 images are compressed simultaneously, leveraging multiple CPU threads to reduce total runtime. JPEG Compression
jpegoptim
is used to compress JPEG images with the following options:
--max=70
: Reduces quality to a maximum of 70%--strip-all
: Removes metadata--all-progressive
: Ensures optimal web-friendly format
PNG Compression
pngquant
handles PNG images with:
--quality=50-80
: Sets acceptable lossy range--speed=1
: Enables highest compression (slow but effective)
Each compressed PNG is saved as a new _compressed.png
file.
Error Handling & Output
jpegoptim
and pngquant
is shown for transparency. Conclusion
This simple yet powerful script helped us reduce storage usage by over 70% in some cases, and it runs efficiently on our Ubuntu 24.04 server with minimal resources. It’s a great way to automate a task that most developers overlook—image compression at scale.
If you’re running into similar problems with image bloat, I highly recommend adapting or expanding this script to fit your own environment.
Requirements (Installation)
Before running the script, make sure the following packages are installed on your Ubuntu/Debian server:
sudo apt update && sudo apt install -y pngquant jpegoptim python3-pil python3
Explanation:
- pngquant – For lossy PNG compression
- jpegoptim – For JPEG compression with quality and metadata control
- python3-pil – Pillow (Python Imaging Library), used to identify image files
- python3 – Ensures Python 3 is available
If you also need pip3
for installing Python packages:
sudo apt install -y python3-pip
Or install Pillow manually via pip:
pip3 install pillow
The Script
#!/usr/bin/env python3
import os
import sys
import subprocess
from concurrent.futures import ThreadPoolExecutor, as_completed
def compress_png(input_path, output_path):
"""
Compresses a PNG file using pngquant with lossy but high compression.
"""
try:
result = subprocess.run(
["pngquant", "--quality=50-80", "--speed=1", "--force", "--output", output_path, input_path],
check=True,
text=True,
capture_output=True
)
print(f"[PNG] {input_path} → {output_path}")
if result.stdout.strip():
print(result.stdout.strip())
except subprocess.CalledProcessError as e:
print(f"[ERROR] PNG not compressed: {input_path}")
print(e.stderr.strip())
def compress_jpeg(input_path):
"""
Compresses a JPEG file using jpegoptim with maximum quality 70.
Strips metadata and makes it progressive.
"""
try:
result = subprocess.run(
["jpegoptim", "--max=70", "--strip-all", "--all-progressive", input_path],
check=True,
text=True,
capture_output=True
)
print(f"[JPEG] {input_path}")
print(result.stdout.strip())
except subprocess.CalledProcessError as e:
print(f"[ERROR] JPEG not compressed: {input_path}")
print(e.stderr.strip())
def process_file(file_path):
"""
Detects the file type and compresses it accordingly.
Supports PNG and JPEG/JPG.
"""
if not os.path.exists(file_path):
print(f"[WARNING] File not found: {file_path}")
return
file_ext = file_path.lower().split(".")[-1]
if file_ext == "png":
output_path = file_path.replace(".png", "_compressed.png")
compress_png(file_path, output_path)
elif file_ext in ["jpg", "jpeg"]:
compress_jpeg(file_path)
else:
print(f"[SKIPPED] Unsupported file format: {file_path}")
def collect_images(folder_path):
"""
Walks through a folder and collects all PNG/JPEG files,
along with their size and modification time for sorting.
"""
image_files = []
for root, _, files in os.walk(folder_path):
for file in files:
if file.lower().endswith((".png", ".jpg", ".jpeg")):
full_path = os.path.join(root, file)
try:
stat = os.stat(full_path)
image_files.append({
"path": full_path,
"size": stat.st_size,
"mtime": stat.st_mtime
})
except FileNotFoundError:
continue
return image_files
def compress_sorted_images(folder_path, max_workers=10):
"""
Compresses images in parallel, starting with the largest and newest first.
Uses ThreadPoolExecutor with the specified number of workers.
"""
images = collect_images(folder_path)
# Sort: largest files first, then newest
images.sort(key=lambda x: (-x["size"], -x["mtime"]))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(process_file, img["path"]) for img in images]
for future in as_completed(futures):
try:
future.result()
except Exception as e:
print(f"[THREAD ERROR] {e}")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python compress_images.py ")
sys.exit(1)
path = sys.argv[1]
if os.path.isdir(path):
compress_sorted_images(path, max_workers=10)
elif os.path.isfile(path):
process_file(path)
else:
print("Error: Invalid file or folder path.")
sys.exit(1)