Automating Image Compression: A Parallel Python Script for Faster, Smarter Optimization

Table of Contents

Introduction: The Problem

When managing large web projects or image-heavy applications, storage and performance can quickly become a concern. On our Linux-based server, we faced a common but critical problem: hundreds of megabytes—sometimes gigabytes—of unoptimized image files accumulating over time. These files, often uploaded by users or generated by internal systems, were primarily high-resolution JPEG and PNG images.
Over time, these oversized assets slowed down page loads, consumed unnecessary disk space, and even impacted backup and sync jobs.
Manual compression was not an option, so we needed:

  • A reliable, automated solution
  • Support for both JPEG and PNG
  • And ideally, the ability to process images in parallel for speed

The Solution: A Parallel Image Compression Script in Python

To solve this, I wrote a Python script that scans a folder (recursively), sorts the image files by size and recency, and then compresses them using jpegoptim (for JPEGs) and pngquant (for PNGs). The script uses multithreading to compress multiple images in parallel, dramatically speeding up the process on modern multi-core servers.

How it works

Image Collection

The script recursively searches a target folder for .jpg, .jpeg, and .png files. For each file, it gathers metadata like file size and modification time.

Smart Sorting

Images are sorted by file size (largest first) and then by modification date (newest first). This ensures the script tackles the „biggest problems“ first.

Parallel Processing

Using Python’s ThreadPoolExecutor, up to 10 images are compressed simultaneously, leveraging multiple CPU threads to reduce total runtime.

JPEG Compression

jpegoptim is used to compress JPEG images with the following options:

  • --max=70: Reduces quality to a maximum of 70%
  • --strip-all: Removes metadata
  • --all-progressive: Ensures optimal web-friendly format

PNG Compression

pngquant handles PNG images with:

  • --quality=50-80: Sets acceptable lossy range
  • --speed=1: Enables highest compression (slow but effective)


Each compressed PNG is saved as a new _compressed.png file.

Error Handling & Output

If an image fails to compress, the script catches the error and prints a clear message—without stopping the whole process. Output from jpegoptim and pngquant is shown for transparency.

Conclusion

This simple yet powerful script helped us reduce storage usage by over 70% in some cases, and it runs efficiently on our Ubuntu 24.04 server with minimal resources. It’s a great way to automate a task that most developers overlook—image compression at scale.

If you’re running into similar problems with image bloat, I highly recommend adapting or expanding this script to fit your own environment.

Requirements (Installation)

Before running the script, make sure the following packages are installed on your Ubuntu/Debian server:

sudo apt update && sudo apt install -y pngquant jpegoptim python3-pil python3

Explanation:

  • pngquant – For lossy PNG compression
  • jpegoptim – For JPEG compression with quality and metadata control
  • python3-pil – Pillow (Python Imaging Library), used to identify image files
  • python3 – Ensures Python 3 is available


If you also need pip3 for installing Python packages:

sudo apt install -y python3-pip

Or install Pillow manually via pip:

pip3 install pillow

The Script

				
					#!/usr/bin/env python3

import os
import sys
import subprocess
from concurrent.futures import ThreadPoolExecutor, as_completed

def compress_png(input_path, output_path):
    """
    Compresses a PNG file using pngquant with lossy but high compression.
    """
    try:
        result = subprocess.run(
            ["pngquant", "--quality=50-80", "--speed=1", "--force", "--output", output_path, input_path],
            check=True,
            text=True,
            capture_output=True
        )
        print(f"[PNG] {input_path} → {output_path}")
        if result.stdout.strip():
            print(result.stdout.strip())
    except subprocess.CalledProcessError as e:
        print(f"[ERROR] PNG not compressed: {input_path}")
        print(e.stderr.strip())

def compress_jpeg(input_path):
    """
    Compresses a JPEG file using jpegoptim with maximum quality 70.
    Strips metadata and makes it progressive.
    """
    try:
        result = subprocess.run(
            ["jpegoptim", "--max=70", "--strip-all", "--all-progressive", input_path],
            check=True,
            text=True,
            capture_output=True
        )
        print(f"[JPEG] {input_path}")
        print(result.stdout.strip())
    except subprocess.CalledProcessError as e:
        print(f"[ERROR] JPEG not compressed: {input_path}")
        print(e.stderr.strip())

def process_file(file_path):
    """
    Detects the file type and compresses it accordingly.
    Supports PNG and JPEG/JPG.
    """
    if not os.path.exists(file_path):
        print(f"[WARNING] File not found: {file_path}")
        return

    file_ext = file_path.lower().split(".")[-1]
    if file_ext == "png":
        output_path = file_path.replace(".png", "_compressed.png")
        compress_png(file_path, output_path)
    elif file_ext in ["jpg", "jpeg"]:
        compress_jpeg(file_path)
    else:
        print(f"[SKIPPED] Unsupported file format: {file_path}")

def collect_images(folder_path):
    """
    Walks through a folder and collects all PNG/JPEG files,
    along with their size and modification time for sorting.
    """
    image_files = []
    for root, _, files in os.walk(folder_path):
        for file in files:
            if file.lower().endswith((".png", ".jpg", ".jpeg")):
                full_path = os.path.join(root, file)
                try:
                    stat = os.stat(full_path)
                    image_files.append({
                        "path": full_path,
                        "size": stat.st_size,
                        "mtime": stat.st_mtime
                    })
                except FileNotFoundError:
                    continue
    return image_files

def compress_sorted_images(folder_path, max_workers=10):
    """
    Compresses images in parallel, starting with the largest and newest first.
    Uses ThreadPoolExecutor with the specified number of workers.
    """
    images = collect_images(folder_path)
    # Sort: largest files first, then newest
    images.sort(key=lambda x: (-x["size"], -x["mtime"]))

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(process_file, img["path"]) for img in images]

        for future in as_completed(futures):
            try:
                future.result()
            except Exception as e:
                print(f"[THREAD ERROR] {e}")

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python compress_images.py <file-or-folder-path>")
        sys.exit(1)

    path = sys.argv[1]
    if os.path.isdir(path):
        compress_sorted_images(path, max_workers=10)
    elif os.path.isfile(path):
        process_file(path)
    else:
        print("Error: Invalid file or folder path.")
        sys.exit(1)

				
			

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert