How to compress Python objects before saving to cache?

Sometimes we need to compress Python objects (lists, dictionaries, strings, etc.) before saving them to cache and decompress them after reading from cache. This is particularly useful when dealing with large data structures that consume significant memory.

Before implementing compression, we should evaluate whether it's actually needed. Check if the data structures are too large to fit uncompressed in the cache. There's an overhead for compression/decompression that we need to balance against the benefits of caching.

Using zlib for Compression

If compression is necessary, zlib is the most commonly used library. It provides efficient compression with adjustable levels ?

Syntax

zlib.compress(data, level=6)

Parameters:

  • data ? The bytes object to compress
  • level ? Compression level (1-9): 1 is fastest/least compression, 9 is slowest/most compression

Example: Compressing a Dictionary

Here's how to compress and decompress a Python dictionary using pickle and zlib ?

import pickle
import zlib

# Sample data to compress
user_data = {
    'users': ['Alice', 'Bob', 'Charlie'] * 100,  # Large dataset
    'scores': [95, 87, 92] * 100,
    'metadata': {'version': '1.0', 'created': '2024-01-01'}
}

# Serialize and compress
serialized_data = pickle.dumps(user_data)
compressed_data = zlib.compress(serialized_data, level=6)

print(f"Original size: {len(serialized_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")
print(f"Compression ratio: {len(compressed_data)/len(serialized_data):.2%}")
Original size: 2418 bytes
Compressed size: 174 bytes
Compression ratio: 7.20%

Example: Decompression

To retrieve the original data from cache ?

import pickle
import zlib

# Assuming compressed_data exists from previous example
compressed_data = zlib.compress(pickle.dumps({
    'users': ['Alice', 'Bob', 'Charlie'] * 100,
    'scores': [95, 87, 92] * 100,
    'metadata': {'version': '1.0', 'created': '2024-01-01'}
}), level=6)

# Decompress and deserialize
decompressed_data = zlib.decompress(compressed_data)
restored_data = pickle.loads(decompressed_data)

print(f"Restored users count: {len(restored_data['users'])}")
print(f"First three users: {restored_data['users'][:3]}")
print(f"Metadata: {restored_data['metadata']}")
Restored users count: 300
First three users: ['Alice', 'Bob', 'Charlie']
Metadata: {'version': '1.0', 'created': '2024-01-01'}

Compression Level Comparison

Different compression levels offer trade-offs between speed and compression ratio ?

Level Speed Compression Ratio Best For
1 Fastest Lowest Real-time applications
6 (default) Balanced Good General purpose
9 Slowest Highest Long-term storage

Cache Implementation Example

Here's a practical cache class with compression ?

import pickle
import zlib

class CompressedCache:
    def __init__(self, compression_level=6):
        self.cache = {}
        self.compression_level = compression_level
    
    def set(self, key, value):
        serialized = pickle.dumps(value)
        compressed = zlib.compress(serialized, self.compression_level)
        self.cache[key] = compressed
    
    def get(self, key):
        if key not in self.cache:
            return None
        compressed = self.cache[key]
        decompressed = zlib.decompress(compressed)
        return pickle.loads(decompressed)

# Usage example
cache = CompressedCache()
cache.set('user_data', {'name': 'Alice', 'scores': [95, 87, 92] * 50})

retrieved_data = cache.get('user_data')
print(f"Retrieved name: {retrieved_data['name']}")
print(f"Scores count: {len(retrieved_data['scores'])}")
Retrieved name: Alice
Scores count: 150

Conclusion

Use zlib compression when caching large Python objects to save memory. Choose compression level 6 for balanced performance, or experiment with levels 1-9 based on your speed vs. size requirements.

Updated on: 2026-03-24T19:55:15+05:30

462 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements