Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to compress Python objects before saving to cache?
Sometimes we need to compress Python objects (lists, dictionaries, strings, etc.) before saving them to cache and decompress them after reading from cache. This is particularly useful when dealing with large data structures that consume significant memory.
Before implementing compression, we should evaluate whether it's actually needed. Check if the data structures are too large to fit uncompressed in the cache. There's an overhead for compression/decompression that we need to balance against the benefits of caching.
Using zlib for Compression
If compression is necessary, zlib is the most commonly used library. It provides efficient compression with adjustable levels ?
Syntax
zlib.compress(data, level=6)
Parameters:
- data ? The bytes object to compress
- level ? Compression level (1-9): 1 is fastest/least compression, 9 is slowest/most compression
Example: Compressing a Dictionary
Here's how to compress and decompress a Python dictionary using pickle and zlib ?
import pickle
import zlib
# Sample data to compress
user_data = {
'users': ['Alice', 'Bob', 'Charlie'] * 100, # Large dataset
'scores': [95, 87, 92] * 100,
'metadata': {'version': '1.0', 'created': '2024-01-01'}
}
# Serialize and compress
serialized_data = pickle.dumps(user_data)
compressed_data = zlib.compress(serialized_data, level=6)
print(f"Original size: {len(serialized_data)} bytes")
print(f"Compressed size: {len(compressed_data)} bytes")
print(f"Compression ratio: {len(compressed_data)/len(serialized_data):.2%}")
Original size: 2418 bytes Compressed size: 174 bytes Compression ratio: 7.20%
Example: Decompression
To retrieve the original data from cache ?
import pickle
import zlib
# Assuming compressed_data exists from previous example
compressed_data = zlib.compress(pickle.dumps({
'users': ['Alice', 'Bob', 'Charlie'] * 100,
'scores': [95, 87, 92] * 100,
'metadata': {'version': '1.0', 'created': '2024-01-01'}
}), level=6)
# Decompress and deserialize
decompressed_data = zlib.decompress(compressed_data)
restored_data = pickle.loads(decompressed_data)
print(f"Restored users count: {len(restored_data['users'])}")
print(f"First three users: {restored_data['users'][:3]}")
print(f"Metadata: {restored_data['metadata']}")
Restored users count: 300
First three users: ['Alice', 'Bob', 'Charlie']
Metadata: {'version': '1.0', 'created': '2024-01-01'}
Compression Level Comparison
Different compression levels offer trade-offs between speed and compression ratio ?
| Level | Speed | Compression Ratio | Best For |
|---|---|---|---|
| 1 | Fastest | Lowest | Real-time applications |
| 6 (default) | Balanced | Good | General purpose |
| 9 | Slowest | Highest | Long-term storage |
Cache Implementation Example
Here's a practical cache class with compression ?
import pickle
import zlib
class CompressedCache:
def __init__(self, compression_level=6):
self.cache = {}
self.compression_level = compression_level
def set(self, key, value):
serialized = pickle.dumps(value)
compressed = zlib.compress(serialized, self.compression_level)
self.cache[key] = compressed
def get(self, key):
if key not in self.cache:
return None
compressed = self.cache[key]
decompressed = zlib.decompress(compressed)
return pickle.loads(decompressed)
# Usage example
cache = CompressedCache()
cache.set('user_data', {'name': 'Alice', 'scores': [95, 87, 92] * 50})
retrieved_data = cache.get('user_data')
print(f"Retrieved name: {retrieved_data['name']}")
print(f"Scores count: {len(retrieved_data['scores'])}")
Retrieved name: Alice Scores count: 150
Conclusion
Use zlib compression when caching large Python objects to save memory. Choose compression level 6 for balanced performance, or experiment with levels 1-9 based on your speed vs. size requirements.
