English Documentation
A blazingly fast disk cache implementation in Rust with Python bindings, designed to be compatible with python-diskcache while providing superior performance and bulletproof network filesystem support.
diskcache_rs consistently outperforms python-diskcache across all operations:
| Operation | diskcache_rs | python-diskcache | Speedup |
|---|---|---|---|
| Single SET | 8,958 ops/s | 7,444 ops/s | 1.2x faster โก |
| Batch SET (10) | 13,968 ops/s | 1,889 ops/s | 7.4x faster ๐ |
| Batch SET (100) | 14,699 ops/s | 7,270 ops/s | 2.0x faster โก |
| Cold Start | 806 ฮผs | 14,558 ฮผs | 18x faster ๐ |
| DELETE | 122k ops/s | 7.7k ops/s | 16x faster ๐ |
Benchmarks run on Windows 11, Python 3.13, identical test conditions.
- โก Superior Performance: 1.2x to 18x faster than python-diskcache
- ๐ Network Filesystem Mastery: Bulletproof operation on NFS, SMB, CIFS
- ๐ Drop-in Replacement: Compatible API with python-diskcache
- ๐ Ultra-Fast Startup: 18x faster cold start times
- ๐งต True Concurrency: Built with Rust's fearless concurrency
- UltraFast: Memory-only storage for maximum speed
- Hybrid: Smart memory + disk storage with automatic optimization
- File: Traditional file-based storage with network compatibility
- No SQLite Dependencies: Eliminates database corruption on network drives
- Atomic Operations: Ensures data consistency even on unreliable connections
- Thread Safe: Safe for concurrent access from multiple threads and processes
- Compression Support: Built-in LZ4 compression for space efficiency
The original python-diskcache can suffer from SQLite corruption on network file systems, as documented in issue #345. This implementation uses a file-based storage engine specifically designed for network filesystems, avoiding the "database disk image is malformed" errors.
pip install diskcache-rsfrom diskcache_rs import Cache
# Create a cache
cache = Cache('/tmp/mycache')
# Basic operations
cache['key'] = 'value'
print(cache['key']) # 'value'
# Check if key exists
if 'key' in cache:
print("Key exists!")
# Get with default
value = cache.get('missing_key', 'default')
# Delete
del cache['key']# Standard installation (Python version-specific wheels)
pip install diskcache-rs
# ABI3 installation (compatible with Python 3.8+)
pip install diskcache-rs --prefer-binary --extra-index-url https://pypi.org/simple/diskcache_rs provides two types of wheels:
-
Standard Wheels (default)
- Optimized for specific Python versions (3.8, 3.9, 3.10, 3.11, 3.12, 3.13)
- Smaller download size
- Maximum performance for your Python version
-
ABI3 Wheels (universal)
- Single wheel compatible with Python 3.8+
- Larger download size but works across Python versions
- Ideal for deployment scenarios with multiple Python versions
- Rust 1.87+ (for building from source)
- Python 3.8+
- maturin (for building Python bindings)
# Clone the repository
git clone https://github.com/loonghao/diskcache_rs.git
cd diskcache_rs
# Install dependencies
uv add diskcache # Optional: for comparison testing
# Standard build (Python version-specific)
uvx maturin develop
# ABI3 build (compatible with Python 3.8+)
uvx maturin develop --features abi3# Setup development environment
just dev
# Build standard wheels
just release
# Build ABI3 wheels
just release-abi3
# Available commands
just --listThis project uses Commitizen for automated version management and releases.
-
Use Conventional Commits: All commits should follow the Conventional Commits specification:
# Use commitizen for guided commit creation just commit # Or manually follow the format: # feat: add new feature # fix: resolve bug # docs: update documentation # chore: maintenance tasks
-
Automatic Releases: When you push to
main, the CI will:- Analyze commit messages since the last release
- Automatically bump version in both
Cargo.tomlandpyproject.toml - Generate changelog
- Create GitHub release
- Build and publish wheels to PyPI
# Check what version would be bumped (dry run)
just bump --dry-run
# Manually bump version and create changelog
just bump
# Generate changelog only
just changelogfrom diskcache_rs import Cache
# Create a cache with size limits
cache = Cache('/tmp/mycache', size_limit=1e9) # 1GB limit
# Dictionary-like interface
cache['key'] = 'value'
print(cache['key']) # 'value'
# Method interface
cache.set('number', 42)
cache.set('data', {'nested': 'dict'})
# Get with default values
value = cache.get('missing', 'default_value')
# Check membership
if 'key' in cache:
print("Found key!")
# Iterate over keys
for key in cache:
print(f"{key}: {cache[key]}")
# Delete items
del cache['key']
cache.pop('number', None) # Safe deletion
# Clear everything
cache.clear()from diskcache_rs import Cache, FanoutCache
# FanoutCache for better concurrent performance
cache = FanoutCache('/tmp/fanout', shards=8, size_limit=1e9)
# Set with expiration (TTL)
cache.set('temp_key', 'temp_value', expire=3600) # 1 hour
# Touch to update access time
cache.touch('temp_key')
# Atomic operations
with cache.transact():
cache['key1'] = 'value1'
cache['key2'] = 'value2'
# Both operations succeed or fail together
# Statistics and monitoring
stats = cache.stats()
print(f"Hits: {stats.hits}, Misses: {stats.misses}")
print(f"Size: {cache.volume()} bytes")
# Eviction and cleanup
cache.cull() # Manual eviction
cache.expire() # Remove expired itemsfrom diskcache_rs import FastCache
# Ultra-fast memory-only cache
fast_cache = FastCache(max_size=1000)
# Batch operations for maximum throughput
items = [(f'key_{i}', f'value_{i}') for i in range(1000)]
for key, value in items:
fast_cache[key] = value
# Efficient bulk retrieval
keys = [f'key_{i}' for i in range(100)]
values = [fast_cache.get(key) for key in keys]from diskcache_rs import Cache
# Works reliably on network drives
network_cache = Cache('//server/share/cache')
# Atomic writes prevent corruption
network_cache['important_data'] = large_dataset
# Built-in retry logic for network issues
try:
value = network_cache['important_data']
except Exception as e:
print(f"Network error handled: {e}")# settings.py
CACHES = {
'default': {
'BACKEND': 'diskcache_rs.DjangoCache',
'LOCATION': '/tmp/django_cache',
'OPTIONS': {
'size_limit': 1e9, # 1GB
'cull_limit': 0.1, # Remove 10% when full
}
}
}
# In your views
from django.core.cache import cache
cache.set('user_data', user_profile, timeout=3600)
user_data = cache.get('user_data')import time
import diskcache
from diskcache_rs import Cache
# Setup
data = b'x' * 1024 # 1KB test data
# Original diskcache
dc_cache = diskcache.Cache('/tmp/diskcache_test')
start = time.perf_counter()
for i in range(1000):
dc_cache.set(f'key_{i}', data)
dc_time = time.perf_counter() - start
# diskcache_rs
rs_cache = Cache('/tmp/diskcache_rs_test')
start = time.perf_counter()
for i in range(1000):
rs_cache[f'key_{i}'] = data
rs_time = time.perf_counter() - start
print(f"diskcache: {dc_time:.3f}s ({1000/dc_time:.0f} ops/sec)")
print(f"diskcache_rs: {rs_time:.3f}s ({1000/rs_time:.0f} ops/sec)")
print(f"Speedup: {dc_time/rs_time:.1f}x faster")For drop-in compatibility with python-diskcache:
# Add the python wrapper to your path
import sys
sys.path.insert(0, 'python')
from diskcache_rs import Cache, FanoutCache
# Use like original diskcache
cache = Cache('/path/to/cache')
cache['key'] = 'value'
print(cache['key']) # 'value'
# FanoutCache for better performance
fanout = FanoutCache('/path/to/cache', shards=8)
fanout.set('key', 'value')Perfect for cloud drives and network storage:
# Works great on network drives
cache = diskcache_rs.PyCache("Z:\\_thm\\temp\\.pkg\\db")
# Or UNC paths
cache = diskcache_rs.PyCache("\\\\server\\share\\cache")
# Handles network interruptions gracefully
cache.set("important_data", b"critical_value")- Storage Engine: File-based storage optimized for network filesystems
- Serialization: Multiple formats (JSON, Bincode) with compression
- Eviction Policies: LRU, LFU, TTL, and combined strategies
- Concurrency: Thread-safe operations with minimal locking
- Network Optimization: Atomic writes, retry logic, corruption detection
- No SQLite: Avoids database corruption issues
- Atomic Writes: Uses temporary files and atomic renames
- File Locking: Optional file locking for coordination
- Retry Logic: Handles temporary network failures
- Corruption Detection: Validates data integrity
| Feature | diskcache_rs | python-diskcache | Notes |
|---|---|---|---|
| Performance | 1.2x - 18x faster | Baseline | Rust implementation advantage |
| Network FS | โ Optimized | File-based vs SQLite | |
| Thread Safety | โ Yes | โ Yes | Both support concurrent access |
| Process Safety | โ Yes | โ Yes | Multi-process coordination |
| API Compatibility | โ Drop-in | โ Native | Same interface |
| Memory Usage | ๐ฅ Lower | Baseline | Rust memory efficiency |
| Startup Time | ๐ 18x faster | Baseline | Minimal initialization |
| Compression | โ LZ4 | โ Multiple | Built-in compression |
| Eviction Policies | โ LRU/LFU/TTL | โ LRU/LFU/TTL | Same strategies |
| Serialization | โ Multiple | โ Pickle | JSON, Bincode, Pickle |
| Type Hints | โ Full | โ Partial | Complete .pyi files |
| Cross Platform | โ Yes | โ Yes | Windows, macOS, Linux |
| ABI3 Support | โ Optional | โ No | Single wheel for Python 3.8+ |
| Wheel Types | ๐ฏ Standard + ABI3 | Standard only | Flexible deployment options |
| Dependencies | ๐ฅ Minimal | More | Fewer runtime dependencies |
| Installation | ๐ฆ pip install | ๐ฆ pip install | Both available on PyPI |
Benchmarks on cloud drive (Z: drive):
| Operation | diskcache_rs | python-diskcache | Notes |
|---|---|---|---|
| Set (1KB) | ~20ms | ~190ms | 9.5x faster |
| Get (1KB) | ~25ms | ~2ms | Optimization needed |
| Concurrent | โ Stable | โ Stable* | Both work on your setup |
| Network FS | โ Optimized | Key advantage |
*Note: python-diskcache works on your specific cloud drive but may fail on other network filesystems
The project includes comprehensive tests for network filesystem compatibility:
# Basic functionality test
uv run python simple_test.py
# Network filesystem specific tests
uv run python test_network_fs.py
# Comparison with original diskcache
uv run python test_detailed_comparison.py
# Extreme conditions testing
uv run python test_extreme_conditions.pyโ All tests pass on Z: drive (cloud storage)
- Basic operations: โ
- Concurrent access: โ
- Large files (1MB+): โ
- Persistence: โ
- Edge cases: โ
cache = diskcache_rs.PyCache(
directory="/path/to/cache",
max_size=1024*1024*1024, # 1GB
max_entries=100000, # 100K entries
)use diskcache_rs::{Cache, CacheConfig, EvictionStrategy, SerializationFormat, CompressionType};
let config = CacheConfig {
directory: PathBuf::from("/path/to/cache"),
max_size: Some(1024 * 1024 * 1024),
max_entries: Some(100_000),
eviction_strategy: EvictionStrategy::LruTtl,
serialization_format: SerializationFormat::Bincode,
compression: CompressionType::Lz4,
use_atomic_writes: true,
use_file_locking: false, // Disable for network drives
auto_vacuum: true,
vacuum_interval: 3600,
};
let cache = Cache::new(config)?;The main cache interface, compatible with python-diskcache:
from diskcache_rs import Cache
cache = Cache(directory, size_limit=None, cull_limit=0.1)Methods:
cache[key] = value- Set a valuevalue = cache[key]- Get a value (raises KeyError if missing)value = cache.get(key, default=None)- Get with defaultcache.set(key, value, expire=None, tag=None)- Set with optionsdel cache[key]- Delete a keykey in cache- Check membershiplen(cache)- Number of itemscache.clear()- Remove all itemscache.stats()- Get statisticscache.volume()- Get total size in bytes
Sharded cache for better concurrent performance:
from diskcache_rs import FanoutCache
cache = FanoutCache(directory, shards=8, size_limit=None)Same API as Cache, but with better concurrent performance.
Memory-only cache for maximum speed:
from diskcache_rs import FastCache
cache = FastCache(max_size=1000)Methods:
cache[key] = value- Set a valuevalue = cache[key]- Get a valuevalue = cache.get(key, default=None)- Get with defaultdel cache[key]- Delete a keycache.clear()- Remove all items
# Run all tests
uv run --group test pytest
# Run specific test categories
uv run --group test pytest -m "not docker" # Skip Docker tests
uv run --group test pytest -m "docker" # Only Docker tests
uv run --group test pytest -m "network" # Network filesystem tests
# Run compatibility tests
uv run --group test pytest tests/test_compatibility.py -vFor comprehensive network filesystem testing, we provide Docker-based simulation:
# Run Docker network tests (requires Docker)
./scripts/test-docker-network.sh
# Or manually with Docker Compose
docker-compose -f docker-compose.test.yml up --buildThe Docker tests simulate:
- NFS server environments
- SMB/CIFS server environments
- Network latency conditions
- Concurrent access scenarios
The test suite automatically detects and tests available network paths:
- Windows: UNC paths, mapped drives, cloud sync folders
- Linux/macOS: NFS mounts, SMB mounts, cloud sync folders
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- python-diskcache for the original inspiration
- PyO3 for excellent Python-Rust bindings
- maturin for seamless Python package building
- python-diskcache - Original Python implementation
- sled - Embedded database in Rust
- rocksdb - High-performance key-value store
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Install development dependencies:
just dev - Make your changes and add tests
- Run the test suite:
just test - Format your code:
just format - Submit a pull request
# Clone and setup
git clone https://github.com/loonghao/diskcache_rs.git
cd diskcache_rs
# One-command setup
just dev
# Available commands
just --listjust test # Run all tests
just test-cov # Run with coverage
just bench # Run benchmarks
just format # Format code
just lint # Run lintingLicensed under the Apache License, Version 2.0. See LICENSE for details.
- Grant Jenks for the original python-diskcache
- PyO3 team for excellent Python-Rust bindings
- maturin for seamless Python package building
- Rust community for the amazing ecosystem
Note: This project specifically addresses network filesystem issues encountered with SQLite-based caches. For local storage scenarios, both diskcache_rs and python-diskcache are excellent choices, with diskcache_rs offering superior performance.