What Are We Solving?
We want to improve application performance and scalability by reducing expensive database reads through caching, especially in read-heavy systems. The goal is to deliver fast responses while keeping data accurate and fresh.
Challenges
-
Cache staleness: Cached data can become outdated when the underlying database changes.
-
Cache invalidation: Knowing when and how to update or remove stale cache entries is difficult.
-
Consistency: Ensuring the cache and database stay in sync, especially during concurrent writes or failures.
-
Balancing freshness vs performance: Too frequent cache updates reduce performance; too infrequent cause stale reads.
Setup:
### Python
import redis
import json
from my_database import db_update_user, db_get_user
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
On Write:
### Python
def update_user(user_id: int, data: dict):
# Write to DB
db_update_user(user_id, data)
# Write to cache
r.set(f"user:{user_id}", json.dumps(data))
On Read:
### Python
def get_user(user_id: int):
# Try cache first
cached = r.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# If cache miss, get from DB and populate cache
user = db_get_user(user_id)
if user:
r.set(f"user:{user_id}", json.dumps(user))
return user
Write-Through or Write-Back Cache without need for TTL
-
In write-through, cache updates happen immediately after DB writes, so cache is always fresh.
-
If your writes always update cache, you know cache is fresh.
-
If cache and DB updates are not atomic, you might still get out-of-sync temporarily.
Conclusion
Caching is essential for scalable, performant systems, but it introduces complexity around data freshness and consistency. Techniques like write-through caching, TTL expiration, versioning, and careful invalidation help balance speed and accuracy. The right approach depends on your application’s read/write patterns and tolerance for stale data.