SIEVE is an eviction algorithm simpler than LRU for web caches.
- It is easy to implement and can be easily integrated into existing systems.
- It achieves state-of-the-art efficiency on skewed workloads.
- It can facilitate the design of advanced eviction algorithms.
Please check out NSDI'24 paper for more details.
- Simulator: a snapshot of libCacheSim
- Traces
- Prototypes
cd scripts && bash install_dependency.sh && bash install_libcachesim.sh;
After building and installing libCacheSim, cachesim
should be in the _build/bin/
directory. You can use cachesim
to run cache simulations.
# ./cachesim DATAPATH TRACE_FORMAT EVICTION_ALGO CACHE_SIZE [OPTION...]
# Run a single cache simulation
./_build/bin/cachesim data/trace.oracleGeneral.bin oracleGeneral sieve 1gb
# Run multiple cache simulations with different algorithms and differnt cache sizes (when CACHE_SIZE is 0)
./_build/bin/cachesim data/trace.oracleGeneral.bin oracleGeneral fifo,lru,clock,sieve 0 --ignore-obj-size 1
Results are recored at result/TRACE_NAME
. (More detailed instructions can be found at libCacheSim.)
We offer a small example trace in mydata/zipf/
, and you can test the miss ratio for varied algorithms.
python3 libCacheSim/scripts/plot_mrc_size.py --tracepath mydata/zipf/zipf_1.0 --trace-format txt --algos=fifo,lru,clock,sieve
Feel free to generate different zipfian traces and try out.
python3 libCacheSim/scripts/data_gen.py -m 10000 -n 1000000 --alpha 1.0 > mydata/zipf/zipf_1.0
The traces we used in the paper can be downloaded in the following.
Trace | Approx Time | # traces | cache type | # request (million) | # object (million) |
---|---|---|---|---|---|
Tencent Photo | 2018 | 2 | object | 5,650 | 1,038 |
Wiki CDN | 2019 | 3 | object | 2,863 | 56 |
Twitter KV | 2020 | 54 | KV | 195,441 | 10,650 |
Meta KV | 2022 | 5 | KV | 1,644 | 82 |
Meta CDN | 2023 | 3 | object | 231 | 76 |
We chose the most popular cache libraries/systems from five different languages: C++, Go, JavaScript, Python, and Rust, and replaced the LRU with SIEVE.
- groupcache (Golang): 21 lines of code change.
- mnemonist (Javascript): 12 lines of code change.
- lru-rs (Rust): 16 lines of code change.
- lru-dict (Python + C): 21 lines of code change.
- cachelib (C++): Cachelib is highly optimized for LRU-based algorithms. Many optimizations are not needed for SIEVE, making it impractical to quantify code modifications for integration with SIEVE.
@inproceedings{zhang2024-sieve,
title={SIEVE is Simpler than LRU: an Efficient Turn-Key Eviction Algorithm for Web Caches},
author={Zhang, Yazhuo and Yang, Juncheng and Yue, Yao and Vigfusson, Ymir and Rashmi, K.V.},
booktitle={USENIX Symposium on Networked Systems Design and Implementation (NSDI'24)},
year={2024}
}