Skip to content

stonet-research/storage-systems-wiki-reading-list

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Welcome to the VU Amsterdam - storage-systems wiki!

We will collect and grow the reading list for the Storage Systems class (https://atlarge-research.com/courses/storage-systems-vu/) at VU Amsterdam.


If you find material from the class useful, then consider citing our work as PDF:

Reviving Storage Systems Education in the 21st Century — An experience report, Animesh Trivedi, Matthijs Jansen, Krijn Doekemeijer, Sacheendra Talluri, Nick Tehrany (2024 May) In 2024 IEEE/ACM 24rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid'24), https://www.computer.org/csdl/proceedings-article/ccgrid/2024/956600a616/20ShedG8GeQ.

@INPROCEEDINGS {2024-ccgrid-stosys-experience,
author = { Trivedi, Animesh and Jansen, Matthijs and Doekemeijer, Krijn and Talluri, Sacheendra and Tehrany, Nick },
booktitle = { 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid) },
title = { Reviving Storage Systems Education in the 21st Century — An experience report },
year = {2024},
volume = {},
ISSN = {},
pages = {616-625},
doi = {10.1109/CCGrid59990.2024.00074},
url = {https://doi.ieeecomputersociety.org/10.1109/CCGrid59990.2024.00074},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = {May}
}

Contributions: please open a pull request with 1-2 line description of the paper!

Table of Content

  1. NVM storage, introduction, device-level details
  2. Host interfacing, OS and Storage I/O Stack
  3. SSD internals (FTL, GC, buffering, staging, scheduling)
  4. NVMe, Flash, PMEM, SSD File Systems
  5. Key-Value Storage and Caches
  6. Persistent Memories / disaggregation / CXL
  7. Networked/distributed Flash/NVMoF/Storage Disaggregation
  8. Programmable storage, acceleration, offloading, computational storage, workload-specific storage
  9. Storage Virtualization, Emulation, Simulation
  10. Flash I/O Scheduling and quality-of-service/multi-tenancy
  11. Reliability and failures studies
  12. Graphs Storage and Processing Systems
  13. Performance, Efficiency, Scalability
  14. NVM storage and Energy consumption
  15. Database, Timeseries, VectorDB, Lookup, Indexes on Storage
  16. Emerging systems architectures
  17. Emerging storage interfaces and features
  18. SNIA/NVMe weblinks
  19. Benchmarking, traces, profiling, monitoring, and characterization
  20. RAID, Compression, De-duplication
  21. ML and (Storage) Systems
  22. A selection of storage related surveys
  23. Company specific stacks

1. NVM storage, introduction, device-level details

  • Michael Cornwell. 2012. Anatomy of a Solid-state Drive: While the ubiquitous SSD shares many features with the hard-disk drive, under the surface they are completely different. Queue 10, 10 (October 2012), 30–36. https://doi.org/10.1145/2381996.2385276
  • Mihir Nanavati, Malte Schwarzkopf, Jake Wires, and Andrew Warfield. 2015. Non-volatile Storage: Implications of the Datacenter’s Shifting Center. Queue 13, 9 (November-December 2015), 33–56. https://doi.org/10.1145/2857274.2874238
  • Ethan Miller, Achilles Benetopoulos, George Neville-Neil, Pankaj Mehra, and Daniel Bittman. 2023. Pointers in Far Memory: A rethink of how data and computations should be organized. Queue 21, 3, Pages 50 (May/June 2023), 19 pages. https://doi.org/10.1145/3606029

2. Host interfacing, OS and Storage I/O Stack

3. SSD internals (FTL, GC, buffering, staging, scheduling)

4. NVMe, Flash, PMEM, SSD File Systems

5. Key-Value Storage and Caches

6. Persistent Memories / disaggregation / CXL

  • An Examination of CXL Memory Use Cases for In-Memory Database Management Systems using SAP HANA MINSEON AHN (SAP Labs Korea)*; Thomas Willhalm (Intel Deutschland GmbH); Norman May (SAP SE); Donghun Lee (SAP Labs Korea); Suprasad Mutalik Desai (Intel); Daniel Booss (SAP SE); Jungmin Kim (SAP); Navneet Singh (Intel Technology India Pvt Ltd); Daniel Ritter (SAP); Oliver Rebholz (SAP SE), https://www.vldb.org/pvldb/vol17/p3827-ahn.pdf
  • Bolong Zheng, Yongyong Gao, Jingyi Wan, Lingsen Yan, Long Hu, Bo Liu, Yunjun Gao, Xiaofang Zhou, and Christian S. Jensen. 2023. DecLog: Decentralized Logging in Non-Volatile Memory for Time Series Database Systems. Proc. VLDB Endow. 17, 1 (September 2023), 1–14. https://doi.org/10.14778/3617838.3617839
  • Ying Zheng and Kian-Lee Tan. 2024. Sorting on Byte-Addressable Storage: The Resurgence of Tree Structure. Proc. VLDB Endow. 17, 6 (February 2024), 1487–1500. https://doi.org/10.14778/3648160.3648185
  • BonsaiKV: Towards Fast, Scalable, and Persistent Key-Value Stores with Tiered, Heterogeneous Memory System. Proc. VLDB Endow. 17, 4 (December 2023), 726–739. https://doi.org/10.14778/3636218.3636228
  • FluidKV: Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint on Ultra-Fast Storage. Proc. VLDB Endow. 17, 6 (February 2024), 1377–1390. https://doi.org/10.14778/3648160.3648177
  • CXL and the Return of Scale-Up Database Engines. Proc. VLDB Endow. 17, 10 (June 2024), 2568–2575. https://doi.org/10.14778/3675034.3675047
  • Anchor: A Library for Building Secure Persistent Memory Systems. Proc. ACM Manag. Data 1, 4, Article 231 (December 2023), 31 pages. https://doi.org/10.1145/3626718.
  • Scalable Distributed Inverted List Indexes in Disaggregated Memory. Proc. ACM Manag. Data 2, 3, Article 171 (June 2024), 27 pages. https://doi.org/10.1145/3654974
  • Persistent Memory Research in the Post-Optane Era. In Proceedings of the 1st Workshop on Disruptive Memory Systems (DIMES '23). Association for Computing Machinery, New York, NY, USA, 23–30. https://doi.org/10.1145/3609308.3625268
  • A Comprehensive Empirical Study of File Systems on Optane Persistent Memory, https://ieeexplore.ieee.org/abstract/document/9605448
  • TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (ASPLOS 2023). Association for Computing Machinery, New York, NY, USA, 742–755. https://doi.org/10.1145/3582016.3582063
  • Pond: CXL-Based Memory Pooling Systems for Cloud Platforms. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS 2023). Association for Computing Machinery, New York, NY, USA, 574–587. https://doi.org/10.1145/3575693.3578835
  • Cache in Hand: Expander-Driven CXL Prefetcher for Next Generation CXL-SSD. In Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage '23). Association for Computing Machinery, New York, NY, USA, 24–30. https://doi.org/10.1145/3599691.3603406
  • Hello bytes, bye blocks: PCIe storage meets compute express link for memory expansion (CXL-SSD). In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage '22). Association for Computing Machinery, New York, NY, USA, 45–51. https://doi.org/10.1145/3538643.3539745
  • Jianguo Wang and Qizhen Zhang. 2023. Disaggregated Database Systems. In Companion of the 2023 International Conference on Management of Data (SIGMOD '23). Association for Computing Machinery, New York, NY, USA, 37–44. https://doi.org/10.1145/3555041.3589403
  • Hasan Al Maruf and Mosharaf Chowdhury. 2023. Memory Disaggregation: Advances and Open Challenges. SIGOPS Oper. Syst. Rev. 57, 1 (June 2023), 29–37. https://doi.org/10.1145/3606557.3606562
  • Marcos K. Aguilera, Emmanuel Amaro, Nadav Amit, Erika Hunhoff, Anil Yelam, and Gerd Zellweger. 2023. Memory disaggregation: why now and what are the challenges. SIGOPS Oper. Syst. Rev. 57, 1 (June 2023), 38–46. https://doi.org/10.1145/3606557.3606563
  • Direct Access, High-Performance Memory Disaggregation with DirectCXL, https://www.usenix.org/conference/atc22/presentation/gouk
  • FlatFS: Flatten Hierarchical File System Namespace on Non-volatile Memories, https://www.usenix.org/conference/atc22/presentation/cai
  • Poseidon: Safe, Fast and Scalable Persistent Memory Allocator (Middleware, 2020), https://dl.acm.org/doi/10.1145/3423211.3425671
  • Persistent State Machines for Recoverable In-memory Storage Systems with NVRam, https://www.usenix.org/conference/osdi20/presentation/zhang-wen
  • Wenda Tang, Ying Han, Tianxiang Ai, Guanghui Li, Bin Yu, Xin Yang. 2024. Yggdrasil: Reducing Network I/O Tax with (CXL-Based) Distributed Shared Memory. In Proceedings of the 53rd International Conference on Parallel Processing. https://dl.acm.org/doi/10.1145/3673038.3673138
  • Qizhen Zhang, Philip A. Bernstein, Badrish Chandramouli, Jiasheng Hu, Yiming Zheng. 2024. DDS: DPU-optimized Disaggregated Storage. PVLDB, 17(11). https://www.vldb.org/pvldb/vol17/p3304-zhang.pdf
  • Minseon Ahn, Willhalm Thomas, Norman May, Donghun Lee, Suprasad Mutalik Desai, Daniel Booss, Jungmin Kim, Navneet Singh, Daniel Ritter, Oliver Rebholz. 2024. An Examination of CXL Memory Use Cases for In-Memory Database Management Systems using SAP HANA. PVLDB, 17(12). https://www.vldb.org/pvldb/vol17/p3827-ahn.pdf
  • SPMFS: A Scalable Persistent Memory File System on Optane Persistent Memory. In Proceedings of the 50th International Conference on Parallel Processing (ICPP '21). Association for Computing Machinery, New York, NY, USA, Article 3, 1–10. https://doi.org/10.1145/3472456.3472503
  • A Survey of Non-Volatile Main Memory File Systems, https://jcst.ict.ac.cn/en/article/pdf/preview/10.1007/s11390-023-1054-3.pdf (https://link.springer.com/article/10.1007/s11390-023-1054-3)
  • Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks https://www.usenix.org/conference/fast19/presentation/zheng

    FileBench, RocksDB, SQLite, MySQL

  • Characterizing the performance of intel optane persistent memory: a close look at its on-DIMM buffering. In Proceedings of the Seventeenth European Conference on Computer Systems (EuroSys '22). Association for Computing Machinery, New York, NY, USA, 488–505. https://doi.org/10.1145/3492321.3519556
  • "CFFS: A Persistent Memory File System for Contiguous File Allocation With Fine-Grained Metadata," in IEEE Access, vol. 10, pp. 91678-91698, 2022, doi: 10.1109/ACCESS.2022.3202532 https://ieeexplore.ieee.org/abstract/document/9869672
  • ByteFS: System Support for (CXL-based) Memory-Semantic Solid-State Drives, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’25). https://arxiv.org/abs/2501.04993
  • M5: Mastering Page Migration and Memory Management for CXL-based Tiered Memory Systems, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’25). https://dl.acm.org/doi/abs/10.1145/3676641.3711999
  • Systematic CXL Memory Characterization and Performance Analysis at Scale, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’25). http://web.cs.ucla.edu/~yuyue/assets/files/melody.pdf
  • EDM: An Ultra-Low Latency Ethernet Fabric for Memory Disaggregation, Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’25). https://arxiv.org/abs/2411.08300

7. Networked/distributed Flash/NVMoF/Storage Disaggregation

8. Programmable storage, acceleration, offloading, computational storage, workload-specific storage

9. Storage Virtualization, Emulation, Simulation

10. Flash I/O Scheduling and quality-of-service/multi-tenancy

11. Reliability and failures studies

12. Graphs Storage and Processing Systems

13. Performance, Efficiency, Scalability

14. NVM storage and Energy consumption

15. Database, Timeseries, VectorDB, Lookup, Indexes on Storage

  • SingleStore-V: An Integrated Vector Database System in SingleStore, VLDB 2024, https://doi.org/10.14778/3685800.3685805
  • Alexander van Renen, Dominik Horn, Pascal Pfeil, Kapil Vaidya, Wenjian Dong, Murali Narayanaswamy, Zhengchun Liu, Gaurav Saxena, Andreas Kipf, and Tim Kraska. 2024. Why TPC is Not Enough: An Analysis of the Amazon Redshift Fleet. Proc. VLDB Endow. 17, 11 (July 2024), 3694–3706. https://doi.org/10.14778/3681954.3682031
  • Apache TsFile: An IoT-native Time Series File Format, VLDB 2024.
  • Cloud-Native Database Systems and Unikernels: Reimagining OS Abstractions for Modern Hardware. Proc. VLDB Endow. 17, 8 (April 2024), 2115–2122. https://doi.org/10.14778/3659437.3659462
  • An Empirical Evaluation of Columnar Storage Formats. Proc. VLDB Endow. 17, 2 (October 2023), 148–161. https://doi.org/10.14778/3626292.3626298
  • Are There Fundamental Limitations in Supporting Vector Data Management in Relational Databases? A Case Study of PostgreSQL. Proceedings of International Conference on Data Engineering (ICDE), 2024.
  • Vector Database Management Techniques and Systems. In Companion of the 2024 International Conference on Management of Data (SIGMOD/PODS '24). Association for Computing Machinery, New York, NY, USA, 597–604. https://doi.org/10.1145/3626246.3654691
  • Milvus: A Purpose-Built Vector Data Management System. In Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 2614–2627. https://doi.org/10.1145/3448016.3457550
  • Vexless: A Serverless Vector Data Management System Using Cloud Functions. Proc. ACM Manag. Data 2, 3, Article 187 (June 2024), 26 pages. https://doi.org/10.1145/3654990
  • Limousine: Blending Learned and Classical Indexes to Self-Design Larger-than-Memory Cloud Storage Engines. Proc. ACM Manag. Data 2, 1, Article 47 (February 2024), 28 pages. https://doi.org/10.1145/3639302
    • We present Limousine, a self-designing key-value storage engine, that can automatically morph to the near-optimal storage engine architecture shape given a workload, a cloud budget, and target performance
  • Viktor Leis. 2024. LeanStore: A High-Performance Storage Engine for NVMe SSDs. PVLDB 17(12). https://www.vldb.org/pvldb/vol17/p4536-leis.pdf
  • What Modern NVMe Storage Can Do, And How To Exploit It: High-Performance I/O for High-Performance Storage Engines https://www.vldb.org/pvldb/vol16/p2090-haas.pdf
  • AirIndex: Versatile Index Tuning Through Data and Storage, https://dl.acm.org/doi/10.1145/3617308
  • Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment. Proc. ACM Manag. Data 2, 1, Article 14 (February 2024), 27 pages. https://doi.org/10.1145/3639269
  • Chenguang Fang, Zijie Chen, Shaoxu Song, Xiangdong Huang, Chen Wang, and Jianmin Wang. 2024. On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDB. Proc. VLDB Endow. 17, 11 (July 2024), 2974–2986. https://doi.org/10.14778/3681954.3681977
  • Time Series Representation for Visualization in Apache IoTDB. Proc. ACM Manag. Data 2, 1, Article 35 (February 2024), 26 pages. https://doi.org/10.1145/3639290
  • MOST: Model-Based Compression with Outlier Storage for Time Series Data, (SIGMOD 2023) https://dl.acm.org/doi/10.1145/3626737.
  • Jalal Mostafa, Sara Wehbi, Suren Chilingaryan, Andreas Kopmann. 2022. SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things. https://arxiv.org/abs/2204.09795v2
    • Peter Bailis, Camille Fournier, Joy Arulraj, and Andy Pavlo. 2016. Research for Practice: Distributed Consensus and Implications of NVM on Database Management Systems: Expert-curated Guides to the Best of CS Research. Queue 14, 3 (May-June 2016), 53–67. https://doi.org/10.1145/2956641.2967618

16. Emerging systems architectures

17. Emerging storage interfaces and features

17.1 ZNS: Explanations, research directions and ZNS extensions

17.2 Other interfaces

  • The Design and Implementation of a Capacity-Variant Storage System, Ziyang Jiao and Xiangqun Zhang, Syracuse University; Hojin Shin and Jongmoo Choi, Dankook University; Bryan S. Kim, Syracuse University, https://www.usenix.org/conference/fast24/presentation/jiao
  • LightNVM: The Linux Open-Channel SSD Subsystem, https://www.usenix.org/system/files/conference/fast17/fast17-bjorling.pdf
  • "KAML: A Flexible, High-Performance Key-Value SSD," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, USA, 2017, pp. 373-384, doi: 10.1109/HPCA.2017.15. https://ieeexplore.ieee.org/abstract/document/7920840/
  • "Elevating Commodity Storage with the SALSA Host Translation Layer," 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Milwaukee, WI, USA, 2018, pp. 277-292, doi: 10.1109/MASCOTS.2018.00035. https://ieeexplore.ieee.org/abstract/document/8526893/
  • AutoStream: automatic stream management for multi-streamed SSDs. In Proceedings of the 10th ACM International Systems and Storage Conference (SYSTOR '17). Association for Computing Machinery, New York, NY, USA, Article 3, 1–11. https://doi.org/10.1145/3078468.3078469
  • Jeong-Uk Kang, Jeeseok Hyun, Hyunjoo Maeng, and Sangyeun Cho. 2014. The multi-streamed solid-state drive. In Proceedings of the 6th USENIX conference on Hot Topics in Storage and File Systems (HotStorage'14). USENIX Association, USA, 13.

17.3 FDP

  • J. Park, H. Kim, J. Ha, H. Jung and H. Eom, "FDPVirt: Flexible Data Placement SSD Emulator," 2024 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops), Kobe, Japan, 2024, pp. 198-199, doi: 10.1109/CLUSTERWorkshops61563.2024.00057.
  • Towards Efficient Flash Caches with Emerging NVMe Flexible Data Placement SSDs. EuroSys'25. https://arxiv.org/abs/2503.11665.

18. SNIA/NVMe weblinks

19. Benchmarking, traces, profiling, monitoring, and characterization

  • LST-Bench: Benchmarking Log-Structured Tables in the Cloud. Proc. ACM Manag. Data 2, 1, Article 59 (February 2024), 26 pages. https://doi.org/10.1145/3639314.
  • On the Performance Variation in Modern Storage Stacks, https://www.usenix.org/system/files/conference/fast17/fast17-cao.pdf
  • Phitchaya Mangpo Phothilimthana, Saurabh Kadekodi, Soroush Ghodrati, Selene Moon, and Martin Maas. 2024. Thesios: Synthesizing Accurate Counterfactual I/O Traces from I/O Samples. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (ASPLOS '24), Vol. 3. Association for Computing Machinery, New York, NY, USA, 1016–1032. https://doi.org/10.1145/3620666.3651337
  • Jinhong Li, Qiuping Wang, Patrick P. C. Lee, and Chao Shi. 2023. An In-depth Comparative Analysis of Cloud Block Storage Workloads: Findings and Implications. ACM Trans. Storage 19, 2, Article 16 (May 2023), 32 pages. https://doi.org/10.1145/3572779
  • Gala Yadgar, MOSHE Gabel, Shehbaz Jaffer, and Bianca Schroeder. 2021. SSD-based Workload Characteristics and Their Performance Implications. ACM Trans. Storage 17, 1, Article 8 (February 2021), 26 pages. https://doi.org/10.1145/3423137
  • A. K. Paul, O. Faaland, A. Moody, E. Gonsiorowski, K. Mohror and A. R. Butt, "Understanding HPC Application I/O Behavior Using System Level Statistics," 2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC), Pune, India, 2020, pp. 202-211, doi: 10.1109/HiPC50609.2020.00034.
  • S. Kavalanekar, B. Worthington, Qi Zhang and V. Sharda, "Characterization of storage workload traces from production Windows Servers," 2008 IEEE International Symposium on Workload Characterization, Seattle, WA, 2008, pp. 119-128, doi: 10.1109/IISWC.2008.4636097.
  • Tirthak Patel, Suren Byna, Glenn K. Lockwood, and Devesh Tiwari. 2019. Revisiting I/O behavior in large-scale storage systems: the expected and the unexpected. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '19). Association for Computing Machinery, New York, NY, USA, Article 65, 1–13. https://doi.org/10.1145/3295500.3356183
  • Omkar Desai, Seungmin Shin, Eunji Lee, and Bryan S. Kim. 2022. A principled approach for selecting block I/O traces. In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage '22). Association for Computing Machinery, New York, NY, USA, 52–58. https://doi.org/10.1145/3538643.3539754
  • Large-Scale Analysis of Docker Images and Performance Implications for Container Storage Systems. IEEE Trans. Parallel Distributed Syst. 32(4): 918-930 (2021)
  • Marc-André Vef, Vasily Tarasov, Dean Hildebrand, and André Brinkmann. 2018. Challenges and Solutions for Tracing Storage Systems: A Case Study with Spectrum Scale. ACM Trans. Storage 14, 2, Article 18 (May 2018), 24 pages. https://doi.org/10.1145/3149376
  • Yang Liu, Raghul Gunasekaran, Xiaosong Ma, and Sudharshan S. Vazhkudai. 2014. Automatic identification of application I/O signatures from noisy server-side traces. In Proceedings of the 12th USENIX conference on File and Storage Technologies (FAST'14). USENIX Association, USA, 213–228.
  • I. Ahmad, "Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server," 2007 IEEE 10th International Symposium on Workload Characterization, Boston, MA, USA, 2007, pp. 149-158, doi: 10.1109/IISWC.2007.4362191.
  • Jayanta Basak, Kushal Wadhwani, and Kaladhar Voruganti. 2016. Storage Workload Identification. ACM Trans. Storage 12, 3, Article 14 (June 2016), 30 pages. https://doi.org/10.1145/2818716
  • Ajay Gulati, Chethan Kumar, and Irfan Ahmad. Storage workload characterization and consolidation in virtualized environments. In Proc. Int'l Workshop on Virtualization Performance: Analysis, Characterization, and Tools (VPACT'09), 2009.
  • Bin Yang, Wei Xue, Tianyu Zhang, Shichao Liu, Xiaosong Ma, Xiyang Wang, and Weiguo Liu. 2023. End-to-end I/O Monitoring on Leading Supercomputers. ACM Trans. Storage 19, 1, Article 3 (February 2023), 35 pages. https://doi.org/10.1145/3568425 (NSDI: https://www.usenix.org/conference/nsdi19/presentation/yang)
  • V. Tarasov, S. Kumar, J. Ma, D. Hildebrand, A. Povzner, G. Kuenning, and E. Zadok. 2012. Extracting flexible, replayable models from large block traces. In Proceedings of the 10th USENIX conference on File and Storage Technologies (FAST'12). USENIX Association, USA, 22. https://static.usenix.org/events/fast12/tech/full_papers/Tarasov.pdf
  • COSBench: cloud object storage benchmark. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE '13). Association for Computing Machinery, New York, NY, USA, 199–210. https://doi.org/10.1145/2479871.2479900

20. RAID, Compression, De-duplication

21. ML and (Storage) Systems

22. A selection of storage related surveys

23. Company-specific stacks

Biased towards AI/ML (duh!)

23.1 Alibaba

  • Optimizing NVMe Storage for Large-Scale Deployment: Key Technologies and Strategies in Alibaba Cloud, in IEEE Micro, vol. 44, no. 5, pp. 47-56, Sept.-Oct. 2024, doi: 10.1109/MM.2024.3426514. https://ieeexplore.ieee.org/document/10604820

23.2 Baidu

  • CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections. In Proceedings of the Eighteenth European Conference on Computer Systems (EuroSys '23). Association for Computing Machinery, New York, NY, USA, 331–346. https://doi.org/10.1145/3552326.3587443

23.3 Meta

24. Unclassified

  • Rohan Basu Roy and Devesh Tiwari. StarShip: Mitigating I/O Bottlenecks in Serverless Computing for Scientific Workflows. Proc. {ACM} Meas. Anal. Comput. Syst. 2024. https://dl.acm.org/doi/10.1145/3639028

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published