Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ nav:
- Getting Started:
- getting_started/quickstart.md
- getting_started/installation.md
- Examples:
- Examples:
- examples/reader.md
- examples/simulation.md
- examples/analysis.md
- examples/plugins.md
Expand Down
55 changes: 55 additions & 0 deletions docs/src/en/examples/reader.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Trace Reader

We support a unified trace reader to open trace files in different format and read the requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a small grammatical error here. It should be 'formats' instead of 'format'.

Suggested change
We support a unified trace reader to open trace files in different format and read the requests.
We support a unified trace reader to open trace files in different formats and read the requests.


## Basic usage

`TraceReader` class is the core of this functionality. When we create an instance of `TraceReader`, we open a trace file for read requests.

`TraceReader` accepts three arguments:
- `trace: str | TraceReader`: A trace path or other trace instance. The trace path can be a file path on your local machine (e.g., ~/data/trace.oracleGeneral.zst) or an S3 URI (e.g., s3://cache-datasets/cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The type hint for the trace parameter seems incorrect. The TraceReader constructor expects a string path to the trace file, not another TraceReader instance. The ability to pass a Reader object is an internal implementation detail used for cloning and not part of the public API. To avoid confusion, it's better to simplify this.

Suggested change
- `trace: str | TraceReader`: A trace path or other trace instance. The trace path can be a file path on your local machine (e.g., ~/data/trace.oracleGeneral.zst) or an S3 URI (e.g., s3://cache-datasets/cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst).
- `trace: str`: A trace path. The trace path can be a file path on your local machine (e.g., ~/data/trace.oracleGeneral.zst) or an S3 URI (e.g., s3://cache-datasets/cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst).

- `trace_type: TraceType` (optional): If not given, it will be infered according to the file name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo here. 'infered' should be 'inferred'.

Suggested change
- `trace_type: TraceType` (optional): If not given, it will be infered according to the file name.
- `trace_type: TraceType` (optional): If not given, it will be inferred according to the file name.

- `reader_init_params: ReaderInitParam` (optional): If not given, will use default params for reader initialization.

Here is an example to load one trace via an S3 URI.

```python
import libcachesim as lcs

# Open a trace hosted on S3 (find more via https://github.com/cacheMon/cache_dataset)
URI = "s3://cache-datasets/cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
reader = lcs.TraceReader(
trace = URI,
trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
)
```

Then we can walk through the trace.

```python
for req in reader:
print(req.obj_id, req.obj_size)
```

## Reader slicing

`TraceReader` support slicing and index access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a small grammatical error here. It should be 'supports' instead of 'support'.

Suggested change
`TraceReader` support slicing and index access.
`TraceReader` supports slicing and index access.


```python
# Read the first 100 reqs
for req in reader[:100]:
print(req.obj_id, req.obj_size)
```

```python
# Read 100 reqs after the first 100 reqs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a trailing whitespace at the end of this line which should be removed for better formatting.

Suggested change
# Read 100 reqs after the first 100 reqs
# Read 100 reqs after the first 100 reqs

for req in reader[100:200]:
print(req.obj_id, req.obj_size)
```

```python
# Read last 100 reqs
for req in reader[-100:]:
print(req.obj_id, req.obj_size)
```