Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: extend data retention #326

Merged
merged 3 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions versioned_docs/version-2.0/how_to_guides/monitoring/rules.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ There are four actions you can take with an automation rule:
- **Add to annotation queue**: Add the trace to an annotation queue.
- **Run online evaluation**: Run an online evaluation on the trace. For more information on online evaluations, you can refer to [this guide](./online_evaluations).
- **Trigger webhook**: Trigger a webhook with the trace data. For more information on webhooks, you can refer to [this guide](./webhooks).
- **Extend data retention**: Extends the data retention period on matching traces that use [base retention](../../concepts/usage_and_billing/data_retention_billing).
Note that all other rules will also extend data retention on matching traces through the
[auto-upgrade](../../concepts/usage_and_billing/data_retention_billing#data-retention-auto-upgrades) mechanism,
but this rule takes no additional action.

## View logs for your automations

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ with the organization level setting, this will only affect retention (and pricin

![](./static/p1projectretention.png)

### [COMING SOON] Keep around a percentage of traces for extended data retention
### Keep around a percentage of traces for extended data retention

We may not want all our traces to expire after 14 days if we care about historical debugging. As such, we can take advantage
of LangSmith's built in ability to do server side sampling for extended data retention.
Expand All @@ -130,18 +130,23 @@ Choosing the right percentage of runs to sample depends on your use case. We wil
leave it to the user to find the right value that balances collecting rare events and cost constraints.

LangSmith automatically upgrades the data retention for any trace that matches a run rule in our automations product (see our [run rules docs](../../../how_to_guides/monitoring/rules)). On the
projects page, click `Add Rule`, and configure the rule as follows:
projects page, click `Rules -> Add Rule`, and configure the rule as follows:

![](./static/P1SampleTraces.png)
![](./static/P2SampleTraces.png)

Run rules match on runs rather than traces. Runs are single units of work within an LLM application's API handling. Traces
are end to end API calls (learn more about [tracing concepts in LangSmith](../../concepts/tracing)). This means a trace can
be thought of as a tree of runs making up an API call. When a run rule matches any run within a trace, the trace's full run tree
upgrades to be retained for 400 days.

Therefore, to make sure we have the proper sampling rate on traces, we can add a filter condition to only match the "root" run in
the run tree. This is distinct per trace, so our 10% sampling will upgrade 10% of traces, rather 10% of runs, which could correspond to
more than 10% of traces.
Therefore, to make sure we have the proper sampling rate on traces, we take advantage of the
[filtering](../../how_to_guides/monitoring/rules#step-2-define-the-filter) functionality of run rules.

We add add a filter condition to only match the "root" run in the run tree. This is distinct per trace, so our 10% sampling
will upgrade 10% of traces, rather 10% of runs, which could correspond to more than 10% of traces. If desired, we can optionally add
any other filtering conditions required (e.g. specific tags/metadata attached to our traces) for more pointed data retention
extension. For the sake of this tutorial, we will stick with the simplest condition, and leave more advanced filtering as an
exercise to the user.

:::note
If you want to keep a subset of traces for longer than 400 days for data collection purposes, you can create another run
Expand Down
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading