Skip to content

Commit ab0d454

Browse files
authored
docs: langsmith-managed-clickhouse (#321)
2 parents fa40390 + 6d4edcb commit ab0d454

10 files changed

+88
-30
lines changed

versioned_docs/version-2.0/self_hosting/architectural_overview.mdx

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
2-
sidebar_label: Architectural Overview
2+
sidebar_label: Architectural overview
33
sidebar_position: 1
44
table_of_contents: true
55
---
66

7-
# Architectural Overview
7+
# Architectural overview
88

99
:::important Enterprise License Required
1010
Self-Hosted LangSmith is an add-on to the Enterprise Plan designed for our largest, most security-conscious customers. See our [pricing page](https://www.langchain.com/pricing) for more detail, and contact us at [email protected] if you want to get a license key to trial LangSmith in your environment.
@@ -36,7 +36,7 @@ In a production setting, we **strongly recommend using external Storage Services
3636

3737
### ClickHouse
3838

39-
[ClickHouse](https://clickhouse.com/docs/en/intro) is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP)
39+
[ClickHouse](https://clickhouse.com/docs/en/intro) is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP).
4040

4141
LangSmith uses ClickHouse as the primary data store for traces and feedback (high-volume data).
4242

versioned_docs/version-2.0/self_hosting/configuration/external_clickhouse.mdx

+26-16
Original file line numberDiff line numberDiff line change
@@ -4,38 +4,48 @@ import {
44
HelmBlock,
55
} from "../../../../src/components/InstructionsWithCode";
66

7-
# Connect to an External ClickHouse Database
7+
# Connect to an external ClickHouse database
88

9-
LangSmith uses ClickHouse as the primary data store for traces and feedback. By default, LangSmith Self-Hosted will use an internal ClickHouse database that is bundled with the LangSmith instance.
9+
ClickHouse is a high-performance, column-oriented database system. It allows for fast ingestion of data and is optimized for analytical queries.
1010

11-
However, you can configure LangSmith to use an external ClickHouse database. By configuring an external ClickHouse database, you can manage backups, scaling, and other operational tasks for your database.
12-
Unfortunately, many cloud providers do not offer managed ClickHouse services at this time. Instead, you can run ClickHouse in a few ways:
11+
LangSmith uses ClickHouse as the primary data store for traces and feedback. By default, self-hosted LangSmith will use an internal ClickHouse database that is bundled with the LangSmith instance. This is run as a stateful set in the same Kubernetes cluster as the LangSmith application or as a Docker container on the same host as the LangSmith application.
1312

14-
- LangSmith Managed ClickHouse Cloud(Reach out to us at [email protected] for more information)
15-
- Provision an instance in [ClickHouse Cloud](https://clickhouse.cloud/)
16-
- Provision a ClickHouse Cloud instance via Marketplace
13+
However, you can configure LangSmith to use an external ClickHouse database for easier management and scaling.
14+
By configuring an external ClickHouse database, you can manage backups, scaling, and other operational tasks for your database.
15+
While Clickhouse is not yet a native service in Azure, AWS, or Google Cloud, you can run LangSmith with an external ClickHouse database in the following ways:
16+
17+
- [LangSmith-managed ClickHouse (beta)](/self_hosting/langsmith_managed_clickhouse)
18+
- Provision a [ClickHouse Cloud](https://clickhouse.cloud/) either directly or through a cloud provider marketplace:
1719
- [Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/clickhouse.clickhouse_cloud?tab=Overview)
1820
- [Google Cloud Marketplace](https://console.cloud.google.com/marketplace/product/clickhouse-public/clickhouse-cloud)
1921
- [AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-jettukeanwrfc)
2022
- On a VM in your cloud provider
2123

24+
:::note
25+
26+
Using the first two options (LangSmith-managed ClickHouse or ClickHouse Cloud) will provision a Clickhouse service OUTSIDE of your VPC.
27+
However, both options support private endpoints, meaning that you can direct traffic to the ClickHouse service without exposing it to the public internet (eg via AWS PrivateLink, or GCP Private Service Connect).
28+
29+
Additionally, sensitive information can be configured to be not stored in Clickhouse. Please reach out to [email protected] for more information.
30+
:::
31+
2232
## Requirements
2333

24-
- A provisioned ClickHouse Instance that your LangSmith instance will have network access to.
25-
- A user with admin access to the ClickHouse database. This user will be used to create the necessary tables, indexes, and views
34+
- A provisioned ClickHouse instance that your LangSmith application will have network access to (see above for options).
35+
- A user with admin access to the ClickHouse database. This user will be used to create the necessary tables, indexes, and views.
2636
- We only support standalone ClickHouse (not clustered or replicated) or ClickHouse Cloud.
27-
- We only support ClickHouse versions >= 23.9. Use of ClickHouse versions >= 24.2 requires LangSmith v0.6 or later.
37+
- We only support ClickHouse versions >= 23.9. Use of ClickHouse versions >= 24.2 requires LangSmith v0.6 or later. See the [LangSmith release notes](../release_notes) for more information.
2838

2939
## Parameters
3040

3141
You will need to provide several parameters to your LangSmith installation to configure an external ClickHouse database. These parameters include:
3242

33-
- Host
34-
- HTTP Port
35-
- Native Port
36-
- Database
37-
- Username
38-
- Password
43+
- Host: The hostname or IP address of the ClickHouse database
44+
- HTTP Port: The port that the ClickHouse database listens on for HTTP connections
45+
- Native Port: The port that the ClickHouse database listens on for [native connections](https://clickhouse.com/docs/en/interfaces/tcp)
46+
- Database: The name of the ClickHouse database that LangSmith should use
47+
- Username: The username to use to connect to the ClickHouse database
48+
- Password: The password to use to connect to the ClickHouse database
3949

4050
## Configuration
4151

versioned_docs/version-2.0/self_hosting/configuration/external_postgres.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ import {
44
HelmBlock,
55
} from "../../../../src/components/InstructionsWithCode";
66

7-
# Connect to an External Postgres Database
7+
# Connect to an external Postgres database
88

99
LangSmith uses a Postgres database as the primary data store for transactional workloads and operational data (almost everything besides runs). By default, LangSmith Self-Hosted will use an internal Postgres database.
1010
However, you can configure LangSmith to use an external Postgres database (**strongly recommended in a production setting**). By configuring an external Postgres database, you can more easily manage backups, scaling, and other operational tasks for your database.

versioned_docs/version-2.0/self_hosting/configuration/external_redis.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ import {
44
HelmBlock,
55
} from "../../../../src/components/InstructionsWithCode";
66

7-
# Connect to an External Redis Database
7+
# Connect to an external Redis database
88

99
LangSmith uses Redis to back our queuing/caching operations. By default, LangSmith Self-Hosted will use an internal Redis instance.
1010
However, you can configure LangSmith to use an external Redis instance (**strongly recommended in a production setting**). By configuring an external Redis instance, you can more easily manage backups, scaling, and other operational tasks for your Redis instance.

versioned_docs/version-2.0/self_hosting/faq.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
sidebar_label: Frequently Asked Questions
2+
sidebar_label: Frequently asked questions
33
sidebar_position: 8
44
description: "Frequently Asked Questions"
55
---

versioned_docs/version-2.0/self_hosting/index.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ sidebar_position: 0
77

88
Step-by-step guides that cover the installation, configuration, and scaling of your Self-Hosted LangSmith instance.
99

10-
- [Architectural Overview](./self_hosting/architectural_overview): A high-level overview of the LangSmith architecture.
11-
- [Storage Services](./self_hosting/architectural_overview#datastores): The storage services used by LangSmith.
10+
- [Architectural overview](./self_hosting/architectural_overview): A high-level overview of the LangSmith architecture.
11+
- [Storage services](./self_hosting/architectural_overview#datastores): The storage services used by LangSmith.
1212
- [Services](./self_hosting/architectural_overview#services): The services that make up LangSmith.
1313
- [Installation](./self_hosting/installation): How to install LangSmith on your own infrastructure.
1414
- [Kubernetes](./self_hosting/installation/kubernetes): Deploy LangSmith on Kubernetes.
@@ -20,7 +20,7 @@ Step-by-step guides that cover the installation, configuration, and scaling of y
2020
- [Connect to an external Redis instance](./self_hosting/configuration/external_redis): Configure LangSmith to use an external Redis instance.
2121
- [Usage](./self_hosting/usage): How to use your self-hosted instance of LangSmith.
2222
- [Upgrades](./self_hosting/upgrades): How to upgrade your self-hosted instance of LangSmith.
23-
- [Release Notes](./self_hosting/release_notes): The latest release notes for LangSmith.
23+
- [Release notes](./self_hosting/release_notes): The latest release notes for LangSmith.
2424
- [Week of June 17, 2024 - LangSmith v0.6](./self_hosting/release_notes#week-of-june-17-2024---langsmith-v05): Release notes for version 0.6 of LangSmith.
2525
- [Week of May 13, 2024 - LangSmith v0.5](./self_hosting/release_notes#week-of-may-13-2024---langsmith-v05): Release notes for version 0.5 of LangSmith.
2626
- [Week of March 25, 2024 - LangSmith v0.4](./self_hosting/release_notes#week-of-march-25-2024---langsmith-v04): Release notes for version 0.4 of LangSmith.

versioned_docs/version-2.0/self_hosting/installation/kubernetes.mdx

+4-4
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ We've successfully tested LangSmith on the following Kubernetes distributions:
2020
- OpenShift
2121
- Minikube and Kind (for development purposes)
2222

23-
To review all configuration options, look at the values.yaml for the [LangSmith Helm Chart](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/README.md).
23+
To review all configuration options, look at the values.yaml for the [LangSmith helm chart](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/README.md).
2424

2525
## Prerequisites
2626

@@ -65,9 +65,9 @@ Ensure you have the following tools/items ready. Some items are marked optional:
6565
## Configure your Helm Charts:
6666

6767
1. Create a new file called `langsmith_config.yaml`. This should have a similar structure to the `values.yaml` file in the LangSmith Helm Chart repository. Only include the values you want to override to avoid having to update the file every time the chart is updated.
68-
2. Set the appropriate values in the `langsmith_config.yaml` file. You can find the available configuration options in the [Configuration](/self_hosting/configuration) section.
68+
2. Set the appropriate values in the `langsmith_config.yaml` file. You can find the available configuration options in the [configuration](/self_hosting/configuration) section.
6969

70-
You can also see some example configurations in the examples directory of the Helm Chart repository here: [LangSmith Helm Chart Examples](https://github.com/langchain-ai/helm/tree/main/charts/langsmith/examples).
70+
You can also see some example configurations in the examples directory of the Helm Chart repository here: [LangSmith helm chart examples](https://github.com/langchain-ai/helm/tree/main/charts/langsmith/examples).
7171

7272
## Deploying to Kubernetes:
7373

@@ -144,4 +144,4 @@ You can also see some example configurations in the examples directory of the He
144144

145145
## Using LangSmith
146146

147-
Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the [Self-Hosted Usage Guide](/self_hosting/usage).
147+
Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the [self-hosted usage guide](/self_hosting/usage).
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
sidebar_label: LangSmith-managed ClickHouse (Beta)
3+
sidebar_position: 9
4+
---
5+
6+
# LangSmith-managed ClickHouse (Beta)
7+
8+
:::note beta
9+
This feature is currently in beta. Please reach out to our team at [email protected] if you are interested in leveraging this option.
10+
:::
11+
12+
:::tip recommended reading
13+
14+
Please read the [LangSmith architectural overview](./architectural_overview) and [guide on connecting to external Clickhouse](./configuration/external_clickhouse) before proceeding with this guide.
15+
16+
:::
17+
18+
As mentioned in previous guides, LangSmith uses Clickhouse as the primary storage engine for traces and feedback.
19+
For easier management and scaling, it is recommended to connect a self-hosted LangSmith instance to an external Clickhouse instance. LangSmith-managed ClickHouse is an option that allows you to use a fully managed ClickHouse instance that is monitored and maintained by the LangSmith team.
20+
21+
## Architecture Overview
22+
23+
Using LangSmith Managed Clickhouse with your Self-Hosted LangSmith instance is fairly simple. The overall architecture is similar to using a fully self-hosted ClickHouse instance, with a few key differences:
24+
25+
- You will need to set up a private network connection between your LangSmith instance and the LangSmith-managed ClickHouse instance. This is to ensure that your data is secure and that you can connect to the ClickHouse instance from your self-hosted LangSmith instance.
26+
- With this option, sensitive information (inputs and outputs) of your traces will be stored in cloud object storage (S3 or GCS) within your cloud instead of Clickhouse to ensure that sensitive information doesn't leave your VPC.
27+
28+
:::note More on sensitive information
29+
30+
This [reference doc](../reference/data_formats/run_data_format) explains the format we use to store runs (spans), which are the building blocks of traces.
31+
32+
Our definition of sensitive information as it relates to application data are `inputs` and `outputs` of a run, since these fields can contain prompts and completions from LLMs.
33+
34+
With LangSmith-managed ClickHouse, we store the `inputs` and `outputs` in cloud object storage (S3 or GCS) within your cloud and store the rest of the run data in ClickHouse. This ensures that sensitive information doesn't leave your VPC.
35+
36+
:::
37+
38+
- The LangSmith team will monitor your ClickHouse instance and ensure that it is running smoothly. This allows us to track metrics like run-ingestion delay and query performance.
39+
40+
The overall architecture looks like this:
41+
42+
![LangSmith Managed ClickHouse Architecture](./static/langsmith_managed_clickhouse_architecture.png)
43+
44+
## Requirements
45+
46+
- **You must be on AWS or GCP.** We do not support Azure at this time as we require S3 or GCS for blob storage.
47+
- You must have a VPC that can connect to the LangSmith-managed Clickhouse service. You will need to work with our team to set up the necessary networking.
48+
- You must have a LangSmith self-hosted instance running. You can use our managed ClickHouse service with both [Kubernetes](./installation/kubernetes) and [Docker](./installation/docker) installations.

versioned_docs/version-2.0/self_hosting/release_notes.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
sidebar_label: Release Notes (Self-Hosted)
2+
sidebar_label: Release notes (self-hosted)
33
sidebar_position: 7
44
---
55

0 commit comments

Comments
 (0)