You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: versioned_docs/version-2.0/self_hosting/architectural_overview.mdx
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
---
2
-
sidebar_label: Architectural Overview
2
+
sidebar_label: Architectural overview
3
3
sidebar_position: 1
4
4
table_of_contents: true
5
5
---
6
6
7
-
# Architectural Overview
7
+
# Architectural overview
8
8
9
9
:::important Enterprise License Required
10
10
Self-Hosted LangSmith is an add-on to the Enterprise Plan designed for our largest, most security-conscious customers. See our [pricing page](https://www.langchain.com/pricing) for more detail, and contact us at [email protected] if you want to get a license key to trial LangSmith in your environment.
@@ -36,7 +36,7 @@ In a production setting, we **strongly recommend using external Storage Services
36
36
37
37
### ClickHouse
38
38
39
-
[ClickHouse](https://clickhouse.com/docs/en/intro) is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP)
39
+
[ClickHouse](https://clickhouse.com/docs/en/intro) is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP).
40
40
41
41
LangSmith uses ClickHouse as the primary data store for traces and feedback (high-volume data).
LangSmith uses ClickHouse as the primary data store for traces and feedback. By default, LangSmith Self-Hosted will use an internal ClickHouse database that is bundled with the LangSmith instance.
9
+
ClickHouse is a high-performance, column-oriented database system. It allows for fast ingestion of data and is optimized for analytical queries.
10
10
11
-
However, you can configure LangSmith to use an external ClickHouse database. By configuring an external ClickHouse database, you can manage backups, scaling, and other operational tasks for your database.
12
-
Unfortunately, many cloud providers do not offer managed ClickHouse services at this time. Instead, you can run ClickHouse in a few ways:
11
+
LangSmith uses ClickHouse as the primary data store for traces and feedback. By default, self-hosted LangSmith will use an internal ClickHouse database that is bundled with the LangSmith instance. This is run as a stateful set in the same Kubernetes cluster as the LangSmith application or as a Docker container on the same host as the LangSmith application.
13
12
14
-
- LangSmith Managed ClickHouse Cloud(Reach out to us at [email protected] for more information)
15
-
- Provision an instance in [ClickHouse Cloud](https://clickhouse.cloud/)
16
-
- Provision a ClickHouse Cloud instance via Marketplace
13
+
However, you can configure LangSmith to use an external ClickHouse database for easier management and scaling.
14
+
By configuring an external ClickHouse database, you can manage backups, scaling, and other operational tasks for your database.
15
+
While Clickhouse is not yet a native service in Azure, AWS, or Google Cloud, you can run LangSmith with an external ClickHouse database in the following ways:
Using the first two options (LangSmith-managed ClickHouse or ClickHouse Cloud) will provision a Clickhouse service OUTSIDE of your VPC.
27
+
However, both options support private endpoints, meaning that you can direct traffic to the ClickHouse service without exposing it to the public internet (eg via AWS PrivateLink, or GCP Private Service Connect).
28
+
29
+
Additionally, sensitive information can be configured to be not stored in Clickhouse. Please reach out to [email protected] for more information.
30
+
:::
31
+
22
32
## Requirements
23
33
24
-
- A provisioned ClickHouse Instance that your LangSmith instance will have network access to.
25
-
- A user with admin access to the ClickHouse database. This user will be used to create the necessary tables, indexes, and views
34
+
- A provisioned ClickHouse instance that your LangSmith application will have network access to (see above for options).
35
+
- A user with admin access to the ClickHouse database. This user will be used to create the necessary tables, indexes, and views.
26
36
- We only support standalone ClickHouse (not clustered or replicated) or ClickHouse Cloud.
27
-
- We only support ClickHouse versions >= 23.9. Use of ClickHouse versions >= 24.2 requires LangSmith v0.6 or later.
37
+
- We only support ClickHouse versions >= 23.9. Use of ClickHouse versions >= 24.2 requires LangSmith v0.6 or later. See the [LangSmith release notes](../release_notes) for more information.
28
38
29
39
## Parameters
30
40
31
41
You will need to provide several parameters to your LangSmith installation to configure an external ClickHouse database. These parameters include:
32
42
33
-
- Host
34
-
- HTTP Port
35
-
- Native Port
36
-
- Database
37
-
- Username
38
-
- Password
43
+
- Host: The hostname or IP address of the ClickHouse database
44
+
- HTTP Port: The port that the ClickHouse database listens on for HTTP connections
45
+
- Native Port: The port that the ClickHouse database listens on for [native connections](https://clickhouse.com/docs/en/interfaces/tcp)
46
+
- Database: The name of the ClickHouse database that LangSmith should use
47
+
- Username: The username to use to connect to the ClickHouse database
48
+
- Password: The password to use to connect to the ClickHouse database
LangSmith uses a Postgres database as the primary data store for transactional workloads and operational data (almost everything besides runs). By default, LangSmith Self-Hosted will use an internal Postgres database.
10
10
However, you can configure LangSmith to use an external Postgres database (**strongly recommended in a production setting**). By configuring an external Postgres database, you can more easily manage backups, scaling, and other operational tasks for your database.
LangSmith uses Redis to back our queuing/caching operations. By default, LangSmith Self-Hosted will use an internal Redis instance.
10
10
However, you can configure LangSmith to use an external Redis instance (**strongly recommended in a production setting**). By configuring an external Redis instance, you can more easily manage backups, scaling, and other operational tasks for your Redis instance.
Copy file name to clipboardexpand all lines: versioned_docs/version-2.0/self_hosting/index.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -7,8 +7,8 @@ sidebar_position: 0
7
7
8
8
Step-by-step guides that cover the installation, configuration, and scaling of your Self-Hosted LangSmith instance.
9
9
10
-
-[Architectural Overview](./self_hosting/architectural_overview): A high-level overview of the LangSmith architecture.
11
-
-[Storage Services](./self_hosting/architectural_overview#datastores): The storage services used by LangSmith.
10
+
-[Architectural overview](./self_hosting/architectural_overview): A high-level overview of the LangSmith architecture.
11
+
-[Storage services](./self_hosting/architectural_overview#datastores): The storage services used by LangSmith.
12
12
-[Services](./self_hosting/architectural_overview#services): The services that make up LangSmith.
13
13
-[Installation](./self_hosting/installation): How to install LangSmith on your own infrastructure.
14
14
-[Kubernetes](./self_hosting/installation/kubernetes): Deploy LangSmith on Kubernetes.
@@ -20,7 +20,7 @@ Step-by-step guides that cover the installation, configuration, and scaling of y
20
20
-[Connect to an external Redis instance](./self_hosting/configuration/external_redis): Configure LangSmith to use an external Redis instance.
21
21
-[Usage](./self_hosting/usage): How to use your self-hosted instance of LangSmith.
22
22
-[Upgrades](./self_hosting/upgrades): How to upgrade your self-hosted instance of LangSmith.
23
-
-[Release Notes](./self_hosting/release_notes): The latest release notes for LangSmith.
23
+
-[Release notes](./self_hosting/release_notes): The latest release notes for LangSmith.
24
24
-[Week of June 17, 2024 - LangSmith v0.6](./self_hosting/release_notes#week-of-june-17-2024---langsmith-v05): Release notes for version 0.6 of LangSmith.
25
25
-[Week of May 13, 2024 - LangSmith v0.5](./self_hosting/release_notes#week-of-may-13-2024---langsmith-v05): Release notes for version 0.5 of LangSmith.
26
26
-[Week of March 25, 2024 - LangSmith v0.4](./self_hosting/release_notes#week-of-march-25-2024---langsmith-v04): Release notes for version 0.4 of LangSmith.
Copy file name to clipboardexpand all lines: versioned_docs/version-2.0/self_hosting/installation/kubernetes.mdx
+4-4
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ We've successfully tested LangSmith on the following Kubernetes distributions:
20
20
- OpenShift
21
21
- Minikube and Kind (for development purposes)
22
22
23
-
To review all configuration options, look at the values.yaml for the [LangSmith Helm Chart](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/README.md).
23
+
To review all configuration options, look at the values.yaml for the [LangSmith helm chart](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/README.md).
24
24
25
25
## Prerequisites
26
26
@@ -65,9 +65,9 @@ Ensure you have the following tools/items ready. Some items are marked optional:
65
65
## Configure your Helm Charts:
66
66
67
67
1. Create a new file called `langsmith_config.yaml`. This should have a similar structure to the `values.yaml` file in the LangSmith Helm Chart repository. Only include the values you want to override to avoid having to update the file every time the chart is updated.
68
-
2. Set the appropriate values in the `langsmith_config.yaml` file. You can find the available configuration options in the [Configuration](/self_hosting/configuration) section.
68
+
2. Set the appropriate values in the `langsmith_config.yaml` file. You can find the available configuration options in the [configuration](/self_hosting/configuration) section.
69
69
70
-
You can also see some example configurations in the examples directory of the Helm Chart repository here: [LangSmith Helm Chart Examples](https://github.com/langchain-ai/helm/tree/main/charts/langsmith/examples).
70
+
You can also see some example configurations in the examples directory of the Helm Chart repository here: [LangSmith helm chart examples](https://github.com/langchain-ai/helm/tree/main/charts/langsmith/examples).
71
71
72
72
## Deploying to Kubernetes:
73
73
@@ -144,4 +144,4 @@ You can also see some example configurations in the examples directory of the He
144
144
145
145
## Using LangSmith
146
146
147
-
Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the [Self-Hosted Usage Guide](/self_hosting/usage).
147
+
Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the [self-hosted usage guide](/self_hosting/usage).
This feature is currently in beta. Please reach out to our team at [email protected] if you are interested in leveraging this option.
10
+
:::
11
+
12
+
:::tip recommended reading
13
+
14
+
Please read the [LangSmith architectural overview](./architectural_overview) and [guide on connecting to external Clickhouse](./configuration/external_clickhouse) before proceeding with this guide.
15
+
16
+
:::
17
+
18
+
As mentioned in previous guides, LangSmith uses Clickhouse as the primary storage engine for traces and feedback.
19
+
For easier management and scaling, it is recommended to connect a self-hosted LangSmith instance to an external Clickhouse instance. LangSmith-managed ClickHouse is an option that allows you to use a fully managed ClickHouse instance that is monitored and maintained by the LangSmith team.
20
+
21
+
## Architecture Overview
22
+
23
+
Using LangSmith Managed Clickhouse with your Self-Hosted LangSmith instance is fairly simple. The overall architecture is similar to using a fully self-hosted ClickHouse instance, with a few key differences:
24
+
25
+
- You will need to set up a private network connection between your LangSmith instance and the LangSmith-managed ClickHouse instance. This is to ensure that your data is secure and that you can connect to the ClickHouse instance from your self-hosted LangSmith instance.
26
+
- With this option, sensitive information (inputs and outputs) of your traces will be stored in cloud object storage (S3 or GCS) within your cloud instead of Clickhouse to ensure that sensitive information doesn't leave your VPC.
27
+
28
+
:::note More on sensitive information
29
+
30
+
This [reference doc](../reference/data_formats/run_data_format) explains the format we use to store runs (spans), which are the building blocks of traces.
31
+
32
+
Our definition of sensitive information as it relates to application data are `inputs` and `outputs` of a run, since these fields can contain prompts and completions from LLMs.
33
+
34
+
With LangSmith-managed ClickHouse, we store the `inputs` and `outputs` in cloud object storage (S3 or GCS) within your cloud and store the rest of the run data in ClickHouse. This ensures that sensitive information doesn't leave your VPC.
35
+
36
+
:::
37
+
38
+
- The LangSmith team will monitor your ClickHouse instance and ensure that it is running smoothly. This allows us to track metrics like run-ingestion delay and query performance.
-**You must be on AWS or GCP.** We do not support Azure at this time as we require S3 or GCS for blob storage.
47
+
- You must have a VPC that can connect to the LangSmith-managed Clickhouse service. You will need to work with our team to set up the necessary networking.
48
+
- You must have a LangSmith self-hosted instance running. You can use our managed ClickHouse service with both [Kubernetes](./installation/kubernetes) and [Docker](./installation/docker) installations.
0 commit comments