-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic langsmith-nodejs Rust bindings #1293
base: main
Are you sure you want to change the base?
Conversation
12f8b84
to
dab5a29
Compare
f125b51
to
ee51176
Compare
ee51176
to
ba76254
Compare
@@ -103,6 +103,7 @@ | |||
"dependencies": { | |||
"@types/uuid": "^10.0.0", | |||
"commander": "^10.0.1", | |||
"langsmith-nodejs": "file:../rust/crates/langsmith-nodejs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merging, we'll need to replace this path dependency with a dependency on a real, published JS library.
js/src/client.ts
Outdated
const response = await this.caller.call( | ||
_getFetchImplementation(), | ||
`${this.apiUrl}/runs`, | ||
{ | ||
method: "POST", | ||
headers, | ||
body: stringifyForTracing(mergedRunCreateParam), | ||
signal: AbortSignal.timeout(this.timeout_ms), | ||
...this.fetchOptions, | ||
} | ||
); | ||
await raiseForStatus(response, "create run", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the old code as-is, just moved into the else
branch.
@@ -3769,7 +3844,7 @@ export class Client implements LangSmithTracingClientInterface { | |||
if (options?.isPublic && !settings.tenant_handle) { | |||
throw new Error( | |||
`Cannot create a public prompt without first\n | |||
creating a LangChain Hub handle. | |||
creating a LangChain Hub handle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about this noise — my editor is stripping trailing whitespace on save by default. If this is a concern, I can undo it.
* Clone a public dataset to your own langsmith tenant. | ||
* This operation is idempotent. If you already have a dataset with the given name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about this noise — my editor is stripping trailing whitespace on save by default. If this is a concern, I can undo it.
@@ -2,7 +2,7 @@ use serde::{Deserialize, Serialize}; | |||
use serde_json::Value; | |||
|
|||
// Map attachment ref to tuple of filename, optional bytes | |||
#[derive(Debug)] | |||
#[derive(Debug, Serialize, Deserialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made all of these Serialize, Deserialize
to make it easier to create them from JS.
This is a temporary hack to get something working quickly. We can figure out a more appropriate long-term solution once we have an end-to-end working implementation.
…h-sdk into pg/nodejs-bindings
@@ -1,5 +1,7 @@ | |||
import * as uuid from "uuid"; | |||
|
|||
import * as bindings from "langsmith-nodejs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any compatibility issues here?
Will this install in environments that don't have Rust?
We can do some refactoring around entrypoints to make sure this works if so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Rust shouldn't be required and users shouldn't notice we use Rust.
The usual approach (e.g. used by Sentry and esbuild) is to publish a cluster of packages like langsmith-nodejs-darwin-arm64, langsmith-nodejs-linux-x64-gnu
etc. (see the npm
directory in this PR) that wrap pre-compiled binaries for that specific platform: macOS on ARM, x64 linux with gnu libc, etc. We should look to make such packages available for all platforms on which users might use langsmith. Adding a new one is relatively straightforward.
When users install langsmith
, it pulls in the dependency on langsmith-nodejs
, which in turn has an optionalDependencies
section with all the per-platform packages. Their package manager then only installs the "optional dependency" that matches their platform.
If people don't install optional dependencies, this obviously wouldn't work. There are a variety of options to give users actionable advice or fix the problem here:
- Using a
postinstall
script to determine the platform and download the required binary. Given that LangChain is used in enterprise environments, this is probably not a good idea — it would likely set off all kinds of security alerts. - Instead of downloading the binary, telling the user to either use optional dependencies or manually add a dependency on the correct platform-specific package based on the platform we've determined in
postinstall
. - Checking at
langsmith
initialization time that the platform-specific dependency is satisfied, and then notifying the user in the same way as above.
I think we probably want to stay away from 1., and probably do 2. with optionally adding 3.
P.S.: The package names are placeholders. langsmith-nodejs
makes sense from the perspective of a Rust crate, but is probably not the right name for an npm package. Once we have something working end-to-end, we can and probably should rename.
@@ -521,6 +526,27 @@ export class Client implements LangSmithTracingClientInterface { | |||
config.blockOnRootRunFinalization ?? this.blockOnRootRunFinalization; | |||
this.batchSizeBytesLimit = config.batchSizeBytesLimit; | |||
this.fetchOptions = config.fetchOptions || {}; | |||
|
|||
// TODO: Do we care about syncing up the env var names between the JS and Python bindings? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we want to do LANGSMITH_USE
if (this._rustClient) { | ||
const mergedRunCreateParam = mergeRuntimeEnvIntoRunCreate(runCreate); | ||
|
||
// We need to massage the data shape into what Rust expects and will accept, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way we can remove this shim in the JS client?
Can we add something to the Rust binding that converts JS data shapes to their Rust counterparts in a more general way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I plan to move most of this into Rust. This was just an MVP just to show the approach working end-to-end.
Some shim parts might need to continue to exist here though, for the sake of performance. It's relatively expensive to jump back and forth between JS and native code many times, so it's quite likely that blocks like this one are faster in JS than in Rust:
const io = {
inputs: mergedRunCreateParam.inputs
? stringifyForTracing(mergedRunCreateParam.inputs)
: null,
outputs: mergedRunCreateParam.outputs
? stringifyForTracing(mergedRunCreateParam.outputs)
: null,
};
This way, Rust sees inputs
and outputs
as just binary buffers instead of having to move through and dereference lots of JS objects. The serialization is using a highly optimized code path within the JS engine, instead of whatever Rust can access "from the outside."
It's probably an 80-20 kind of thing: 80% can easily move into Rust, so it should. Moving the remaining 20% to Rust would be more trouble than it's worth, and may require extra performance engineering work we could have just avoided by keeping it in JS.
console.error(e); | ||
} | ||
return; | ||
} else if (this.autoBatchTracing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that anyone using the new bindings will have this enabled right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I can document that explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @obi1kenobi!
Main question on is if we can generalize serialization from JS to Rust and move it into the binding to avoid overhead on other methods
Rust-based client bindings for submitting data to LangSmith.
Current state: MVP, aiming to get minimal end-to-end validation of the approach. Only implements
createRun
at the moment.updateRun
does not have a corresponding Rust implementation.Don't mind the failing tests at the moment — it's because there aren't any test files, so the test runner is exiting non-zero. If we have a good way to make an integration test for this, we can easily add it to the test suite. In one of the earlier commits, I tested the bindings with a simple
fibonacci()
function to make sure the Node.js -> Rust calling works, and all the tests passed.Testing this code
To build the native code bindings:
This should produce a file with a name like
langsmith-nodejs.linux-x64-gnu.node
(on libc x64 Linux), orlangsmith-nodejs.darwin-arm64.node
(on macOS arm64). That file is the native Rust code, compiled for your platform.Then, go to
js/
and runyarn install
to install the native library (as a path dependency).To have the JS code use the Rust bindings to submit data, run it with the
LANGCHAIN_USE_RUST_CLIENT
environment variable set to any value.