-
Notifications
You must be signed in to change notification settings - Fork 232
Add nats-jetstream package #734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: add-nats-client-package
Are you sure you want to change the base?
Conversation
cc17acd to
b383f0b
Compare
aricart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure why we are going to look at the schemas and validate them... But LGTM
caspervonb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits
caspervonb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More nits
| self, | ||
| stream_name: str, | ||
| name: str, | ||
| durable_name: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pack into kwargs
| durable_name: str | None = None, |
| self, | ||
| stream_name: str, | ||
| name: str, | ||
| durable_name: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pack into kwargs
| durable_name: str | None = None, |
| Returns: | ||
| A unique inbox subject string in the format "_INBOX.{uuid}" | ||
| """ | ||
| return f"_INBOX.{uuid.uuid4().hex}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened to the nuid?
wallyqs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have:
- Clustering based test cases
- Cluster reconnect test cases resuming fetch
?
philpennock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed through end of the src/nats/jetstream/api/ folder, need to resume from the consumer/ sibling to that when I next look at this again. Sending what I've looked at so far though. (Will be around on Thursday to resume).
| while True: | ||
| await asyncio.sleep(0.5) | ||
| if nc.status != ClientStatus.CONNECTED: | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is example code for people to start from, and people do Interesting Things when copy/pasting without full understanding, I think it's worth a small comment:
if nc.status != ClientStatus.CONNECTED:
# NB: this is only safe because of the sleep above; beware busy loops
continueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah example got stale, need new ones.
| try: | ||
| await js.publish("FOO.TEST1", f"msg {i}".encode()) | ||
| except Exception as err: | ||
| print("pub error: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grumble (unix stdio pedant) : , file=sys.stderr, flush=True) (and import sys above)
| @@ -0,0 +1,24 @@ | |||
| { | |||
| "$schema": "http://json-schema.org/draft-07/schema#", | |||
| "$id": "https://nats.io/schemas/jetstream/api/v1/account_info_response.json", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, are we expecting every client library for each language to hold a copy of these? That seems ... sub-optimal. Shouldn't there be one primary authoritative location, designed to be pulled in via git submodules or whichever other repo-linking poison folks prefer?
(I'm assuming no point reviewing the schemas in this PR so skipping past them.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, those schemas live and are maintained in jsm.go, it'd be hard to keep it up to date across many repos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are pulled from there, just with a little bit of filtering unwanted files. I'd do a submodule if we had nats-io/schema or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is a task to fetch them from jsm.go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah tools/fetch_schemas.py
| StreamState, | ||
| SubjectTransform, | ||
| Tier, | ||
| LostStreamData, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort
| def check_response( | ||
| data: Any, expected_type: type[ResponseT] | ||
| ) -> tuple[bool, set[str] | None, set[str] | None]: | ||
| # TODO check for missing and unknown keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity: are you expecting this TODO to be done before this PR moves out of draft, or for some later milestone?
|
|
||
| consumer_name = consumer_config.get("name") | ||
| if not consumer_name: | ||
| raise ValueError("name is required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be more friendly if all 3 were collected, and then if not (a and b and c) used to return a ValueError listing all the missing fields at once.
(nb: this is an opinion and not a blocking issue, happy to be ignored on this)
| mirror = kwargs.get("mirror") | ||
| subjects = kwargs.get("subjects") | ||
| if mirror is not None and subjects: | ||
| raise ValueError("Cannot specify both mirror and subjects") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I look at logic like this and can't help but think "Isn't this schema enforcement? If we have all these .json schema files, why aren't the client functions (at least for infrequent ops like stream create) just validating the config against the schema file instead of implementing policy checks manually?"
| response_type: type[ResponseT], | ||
| raise_on_error: Literal[True] = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, and the same in the next @overload, is missing the timeout parameter; the implementation is assignable to these function signatures so it should work, but it looks like this means that type-checkers should reject attempts to actually provide the timeout parameter. Have I misunderstood? Is this intentional? Or is it just a forgotten parameter for the signature?
| raise_on_error: bool = True, | ||
| timeout: float = 5.0, | ||
| ) -> ResponseT | ErrorResponse: | ||
| request_id = str(uuid.uuid4())[:8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reading 16 bytes from /dev/urandom every time, just to return the representation of the first 4 bytes. Aside from making me twitch (which is probably amusing 😉), that's also going to be a device read on every JSON request.
The secrets module was introduced in Python 3.6; secrets.token_hex(4) will be more efficient, but still hit /dev/urandom. But the only goal is a distinct ID, there's no defense against other clients knowing the ID, right? Presumably not using an incrementing integer to avoid threading issues, but should then be using a client-side cache to prevent re-use, so just rolling the dice here (but the die has a lot of sides). I'd be tempted to use an in-process PRNG seeded once and just roll with that.
For now, just use secrets.token_hex(4) please and remember to benchmark to see if this ever matters with lots of small JSON requests.
philpennock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the rest, except I skimmed the test files just enough to confirm they were limited to behavior tests, not reflective integrity tests.
| """Protocol for a batch of messages retrieved from a JetStream consumer.""" | ||
|
|
||
| def __aiter__(self) -> AsyncIterator[Message]: | ||
| """Return self as an async iterator.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in MessageStream below, is this really self?
| raise StopAsyncIteration | ||
| case _: | ||
| self._error = Exception( | ||
| f"Status {raw_msg.status.code}: {raw_msg.status.description or ''}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: can we use or '(missing description)' to make it clearer if this exception is ever seen that there was something expected that's missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we shouldn't swallow unexpected errors.
| try: | ||
| headers_bytes = base64.b64decode(message["hdrs"]) | ||
| headers_str = headers_bytes.decode("utf-8") | ||
| if headers_str.startswith("NATS/1.0\r\n"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an exception raised if the headers don't start with this?
| def main(): | ||
| """Main entry point.""" | ||
| # Get the root directory of our project | ||
| root_dir = Path(__file__).parent.parent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fragile to current location in the git repo. If you're going to invoke git anyway for this rare maintenance operation, then git rev-parse --show-toplevel is your friend here.
7acfd1f to
2e13fe5
Compare
1da8646 to
b6b8a1a
Compare
3cd0ce1 to
171ad83
Compare
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
- Add validation to reject both max_messages and max_bytes simultaneously - Set batch size to 1,000,000 when using max_bytes for better throughput - Add heartbeat timer pause/resume on disconnect/reconnect - All three fixes address the non-compliance issues identified in ADR-37
- Remove test that used both max_messages and max_bytes together - Add test validating rejection of both parameters - Fix threshold_bytes test to use only max_bytes per ADR-37
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
Signed-off-by: Casper Beyer <[email protected]>
1274414 to
185c9f9
Compare
a3fc895 to
71eb800
Compare
This adds a new modern (following the newer simplified interface) JetStream package under the name
nats-jetstreamwhich implements the following:Depends on #732