Skip to content
This repository has been archived by the owner on May 21, 2020. It is now read-only.

Upgrade to KCL 2.x #13

Closed
julianhowarth opened this issue Dec 7, 2018 · 5 comments
Closed

Upgrade to KCL 2.x #13

julianhowarth opened this issue Dec 7, 2018 · 5 comments

Comments

@julianhowarth
Copy link
Contributor

Are there any plans to upgrade the version of the KCL used? 2.x adds some additional features that we'd be interested in using.

We'll probably try to build a version that works anyway, but don't want to repeat work that's already being done.

@aserrallerios
Copy link
Owner

I have not worked on it yet. It'd be great if you can open a PR!

@julianhowarth
Copy link
Contributor Author

It looks like the api has changed a fair amount: https://docs.aws.amazon.com/streams/latest/dev/kcl-migration.html, so I don't think it will be possible to support both versions. Is that likely to be a problem?

@aserrallerios
Copy link
Owner

I don't think so, I'd rather just support the latest SDK version.

@julianhowarth
Copy link
Contributor Author

The problems described in #9 are exacerbated when using the 2.0 client lib. There are now separate callbacks for shard ended and shutdown requested and the issue is still that we can't checkpoint until we know we have completed the records we are processing.

There are a couple of options I can think of:

  • use the terminateStreamGracePeriod and sleep, then checkpoint which assumes that the messages have all made it through the flow within the grace period
  • alter the primary flow so that we can carry a checkpointing control signal which is executed after the last of the messages (I haven't investigated how best to do this yet, but it is similar to solutions I've used in the past)

Clearly the first option is the simpler, the second should be the more robust.

@aserrallerios
Copy link
Owner

I'd say that we should keep the same behavior in the scope of this change (2.X upgrade).

I agree that it'd be nice to implement a more robust termination of streams. We can add more mutable shared state to keep track of pending to commit records and terminate the stream if:

  • There are no records to commit and we got the stream termination signal

or:

  • We're commiting the last uncommited record
  • We've received the stream termination signal

Anyway as I said, I'd do it later, as this change needs to be carefully implemented to avoid race conditions & potential problems.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants