Skip to content

Conversation

frankmcsherry
Copy link
Member

This PR introduces a generalization to the arrange method, which now allows the user to interpose between the batch formation and accepting the batch into the output trace. In particular, the input and output trace formats may be different, which allows the user the ability to perform some non-standard translation, for example playing a state machine forward and recording the transitions in the output.

The vanilla arrange_core operator is now implemented using "logic" that just passes through the input batch. It looks like so:

        self.arrange_general::<P, Tr, Tr, _, _>(pact, name, |_capability, _trace_agent| {
            |batch, _capability| (batch, Vec::new())
        })

Other operators like upsert and reduce could in principle be ported to this framework, but I wanted to float this first before doing a massive re-write.

cc: @ruchirK @petrosagg

Copy link
Contributor

@ruchirK ruchirK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

/// Arranges a differential dataflow collection with custom user logic.
///
/// This method generalizes `arrange` in that the output type may differ
/// from the input type, and the user is allow to perform logic as the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is allowed

@frankmcsherry
Copy link
Member Author

Brief thoughts on reduce at least: this probably isn't a great place to port that too without some more thinking, as one of its goals is to accept pre-arranged input, and this operator is doing the arranging. But, I could imagine with a bit more thinking finding a way to blend the two, where you get a stream of arranged data and input trace, which instances of this operator would then immediately drop (just because that is what it does) but which others (like reduce) could hold on to.

It's a bit weird, because this operator would generalize to "something that takes arrangements as input" which .. well at the moment it is what makes arrangements, so clearly there is some unpicking to do there.

The goal, though, is to avoid having so many copy/paste instances of "operator that forms batches and maintains traces" as we have in arrange, reduce, and upsert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants