builtin merge-tool: fix edge cases found by property-based testing #6448

jgreitemann · 2025-05-04T17:14:16Z

Fixes #5189. As discussed in the issue, I went ahead with property-based testing and followed the state machine approach mentioned by @arxanas. Thanks for the pointers you gave in #5189 (comment); they were quite helpful to get my head wrapped around it.

You mentioned that it would be nice to create working copy states for other tests. AFAIU the split been the ReferenceStateMachine and StateMachineTest traits in proptest-state-machine is intended to facilitate this kind of reuse of the reference state machine while keeping the test's state separate, so I've created a dedicated proptest module in testutils containing the data structure implementing ReferenceStateMachine the and associated strategies. The StateMachineTest implementation in merge_tools::builtin then applies the same transitions to its domain by manipulating the "right" MergedTree; another implementation could pick a different architectual flight level.

Regarding the state machine modeling:

The transitions CreateFile/DeleteFile are implemented in a way that they don't require the path to be vacant/extant. CreateFile will overwrite existing files or even directories (as in the case of the issue) and DeleteFile is a no-op if the file does not exist. Of course, the strategies will generate the transitions such that CreateFile references a random file in an existing directory and DeleteFile references an existing file, but collisions might happen and shrinking can make existing files disappear (or even new ones appear, if a prior DeleteFile transition has been shrunk away). It is possible to impose preconditions for transitions and I did so at first but I've found that this really impedes the exploration of the state space and leads into dead ends when shrinking, presumably because the initial state can no longer be shrunk independently from the transitions and vice versa. This is a little hard to explain without going too deep; feel free to hit me up on Discord.
I've not found a great way to test for other properties than what is done in the existing manual tests: selecting none of the changes produces the left tree, selecting them all produces the right tree. As mentioned in the discussion of the issue, it might make sense to apply only some changes, produce an intermediate tree, then apply the remaining changes to go all the way to the right tree. Usually, this sort of thing should be a perfect fit for proptesting but I've struggled to reconcile this with the state machine approach.
The present incarnation is pretty limited to the issue at hand. There are several ways to expand on that (some of which you already mentioned). One thing that I did include is file mode (well, at least the executable flag) and this promptly found another issue that is also fixed in this PR (in the last commit). (File mode was not covered by the manual tests so far because testutils::create_tree didn't account for it.)

The fix for the original issue is in the penultimate tree_builder: ignore tombstone overrides on nested trees. I've also added a regular regression test for this particular scenario. (Proptest docs recommend adding regular tests for known regressions as changes in the strategies effectively invalidate the checked-in seeds).

I'm not super happy with the fixes for either of the bugs. In both cases, it is the best thing I could do given the information that scm-record produces. I've highlighted the pain points in the ~~commit messages~~ change descriptions; perhaps you can judge for yourself whether to take any of that upstream.

Checklist

If applicable:

I have updated CHANGELOG.md
I have updated the documentation (README.md, docs/, demos/)
I have updated the config schema (cli/src/config-schema.json)
I have added tests to cover my changes

This is a manual regression test for issue #5189. When this change is replaced onto `v0.28.2`, the test produces a panic, as described in the issue. As of #6411, the panic no longer happens. However, the test still fails and rightfully so: When following the reproduction and not selecting any change in `jj split`, the split-off (first) commit will still record the deletion of `folder/.keep` but not the creation of the file `folder`.

This adds the proptest crate for property-based testing as well as the proptest-state-machine crate as direct dev dependencies of jj-cli and as dependencies of the internal testutils crate. Within testutils, a `proptest` module provides a reference state machine which models a repository workspace as a tree-like data structure whose leaves represent `File`s. The possible transitions of this state machine are for now limited to the creation of new files (including replacements of existing files or directories) and deletions of files (pruning the tree of empty directory nodes). Additional transitions (moving files, modifying file contents incrementally, ...) and states (symlinks, submodules, conflicts, ...) may be added in the future. The `ReferenceStateMachine` trait implementation provides proptest with strategies for the generation and shrinking of both the initial state and the transitions that are replayed on it; by shrinking the transitions rather than another independent reference state, proptest can search of a failing test input with a minimal diff. This makes this approach quite suited to VCS problems. This reference state machine is then applied to the builtin merge-tool's test suite: - The initial state is used to build a corresponding `MergedTree`. Its ID is used for the fixed "left" tree and serves as the starting point for the right tree. - As transitions are applied, the right tree is updated accordingly. - Each step of the way, the same test logic as in the manual `test_edit_diff_builtin*` tests is run to check that splitting off none or all of the changes results in the left or right tree, respectively. Aside from the bug already captured by `*_replace_directory_with_file`, the property-based test found an independent problem related to file mode changes of empty files. Regression test seeds for both of these issues are also checked in. This ensures that others / CI will reproduce known edge cases deterministically before randomly exploring additional onwards.

scm-record yields a selected change with `FileMode::Absent` in one of two cases: - when an existing file is deleted and this change is selected, - when a new file is created and this change is not selected. From the information provided by `File::get_selected_contents`, it is not possible to distinguish these two cases. In the first case, the tree definitely needs to change to reflect the deletion, so a "tombstone" override is set on the tree builder. In the second case, that tombstone is usually ignored when `TreeBuilder::write_tree()` processes the overrides because the associated file does not exist. However, when the tombstone happens to coindice with a deleted directory and a new file has been created in its place, but neither change is selected, then the tombstone had the effect of deleting the directory from the tree. This is fixed now by ignoring tombstones on trees. When a directory is actually intentionally deleted, this will result in a bunch of tombstones on the files therein; after processing the file overrides, `write_tree()` will remove any now-empty directory trees anyway. Perhaps, a better solution would see scm-record provide information to distinguish a selected deletion from an unselected creation to avoid placing the tombstone altogether. For now, this fixes the test case `test_edit_diff_builtin_replace_directory_with_file` and one failure source of the property-based tests.

The property-based test exposed another bug; this can be reproduced in the following way: $ touch empty-file $ jj commit -m "add non-executable file" $ jj split ...then select the file mode change interactively and give the split-off change a name. Since all the changes have been selected, one would expect the first part to contain the file mode change and the second part to be empty, but in fact: First part: nrkmknyr 785bfe6e (empty) bla Second part: qmwwknqx b5a92967 (no description set) Doing the equivalent thing non-interactively has the expected result: $ jj undo $ jj split empty-file Warning: All changes have been selected, so the second commit will be empty First part: nrkmknyr 3f156f9e first Second part: zlurmpxm 06e24b2d (empty) (no description set) This problem only occurs when the file is empty. When the file mode of an empty text file changes, scm-record produces `SelectedContents::Unchanged`, even though there is a change that needs to be addressed. AFAICT, empty files are the only situation where `Unchanged` is produced. If the file mode of a non-empty file is changed, `SelectedContent::Text` is produced, even though the file itself has not changed and does not have to be written to the store. With the current semantics, IMHO `Unchanged` is a little misleading and acts more like an `Empty`. Knowing this, it is possible to fix the bug by just writing an empty file. However, this is a little wasteful to have to write the file blob anew when it hasn't changed (both when the file is empty or not). It would be nicer to have scm-record report an actual `Unchanged` (also when a non-empty file hasn't changed) and to have some way to update the executable flag in `MergedTreeBuilder` rather then having to provide a new value from scratch.

yuja · 2025-05-05T01:43:53Z

lib/src/tree_builder.rs

+                        tree.remove(basename);
+                    }
+                    None => {}
+                },


Can you move this to caller? As you said, I think this is the problem of scm_record. It's better to insert a workaround in the integration layer, not in the lower-level API.

BTW, since this patch fixes the problem, there should be a test change within this patch.

arxanas

Exciting! Thanks for doing this.

For the benefit of others, I don't think any of the commit messages/comments include an example of what the output looks like, so here's what the failing test looks like (before the fix):

thread 'merge_tools::builtin::tests::test_edit_diff_builtin_proptest' panicked at cli/src/merge_tools/builtin.rs:1425:5:
Test failed: assertion `left == right` failed: all-changes tree was different
  left: Merge(Resolved(TreeId("cce6abbe6e5c05234075")))
 right: Merge(Resolved(TreeId("1131d42426fbb8e2ad12"))).
minimal failing input: (initial_state, transitions, seen_counter) = (
    RepoRefState {
        root: Directory {
            gamma: RegularFile {
                contents: "",
                executable: false,
            },
        },
    },
    [
        CreateFile {
            path: "gamma",
            contents: "",
            executable: true,
        },
    ],
    None,
)

arxanas · 2025-05-06T16:31:02Z

cli/proptest-regressions/merge_tools/builtin.txt

+cc 50055483105a0234d9fb9d2eaf623033f583b31856f280a75f4d3877ea68046f # shrinks to (initial_state, transitions, seen_counter) = (RepoRefState { root: Directory { gamma: RegularFile { contents: "", executable: false } } }, [CreateFile { path: "gamma", contents: "", executable: true }], None)
+cc 7f8b4d5e36149675c133884be076c13317673a2fe0484986a2faf667924af673 # shrinks to (initial_state, transitions, seen_counter) = (RepoRefState { root: Directory { gamma: Directory { alpha: RegularFile { contents: "", executable: false } } } }, [CreateFile { path: "gamma", contents: "", executable: false }], None)


suggestion (non-blocking): My theoretical preference is to copy these tests into "real" test cases anyways, because they may not shrink to the same thing when the code itself changes. But it's a hassle, so I'm fine leaving this as-is for now.

I think Hypothesis makes it easier by 1) having an explicit example feature for embedding test cases right next to the property-based test definition and 2) having some ergonomic features to make it easier to copy/paste (or maybe apply a Git patch?) the test into the code.

arxanas · 2025-05-06T16:33:19Z

cli/src/merge_tools/builtin.rs

@@ -437,7 +437,13 @@ fn apply_changes(

        match contents {
            scm_record::SelectedContents::Unchanged => {
-                // Do nothing.
+                // scm_record currently only reports `Unchanged` if the file is empty.


question: Is this comment accurate? Does it really only report Unchanged if the file is empty? It was supposed to report it if the file was non-empty but the contents were still unchanged.

arxanas · 2025-05-06T16:35:52Z

cli/src/merge_tools/builtin.rs

+                // TODO: It would be sufficient to only update the executable flag on the tree
+                // value, but there's no way to retrieve the FileId or only flip the executable
+                // flag. We resort to re-writing the file, knowing that it's empty.
+                let executable = file_mode == mode::EXECUTABLE;
+                let value = write_file(&path, &[], executable)?;


comment (non-blocking): Hmm, it does seem like a problem that Store has no (?) interface to set or unset whether the file is executable.

Possibly what we should logically be doing is extending the above check for Absent to set the file mode on disk when present. (It's also an issue that we'd have to unconditionally set the file mode because we can't check the current file mode, but that's separate.)

question: Shouldn't the Binary case below have the same issue? If the file has binary contents, and we change the executable bit, then shouldn't it also fail to update?

arxanas · 2025-05-06T16:47:33Z

lib/testutils/src/proptest.rs

+    /// directories are replaced.
+    CreateFile {
+        path: RepoPathBuf,
+        contents: String,


suggestion: We should try to generate specific contents according to some rules:

Should generate newlines reasonably often, because the diff editor is newline-oriented.

Should generate non-UTF-8 in at least some cases.

comment: Allowing binary data here seems to expose an issue, something like this:

thread 'merge_tools::builtin::tests::test_edit_diff_builtin_proptest' panicked at cli/src/merge_tools/builtin.rs:1431:5: Test failed: assertion `left == right` failed: no-changes tree was different: expected tree: tree 9d99fd20a1bcb70eb010 file "a" (e6af4ee4458c7a51a060): "\0" actual tree: tree 4d94566e775adba288d3 file "a" (482ae5a29fbe856c7272): "" left: Merge(Resolved(TreeId("9d99fd20a1bcb70eb010"))) right: Merge(Resolved(TreeId("4d94566e775adba288d3"))). minimal failing input: (initial_state, transitions, seen_counter) = ( WorkingCopyReferenceStateMachine { entries: { "": Directory, }, }, [ SetDirEntry { path: "a", dir_entry: Some( File { contents: "\0", executable: false, }, ), }, Commit, SetDirEntry { path: "a", dir_entry: None, }, ], None, )

arxanas · 2025-05-10T22:40:24Z

lib/testutils/src/proptest.rs

+    }
+}
+
+fn arb_tree() -> impl Strategy<Value = Tree> {


suggestion: By default, I recommend implementing Arbitrary where possible and relying on type-directed strategy lookup, rather than writing explicit strategies (see also https://proptest-rs.github.io/proptest/proptest-derive/modifiers.html). I find it tends to be more ergonomic for writing new property-based tests, since you can just reach for some of the types that you already have on hand.

arxanas · 2025-05-10T22:41:04Z

lib/testutils/src/proptest.rs

+fn arb_path_component() -> impl Strategy<Value = RepoPathComponentBuf> {
+    // biased towards naming collisions (alpha-delta) but with the option to
+    // generate arbitrary UTF-8
+    "(alpha|beta|gamma|delta|[\\PC&&[^/]]+)".prop_map(|s| RepoPathComponentBuf::new(s).unwrap())


nit: I was surprised when I saw alpha/beta/gamma/delta in the output, despite having read this code beforehand. I would recommend that you keep it simple and just use a/b/c.

arxanas · 2025-05-10T22:44:30Z

lib/testutils/src/proptest.rs

+    "(alpha|beta|gamma|delta|[\\PC&&[^/]]+)".prop_map(|s| RepoPathComponentBuf::new(s).unwrap())
+}
+
+fn arb_node() -> impl Strategy<Value = Node> {


suggestion: Unfortunately, I don't think I communicated the design I was proposing well. In this case, I think you should need only one of

recursive generation

the state machine or a hand-rolled operation log approach

The search space for the recursive generation should already be handled by the state machine approach (assuming that your transitions are rich enough). Furthermore, it'll be difficult to get the recursive to "look back" and construct values with inter-references between unrelated trees. Therefore I recommend dropping the recursive generation and relying only on the state machine to explore the state space.

I've opened a PR against this one showing one way it could be done: #6502.

arxanas · 2025-05-10T22:47:01Z

lib/testutils/src/proptest.rs

+    btree_map(arb_path_component(), arb_node(), 1..8).prop_map(|nodes| Tree { nodes })
+}
+
+fn arb_path_component() -> impl Strategy<Value = RepoPathComponentBuf> {


suggestion (non-blocking): I'd recommend just generating Strings or PathBufs, and only converting them to RepoPathComponentBufs right before feeding the values into the jj API, just as a matter of convenience, given that RepoPathComponentBuf isn't something we're specifically testing.

arxanas

It is possible to impose preconditions for transitions and I did so at first but I've found that this really impedes the exploration of the state space and leads into dead ends when shrinking, presumably because the initial state can no longer be shrunk independently from the transitions and vice versa

I've also noticed this, but the shrinking hasn't been too bad for me so far. (I suspect that's partially because I always start from an empty state, instead of a populated tree.)

I think a principled approach might involve allocating a fixed number of optional slots up-front, and then address elements by their index, ignoring operations that would address unoccupied slots. Then you could e.g. prevent a file from being generated without necessarily invalidating all the operations that happen afterwards. I haven't thought too much about it, though.

I've not found a great way to test for other properties than what is done in the existing manual tests: selecting none of the changes produces the left tree, selecting them all produces the right tree. As mentioned in the discussion of the issue, it might make sense to apply only some changes, produce an intermediate tree, then apply the remaining changes to go all the way to the right tree. Usually, this sort of thing should be a perfect fit for proptesting but I've struggled to reconcile this with the state machine approach.

The state machine approach indeed seems a little limiting. I think you'd have to represent the desire to check only some of the checkboxes as a state transition, which is kind of clunky because it represents an entirely different system.

We could also drop the state machine approach and just switch to generating an operation (transition) list as the test case. Then we would have a helper function which applies the operation to a tree/repo. For the left vs right sides, the left side would start generating from the empty operation list, and the right side would start generating from the left operation list. Then we should also be able to use both sides as input to another strategy which selects some number of checkboxes in the resulting diff editor.

arxanas · 2025-05-10T23:05:12Z

lib/testutils/src/proptest.rs

+    fn arb_extant_file_recursive(
+        path: &RepoPath,
+        tree: Tree,
+    ) -> impl Strategy<Value = RepoPathBuf> {
+        let (files, subdirs): (Vec<_>, Vec<_>) =
+            tree.nodes
+                .into_iter()
+                .partition_map(|(name, node)| match node {
+                    Node::File(_) => Either::Left(path.join(&name)),
+                    Node::Directory(tree) => Either::Right((path.join(&name), tree)),
+                });
+
+        match (&files[..], &subdirs[..]) {
+            ([], []) => unreachable!("directory must not be empty"),
+            ([], _) => select(subdirs)
+                .prop_flat_map(|(subdir, subtree)| arb_extant_file_recursive(&subdir, subtree))
+                .boxed(),
+            (_, []) => select(files).boxed(),
+            (_, _) => prop_oneof![
+                select(files),
+                select(subdirs)
+                    .prop_flat_map(|(subdir, subtree)| arb_extant_file_recursive(&subdir, subtree)),
+            ]
+            .boxed(),
+        }
+    }


suggestion: To make this easier, I recommend you store a flat list/map of files and manipulate that (which is the approach I used in #6502). Besides implementation difficulty, this approach will make it difficult to avoid biasing the selection, since it'll be strongly affected by directory structure.

jgreitemann added 5 commits May 4, 2025 16:14

repo_path: impl Extend for RepoPathBuf

9b3433e

jgreitemann requested a review from a team as a code owner May 4, 2025 17:14

yuja reviewed May 5, 2025

View reviewed changes

arxanas mentioned this pull request May 10, 2025

wip: adjust state machine #6502

Draft

arxanas requested changes May 10, 2025

View reviewed changes

arxanas reviewed May 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

builtin merge-tool: fix edge cases found by property-based testing #6448

builtin merge-tool: fix edge cases found by property-based testing #6448

jgreitemann commented May 4, 2025

yuja May 5, 2025

arxanas left a comment

arxanas May 6, 2025

arxanas May 6, 2025

arxanas May 6, 2025

arxanas May 6, 2025

arxanas May 10, 2025

arxanas May 10, 2025

arxanas May 10, 2025

arxanas May 10, 2025

arxanas left a comment

arxanas May 10, 2025

		cc 50055483105a0234d9fb9d2eaf623033f583b31856f280a75f4d3877ea68046f # shrinks to (initial_state, transitions, seen_counter) = (RepoRefState { root: Directory { gamma: RegularFile { contents: "", executable: false } } }, [CreateFile { path: "gamma", contents: "", executable: true }], None)
		cc 7f8b4d5e36149675c133884be076c13317673a2fe0484986a2faf667924af673 # shrinks to (initial_state, transitions, seen_counter) = (RepoRefState { root: Directory { gamma: Directory { alpha: RegularFile { contents: "", executable: false } } } }, [CreateFile { path: "gamma", contents: "", executable: false }], None)

builtin merge-tool: fix edge cases found by property-based testing #6448

Are you sure you want to change the base?

builtin merge-tool: fix edge cases found by property-based testing #6448

Conversation

jgreitemann commented May 4, 2025

Checklist

Choose a reason for hiding this comment

arxanas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arxanas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment