Skip to content

mongo-tools evergreen for mongodump passthrough tests #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 18, 2025

Conversation

rcownie
Copy link
Collaborator

@rcownie rcownie commented Apr 14, 2025

  1. Evergreen tasks and functions are ported from mongosync/evergreen into mongo-tools/mongodump_passthrough
  2. Various changes are needed to match the different directory structure and evergreen environment
  3. Slight changes to mongodump/mongorestore to help with debugging passthrough failures
  4. New buildvariant "rhel80" used for mongodump passthrough testing only

@rcownie rcownie force-pushed the rcownie/passthru3 branch from a189cba to 3706d62 Compare April 15, 2025 00:27
@rcownie rcownie marked this pull request as ready for review April 15, 2025 00:43
@rcownie rcownie requested review from autarch and shyam12860 April 15, 2025 00:43
@rcownie
Copy link
Collaborator Author

rcownie commented Apr 15, 2025

The check-sbom-lite task is failing apparently because of a disagreement between "go1.23.7" and "go1.23.8". I don't know where these versions are coming from, so I'd welcome any suggestions for fixing this. But it seems unrelated to the new functionality.

@@ -0,0 +1,9 @@
mongo-tools/mongodump_passthrough contains evergreen .yml files to support resmoke passthrough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this is for both mongodump and mongorestore, would it be better to call this directory just passthrough rather than mongodump_passthrough?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tricky one. In the scope document it was called "mongodump/mongorestore passthrough, in the technical design I shortened that to "mongodump passthrough". I'm inclined to stick with that because the mongo-tools repo contains other tools, e.g. mongoimport/mongoexport.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable. I was going to suggest putting them in mongodump/passthrough instead, but these tests are for dump + restore both, so this is fine in that case.

# or: "Changed for mongodump_passthrough"
#
# The tests are run by loading the mongosync repo as an evergreen module under src/mongosync.
# The sources for mongodump-suite-gen, mongodump-task-gen, js tests (with slight modifications
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me why the code for these lives in the mongosync directory? I don't have the context

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The mongosync repo has a fairly elaborate multi-language stack of JS code for generating traffic on the source db and verifying consistency between src and dst clusters; python code for "fixtures" which set up the various components of the test infrastructure - the two clusters, the driver for cluster-to-cluster-replication, and the particular method of replication (mongosync, multi-mongosync, and now mongodump+mongorestore); the suite.yml files which configure the fixtures to run a suite of tests; the suite generator which generates the suites; and the task-generator which tells evergreen about the suites.

And that code has to evolve to cope with changes in the server resmoke infrastructure.

So it's good to share as much of that code as possible, andonly have one copy to be maintained.

It also makes it easier to eyeball the code and see which parts (not very many) have different behavior for mongodump/mongorestore vs mongosync.

So the design choice was to leave that code in the mongosync repo and make minimal change to it, but
fetch it from the mongo-tools repo as an evergreen module.

@@ -673,6 +695,11 @@ post:
set -v
rm -rf /data/db/*
exit 0
# Extra post steps needed for mongodump passthrough.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering all the commands added to pre / post / timeout are passthrough specific, is there a way to run them conditionally, only for passthrough?

Can running it for every task cause errors or slowdowns?

Copy link
Collaborator Author

@rcownie rcownie Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could make them conditional - perhaps conditional on the buildvariant. But that's hard with the way they're written - within a shell_exec script, you could use ${build_variant} to select conditional behavior, but this is written as several evergreen function calls, and the whole "not-quite-a-programming-language" flavor of evergreen makes it hard.

I was definitely worried that it might cause errors on the existing tests, thought I'd just try it and see, and it seems the tests pass.

My wild-ass-guess was that it wouldn't cause significant slowdown because it's mostly to do with trying to pick up files produced by the tests, and if it tries to pick up files that weren't created, that doesn't waste a noticeable amount of time. Could be wrong though.

The meta-problem here is that evergreen configurations don't have the usual composability properties of a real programming language. So when you try to glue together two separate chunks of evergreen stuff, there are rough edges in dealing with the global namespace of tasks and functions, and the global behavior of pre and post.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I don't see a way to use tags to do this either because you need them to run in a specific order.

I'm okay with leaving them here if they're quick enough and don't cause failures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no way to do this. It's a bit annoying.

@rcownie rcownie requested review from FGasper and autarch and removed request for autarch and FGasper April 17, 2025 01:43
Copy link
Collaborator

@autarch autarch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good % some questions I had about specific changes. I also have a few general comments:

  • It'd be nice to separate out the Go version changes into a separate PR, since it's totally unrelated to this work.
  • It seems like you copied a lot of tasks & functions that aren't actually used. It'd be good to remove those. Maybe just leave a comment in where you deleted it if you're concerned about keeping up to date with future changes in the mongosync config.

@@ -673,6 +695,11 @@ post:
set -v
rm -rf /data/db/*
exit 0
# Extra post steps needed for mongodump passthrough.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no way to do this. It's a bit annoying.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just add an exclude = "mongodump_passthrough/**" to the golangci-lint config in this repo's precious.toml` file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mongo-tools/common.yml has a "vet" task which doesn't go through precious.

  • name: vet
    commands:
    • func: "fetch source"
    • command: shell.exec
      type: test
      params:
      working_dir: src/github.com/mongodb/mongo-tools
      script: |
      ${_set_shell_env}
      set -x
      set -v
      set -e
      go vet -composites=false ./bsondump ./mongo*

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I forgot about that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could just move that to precious too. It'd be fairly trivial.

${PREPARE_SHELL}
PATH=$PATH:$HOME
echo $PATH
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh

Michael recently made a change to add retrying when this fails. It'd be good to get that into this as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +305 to +312
cd mongodump-suite-config
tar cvf ../suite-config/a.tar .
cd ../suite-config
tar xvf a.tar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use tar like this instead of cp -r?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more confident that I understand the behavior of tar.

"Historic versions of the cp utility had a -r option. This implementation supports that option, however,
its behavior is different from historical FreeBSD behavior. Use of this option is strongly discouraged as
the behavior is implementation-dependent."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only run this stuff on Linux, which has a cp -r that works as you'd expect. Using tar seems unnecessarily baroque.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Baroque is my era. I'm easily confused by the way cp puts things inside a destination directory if it already exists.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be changed. This is a very idiosyncratic thing that we don't do elsewhere, AFAIK. It's time to move onto the Classical era, if not Romantic.

@rcownie rcownie force-pushed the rcownie/passthru3 branch from d5c99bc to fc073cc Compare April 17, 2025 19:06
@rcownie
Copy link
Collaborator Author

rcownie commented Apr 17, 2025

This looks good % some questions I had about specific changes. I also have a few general comments:

  • It'd be nice to separate out the Go version changes into a separate PR, since it's totally unrelated to this work.

Done

  • It seems like you copied a lot of tasks & functions that aren't actually used. It'd be good to remove those. Maybe just leave a comment in where you deleted it if you're concerned about keeping up to date with future changes in the mongosync config.

I pruned it fairly heavily: the tasks+functions here are 892 lines, compared to 1826 lines in mongosync. Might have missed one or two unused functions, IDK, but "evergreen validate common.yml" seems mostly happy.

@autarch
Copy link
Collaborator

autarch commented Apr 17, 2025

  • It seems like you copied a lot of tasks & functions that aren't actually used. It'd be good to remove those. Maybe just leave a comment in where you deleted it if you're concerned about keeping up to date with future changes in the mongosync config.

I pruned it fairly heavily: the tasks+functions here are 892 lines, compared to 1826 lines in mongosync. Might have missed one or two unused functions, IDK, but "evergreen validate common.yml" seems mostly happy.

I'm pretty sure validate won't complain about unused functions and tasks. I think it'd be good to audit for those and not include them. Otherwise, it's a burden on whoever has to maintain this in the future. They're forced to try figure out why some seemingly unused function is included.

@rcownie
Copy link
Collaborator Author

rcownie commented Apr 17, 2025

  • It seems like you copied a lot of tasks & functions that aren't actually used. It'd be good to remove those. Maybe just leave a comment in where you deleted it if you're concerned about keeping up to date with future changes in the mongosync config.

I pruned it fairly heavily: the tasks+functions here are 892 lines, compared to 1826 lines in mongosync. Might have missed one or two unused functions, IDK, but "evergreen validate common.yml" seems mostly happy.

I'm pretty sure validate won't complain about unused functions and tasks. I think it'd be good to audit for those and not include them. Otherwise, it's a burden on whoever has to maintain this in the future. They're forced to try figure out why some seemingly unused function is included.

It does complain about seemingly-unused tasks (except it can't tell if dynamically-generated tasks will have them as dependencies).

I made another pass through to find unused functions, that reduces the size to 775 lines.

@rcownie rcownie requested a review from autarch April 18, 2025 10:51
Copy link
Collaborator

@autarch autarch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % please do change the tar thing. I think it is likely to confuse basically most future developers, who have not ever seen this idiom. Keep in mind that nowadays new devs often have much less Unix CLI experience than people from our era.

Comment on lines +305 to +312
cd mongodump-suite-config
tar cvf ../suite-config/a.tar .
cd ../suite-config
tar xvf a.tar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be changed. This is a very idiosyncratic thing that we don't do elsewhere, AFAIK. It's time to move onto the Classical era, if not Romantic.

@rcownie rcownie merged commit 7214f6a into master Apr 18, 2025
38 checks passed
@autarch
Copy link
Collaborator

autarch commented Apr 18, 2025

@rcownie I think you missed my comment asking you to change the tar thing before you merged this. It's a common practice on our team to do an "LGTM but before you merge please do this one thing" approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants