-
Notifications
You must be signed in to change notification settings - Fork 365
mongo-tools evergreen for mongodump passthrough tests #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
rcownie
commented
Apr 14, 2025
•
edited
Loading
edited
- Evergreen tasks and functions are ported from mongosync/evergreen into mongo-tools/mongodump_passthrough
- Various changes are needed to match the different directory structure and evergreen environment
- Slight changes to mongodump/mongorestore to help with debugging passthrough failures
- New buildvariant "rhel80" used for mongodump passthrough testing only
a189cba
to
3706d62
Compare
The check-sbom-lite task is failing apparently because of a disagreement between "go1.23.7" and "go1.23.8". I don't know where these versions are coming from, so I'd welcome any suggestions for fixing this. But it seems unrelated to the new functionality. |
@@ -0,0 +1,9 @@ | |||
mongo-tools/mongodump_passthrough contains evergreen .yml files to support resmoke passthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering this is for both mongodump and mongorestore, would it be better to call this directory just passthrough
rather than mongodump_passthrough
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tricky one. In the scope document it was called "mongodump/mongorestore passthrough, in the technical design I shortened that to "mongodump passthrough". I'm inclined to stick with that because the mongo-tools repo contains other tools, e.g. mongoimport/mongoexport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. I was going to suggest putting them in mongodump/passthrough instead, but these tests are for dump + restore both, so this is fine in that case.
# or: "Changed for mongodump_passthrough" | ||
# | ||
# The tests are run by loading the mongosync repo as an evergreen module under src/mongosync. | ||
# The sources for mongodump-suite-gen, mongodump-task-gen, js tests (with slight modifications |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you tell me why the code for these lives in the mongosync directory? I don't have the context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. The mongosync repo has a fairly elaborate multi-language stack of JS code for generating traffic on the source db and verifying consistency between src and dst clusters; python code for "fixtures" which set up the various components of the test infrastructure - the two clusters, the driver for cluster-to-cluster-replication, and the particular method of replication (mongosync, multi-mongosync, and now mongodump+mongorestore); the suite.yml files which configure the fixtures to run a suite of tests; the suite generator which generates the suites; and the task-generator which tells evergreen about the suites.
And that code has to evolve to cope with changes in the server resmoke infrastructure.
So it's good to share as much of that code as possible, andonly have one copy to be maintained.
It also makes it easier to eyeball the code and see which parts (not very many) have different behavior for mongodump/mongorestore vs mongosync.
So the design choice was to leave that code in the mongosync repo and make minimal change to it, but
fetch it from the mongo-tools repo as an evergreen module.
@@ -673,6 +695,11 @@ post: | |||
set -v | |||
rm -rf /data/db/* | |||
exit 0 | |||
# Extra post steps needed for mongodump passthrough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering all the commands added to pre / post / timeout are passthrough specific, is there a way to run them conditionally, only for passthrough?
Can running it for every task cause errors or slowdowns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could make them conditional - perhaps conditional on the buildvariant. But that's hard with the way they're written - within a shell_exec script, you could use ${build_variant} to select conditional behavior, but this is written as several evergreen function calls, and the whole "not-quite-a-programming-language" flavor of evergreen makes it hard.
I was definitely worried that it might cause errors on the existing tests, thought I'd just try it and see, and it seems the tests pass.
My wild-ass-guess was that it wouldn't cause significant slowdown because it's mostly to do with trying to pick up files produced by the tests, and if it tries to pick up files that weren't created, that doesn't waste a noticeable amount of time. Could be wrong though.
The meta-problem here is that evergreen configurations don't have the usual composability properties of a real programming language. So when you try to glue together two separate chunks of evergreen stuff, there are rough edges in dealing with the global namespace of tasks and functions, and the global behavior of pre and post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I don't see a way to use tags to do this either because you need them to run in a specific order.
I'm okay with leaving them here if they're quick enough and don't cause failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's no way to do this. It's a bit annoying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good % some questions I had about specific changes. I also have a few general comments:
- It'd be nice to separate out the Go version changes into a separate PR, since it's totally unrelated to this work.
- It seems like you copied a lot of tasks & functions that aren't actually used. It'd be good to remove those. Maybe just leave a comment in where you deleted it if you're concerned about keeping up to date with future changes in the mongosync config.
@@ -673,6 +695,11 @@ post: | |||
set -v | |||
rm -rf /data/db/* | |||
exit 0 | |||
# Extra post steps needed for mongodump passthrough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's no way to do this. It's a bit annoying.
common.yml
Outdated
@@ -673,6 +695,11 @@ post: | |||
set -v | |||
rm -rf /data/db/* | |||
exit 0 | |||
# Extra post steps needed for mongodump passthrough. | |||
- func: f_resmoke_report_attach | |||
- func: f_gotest_report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary for the resmoke stuff? Did you just copy this from the mongosync config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's copied from the mongosync config without any deeper analysis.
I'll try removing the f_gotest_report because that seems unnecessary.
common.yml
Outdated
# Extra timeout steps needed for mongodump passthrough. | ||
- func: f_expansions_write | ||
# Hang analyzer is used to upload data files after a timed-out resmoke test. | ||
- func: "run hang analyzer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we ever actually run this for Mongosync, because for some reason we can't. So I'm not sure if this is useful here either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly inclined to match what mongosync evergreen has as closely as possible. But if you're 100% sure it's useless, I could take it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, it's fine either way, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it.
@@ -158,7 +158,7 @@ func oplogDocumentValidator(in []byte) error { | |||
} | |||
|
|||
if ok && nsStr == "admin.system.version" { | |||
return fmt.Errorf("cannot dump with oplog if admin.system.version is modified") | |||
return fmt.Errorf("cannot dump with oplog if admin.system.version is modified by %v", raw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't printing a bson.Raw
just print a []byte
slice in hex? I think that might be confusing for any users who see this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes out as text, somehow. I've looked at the logs. bson.Raw has a String() method so that will get it formatted.
"5. If an operand implements method String() string, that method will be invoked to convert the object to a string, which will then be formatted as required by the verb (if any)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, interesting. I think in that case you should use %s
instead.
mongodump_passthrough/empty.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just add an exclude = "mongodump_passthrough/**" to the golangci-lint config in this repo's
precious.toml` file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mongo-tools/common.yml has a "vet" task which doesn't go through precious.
- name: vet
commands:- func: "fetch source"
- command: shell.exec
type: test
params:
working_dir: src/github.com/mongodb/mongo-tools
script: |
${_set_shell_env}
set -x
set -v
set -e
go vet -composites=false ./bsondump ./mongo*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. I forgot about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could just move that to precious too. It'd be fairly trivial.
f_mongosync_binary_fetch: &f_mongosync_binary_fetch | ||
command: s3.get | ||
params: | ||
aws_key: ${aws_key} | ||
aws_secret: ${aws_secret} | ||
# Changed for mongodump_passthrough | ||
remote_file: mongo-tools/mongodump_passthrough/${mongosync_compile_build_variant}/${revision}/${version_id}/${mongosync_binary_folder}/${version_id}.tgz | ||
bucket: mciuploads | ||
extract_to: "src/mongosync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to do this at all? Do you actually run mongosync
for tools passthroughs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't build or run any mongosync binaries. But I copy the mongodump and mongorestore binaries into the mongosync/dist directory, they get packed up and sent to s3 from there, and then they get downloaded back into mongosync/dist.
So we have the same task names and same dependencies as for mongosync passthrough tests, we're just gluing mongodump and mongorestore into that same framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment explaining this would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment
mongodump_passthrough/functions.yml
Outdated
${PREPARE_SHELL} | ||
PATH=$PATH:$HOME | ||
echo $PATH | ||
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh | |
MISE_INSTALL_PATH='${workdir}/.local/bin/mise' sh ${workdir}/src/mongosync/etc/mise.run.sh |
Michael recently made a change to add retrying when this fails. It'd be good to get that into this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cd mongodump-suite-config | ||
tar cvf ../suite-config/a.tar . | ||
cd ../suite-config | ||
tar xvf a.tar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use tar
like this instead of cp -r
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more confident that I understand the behavior of tar.
"Historic versions of the cp utility had a -r option. This implementation supports that option, however,
its behavior is different from historical FreeBSD behavior. Use of this option is strongly discouraged as
the behavior is implementation-dependent."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only run this stuff on Linux, which has a cp -r
that works as you'd expect. Using tar
seems unnecessarily baroque.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Baroque is my era. I'm easily confused by the way cp puts things inside a destination directory if it already exists.
# | ||
# This build the mongo-tools executables, then copies mongogump and mongorestore | ||
# into src/mongosync/dist. | ||
"build mongodump and mongorestore": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you're using mise
to build the tools binaries. The mongo-tools repo is not using mise (yet?). So when we do an actual release of the tools, we'll just use whatever go
is installed on the Evergreen host. I think we should use that same go
for these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed so that we use the same setup as mongo-tools/common.yml "run make target" for code that runs in mongo-tools, but use the mise for code that lives in mongosync repo (the mongodump-suite-gen).
I did try to get everything to run without mise, but the shell setup (sourcing ./set_goenv.sh) gets confusing so I backed off to this.
mongodump_passthrough/functions.yml
Outdated
- *f_make_migration_verifier_binary_executable | ||
|
||
# This uploads everything in src/mongosync/dist/ | ||
# For mongodump, we don't need the mongosync bi9naries, but we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# For mongodump, we don't need the mongosync bi9naries, but we have | |
# For mongodump, we don't need the mongosync binaries, but we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
d5c99bc
to
fc073cc
Compare
Done
I pruned it fairly heavily: the tasks+functions here are 892 lines, compared to 1826 lines in mongosync. Might have missed one or two unused functions, IDK, but "evergreen validate common.yml" seems mostly happy. |
I'm pretty sure validate won't complain about unused functions and tasks. I think it'd be good to audit for those and not include them. Otherwise, it's a burden on whoever has to maintain this in the future. They're forced to try figure out why some seemingly unused function is included. |
It does complain about seemingly-unused tasks (except it can't tell if dynamically-generated tasks will have them as dependencies). I don't know if that extends to functions as well. |