We should ideally run the tests as part of a CD/CI process to staging, so that a merged PR would trigger an SSH call to stage.verifierplus.org to pull in the latest from main, rebuild the docker container, and then the playwright tests would be run against stage. We can do this with a github workflow.