Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neurofinder Monday! #10

Open
freeman-lab opened this issue Nov 4, 2015 · 0 comments
Open

Neurofinder Monday! #10

freeman-lab opened this issue Nov 4, 2015 · 0 comments

Comments

@freeman-lab
Copy link
Member

A bunch of us met during a workshop on large-scale imaging at Janelia Research Campus, on Monday November 4th, for a pow wow on the state of Neurofinder and where to take it next. Here are notes on what we discussed, and where we landed.

The following people were present: Darcy Peterka, Andrew Osheroff (@andrewosh), Jason Wittenbach (@jwittenbach), Tim Holy (@timholy), Nicholas Sofroniew (@sofroniewn), Konrad Kording, Adam Packer (@apacker83), Ferran Diego, Eftychios Pnevmatikakis (@epnev), Johannes Friedrich (@j-friedrich), Jeremy Freeman (@freeman-lab)

First we summarized the current state. We agreed that we've assembled a nice initial collection of datasets and evaluation metrics, with the help of many contributors, and we've made the data available in a variety of useful formats (including web access via notebooks, and download via these links).

But we also agreed that the current automated submission and algorithm running system, which requires that algorithms be written in Python for a standardized environment, and submitted via pull requests, has proven a barrier for algorithm developers because many are working in other languages (including Matlab and Julia) and/or find the process too disconnected from their existing workflows.

We discussed two alternatives for moving the project forward:

  1. Continue to provide only the training data for download, but allow people to submit algorithms in their language of choice, which we would still run automatically on the test data. This would hopefully broaden the community, while ensuring that algorithms can actually be run and reproduced. But we'd need to modify the testing framework to support multiple languages and handle complex environment specifications. Of course, we could require that people submit Docker images that we use to run their algorithm, but for most computational neuroscientists writing and building Docker images may be a significant barrier — as much or more so than the current system.
  2. Provide the training data and the (unlabeled) test data for download, and allow people to submit algorithm results on the test data. It was noted that this is more similar to benchmarking setups used e.g. for object recognition. People will still need to include a link to a github repo with their submission, but we won't run their code. In this version, we can't guarantee reproducibility, but it would open Neurofinder to the broadest community possible, and eliminate nearly all barriers to entry: presumably people can run their own code, so to submit they just need to run their code on the test data and submit the results.

After a lively debate, we all favored option 2. But to encourage reproducibility, we can request that users submit Docker images posted to DockerHub, or build Binders with Jupyter notebooks, that reproduce their results. This could be a 👍 next to their submission on the metrics page, and for these submissions we could also run the code to include stats like run time.

Feel free to add comments / ideas / anything I forgot here. Assuming we move ahead with this plan, the next step will be nailing down the format for submissions. We'll make another issue or PR to discuss that.

CC @broxtronix @poolio @mathisonian @logang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant