For many benchmarks at the patch level, the patches are taken from whole slides.
This task has a few parts.
The first is tracing the patches back to the slides that they came from.
The second is converting a patch dataset into a slide dataset, so that we evaluate everything at the full level, the way it would be in reality.