I have reminded myself and then forgot again, and @duckduckgrayduck had to remind me again, that the .db file is the single source of truth if the amount of files you thought you were uploading don’t match the amount on DocumentCloud. I think there’s probably two related problems here that might be helpful to think through:
- Syncing the .db and .csv in a way that produces a column in the .db similar to the error message column, but records a disagreement between the db and csv
- Being sure that the above solution works in a way that allows the user to re-rerun the script without having to wipe the db first. Solving the first part should actually take care of the second part, but I think it's worth explicitly being sure the first and second here work in tandem.
I have reminded myself and then forgot again, and @duckduckgrayduck had to remind me again, that the .db file is the single source of truth if the amount of files you thought you were uploading don’t match the amount on DocumentCloud. I think there’s probably two related problems here that might be helpful to think through: