-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize project activity storage #1220
Conversation
…wngrade manager -> contributor)
@loic911 : I didn't collect this domain because I didn't like the "postgresql cache of mongo data" solution... and as I didn't have the perf problem on our server, it was not a priority to me. I ping you because it can be an addition to your current reflexion about command, triggers and optimization |
This is not a cache of MongoDB data. The commands & |
Yes ! Sorry, I mixed the concepts. |
But that's true that command history could be a good candidate to be stored in a nosql database (but I don't have the whole command system in mind, so maybe it is not the case). However, it then would raise the question on how to efficiently join data from SQL DB and data from noSQL DB :/ |
It seems to be the best solutions but This will indeed probably slow down add/edit/delete. I see a potential probblem here (to be confirmed/test):
A possible way to test this is to run the annotation benchmark written by Ba Thien as it insert lots of annotations on multiple threads.
I don't think so, I think a view is only a way to "encapsulate" a request. So you will simply replace this By something like that: Another possibility is to avoid the use of triggers and to keep in memory a structure that maps project and the last modification date and to sync it frequently (let says every min) in the database. |
Needs to be migrated to spring (or already has been, needs to be checked) |
Currently the endpoint
api/project.json
with the parameterwithLastActivity
needs to make a join with a group by on thecommand_history
table. On Cytomine instances with large data sets, with many history, this join becomes very very slow.However, this request is very frequently used in the webUI to list the project by last activity (default behavior). This PR proposes a other solution, with an intermediary table to store the last project activity. It is much faster than before.
The drawback of this solution is that it uses a SQL trigger on
command_history
, which probably has a negative effect on all add/edit/delete commands, but is a not been measured.In all cases, we need to find a solution to solve this performance issue, because current implementation makes the webUI very unresponsive and unuasable.
A possible alternative maybe to study could be the usage of a PostgreSQL view ? I don't know how this is managed internally and if it's more efficient than a trigger.