Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance with multiple rows to insert #175

Open
bomb-on opened this issue Jun 5, 2018 · 4 comments
Open

Slow performance with multiple rows to insert #175

bomb-on opened this issue Jun 5, 2018 · 4 comments

Comments

@bomb-on
Copy link
Contributor

bomb-on commented Jun 5, 2018

I have noticed that sending multiple "documents" for inserting (tried with 300-400 documents) takes very long time, in my case, it was around 2.5 minutes. All of the data is sent in a single request with an array of documents in request body.

Trying to find a bottleneck in my app, I have noticed that commenting out a line responsible for committing the data made everything blazingly fast, so I left it that way for the time being even though I'm not sure about possible consequences yet.

I wonder if that line really is necessary? If so, why is it so? And what could possibly go wrong if I leave it commented out?

Thanks!

@dkellner
Copy link
Collaborator

dkellner commented Jun 5, 2018

Just from reading our code and the SQLAlchemy docs on commit (http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#committing) I'd say it's necessary to save the changes you make in the session to be actually saved to the database. That may mean that as long as your service is running, you may not experience problems (as the objects are stored in the session) - but are you sure they are actually written to the database?

Removing the line causes almost all of our test case to fail - so I bet at least something will be seriously broken. We could try to move it outside that loop though...

@bomb-on
Copy link
Contributor Author

bomb-on commented Jun 5, 2018

Thanks for the quick answer! Pardon my ignorance here, but I just wonder could flush() be used here instead of commit()?

The idea about moving the commit out of the loop could be helpful for my use case I suppose, but not sure if everything would still work as intended then?

@dkellner
Copy link
Collaborator

Thanks for the quick answer! Pardon my ignorance here, but I just wonder could flush() be used here instead of commit()?

I think that's still not really persisting any data (possibly depending on the database). For the difference between flush() and commit() I think this answer on StackOverflow explains it quite well: https://stackoverflow.com/questions/4201455/sqlalchemy-whats-the-difference-between-flush-and-commit

The idea about moving the commit out of the loop could be helpful for my use case I suppose, but not sure if everything would still work as intended then?

At least it breaks less test cases than removing the line entirely ;). If you have time to investigate further, I'd appreciate it!

@bomb-on
Copy link
Contributor Author

bomb-on commented Jun 12, 2018

Ok, thanks for that link, it explains it perfectly! :)

I might "investigate" a bit, the first thing which comes to my mind would be to be able to pass a parameter to that function which would skip committing if user wants to (e.g. def insert(self, resource, doc_or_docs, skip_commit=False)), but I'm not sure if that's something desirable in the first place? Maybe there's something wrong on my end which makes everything slow during commit phase, not sure if that makes any sense at all :)

However, I'll have to fix a ton of other things first since I've been a bit inactive and after updating Eve-SQLAlchemy my tests are screaming with tons of errors... :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants