-
Notifications
You must be signed in to change notification settings - Fork 175
GH-5291 Add asynchronous fsync to LuceneSail #5446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
This requires this PR to be merged: eclipse-rdf4j/rdf4j#5446
I don't think that the close() method on the directory in the LuceneIndex class is ever called. |
b6a649b
to
8f2d4b7
Compare
@hmottestad oops, you are right! Fixed that and added a test to make sure it happens. |
try { | ||
super.syncMetaData(); | ||
} catch (IOException e) { | ||
logger.error("IO error during a periodic sync of Lucene index metadata", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried that if for some reason there is a persistent issue, then we may end up logging continuously but never actually throwing an exception.
What would usually happen if an IO exception was thrown (with the original code)? Would it bring down the entire application or just a particular transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would result in a transaction rollback:
rdf4j/core/sail/lucene-api/src/main/java/org/eclipse/rdf4j/sail/lucene/LuceneSailConnection.java
Lines 266 to 269 in 18f9a56
luceneIndex.commit(); | |
} catch (IOException | SailException e) { | |
logger.error("Rolling back", e); | |
luceneIndex.rollback(); |
We cannot do the same thing 1:1 with asynchronous fsyncs, because we don't wait for the result of the fsync. The next best thing we can do is to throw an exception on the next transaction.
I've added a bit of code for that, along with a test.
...ail/lucene/src/main/java/org/eclipse/rdf4j/sail/lucene/impl/DelayedSyncDirectoryWrapper.java
Show resolved
Hide resolved
core/sail/lucene/src/main/java/org/eclipse/rdf4j/sail/lucene/impl/LuceneIndex.java
Outdated
Show resolved
Hide resolved
Thanks for the good fix. I think this will be a good solution overall, just some small things I want to be sure are robust. |
@Override | ||
public void sync(Collection<String> names) throws IOException { | ||
synchronized (pendingSyncs) { | ||
pendingSyncs.addAll(names); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How big is this likely to grow? Should we have a hard limit (possibly configurable) so that we don't run out of memory before we sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell, there is no limit on this, it depends on Lucene index size. I added a configurable limit for this, set to 5000 files by default – should be good enough. There is also a test for this.
8f2d4b7
to
ed3e249
Compare
GitHub issue resolved: #5291
Briefly describe the changes proposed in this PR:
As described in the issue, the current setup with an fsync after each transaction is very safe, but also a huge bottleneck when dealing with many small transactions. This PR introduces an option that allows for asynchronous fsyncs in the background, on a fixed interval. If there is nothing to sync, it does nothing.
I tested this with the original workload with which I found the issue. When I set
fsyncInterval
to 5000 ms, it went from ~10–12 TX/s to ~100–150 TX/s, over an HTTP connection. That's basically the same as when I tried removing the fsync entirely. Great :)PR Author Checklist (see the contributor guidelines for more details):
mvn process-resources
to format from the command line)