Skip to content

Commit 5ed0654

Browse files
committed
Use fdatasync for commits
We can use fdatasync to save 1 extra write per call, for a total of 2 writes per commit, since we do two sync, one for data block up to the header, then another after the header. As of OTP 25 (our oldest supported version): * On Linux/BSDs: fdatasync() * On Window: FlushFileBuffers() i.e. the same as for file:sync/1 * On MacOS: fcntl(fd,F_FULLFSYNC/F_BARRIERFSYNC) According to https://linux.die.net/man/2/fdatasync > fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see stat(2)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush. The key things for us are: * It updates the size (positions) correctly * We do not rely or care about atime/mtime for safety or correctness * Erlang VM does the right thing on all the supported OSes
1 parent 3d7a019 commit 5ed0654

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

src/couch/src/couch_file.erl

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -599,7 +599,12 @@ format_status(_Opt, [PDict, #file{} = File]) ->
599599

600600
fsync(Fd) ->
601601
T0 = erlang:monotonic_time(),
602-
Res = file:sync(Fd),
602+
% We do not rely on mtime/atime for our safety/consitency so we can use
603+
% fdatasync. As of version 25 OTP will use:
604+
% - On Linux/BSDs: fdatasync()
605+
% - On Window: FlushFileBuffers() i.e. the same as for file:sync/1
606+
% - On MacOS: fcntl(fd,F_FULLFSYNC/F_BARRIERFSYNC)
607+
Res = file:datasync(Fd),
603608
T1 = erlang:monotonic_time(),
604609
% Since histograms can consume floating point values we can measure in
605610
% nanoseconds, then turn it into floating point milliseconds

0 commit comments

Comments
 (0)