-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] flock unlocking issues #13821
Comments
Pinging the flock implementation author @crafcat7, any ideas? |
Summary: Fixed the problem of releasing the bucket prematurely in multi-threaded flock scenarios. A thread setlk B thread setlk_wait A thread releases lock but fails to determine if nwaiter causes the bucket to be released prematurely post B thread causes crash due to heap use after free apache#13821 Signed-off-by: chenrun1 <[email protected]>
Hi, Thanks for providing the steps to reproduce the issue, which has been fixed in the PR #13826
Regarding the design that gettid / getpid should be used in file locks, I expect the implementation to be consistent with that in Linux. In the file lock implementation in Linux, I read that they use groupid instead of threadid if I understand correctly (refer to https://github.com/torvalds/linux/blob/27cc6fdf720183dce1dbd293483ec5a9cb6b595e/fs/locks.c#L528-L533) |
Thanks for the quick fix!
Hmm, I have tested the same app against Linux (kernel version 6.8.0-45) and thread locking works there. But yes, from the code you sent it seems they are using thread group ID, which should be the same as what we get from |
Summary: Fixed the problem of releasing the bucket prematurely in multi-threaded flock scenarios. A thread setlk B thread setlk_wait A thread releases lock but fails to determine if nwaiter causes the bucket to be released prematurely post B thread causes crash due to heap use after free #13821 Signed-off-by: chenrun1 <[email protected]>
Description / Steps to reproduce the issue
File lock
flock
interface (and generallyfs_lock.c
implementation, I suppose the issue is valid for directfnctl
locks as well) does not seem to handle file unlocking correctly. A simple scenario where two threads open and access the file and both attempt to lock it withflock(fd, LOCK_EX);
ends with the following result:So far I have figured out two issues. One is the incorrect obtain of lock pid if the lock is taken from a POSIX thread created from the main process instead of a completely separate process. Imho task id (
gettid()
) should be used instead of pid (getpid()
) to ensure every thread is treated independently and locking between threads is also possible. The following diff solves this.But the main issue I am facing is with the unlocking as described above. From my tests it seems the lock from thread 1 is unlocked successfully, deleted from the list and semaphore is posted, which releases the second thread. At this moment, the second thread jumps to
retry
label (see this line) and takeslist_for_every_entry()
which search through the list of active locks. This should be empty now as we released the only held lock. But for some reason the list returns an existing lock and goes tofile_lock_is_conflict()
. But the data in the returned lock are not valid! Or at least is seems to be that way, I get pid value of something like541213036
, so it seems we are accessing a bad part of the memory.I have tested this while using SmartFS file system and NOR flash, but I suppose this is reproducible on other file systems as well as the issue seems to be in the common part of locking infrastructure or list implementation. One more change is required for SmartFS, I have not committed it yet to the mainline:
I have
CONFIG_FS_LOCK_BUCKET_SIZE=8
, I suppose this is the only configuration needed.On which OS does this issue occur?
[OS: Linux]
What is the version of your OS?
Ubuntu 22.04.5, 6.8.0-45-generic
NuttX Version
master
Issue Architecture
[Arch: arm]
Issue Area
[Area: File System]
Verification
The text was updated successfully, but these errors were encountered: