fs: Deferred inode reclaim by vfsci-bot[bot] · Pull Request #1300 · linux-fsdevel/vfs

vfsci-bot · 2026-04-29T19:26:21Z

Series: https://patchwork.kernel.org/project/linux-fsdevel/list/?series=1087657
Submitter: Jan Kara
Version: 1
Patches: 4/4
Message-ID: <20260429174850.18223-1-jack@suse.cz>
Base: vfs.base.ci
Lore: https://lore.kernel.org/linux-fsdevel/20260429174850.18223-1-jack@suse.cz

Automated by ml2pr

When inode has dirtied timestamps, we currently call sync_lazytime() on last iput. This is done because inode with any dirty bit set is not inserted into LRU and dirty timestamps expire only after many (12 by default) hours so these inodes would be sitting outside of LRU aging for a really long time. However this can result in doing IO and consequently GFP_NOFAIL allocations from dentry reclaim making MM complain. Sample trace for ext4 is: prune_dcache_sb shrink_dentry_list __dentry_kill iput sync_lazytime __mark_inode_dirty ext4_dirty_inode __ext4_mark_inode_dirty ext4_reserve_inode_write ext4_get_inode_loc bdev_getblk __filemap_get_folio_mpol Avoid this dirtying on last iput by reshuffling unused inodes to the beginning of b_dirty_time list and clobbering dirtied_time_when instead so that they get written during next periodic writeback. Signed-off-by: Jan Kara <jack@suse.cz>

Reclaim of some inodes is rather complex requiring running transactions or doing other IO. Consequently filesystems end up doing GFP_NOFAIL allocations from kswapd or even direct reclaim which is problematic because forward progress of these allocations isn't guaranteed. Add infrastructure for marking inodes whose reclaim is difficult and offload reclaim of such inodes into a workqueue to not block kswapd with difficult inode reclaim. Signed-off-by: Jan Kara <jack@suse.cz>

Deferring difficult inode reclaim from prune_icache_sb() to a workqueue removes the natural feedback loop of blocking tasks in direct reclaim until they make space for new allocations. This can result in the list of deferred inodes to grow beyond any bounds and possibly push the machine to a reclaim storm or OOM. Add a throttling mechanism slowing down tasks in mark_inode_reclaim_deferred() if the list of deferred inodes to reclaim grows over limit. We measure average time it takes to reclaim inode on deferred list and block tasks proportionally to that. Signed-off-by: Jan Kara <jack@suse.cz>

When we have to free preallocations during inode eviction, we need to load block bitmaps and run transaction to modify them. This takes time and also requires GFP_NOFAIL allocations. Mark inodes with preallocated blocks as needing offloading of inode reclaim to a workqueue so that we don't block reclaim for long and potentially deadlock MM subsystem. Signed-off-by: Jan Kara <jack@suse.cz>

jankara added 4 commits April 29, 2026 19:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fs: Deferred inode reclaim#1300

fs: Deferred inode reclaim#1300
vfsci-bot[bot] wants to merge 4 commits intovfs.base.cifrom
pw/1087657/vfs.base.ci

vfsci-bot Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vfsci-bot Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant