Skip to content

use notify prune from upstream#137

Open
hbirth wants to merge 2 commits intoDDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1
Open

use notify prune from upstream#137
hbirth wants to merge 2 commits intoDDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1

Conversation

@hbirth
Copy link
Copy Markdown
Collaborator

@hbirth hbirth commented Mar 25, 2026

No description provided.

@hbirth hbirth requested a review from bsbernd March 25, 2026 12:39
Some fuse servers need to prune their caches, which can only be done if the
kernel's own dentry/inode caches are pruned first to avoid dangling
references.

Add FUSE_NOTIFY_PRUNE, which takes an array of node ID's to try and get rid
of.  Inodes with active references are skipped.

A similar functionality is already provided by FUSE_NOTIFY_INVAL_ENTRY with
the FUSE_EXPIRE_ONLY flag.  Differences in the interface are

FUSE_NOTIFY_INVAL_ENTRY:

  - can only prune one dentry

  - dentry is determined by parent ID and name

  - if inode has multiple aliases (cached hard links), then they would have
    to be invalidated individually to be able to get rid of the inode

FUSE_NOTIFY_PRUNE:

  - can prune multiple inodes

  - inodes determined by their node ID

  - aliases are taken care of automatically

Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 3f29d59)
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from 4748957 to d25cddd Compare March 25, 2026 12:46
bsbernd
bsbernd previously approved these changes Mar 25, 2026
@bsbernd
Copy link
Copy Markdown
Collaborator

bsbernd commented Mar 25, 2026

@yongzech Please also take a look.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Mar 25, 2026

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

When the inode that has to be pruned is a directory
call shrink_dcache_parent() before trying to prune.

This will try to free all related dcache entries.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from d25cddd to f297a15 Compare March 25, 2026 14:32
@hbirth hbirth requested a review from bsbernd March 25, 2026 14:33
@yongzech
Copy link
Copy Markdown
Collaborator

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

d_invalidate basically calls shrink_dcache_parent, but also handle submounts

case FUSE_NOTIFY_INC_EPOCH:
return fuse_notify_inc_epoch(fc);

case FUSE_NOTIFY_PRUNE:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have libfuse interface for FUSE_NOTIFY_PRUNE ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no ... but we have to start at the kernel ... the libfuse code is in a PR

if (S_ISDIR(inode->i_mode)) {
dentry = d_find_alias(inode);
if (dentry) {
shrink_dcache_parent(dentry);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this only remove the children which with ref count is zero?

}
}

d_prune_aliases(inode);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, only when d_lockref.count is zero. But when lost a lock, we need to invalidate dcache entry

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do at the moment ... that's why I left that code in

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think issue with d_invalidate is if it is currently used for for something, which is why Miklos is suggesting shrink_dcache_parent()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the problem with shrink_dcache_parent and `d_prune_aliases is that, if another node delete a file whose dcache already cached in this node, then don't invalidate dcache.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Mar 30, 2026

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

d_invalidate basically calls shrink_dcache_parent, but also handle submounts

Miklos said that he does not want to dehash the dentry ... but I think WE want exactly that ... that's why I think at the moment we need both

@bsbernd
Copy link
Copy Markdown
Collaborator

bsbernd commented Mar 30, 2026

Well, let's say something is currently actively using the dentry while it gets invalidated - the racing process would then work with an unhashed dentry. I think the critical call is

void __d_drop(struct dentry *dentry)
{
	if (!d_unhashed(dentry)) {
		___d_drop(dentry);
		dentry->d_hash.pprev = NULL;
		write_seqcount_invalidate(&dentry->d_seq);
	}
}

And there the question is if this couldn't cause a null pointer deref at some point.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Apr 7, 2026

@yongzech do we want FUSE_NOTIFY_PRUNE?
AFAICT we need both functionalities ... the 'nice' FUSE_NOTIFY_PRUNE, that will remove all unneeded entries, if possible and the one that dehashes the dentry and invalidates it reliably, because we lost a lock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants