Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support F_SETFL and F_GETFL #4119

Open
2 tasks
tiif opened this issue Jan 3, 2025 · 18 comments
Open
2 tasks

Support F_SETFL and F_GETFL #4119

tiif opened this issue Jan 3, 2025 · 18 comments
Assignees
Labels
A-concurrency Area: affects our concurrency (multi-thread) support A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement

Comments

@tiif
Copy link
Contributor

tiif commented Jan 3, 2025

This is proposed to support some tests in tokio.

Ideally we could support F_SETFL and F_GETFL for all kind of file descriptions, but getting them to work on socketpair should be sufficient to unblock some of the tests there (here is the relevant codepath).

F_GETFL is quite straight forward as we only support SOCK_NONBLOCK and SOCK_CLOEXEC flags for socketpair.

For F_SETFL, as it has the ability to change O_NONBLOCK flag, we need to decide what should happen if a previously blocking fd is marked as non-blocking, and there are waiting threads. Should they wake up immediately, later at some point or never? It might make this previously unreachable case, reachable.

Ideally support this on:

  • socketpair
  • pipe

@rustbot label +A-shims +C-enhancement +A-concurrency

@rustbot rustbot added A-concurrency Area: affects our concurrency (multi-thread) support A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement labels Jan 3, 2025
@RalfJung
Copy link
Member

RalfJung commented Jan 3, 2025

F_GETFL is quite straight forward as we only support SOCK_NONBLOCK and SOCK_CLOEXEC flags for socketpair.

Note that CLOEXEC is a file descriptor flag and not exposed via F_SETFL/F_GETFL. Those commands only handle file description flags. See the man page:

       Each open file description has certain associated status flags,
       initialized by open(2) and possibly modified by fcntl().
       Duplicated file descriptors (made with dup(2), fcntl(F_DUPFD),
       fork(2), etc.) refer to the same open file description, and thus
       share the same file status flags.

       The file status flags and their semantics are described in
       open(2).

       F_GETFL (void)
              Return (as the function result) the file access mode and
              the file status flags; arg is ignored.

       F_SETFL (int)
              Set the file status flags to the value specified by arg.
              File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file
              creation flags (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC)
              in arg are ignored.  On Linux, this operation can change
              only the O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and
              O_NONBLOCK flags.  It is not possible to change the
              O_DSYNC and O_SYNC flags; see BUGS, below.

@Darksonn
Copy link

Darksonn commented Jan 3, 2025

Looking at the kernel source code, it seems that it does not notify existing blocking operations in any way, so I would assume that they just continue to block.

@RalfJung
Copy link
Member

RalfJung commented Jan 3, 2025

And then they get unblocked on the next read/write as usual? And at that point they might actually return EWOULDBLOCK if it turns out that not all threads could really be unblocked? I could imagine the kernel doing something smarter, but OTOH changing the blocking nature of a socket that has ongoing operations seems sufficiently odd that maybe we don't need to be bug-for-bug compatible with the kernel.

@Darksonn
Copy link

Darksonn commented Jan 3, 2025

Yeah, not sure. I guess you'll have to try it if you want to find out.

@bjorn3
Copy link
Member

bjorn3 commented Jan 24, 2025

For jobserver it would also be nice to support them on pipes.

@tiif
Copy link
Contributor Author

tiif commented Feb 6, 2025

I'd like to push this forward, looking into this soon.

@rustbot claim

@tiif
Copy link
Contributor Author

tiif commented Feb 13, 2025

I might be wrong, but I suspect that the kernel might race in reading / modifying the O_NONBLOCK flag.

In setfl, when the kernel is modifying a flag, a spinlock is used to protect the the flag.
https://github.com/torvalds/linux/blob/4dc1d1bec89864d8076e5ab314f86f46442bfb02/fs/fcntl.c#L82-L85

The comment here also says f_lock is used to protect f_flags. https://github.com/torvalds/linux/blob/4dc1d1bec89864d8076e5ab314f86f46442bfb02/include/linux/fs.h#L1071

But in pipe_read / pipe_write, the reading of flags is not protected by the same lock.
https://github.com/torvalds/linux/blob/4dc1d1bec89864d8076e5ab314f86f46442bfb02/fs/pipe.c#L357-L361

There could be other mechanisms that can prevent this from happening, but up until now, I still haven't found any.

If it'd be reasonable for us to define the behaviour, I would prefer to make fcntl throws EAGAIN when there is 1 or more threads blocking on the target fd (for example, socketpair / pipe).

@RalfJung
Copy link
Member

RalfJung commented Feb 13, 2025 via email

@tiif
Copy link
Contributor Author

tiif commented Feb 13, 2025

Does fcntl ever return that error on real systems?

Yes, these are documented in https://man7.org/linux/man-pages/man2/fcntl.2.html#ERRORS

       EACCES or EAGAIN
              Operation is prohibited by locks held by other processes.

       EAGAIN The operation is prohibited because the file has been
              memory-mapped by another process.

The reason why I chose EAGAIN is because the description is the closest to the scenario we have here.

       EAGAIN Resource temporarily unavailable (may be the same value as
              EWOULDBLOCK) (POSIX.1-2001).

the above is from taken from https://man7.org/linux/man-pages/man3/errno.3.html

@Darksonn
Copy link

But does it happen in practice? Tokio certainly does not handle it.

@tiif
Copy link
Contributor Author

tiif commented Feb 13, 2025

I tried to write a test that triggers fcntl to change the flag to O_NONBLOCK while a socketpair is blocking, but I could not find a way to make it happen because I could not control thread interleaving in real system.

@Darksonn
Copy link

What is the codepath you are attempting to hit? I don't see a codepath in the kernel that returns -EINVAL when setting the O_NONBLOCK flag.

@tiif
Copy link
Contributor Author

tiif commented Feb 13, 2025

I was trying to reason what would happen when we have this interleaving:

  thread1         | thread 2
pipe read blocks  |
                  |  fcntl add O_NONBLOCK flag to pipe fd

I am not too sure which codepath in the kernel would this behaviour hit.

my another question is would it be possible to have read of flag in pipe_read and this modify of flag in setfl happen without synchronisation?

@Darksonn
Copy link

Most likely the pipe read continues blocking like normal until something wakes it up, at which point it either returns like normal, or notices that the flag has been set and then it returns wouldblock. I don't think changing the flag causes existing read calls to immediately return wouldblock.

And yes they probably read the flag without synchronization that case. They probably should take the lock, but yolo 🤷 You can assume it behaves like they locked and immediately unlocked it, i.e. it gets whatever the value is at that time, because that's probably indeed what does happen in practice.

@RalfJung
Copy link
Member

RalfJung commented Feb 13, 2025 via email

@tiif
Copy link
Contributor Author

tiif commented Feb 13, 2025

Oh yes, I managed to call fcntl(setfl) when pipe is blocking, and have verified these scenarios with strace.

  1. If O_NONBLOCK flag is set while pipe is blocking, the read will still block and never return.
  2. If O_NONBLOCK flag is set while pipe is blocking, then another thread writes to the other side of the pipe, the previously blocking pipe will read successfully.
  3. After executing everything in 2., if we do read again on an empty pipe, it will return EAGAIN, which is the normal behaviour of pipe with O_NONBLOCK flag.

@RalfJung
Copy link
Member

Okay, so basically on a write should just always unblock waiting writers -- even if the pipe is currently non-block.

@tiif
Copy link
Contributor Author

tiif commented Feb 14, 2025

I tested a new case, if we call setfl and getfl while the pipe is blocking, for example:

1. thread 1: pipe read blocking
2. thread 2 : call setfl to set the pipe as non-blocking
3. thread 2: call getfl to retrieve the flag value of that fd

getfl will be able to observe O_NONBLOCK flag that is set in step 2.

So my current mental model if very close to what @Darksonn stated above, if setfl is called while a fd is blocking, that fd will not be affected by the new flag value and will continue blocking like normal until a read/write wakes it up (it acts like it never knew the setfl has happened). But any operation after setfl will see the new flag value.

This shouldn't be hard to replicate in miri. If this model is ok, I'll start working on an implementation.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-concurrency Area: affects our concurrency (multi-thread) support A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement
Projects
None yet
Development

No branches or pull requests

5 participants