-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathelect(int nfds, fd_set *readfds, fd_set *writefds,
227 lines (154 loc) · 16.9 KB
/
elect(int nfds, fd_set *readfds, fd_set *writefds,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
SELECT(2) Linux Programmer's Manual SELECT(2)
NNAAMMEE
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O multiplexing
SSYYNNOOPPSSIISS
/* According to POSIX.1-2001, POSIX.1-2008 */
##iinncclluuddee <<ssyyss//sseelleecctt..hh>>
/* According to earlier standards */
##iinncclluuddee <<ssyyss//ttiimmee..hh>>
##iinncclluuddee <<ssyyss//ttyyppeess..hh>>
##iinncclluuddee <<uunniissttdd..hh>>
iinntt sseelleecctt((iinntt _n_f_d_s,, ffdd__sseett **_r_e_a_d_f_d_s,, ffdd__sseett **_w_r_i_t_e_f_d_s,,
ffdd__sseett **_e_x_c_e_p_t_f_d_s,, ssttrruucctt ttiimmeevvaall **_t_i_m_e_o_u_t));;
vvooiidd FFDD__CCLLRR((iinntt _f_d,, ffdd__sseett **_s_e_t));;
iinntt FFDD__IISSSSEETT((iinntt _f_d,, ffdd__sseett **_s_e_t));;
vvooiidd FFDD__SSEETT((iinntt _f_d,, ffdd__sseett **_s_e_t));;
vvooiidd FFDD__ZZEERROO((ffdd__sseett **_s_e_t));;
##iinncclluuddee <<ssyyss//sseelleecctt..hh>>
iinntt ppsseelleecctt((iinntt _n_f_d_s,, ffdd__sseett **_r_e_a_d_f_d_s,, ffdd__sseett **_w_r_i_t_e_f_d_s,,
ffdd__sseett **_e_x_c_e_p_t_f_d_s,, ccoonnsstt ssttrruucctt ttiimmeessppeecc **_t_i_m_e_o_u_t,,
ccoonnsstt ssiiggsseett__tt **_s_i_g_m_a_s_k));;
Feature Test Macro Requirements for glibc (see ffeeaattuurree__tteesstt__mmaaccrrooss(7)):
ppsseelleecctt(): _POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600
DDEESSCCRRIIPPTTIIOONN
sseelleecctt() and ppsseelleecctt() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it
is possible to perform a corresponding I/O operation (e.g., rreeaadd(2) without blocking, or a sufficiently small wwrriittee(2)).
The operation of sseelleecctt() and ppsseelleecctt() is identical, other than these three differences:
(i) sseelleecctt() uses a timeout that is a _s_t_r_u_c_t _t_i_m_e_v_a_l (with seconds and microseconds), while ppsseelleecctt() uses a _s_t_r_u_c_t _t_i_m_e_s_p_e_c (with seconds and nanoseconds).
(ii) sseelleecctt() may update the _t_i_m_e_o_u_t argument to indicate how much time was left. ppsseelleecctt() does not change this argument.
(iii) sseelleecctt() has no _s_i_g_m_a_s_k argument, and behaves as ppsseelleecctt() called with NULL _s_i_g_m_a_s_k.
Three independent sets of file descriptors are watched. Those listed in _r_e_a_d_f_d_s will be watched to see if characters become available for reading (more precisely, to see if a read will not block; in particular, a file descriptor is also
ready on end-of-file), those in _w_r_i_t_e_f_d_s will be watched to see if space is available for write (though a large write may still block), and those in _e_x_c_e_p_t_f_d_s will be watched for exceptions. On exit, the sets are modified in place to indi‐
cate which file descriptors actually changed status. Each of the three file descriptor sets may be specified as NULL if no file descriptors are to be watched for the corresponding class of events.
Four macros are provided to manipulate the sets. FFDD__ZZEERROO() clears a set. FFDD__SSEETT() and FFDD__CCLLRR() respectively add and remove a given file descriptor from a set. FFDD__IISSSSEETT() tests to see if a file descriptor is part of the set; this is useful
after sseelleecctt() returns.
_n_f_d_s is the highest-numbered file descriptor in any of the three sets, plus 1.
The _t_i_m_e_o_u_t argument specifies the interval that sseelleecctt() should block waiting for a file descriptor to become ready. The call will block until either:
* a file descriptor becomes ready;
* the call is interrupted by a signal handler; or
* the timeout expires.
Note that the _t_i_m_e_o_u_t interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount. If both fields of the _t_i_m_e_v_a_l structure are zero, then sseelleecctt()
returns immediately. (This is useful for polling.) If _t_i_m_e_o_u_t is NULL (no timeout), sseelleecctt() can block indefinitely.
_s_i_g_m_a_s_k is a pointer to a signal mask (see ssiiggpprrooccmmaasskk(2)); if it is not NULL, then ppsseelleecctt() first replaces the current signal mask by the one pointed to by _s_i_g_m_a_s_k, then does the "select" function, and then restores the original signal
mask.
Other than the difference in the precision of the _t_i_m_e_o_u_t argument, the following ppsseelleecctt() call:
ready = pselect(nfds, &readfds, &writefds, &exceptfds,
timeout, &sigmask);
is equivalent to _a_t_o_m_i_c_a_l_l_y executing the following calls:
sigset_t origmask;
pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
pthread_sigmask(SIG_SETMASK, &origmask, NULL);
The reason that ppsseelleecctt() is needed is that if one wants to wait for either a signal or for a file descriptor to become ready, then an atomic test is needed to prevent race conditions. (Suppose the signal handler sets a global flag and
returns. Then a test of this global flag followed by a call of sseelleecctt() could hang indefinitely if the signal arrived just after the test but just before the call. By contrast, ppsseelleecctt() allows one to first block signals, handle the signals
that have come in, then call ppsseelleecctt() with the desired _s_i_g_m_a_s_k, avoiding the race.)
TThhee ttiimmeeoouutt
The time structures involved are defined in _<_s_y_s_/_t_i_m_e_._h_> and look like
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
and
struct timespec {
long tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
(However, see below on the POSIX.1 versions.)
Some code calls sseelleecctt() with all three sets empty, _n_f_d_s zero, and a non-NULL _t_i_m_e_o_u_t as a fairly portable way to sleep with subsecond precision.
On Linux, sseelleecctt() modifies _t_i_m_e_o_u_t to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1 permits either behavior.) This causes problems both when Linux code which reads _t_i_m_e_o_u_t is ported to other
operating systems, and when code is ported to Linux that reuses a _s_t_r_u_c_t _t_i_m_e_v_a_l for multiple sseelleecctt()s in a loop without reinitializing it. Consider _t_i_m_e_o_u_t to be undefined after sseelleecctt() returns.
RREETTUURRNN VVAALLUUEE
On success, sseelleecctt() and ppsseelleecctt() return the number of file descriptors contained in the three returned descriptor sets (that is, the total number of bits that are set in _r_e_a_d_f_d_s, _w_r_i_t_e_f_d_s, _e_x_c_e_p_t_f_d_s) which may be zero if the timeout expires
before anything interesting happens. On error, -1 is returned, and _e_r_r_n_o is set to indicate the error; the file descriptor sets are unmodified, and _t_i_m_e_o_u_t becomes undefined.
EERRRROORRSS
EEBBAADDFF An invalid file descriptor was given in one of the sets. (Perhaps a file descriptor that was already closed, or one on which an error has occurred.)
EEIINNTTRR A signal was caught; see ssiiggnnaall(7).
EEIINNVVAALL _n_f_d_s is negative or exceeds the RRLLIIMMIITT__NNOOFFIILLEE resource limit (see ggeettrrlliimmiitt(2)).
EEIINNVVAALL the value contained within _t_i_m_e_o_u_t is invalid.
EENNOOMMEEMM unable to allocate memory for internal tables.
VVEERRSSIIOONNSS
ppsseelleecctt() was added to Linux in kernel 2.6.16. Prior to this, ppsseelleecctt() was emulated in glibc (but see BUGS).
CCOONNFFOORRMMIINNGG TTOO
sseelleecctt() conforms to POSIX.1-2001, POSIX.1-2008, and 4.4BSD (sseelleecctt() first appeared in 4.2BSD). Generally portable to/from non-BSD systems supporting clones of the BSD socket layer (including System V variants). However, note that the Sys‐
tem V variant typically sets the timeout variable before exit, but the BSD variant does not.
ppsseelleecctt() is defined in POSIX.1g, and in POSIX.1-2001 and POSIX.1-2008.
NNOOTTEESS
An _f_d___s_e_t is a fixed size buffer. Executing FFDD__CCLLRR() or FFDD__SSEETT() with a value of _f_d that is negative or is equal to or larger than FFDD__SSEETTSSIIZZEE will result in undefined behavior. Moreover, POSIX requires _f_d to be a valid file descriptor.
Concerning the types involved, the classical situation is that the two fields of a _t_i_m_e_v_a_l structure are typed as _l_o_n_g (as shown above), and the structure is defined in _<_s_y_s_/_t_i_m_e_._h_>. The POSIX.1 situation is
struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* microseconds */
};
where the structure is defined in _<_s_y_s_/_s_e_l_e_c_t_._h_> and the data types _t_i_m_e___t and _s_u_s_e_c_o_n_d_s___t are defined in _<_s_y_s_/_t_y_p_e_s_._h_>.
Concerning prototypes, the classical situation is that one should include _<_t_i_m_e_._h_> for sseelleecctt(). The POSIX.1 situation is that one should include _<_s_y_s_/_s_e_l_e_c_t_._h_> for sseelleecctt() and ppsseelleecctt().
Under glibc 2.0, _<_s_y_s_/_s_e_l_e_c_t_._h_> gives the wrong prototype for ppsseelleecctt(). Under glibc 2.1 to 2.2.1, it gives ppsseelleecctt() when __GGNNUU__SSOOUURRCCEE is defined. Since glibc 2.2.2, the requirements are as shown in the SYNOPSIS.
MMuullttiitthhrreeaaddeedd aapppplliiccaattiioonnss
If a file descriptor being monitored by sseelleecctt() is closed in another thread, the result is unspecified. On some UNIX systems, sseelleecctt() unblocks and returns, with an indication that the file descriptor is ready (a subsequent I/O operation
will likely fail with an error, unless another the file descriptor reopened between the time sseelleecctt() returned and the I/O operations was performed). On Linux (and some other systems), closing the file descriptor in another thread has no
effect on sseelleecctt(). In summary, any application that relies on a particular behavior in this scenario must be considered buggy.
CC lliibbrraarryy//kkeerrnneell ddiiffffeerreenncceess
The ppsseelleecctt() interface described in this page is implemented by glibc. The underlying Linux system call is named ppsseelleecctt66(). This system call has somewhat different behavior from the glibc wrapper function.
The Linux ppsseelleecctt66() system call modifies its _t_i_m_e_o_u_t argument. However, the glibc wrapper function hides this behavior by using a local variable for the timeout argument that is passed to the system call. Thus, the glibc ppsseelleecctt() function
does not modify its _t_i_m_e_o_u_t argument; this is the behavior required by POSIX.1-2001.
The final argument of the ppsseelleecctt66() system call is not a _s_i_g_s_e_t___t _* pointer, but is instead a structure of the form:
struct {
const sigset_t *ss; /* Pointer to signal set */
size_t ss_len; /* Size (in bytes) of object pointed
to by 'ss' */
};
This allows the system call to obtain both a pointer to the signal set and its size, while allowing for the fact that most architectures support a maximum of 6 arguments to a system call.
BBUUGGSS
Glibc 2.0 provided a version of ppsseelleecctt() that did not take a _s_i_g_m_a_s_k argument.
Starting with version 2.1, glibc provided an emulation of ppsseelleecctt() that was implemented using ssiiggpprrooccmmaasskk(2) and sseelleecctt(). This implementation remained vulnerable to the very race condition that ppsseelleecctt() was designed to prevent. Modern
versions of glibc use the (race-free) ppsseelleecctt() system call on kernels where it is provided.
On systems that lack ppsseelleecctt(), reliable (and more portable) signal trapping can be achieved using the self-pipe trick. In this technique, a signal handler writes a byte to a pipe whose other end is monitored by sseelleecctt() in the main program.
(To avoid possibly blocking when writing to a pipe that may be full or reading from a pipe that may be empty, nonblocking I/O is used when reading from and writing to the pipe.)
Under Linux, sseelleecctt() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded.
There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use OO__NNOONNBBLLOOCCKK on sockets that should not block.
On Linux, sseelleecctt() also modifies _t_i_m_e_o_u_t if the call is interrupted by a signal handler (i.e., the EEIINNTTRR error return). This is not permitted by POSIX.1. The Linux ppsseelleecctt() system call has the same behavior, but the glibc wrapper hides
this behavior by internally copying the _t_i_m_e_o_u_t to a local variable and passing that variable to the system call.
EEXXAAMMPPLLEE
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
int
main(void)
{
fd_set rfds;
struct timeval tv;
int retval;
/* Watch stdin (fd 0) to see when it has input. */
FD_ZERO(&rfds);
FD_SET(0, &rfds);
/* Wait up to five seconds. */
tv.tv_sec = 5;
tv.tv_usec = 0;
retval = select(1, &rfds, NULL, NULL, &tv);
/* Don't rely on the value of tv now! */
if (retval == -1)
perror("select()");
else if (retval)
printf("Data is available now.\n");
/* FD_ISSET(0, &rfds) will be true. */
else
printf("No data within five seconds.\n");
exit(EXIT_SUCCESS);
}
SSEEEE AALLSSOO
aacccceepptt(2), ccoonnnneecctt(2), ppoollll(2), rreeaadd(2), rreeccvv(2), rreessttaarrtt__ssyyssccaallll(2), sseenndd(2), ssiiggpprrooccmmaasskk(2), wwrriittee(2), eeppoollll(7), ttiimmee(7)
For a tutorial with discussion and examples, see sseelleecctt__ttuutt(2).
CCOOLLOOPPHHOONN
This page is part of release 4.04 of the Linux _m_a_n_-_p_a_g_e_s project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at http://www.kernel.org/doc/man-pages/.
Linux 2015-07-23 SELECT(2)