Hi Scott,
Do you have the sigaction() set up properly? i.e. if you replaced
takedescriptor() with sleep() (just for the sake of a test, mind you)
would it successfully receive the alarm() signal? (Just to rule out a
bug in your sigaction code...)
Yes, it is all set up correctly. If I run the following code:
alarm(5);
sleep(10);
alarm(0);
my alarm handler procedure gets called after 5 seconds.
If I run this code:
alarm(5);
sd = takedescriptor(*null);
alarm(0);
my alarm handler procedure does not get called after 5 seconds.
If, after the 5 seconds has elapsed, I end the request via SysReq 2 and I have a breakpoint on a line within the alarm handler - it DOES get called then. So the End Request interrupted takedescriptor, and the alarm signal (which had been raised earlier) is only actioned once takedescriptor yielded.
Assuming that the problem really is in takedescriptor() instead of
something with your signal mask or signal configuration, then would you
be open to alternatives?
It certainly looks like the problem is with takedescriptor. But I am always open to alternatives.
Alternately, you could eliminate takedescriptor (I know I haven't used
it in 10 years) and use sendmsg() and recvmsg() instead. I'm confident
that _they_ will be interrupted by a signal.
Yes, in my original message I mentioned these APIs. But it would be difficult to move over to them because of the required code changes in multiple programs. If it becomes necessary I would move over to them. It may be my only choice.
Or... depending on what you're doing... you might find spawn() to be a
better choice. It handles the submit and the descriptor passing all in
one fell swoop.
Not an option here - it doesn't fit for this application design.
But... are you sure that alarm() doesn't interrupt takedescriptor?
Yes, definitely.
I'm wondering if there is another timer-based signal raising mechanism that I could use to raise a higher priority signal. Or is it possible for alarm to raise a different signal to SIGALRM? A SIGTERM for example? That would work for me. I want to end the job anyway, I just want to end it if takedescriptor doesn't return after a set period of time.
Background: This code runs in a pool of jobs used as worker jobs for a socket application. Their details are retrieved from a DTAQ by the server job and the socket descriptor is passed to the worker job. This code is reasonably old and stable (mature) and works very fast and very well. I want to use it to manage stateful connections and the changes work very well. Cleaning the orphaned jobs after a set session timeout period should have been a case of setting an alarm when the job goes back on to the takedescriptor. Once the session times out due to user inactivity (a socket descriptor not being passed since the alarm was set) the alarm should fire and the job get cleaned up and ended.
But takedescriptor is blocking the SIGALRM! Very frustrating.
I appreciate your time and interest, Scott.
Cheers
Larry Ducie
As an Amazon Associate we earn from qualifying purchases.