|
Hi Scott,
Is it possible to have a variable number of workers
with sendmsg/recvmsg ?
The server now handles a variable number of workers.
It starts 10 workers but - when a new request arrives
and no more workers are available, the server starts
another one (up to a maximum number). Each worker
waits on a dtaq for a certain amount of time. If no
message arrives within 30 mins, it ends. This means
that in the nighttime you can have 2-3 workers
running, while in the peak hours you can have 20 or
more. Because each workers puts a lock on the dtaq I
can alway know how many workers are running and I can
end them easily by sending a message in the dtaq.
All that stuff run in a dedicated subsystem; when the
crash occurred, the sbs was still on. I suppose it's a
communication failure, but I don't know where to look
for. Crash occurred 3 times in 10 month (twice whith
the old 890 and once with the new 570), so there is no
big deal about it, but I would like to find what's
causing this failure, and where the SIGTERM comes
from.
Giuseppe.
--- Scott Klement <rpg400-l@xxxxxxxxxxxxxxxx> ha
scritto:
beppecosta@xxxxxxxxxxx wrote:
I'have a socket server that spawns 10 workers. Theserver
waits on 'accept' (on a free high port - 52054 -defined
in the Service Table Entry), while the workerswait on a
dtaq.You might consider changing this to use
sendmsg/recvmsg on a socket pair. That would perform better, and have fewer
errors (in my experience, anyway) than the old data queue
technique.
CPC1224 Completion 50 11/09/07 11:53:11,246936for the
QWTPITP2 QSYS 0601 *EXT
Message . . . . : Job ended abnormally.
Cause . . . . . : A SIGTERM signal was received
job. The action for the signal was to terminatethe job.
Sounds like they all received the SIGTERM signal. At
the same time. Was the subsystem ended, by chance? That would cause
this. It's possible to manually send SIGTERM to each process using the
kill() API, or the 'kill' QShell command, but that would require a
program to be written, or a user to run the command.
By constrast, ending the subsystem, or ending all of
the jobs, would send this signal to all of them automatically.
You can write code (via the sigaction() API) that
will catch and ignore SIGTERM if you like.
--
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.