On 3/16/2011 11:59 AM, Scott Klement wrote:
Hmmm... not sure what you're replying to, here, Jerry... is there more
to this thread? (I guess I'll go scout the archives)
Okay, after about 15 minutes of fishing through the archives, I found
that someone named "Rick Renkema" posted the start of this thread. He
posted it as a reply to a discussion about "SQL RAND" instead of writing
a new message, he used "Reply", so it showed up as part of the SQL RAND
thread in my e-mail. And since I already deleted that thread (I wasn't
interested in SQL RAND) I didn't see his message.
http://archive.midrange.com/midrange-l/201103/msg00685.html
@Rick: This cost me time and effort! PLEASE don't reply to an existing
thread when you want to start a new topic! Just compose a new message
to midrange-l@xxxxxxxxxxxx
Here's my response in-line with Rick's words:
Sockets are open and always stay open (I am working on finding a way to
close them after a period of inactivity, but that is not my problem, I
think).
I don't understand what you are saying here.
Do you mean that your POS units keep a static connection to your server,
and by the word "open" you mean the connection remains established? If
so, this sounds like desirable behavior, yes?
Or are you saying that the POS units are disconnecting, but the
connections remain established? Or that the instances never end? (This
would be undesirable.)
Or do you mean that the TCP channels are closing, but the sockets aren't
being closed? (This would be a bug in your program, and would
eventually exhaust resources.)
What does happen is that all of a sudden, after working
perfectly for a period of time, the server instances (all of them) will
no longer respond.
What is meant by "no longer respond"? Do you mean you can' establish a
new connection? (Which would make sense if something happened in the
listener program.) Or do you mean that you can't send/receive data on
any of the already established ones? (I can't come up with a reason why
that would happen simultaneously on all instances... but on a
per-instance basis it can happen easily if you're not performing
timeouts and handling errors properly.)
A debug on the program shows nothing hitting the
dataqueue hence no responses ... But data is sent from the POS station
(we have a simultaneous trace on it).
Assuming you're using the ancient give/take descriptor method of my
original 1997 socket tutorial (yuck) then the data queue is ONLY used
when a new connection is established. It's not used at any other time,
so you wouldn't expect to see data hit the data queue at any other time.
I cannot find anything (tutorials, google or otherwise) that even
hints at this sort of problem. I want to believe it may be as simple
as a TCP/IP setting ... somewhere.
You need to take this deeper. You're only seeing surface symptoms, you
need to troubleshoot the issue further! You need to find out the cause.
If you run debug on the listener, is it receiving the connection via
accept()? Is it writing the proper info to the data queue? is it
calling givedescriptor() properly?
On the instance side, is it receiving the queue entry?
You might consider eliminating this clumsy process altogether (I
certainly would.) I'd either use spawn() instead of give/take, or I'd
use INETD. Or, depending on the application, I might use a handoff
server approach instead of a spawning server approach.
I wonder if keeping a static connection to your POS systems wouldn't
work better?
As an Amazon Associate we earn from qualifying purchases.