[Dxspider-support] Spider stall.

Mike McCarthy, W1NR lists at w1nr.net
Sun Oct 8 14:36:57 BST 2006


I have a theory...

Since Spider is a single threaded application, any system call that blocks
will cause the entire process to appear to hang.  It is most likely a TCP
call that is hanging.  Once I observed that one user appeared to be
connected but was in fact trying to reconnect with no success.  There were a
lot of TCP sockets open in the SYN state from him trying to reconnect.  His
IP address still had a socket in the connected state with a ton of traffic
queued up waiting for him.

The only way I know of to try to "fix" this is to have each connection in a
separate process or thread from the main application.  If one does get hung,
it doesn't prevent the rest of the system from functioning.  I don't know if
that is possible in Perl.  Again, this is just a theory.  It would help to
see the output of netstat when this occurs.

AX25d does spawn "client" for each RF connection and I don't see it ever
hang despite numerous filled pipes from marginal or RF users who switch off
the radio without a proper disconnect.  Perhaps the way to fix this is to
spawn client via xinetd for each telnet connect?

I have seen this type of hang a couple of times and it always clears itself
after about 20 minutes.  Once, and only once, I had a system hang so bad I
could not get a response from the console.  But this was a couple of years
and several versions of Linux ago and may have had nothing to do with
spider.

Mike, W1NR

-----Original Message-----
From: dxspider-support-bounces at dxcluster.org
[mailto:dxspider-support-bounces at dxcluster.org] On Behalf Of Kjell Jarl
Sent: Sunday, October 08, 2006 8:36 AM
To: rene_olsen at post3.tele.dk; The DXSpider Support list
Subject: Re: [Dxspider-support] Spider stall.

Hi,
What OS?
I have seen this many times on Linux 2.4. I suspect that TCP stack is filled
due to a bad TCP/IP link and spider can not deliver data.
73
Kjell


Rene Olsen wrote:
> Hi.
> 
> Today the spider at oz5bbs-7 decided to suddenly stall. I have seen it 
> on one other occation, and there is absolutely no indication on the debug
log, as to what is going on.
> 
> Running spider 1.52 build 61.521 (should be latest version) and perl
5.6.1.
> 
> The debug log looks like this:
> 
> 1160298101^<- I SK3W-6 PC11^7060.0^IZ8DDP/P^08-Oct-2006^0904Z^DCI SA- 
> 062^I1SCL^IR4AD^H95^~
> 1160298101^PCPROT: Duplicate Spot ignored
> 1160298102^<- I N0VD PC11^7060.0^IZ8DDP/P^08-Oct-2006^0904Z^DCI SA- 
> 062^I1SCL^IR4AD^H94^~
> 1160298102^PCPROT: Duplicate Spot ignored
> 1160298102^<- I OZ2DXC PC11^7060.0^IZ8DDP/P^08-Oct-2006^0904Z^DCI SA- 
> 062^I1SCL^IR4AD^H92^~
> 1160298102^PCPROT: Duplicate Spot ignored
> 1160298102^<- I DB0SUE-7 PC11^7060.0^IZ8DDP/P^08-Oct-2006^0904Z^DCI 
> SA- 062^I1SCL^IR4AD^H4^~
> 1160298102^PCPROT: Duplicate Spot ignored 1160298102^-> D EI7SDX 
> PC51^EI7SDX^OZ5BBS-7^1^ 1160302409^DXSpider V1.52, build 61.521 
> started
> 
> I looked at least 200 lines back in the debug log, and don't see 
> anything that could indicate the reason.
> 
>>From the log I can see that it was dead for almost 72 minutes, and 
>>there were nothing else
> to do than to kill the proccess and start it again. Couldn't log in 
> and shut down the normal way.
> 
> Any idea as to what this can be. It is the second time that this 
> happens within a month I think.
> 
> Vy 73 de Rene / OZ1LQH
> 
> 
> 
> _______________________________________________
> Dxspider-support mailing list
> Dxspider-support at dxcluster.org
> http://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
> 
> 



_______________________________________________
Dxspider-support mailing list
Dxspider-support at dxcluster.org
http://mailman.tobit.co.uk/mailman/listinfo/dxspider-support





More information about the Dxspider-support mailing list