[Dxspider-support] DXSpider stop responding to connections with 100% CPU

iz6fxs at cisarmajella.org iz6fxs at cisarmajella.org
Mon Feb 19 08:55:30 GMT 2024


Hi!

 

Please help me troubleshooting this problem. DXspider (cluster.iz6fxs.radio)
suddenly stops responding with the "login" prompt if you connect to it. I'm
running the las build.

When this happens the CPU is stuck to 100%:

 



 

This is the content of the log:

 

root at cluster:/spider/local_data/log# tail 2024/02.dat 

1708206540^DXProt^PC92A E77AR -> 31.223.135.216 on DB0SUE-7

1708206545^DXProt^PC92A EB1FEV -> 88.10.193.136 on EA4URE-3

1708206551^DXProt^PC92A JJ3FBS -> 157.14.219.178 on JE3YEK

1708206551^DXProt^PC92A 4X6TT -> 147.235.199.58 on NX9G

1708206551^DXProt^PC92A W0FK -> 47.24.152.2 on NX9G

1708206552^DXProt^PC92A EA1AOC -> 91.196.223.124 on EA4URE-5

1708206561^DXProt^PC92A K4FTV -> 35.137.52.111 on EI7MRE

1708206562^DXProt^PC92A KI0EB -> 24.245.245.113 on W1NR

1708206567^DXProt^PC92A OK1CF -> 77.237.128.209 on EA4RCH-5

1708206568^DXProt^PC92A DF1MM -> 176.1.242.227 on S50CLX

 

And this is the contect of the last lines of the debug log:

 

root at cluster:/spider/local_data/debug# tail -20 2024/048.dat 

1708206345^(*) RBN:WRITE_CACHE size: 420.774KB time to write: 21 mS

1708206368^(*) RBN: ERROR invalid prefix/callsign T9CT from WB6BEE-# on
28005.9, dumped

1708206386^(nologchan)
PC61^7012.0^W4NF^17-Feb-2024^2146Z^arrl^PA2A^SR2PUT^77.171.80.188^H26^~

1708206386^(*) PCPROT: Bad Spotter PA2A, dropped

1708206386^(nologchan)
PC61^7012.0^W4NF^17-Feb-2024^2146Z^arrl^PA2A^SR2PUT^77.171.80.188^H23^~

1708206386^(*) PCPROT: Bad Spotter PA2A, dropped

1708206386^(nologchan)
PC61^7012.0^W4NF^17-Feb-2024^2146Z^arrl^PA2A^SR2PUT^77.171.80.188^H24^~

1708206386^(*) PCPROT: Bad Spotter PA2A, dropped

1708206386^(nologchan)
PC61^7012.0^W4NF^17-Feb-2024^2146Z^arrl^PA2A^SR2PUT^77.171.80.188^H23^~

1708206386^(*) PCPROT: Bad Spotter PA2A, dropped

1708206405^(*) RBN:WRITE_CACHE size: 420.606KB time to write: 18 mS

1708206406^(*) RBN: ERROR invalid prefix/callsign 1E2OCV from JN1ILK-# on
7007.5, dumped

1708206420^(err) SK0MMR connected from 44.52.120.88

1708206420^(*) RBN: noinrush: 0, setting inrushpreventor on SK0MMR to 0

1708206465^(err) RBN: no input from SK0MMR, disconnecting

1708206465^(*) RBN:WRITE_CACHE size: 424.195KB time to write: 22 mS

1708206480^(err) SK0MMR connected from 44.52.120.88

1708206480^(*) RBN: noinrush: 0, setting inrushpreventor on SK0MMR to 0

1708206525^(err) RBN: no input from SK0MMR, disconnecting

1708206525^(*) RBN:WRITE_CACHE size: 421.602KB time to write: 20 mS

 

I noticed that all the connections to the node are hung in CLOSE_WAIT
(hundreds if not more):

 



 

I rebuilt the file users.v3j to no avail.

Restarting it with systemd is not working, you have to kill the process
manually to have it restarted.

 

Please help! Thanks,

Norm IZ6FXS

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20240219/9290e210/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 63419 bytes
Desc: not available
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20240219/9290e210/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 17038 bytes
Desc: not available
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20240219/9290e210/attachment-0003.jpg>


More information about the Dxspider-support mailing list