[Dxspider-support] New mojo version
Keith Maton
g6nhu at me.com
Sat Sep 21 17:05:43 BST 2024
Waiting for someone I know who has the problem to reconnect.
> On 21 Sep 2024, at 16:45, IZ2LSC <iz2lsc.andrea at gmail.com> wrote:
>
> Can I see a TCPDUMP of the full sequence from user login to user disconnect?
>
> The command is tcpdump host x.x.x.x -An
>
> x.x.x.x is the ip address of a user with the issue.
>
> Andrea
>
> -->
>
>
> Il giorno sab 21 set 2024 alle ore 16:54 Kin EA3CV <ea3cv at cronux.net <mailto:ea3cv at cronux.net>> ha scritto:
>> Andrea, the RSTs correspond to HamClock users that are failing. That was the first thing I checked, timestamp and IP.
>> As Keith said, it crashed after the update, but when the working build was restored, it continued to crash. I don't think the problem is in dxspider if we have gone back to the backup that worked and no longer works.
>> The dxspider traces do not show anything abnormal except disconnections.
>> It is necessary to reproduce the behavior with a client under observation.
>>
>> Kin
>>
>>
>> De: IZ2LSC <iz2lsc.andrea at gmail.com <mailto:iz2lsc.andrea at gmail.com>>
>> Enviado: sábado, septiembre 21, 2024 4:41:34 p. m.
>> Para: Keith Maton <g6nhu at me.com <mailto:g6nhu at me.com>>
>> CC: Kin EA3CV <ea3cv at cronux.net <mailto:ea3cv at cronux.net>>; The DXSpider Support list <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>>
>> Asunto: Re: [Dxspider-support] New mojo version
>>
>> Keith,
>> from what was shared in this thread I can see the reset is received by the dxspider, so someone else has generated it.
>> Before going for any conclusion, I want to be sure that the tcpdump that was shared is really about a user that is having the problem.
>> This is why I asked to get the tcpdump for a user IP as long as the dxspider debug log for the same user captured at the same time.
>>
>> My hamclock is able to connect to your cluster without any issue and I tried several disconnect/connect sequences.
>>
>> Andrea
>>
>>
>>
>>
>>
>> -->
>>
>>
>> Il giorno sab 21 set 2024 alle ore 16:30 Keith Maton <g6nhu at me.com <mailto:g6nhu at me.com>> ha scritto:
>>> So what’s the current feeling, is the disconnect coming from HamClock or the DXSpider?
>>>
>>> I don’t think we can send attachments to this list so here’s a link <https://g6nhu.co.uk/users-week.png> to the mrtg users graph.
>>>
>>> You’ll see it stops at 20:30z on Wednesday. That’s because it all went wrong when I did the update on Thursday afternoon and then a couple of hours later I restored the 536 backup that was taken the previous evening. The gap is from the time of the backup to when I restored.
>>>
>>> I’ve gone back to exactly how it was before the update. I talk to the HamClock dev daily and there are multiple different versions of HamClock all unable to connect.
>>>
>>> I simply don’t know where to go from here, especially as I built a new node on a different pi this morning and the same thing happens.
>>>
>>> 73 Keith.
>>>
>>>
>>>
>>>> On 21 Sep 2024, at 14:58, Kin EA3CV <ea3cv at cronux.net <mailto:ea3cv at cronux.net>> wrote:
>>>>
>>>> Yes, there is clearly something HamClock doesn't like. I haven't looked at a HamClock user that works but the ones that fail don't terminate the socket with FIN.
>>>> I had thought about setting up a client in a container, but if you try it, you'll let us know.
>>>>
>>>> Kin
>>>>
>>>>
>>>>
>>>> De: IZ2LSC <iz2lsc.andrea at gmail.com <mailto:iz2lsc.andrea at gmail.com>>
>>>> Enviado: sábado, septiembre 21, 2024 3:23:04 p. m.
>>>> Para: Kin <ea3cv at cronux.net <mailto:ea3cv at cronux.net>>
>>>> CC: The DXSpider Support list <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>>; Keith Maton <g6nhu at me.com <mailto:g6nhu at me.com>>
>>>> Asunto: Re: [Dxspider-support] New mojo version
>>>>
>>>> Kin,
>>>> the netstat looks fine, I can see 87 sessions established.
>>>> But from the TCP dump you attached I see a lot of RST (reset) coming from client side, not from cluster.
>>>>
>>>> Just to give you an example this is what happen when is the cluster disconnecting a user (192.168.1130 is the cluster):
>>>>
>>>> 15:15:53.503505 IP 192.168.1130.7373 > 192.168.1.111.52076: Flags [F.], seq 4172, ack 8, win 227, options [nop,nop,TS val 2856414384 ecr 3254272443], length 0
>>>> 15:15:53.504215 IP 192.168.1.111.52076 > 192.168.1.130.7373: Flags [F.], seq 8, ack 4173, win 501, options [nop,nop,TS val 3254273942 ecr 2856414384], length 0
>>>> 15:15:53.504340 IP 192.168.1.130.7373 > 192.168.1.111.52076: Flags [], ack 9, win 227, options [nop,nop,TS val 2856414385 ecr 3254273942], length 0
>>>>
>>>> So the cluster is the first sending the Fin
>>>>
>>>> Can you try to follow a specific flow, correlating the IP address you see in the debug log of dxspider with the ip address you find on the tcpdump?
>>>> I mean the sessions from the beginning to end.
>>>>
>>>> Meantime I'll set up a hamclock and test it with your cluster.
>>>>
>>>>
>>>> Andrea
>>>>
>>>>
>>>>
>>>> -->
>>>>
>>>>
>>>> Il giorno sab 21 set 2024 alle ore 14:43 Kin <ea3cv at cronux.net <mailto:ea3cv at cronux.net>> ha scritto:
>>>>> I think it is clear that the client is being logged out:
>>>>>
>>>>>
>>>>>
>>>>> 7160 33.786132 216.189.132.128 → 192.168.1.8 TCP 72 56774 → 7300 [RST, ACK] Seq=10 Ack=8 Win=32128 Len=0 TSval=1580625328 TSecr=201034149
>>>>>
>>>>> 7168 33.896513 216.189.132.128 → 192.168.1.8 TCP 66 56774 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 7169 33.896698 216.189.132.128 → 192.168.1.8 TCP 66 56774 → 7300 [RST] Seq=7 Win=0 Len=0
>>>>>
>>>>> 7170 33.906293 216.189.132.128 → 192.168.1.8 TCP 66 56774 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 7178 34.340220 171.100.240.62 → 192.168.1.8 TCP 66 63285 → 7300 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
>>>>>
>>>>> 8448 39.243542 209.193.104.69 → 192.168.1.8 TCP 66 40996 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 9532 50.372818 72.14.148.41 → 192.168.1.8 TCP 72 64276 → 7300 [RST, ACK] Seq=2 Ack=2 Win=251 Len=0 TSval=3068700830 TSecr=1953125033
>>>>>
>>>>> 19600 91.809075 74.132.9147 → 192.168.1.8 TCP 72 36452 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598464619 TSecr=4079102387
>>>>>
>>>>> 19749 91.934857 74.132.91.47 → 192.168.1.8 TCP 66 36452 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 19750 91.937074 74.132.91.47 → 192.168.1.8 TCP 66 36452 → 7300 [RST] Seq=13 Win=0 Len=0
>>>>>
>>>>> 21439 97.245904 67.190.210.166 → 192.168.1.8 TCP 66 57758 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 23946 104.730291 86.150.197.182 → 192.168.1.8 TCP 66 49319 → 7300 [RST, ACK] Seq=1 Ack=9 Win=0 Len=0
>>>>>
>>>>> 23947 104.730291 86.150.197.182 → 192.168.1.8 TCP 66 49316 → 7300 [RST, ACK] Seq=1 Ack=2 Win=0 Len=0
>>>>>
>>>>> 24595 106.702456 74.132.91.47 → 192.168.1.8 TCP 72 58432 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598479515 TSecr=4079117277
>>>>>
>>>>> 24614 106.848106 74.132.91.47 → 192.168.1.8 TCP 66 58432 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 24618 106.919363 74.132.91.47 → 192.168.1.8 TCP 66 58432 → 7300 [RST] Seq=13 Win=0 Len=0
>>>>>
>>>>> 25499 114.246740 67.190.210.166 → 192.168.1.8 TCP 66 41444 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 26057 118.535648 72.14.148.41 → 192.168.1.8 TCP 72 22727 → 7300 [RST, ACK] Seq=1 Ack=9 Win=64256 Len=0 TSval=3068768993 TSecr=1953190036
>>>>>
>>>>> 27133 121.696803 74.132.91.47 → 192.168.1.8 TCP 72 33884 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598494508 TSecr=4079132270
>>>>>
>>>>> 27149 121.815184 74.132.91.47 → 192.168.1.8 TCP 66 33884 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 27150 121.815249 74.132.91.47 → 192.168.1.8 TCP 66 33884 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 27151 121.815250 74.132.9147 → 192.168.1.8 TCP 66 33884 → 7300 [RST] Seq=13 Win=0 Len=0
>>>>>
>>>>> 29651 131.245565 67.190.210.166 → 192.168.1.8 TCP 66 56690 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 29689 132.322988 171.100.240.62 → 192.1681.8 TCP 66 63313 → 7300 [RST, ACK] Seq=1 Ack=9 Win=0 Len=0
>>>>>
>>>>> 29690 132.323664 171.100.240.62 → 192.168.1.8 TCP 66 63298 → 7300 [RST, ACK] Seq=1 Ack=2 Win=0 Len=0
>>>>>
>>>>> 30075 136.719069 74.132.91.47 → 192.168.1.8 TCP 72 51106 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598509531 TSecr=4079147277
>>>>>
>>>>> 30094 136.842612 74.132.91.47 → 192.168.1.8 TCP 66 51106 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 30512 139.246966 74.132.91.47 → 192.168.1.8 TCP 66 46146 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 30730 141.916039 72181.212.51 → 192.168.1.8 TCP 72 52404 → 7300 [RST, ACK] Seq=10 Ack=8 Win=32128 Len=0 TSval=10881996 TSecr=4118248549
>>>>>
>>>>> 32092 148.245539 67.190.210.166 → 192.168.1.8 TCP 66 60236 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 33326 151.728538 74.132.91.47 → 192.168.1.8 TCP 72 47306 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598524532 TSecr=4079162306
>>>>>
>>>>> 33340 151.867383 74.132.91.47 → 192.1681.8 TCP 66 47306 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 33341 151.867471 74.132.91.47 → 192.168.1.8 TCP 66 47306 → 7300 [RST] Seq=10 Win=0 Len=0
>>>>>
>>>>> 33342 151.868904 74.132.91.47 → 192.168.1.8 TCP 66 47306 → 7300 [RST] Seq=13 Win=0 Len=0
>>>>>
>>>>> 34141 156.245366 67.190.210.166 → 192.168.1.8 TCP 66 59968 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 36145 166.704908 74.132.91.47 → 192.168.1.8 TCP 72 55558 → 7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598539512 TSecr=4079177284
>>>>>
>>>>> 36150 166844112 74.132.91.47 → 192.168.1.8 TCP 66 55558 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 36151 166.844112 74.132.91.47 → 192.168.1.8 TCP 66 55558 → 7300 [RST] Seq=13 Win=0 Len=0
>>>>>
>>>>> 37488 173.246799 67.190.210.166 → 192.168.1.8 TCP 66 55428 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 37877 176.782641 72.14.148.41 → 192.168.1.8 TCP 72 47454 → 7300 [RST, ACK] Seq=1 Ack=9 Win=64256 Len=0 TSval=3068827240 TSecr=1953255335
>>>>>
>>>>> 38468 182.245044 212.251.236.77 → 192.168.1.8 TCP 66 25610 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>> 40367 190.261692 67.190.210.166 → 192.168.1.8 TCP 66 41508 → 7300 [RST] Seq=1 Win=0 Len=0
>>>>>
>>>>>
>>>>>
>>>>> Kin EA3CV
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> De: IZ2LSC <iz2lsc.andrea at gmail.com <mailto:iz2lsc.andrea at gmailcom>>
>>>>> Enviado el: sábado, 21 de septiembre de 2024 13:10
>>>>> Para: The DXSpider Support list <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>>
>>>>> CC: Kin <ea3cv at cronux.net <mailto:ea3cv at cronux.net>>; Keith Maton <g6nhu at me.com <mailto:g6nhu at me.com>>
>>>>> Asunto: Re: [Dxspider-support] New mojo version
>>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Any change on the router that is doing the port forward?
>>>>>
>>>>> Maybe there is ddos protection on it that kick in.
>>>>>
>>>>>
>>>>>
>>>>> Are we sure that the disconnect is coming from dxspider and not from the router?
>>>>>
>>>>>
>>>>>
>>>>> I think we have to take a tcpdump to look at the tcp flow to understand from where the TCP RST or FIN is coming from.
>>>>>
>>>>>
>>>>>
>>>>> If you need help taking the tcpdump we can setup a call with screen sharing and I can guide you.
>>>>>
>>>>>
>>>>>
>>>>> 73
>>>>>
>>>>> Andrea, iz2lsc
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -->
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Il giorno sab 21 set 2024 alle ore 13:01 Kin via Dxspider-support <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>> ha scritto:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have been trying to help Keith with his problem, and after analysing
>>>>> everything I can think of, I can't see the reason for the disconnection with
>>>>> the traces we have.
>>>>>
>>>>> This is basically what is happening to him:
>>>>>
>>>>> 1726911492^(connect) ExtMsg accept 165:192.168.1.208 from
>>>>> 68.117.200.55:58828 <http://68.117.200.55:58828/>
>>>>> 1726911492^(connect) ExtMsg connect 165: login:
>>>>> 1726911492^(connect) connect 165: timeout set to 60
>>>>> 1726911492^(connect) connect 165: AE5DW
>>>>> 1726911492^(state) AE5DW channel func state 0 -> prompt
>>>>> 1726911492^(DXCommand) AE5DW connected from 68.117.200.55 cols 80
>>>>> 1726911492^(progress) CMD: 'unset/beep ' by AE5DW ip: 68.117.200.55 0mS
>>>>> 1726911492^(progress) CMD: 'show/cluster ' by AE5DW ip: 68.117.200.55 0mS
>>>>> 1726911492^(DXCommand) AE5DW disconnected
>>>>>
>>>>> But with the rest of the users it is not failing.
>>>>>
>>>>> Kin EA3CV
>>>>>
>>>>>
>>>>> -----Mensaje original-----
>>>>> De: Dxspider-support <dxspider-support-bounces at tobit.co.uk <mailto:dxspider-support-bounces at tobit.co.uk>> En nombre de
>>>>> Keith Maton via Dxspider-support
>>>>> Enviado el: sábado, 21 de septiembre de 2024 12:30
>>>>> Para: The DXSpider Support list <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>>
>>>>> CC: Keith Maton <g6nhu at me.com <mailto:g6nhu at me.com>>
>>>>> Asunto: Re: [Dxspider-support] New mojo version
>>>>>
>>>>> This morning I took a fresh Pi, a new SSD and built a new node from scratch.
>>>>> I copied over the user file and imported it. I also copied the spots
>>>>> directory so no history would be lost and the filters directory so my users
>>>>> would still have their filters.
>>>>>
>>>>> I also copied my startup file, my connect scripts and my crontab.
>>>>>
>>>>> I hashed out pretty much everything in the crontab. I started the node,
>>>>> disconnected some links from the old one and manually started them on the
>>>>> new one to confirm I could connect and get data in.
>>>>>
>>>>> Then I stopped the old node and changed the port forwarding in my router to
>>>>> the new one.
>>>>>
>>>>> It’s no different. I’m still getting exactly the same thing. Some (but not
>>>>> all) HamClocks are connecting and then immediately being disconnected before
>>>>> they can send any commands. I’m 99.9% sure the disconnect is coming from
>>>>> the dxspider and not the HamClock because HamClock tracks whether the
>>>>> disconnect is coming from local or remote.
>>>>>
>>>>> There’s no pattern to this, it doesn’t seem to be HamClock version specific
>>>>> as I sent a sample to the developer who checked and saw multiple different
>>>>> versions.
>>>>>
>>>>> The HamClock connects
>>>>> I see the connection in the debug log and then immediately, after two
>>>>> commands are forced by the node (unset/beep and show/cluster), the node
>>>>> disconnects.
>>>>> This repeats ten times then the HamClock stops connecting for one hour
>>>>> because it’s reached its hard limit of ten disconnects/hour. It only tracks
>>>>> remote disconnections towards this limit.
>>>>>
>>>>> But the crazy and unexplained thing is that when I reverted back to build
>>>>> 536 by restoring a backup, the same thing is still happening. Nothing has
>>>>> changed on my network as the connections are still making it to the node.
>>>>>
>>>>> I’m really lost here. I feel bad because there are well over 200 people who
>>>>> won’t have been able to connect since Thursday afternoon. They’ve probably
>>>>> gone over to other nodes, which is fine but it doesn’t resolve the problem
>>>>> I’ve got here and what could happen to me could happen to anyone. I’ve
>>>>> gone out of my way recently to push my node as the best for HamClocks
>>>>> (because I know a lot of sysops weren’t happy with it) and now it’s utterly
>>>>> rubbish for them.
>>>>>
>>>>> I owe it to my users to try and resolve this but at the moment, I feel as
>>>>> though after eight years of running a node (which I appreciate is a lot less
>>>>> than many), I just want to switch the damn thing off. I’m not going to,
>>>>> because I don’t like things to beat me but it’s very, very frustrating.
>>>>>
>>>>> 73 Keith
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> > On 21 Sep 2024, at 04:25, Rene Olsen via Dxspider-support
>>>>> <dxspider-support at tobit.co.uk <mailto:dxspider-support at tobit.co.uk>> wrote:
>>>>> >
>>>>> > Hi.
>>>>> >
>>>>> > Still waiting for a replay as to why G6NHU-2 lost like 75% of his
>>>>> > users before I do anything with the new version.
>>>>> >
>>>>> > So, will at least wait until next week. Like W1NR, I never update just
>>>>> > before or during a weekend.
>>>>> >
>>>>> > Vy 73 de René / OZ1LQH
>>>>> >
>>>>> > On 20 Sep 2024 at 17:44, Kin via Dxspider-support wrote:
>>>>> >
>>>>> >> Hi,
>>>>> >>
>>>>> >> The new build is working very well for me.
>>>>> >> Only 60 out of 318 dxspider have been updated.
>>>>> >> Cheer up, it's been in testing for a while and it's stable.
>>>>> >>
>>>>> >> 73 de Kin EA3CV
>>>>> >>
>>>>> >>
>>>>> >> De: Dxspider-support <dxspider-support-bounces at tobit.co.uk <mailto:dxspider-support-bounces at tobit.co.uk>> En nombre
>>>>> >> de Dirk Koopman via Dxspider-support Enviado el: jueves, 19 de
>>>>> >> septiembre de 2024 15:24
>>>>> >> Para: Dxspider-Support <dxspider-support at dxcluster.org <mailto:dxspider-support at dxcluster.org>>
>>>>> >> CC: Dirk Koopman <djk at tobit.co.uk <mailto:djk at tobit.co.uk>>
>>>>> >> Asunto: [Dxspider-support] New mojo version
>>>>> >>
>>>>> >> There is a new mojo version which has been under test by a few brave
>>>>> sysops and they have determined that it is stable. Please look at the
>>>>> Changes file for the list of issues dealt with.
>>>>> >>
>>>>> >> One of the issues that has become apparent is the random lock status
>>>>> (historically) granted to new nodes that appear on the network. For some
>>>>> reason they defaulting to "unlocked". I don't understand why this has
>>>>> suddenly become a problem AGAIN, but it does seem to affect longer running
>>>>> nodes more than newer ones.
>>>>> >>
>>>>> >> This release is an attempt to fix this. It will lock all nodes that are
>>>>> not specifically unlocked via explicit unset/lock or set/spider type
>>>>> commands. Unfortunately, previous attempts to deal with this may have got
>>>>> this all confused and it *MAY* (and I stress this) mean that a (very) few of
>>>>> your older node partners *MIGHT* get locked out. If this happens then simply
>>>>> unset/lock or set/spider any of these nodes manually.
>>>>> >>
>>>>> >> There is new spot deduping code which seems to reduce the number of
>>>>> dupes, but since I have not been able to reproduce this further than making
>>>>> sure that nodes that issue multiple dupe spots with the same sequence number
>>>>> don't cause dupes.
>>>>> >>
>>>>> >> 73 Dirk G1TLH
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Dxspider-support mailing list
>>>>> > Dxspider-support at tobit.co.uk <mailto:Dxspider-support at tobit.co.uk>
>>>>> > https://mailmantobit.co.uk/mailman/listinfo/dxspider-support <https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dxspider-support mailing list
>>>>> Dxspider-support at tobit.co.uk <mailto:Dxspider-support at tobit.co.uk>
>>>>> https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dxspider-support mailing list
>>>>> Dxspider-support at tobit.co.uk <mailto:Dxspider-support at tobit.co.uk>
>>>>> https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
>>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20240921/69b0f975/attachment-0001.htm>
More information about the Dxspider-support
mailing list