[Dxspider-support] New mojo version

Dirk Koopman djk at tobit.co.uk
Thu Sep 26 13:37:32 BST 2024


I have been away from computers rather a lot recently, but I my 
(failing) memory suggests that this sort of thing has happened before 
and I enclose some comments from the code that *may* shed some light on 
what is going on with HamClock. This code has not be altered since May 
2020. The point of this is to discourage software hammering away at a 
node, filling the logs with never ending connect requests.


$bumpexisting = 1;                # 1 = allow new connection to 
disconnect old, 0 - don't allow it
our $allowmultiple = 0;                # This is used in conjunction 
with $bumpexisting, in a rather weird way.
our $min_reconnection_rate = 5*60;        # minimum value of seconds 
between connections per user to allow co-existing users
our $max_ssid = 15;                    # highest ssid to be searched for 
a spare one on multiple connections

# If $allowmultiple > 0 and the $reconnection_rate is some value of seconds
# based on the average connection time calculated from the 
$user->conntimel entries / frequency is
# less than $reconnection_rate then we assume that there is more than 
one device (probably HRD) trying
# to connect "at once". In which case we probe for a spare SSID for a 
user callsign to allow up to
# $allowmultiple connections per callsign.

and

$maxconnect_user = 3;            # the maximum no of concurrent 
connections a user can have at a time
$maxconnect_node = 0;            # Ditto but for nodes. In either case 
if a new incoming connection
                                 # takes the no of references in the 
routing table above these numbers
                                 # then the connection is refused. This 
only affects INCOMING connections.


This is to prevent people trying to break the (default 3) limit of 
connections a user can have to the (global) cluster. Unfortunately, I 
won't give DXSpider privileges to install "fail2ban" style iptables 
block commands for an IP address that does this. But, having written 
that, I could put together an actual "fail2ban" recipe and the necessary 
debugging so that it can do this for us. Alternatively, you could fiddle 
with $allowmultiple and $min_reconnection_rate to see whether affects 
anything.

If this is a "bump existing" situation, then you should see evidence in 
the both the debug and log files. A "show/log bumped" may show 
something. Also a "stat/user <affected callsign>" may also show some clues.

Also, Keith, if you would send me, privately,  some login credentials by 
email, I will have a look over the next couple of days.

73 Dirk G1TLH

On 21/09/2024 15:41, IZ2LSC via Dxspider-support wrote:
> Keith,
> from what was shared in this thread I can see the reset is received by 
> the dxspider, so someone else has generated it.
> Before going for any conclusion, I want to be sure that the tcpdump 
> that was shared is really about a user that is having the problem.
> This is why I asked to get the tcpdump for a user IP as long as the 
> dxspider debug log for the same user captured at the same time.
>
> My hamclock is able to connect to your cluster without any issue and I 
> tried several disconnect/connect sequences.
>
> Andrea
>
>
>
>
>
> -->
>
>
> Il giorno sab 21 set 2024 alle ore 16:30 Keith Maton <g6nhu at me.com> ha 
> scritto:
>
>     So what’s the current feeling, is the disconnect coming from
>     HamClock or the DXSpider?
>
>     I don’t think we can send attachments to this list so here’s a
>     link <https://g6nhu.co.uk/users-week.png> to the mrtg users graph.
>
>     You’ll see it stops at 20:30z on Wednesday.  That’s because it all
>     went wrong when I did the update on Thursday afternoon and then a
>     couple of hours later I restored the 536 backup that was taken the
>     previous evening.  The gap is from the time of the backup to when
>     I restored.
>
>     I’ve gone back to exactly how it was before the update.   I talk
>     to the HamClock dev daily and there are multiple different
>     versions of HamClock all unable to connect.
>
>     I simply don’t know where to go from here, especially as I built a
>     new node on a different pi this morning and the same thing happens.
>
>     73 Keith.
>
>
>
>>     On 21 Sep 2024, at 14:58, Kin EA3CV <ea3cv at cronux.net> wrote:
>>
>>     Yes, there is clearly something HamClock doesn't like. I haven't
>>     looked at a HamClock user that works but the ones that fail don't
>>     terminate the socket with FIN.
>>     I had thought about setting up a client in a container, but if
>>     you try it, you'll let us know.
>>
>>     Kin
>>
>>
>>
>>     ------------------------------------------------------------------------
>>     *De:* IZ2LSC <iz2lsc.andrea at gmail.com>
>>     *Enviado:* sábado, septiembre 21, 2024 3:23:04 p. m.
>>     *Para:* Kin <ea3cv at cronux.net>
>>     *CC:* The DXSpider Support list <dxspider-support at tobit.co.uk>;
>>     Keith Maton <g6nhu at me.com>
>>     *Asunto:* Re: [Dxspider-support] New mojo version
>>
>>     Kin,
>>     the netstat looks fine, I can see 87 sessions established.
>>     But from the TCP dump you attached I see a lot of RST (reset) 
>>     coming from client side, not from cluster.
>>
>>     Just to give you an example this is what happen when is the
>>     cluster disconnecting a user (192.168.1130 is the cluster):
>>
>>     15:15:53.503505 IP 192.168.1130.7373 > 192.168.1.111.52076: Flags
>>     [F.], seq 4172, ack 8, win 227, options [nop,nop,TS val
>>     2856414384 ecr 3254272443], length 0
>>     15:15:53.504215 IP 192.168.1.111.52076 > 192.168.1.130.7373:
>>     Flags [F.], seq 8, ack 4173, win 501, options [nop,nop,TS val
>>     3254273942 ecr 2856414384], length 0
>>     15:15:53.504340 IP 192.168.1.130.7373 > 192.168.1.111.52076:
>>     Flags [], ack 9, win 227, options [nop,nop,TS val 2856414385 ecr
>>     3254273942], length 0
>>
>>     So the cluster is the first sending the Fin
>>
>>     Can you try to follow a specific flow, correlating the IP address
>>     you see in the debug log of dxspider with the ip address you find
>>     on the tcpdump?
>>     I mean the sessions from the beginning to end.
>>
>>     Meantime I'll set up a hamclock and test it with your cluster.
>>
>>
>>     Andrea
>>
>>
>>
>>     -->
>>
>>
>>     Il giorno sab 21 set 2024 alle ore 14:43 Kin <ea3cv at cronux.net>
>>     ha scritto:
>>
>>         I think it is clear that the client is being logged out:
>>
>>         716033.786132216.189.132.128 → 192.168.1.8TCP 72 56774 → 7300
>>         [RST, ACK] Seq=10 Ack=8 Win=32128 Len=0 TSval=1580625328
>>         TSecr=201034149
>>
>>         716833.896513 216.189.132.128 → 192.168.1.8TCP 66 56774 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         716933.896698 216.189.132.128 → 192.168.1.8TCP 66 56774 →
>>         7300 [RST] Seq=7 Win=0 Len=0
>>
>>         717033.906293 216.189.132.128 → 192.168.1.8TCP 66 56774 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         717834.340220 171.100.240.62 → 192.168.1.8TCP 66 63285 → 7300
>>         [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
>>
>>         844839.243542 209.193.104.69 → 192.168.1.8TCP 66 40996 → 7300
>>         [RST] Seq=1 Win=0 Len=0
>>
>>         953250.372818 72.14.148.41 → 192.168.1.8TCP 72 64276 → 7300
>>         [RST, ACK] Seq=2 Ack=2 Win=251 Len=0 TSval=3068700830
>>         TSecr=1953125033
>>
>>         1960091.80907574.132.91.47 → 192.168.1.8TCP 72 36452 → 7300
>>         [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598464619
>>         TSecr=4079102387
>>
>>         1974991.93485774.132.91.47 → 192.168.1.8TCP 66 36452 → 7300
>>         [RST] Seq=10 Win=0 Len=0
>>
>>         1975091.93707474.132.91.47 → 192.168.1.8TCP 66 36452 → 7300
>>         [RST] Seq=13 Win=0 Len=0
>>
>>         2143997.24590467.190.210.166 → 192.168.1.8TCP 66 57758 → 7300
>>         [RST] Seq=1 Win=0 Len=0
>>
>>         23946 104.730291 86.150.197.182 → 192.168.1.8TCP 66 49319 →
>>         7300 [RST, ACK] Seq=1 Ack=9 Win=0 Len=0
>>
>>         23947 104.730291 86.150.197.182 → 192.168.1.8TCP 66 49316 →
>>         7300 [RST, ACK] Seq=1 Ack=2 Win=0 Len=0
>>
>>         24595 106.702456 74.132.91.47 → 192.168.1.8TCP 72 58432 →
>>         7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598479515
>>         TSecr=4079117277
>>
>>         24614 106.848106 74.132.91.47 → 192.168.1.8TCP 66 58432 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         24618 106.919363 74.132.91.47 → 192.168.1.8TCP 66 58432 →
>>         7300 [RST] Seq=13 Win=0 Len=0
>>
>>         25499 114.246740 67.190.210.166 → 192.168.1.8TCP 66 41444 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         26057 118.535648 72.14.148.41 → 192.168.1.8TCP 72 22727 →
>>         7300 [RST, ACK] Seq=1 Ack=9 Win=64256 Len=0 TSval=3068768993
>>         TSecr=1953190036
>>
>>         27133 121.696803 74.132.91.47 → 192.168.1.8TCP 72 33884 →
>>         7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598494508
>>         TSecr=4079132270
>>
>>         27149 121.815184 74.132.91.47 → 192.168.1.8TCP 66 33884 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         27150 121.815249 74.132.91.47 → 192.168.1.8TCP 66 33884 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         27151 121.815250 74.132.91.47 → 192.168.1.8TCP 66 33884 →
>>         7300 [RST] Seq=13 Win=0 Len=0
>>
>>         29651 131.245565 67.190.210.166 → 192.168.1.8TCP 66 56690 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         29689 132.322988 171.100.240.62 → 192.168.1.8TCP 66 63313 →
>>         7300 [RST, ACK] Seq=1 Ack=9 Win=0 Len=0
>>
>>         29690 132.323664 171.100.240.62 → 192.168.1.8TCP 66 63298 →
>>         7300 [RST, ACK] Seq=1 Ack=2 Win=0 Len=0
>>
>>         30075 136.719069 74.132.91.47 → 192.168.1.8TCP 72 51106 →
>>         7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598509531
>>         TSecr=4079147277
>>
>>         30094 136.842612 74.132.91.47 → 192.168.1.8TCP 66 51106 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         30512 139.246966 74.132.91.47 → 192.168.1.8TCP 66 46146 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         30730 141.916039 72.181.212.51 → 192.168.1.8TCP 72 52404 →
>>         7300 [RST, ACK] Seq=10 Ack=8 Win=32128 Len=0 TSval=10881996
>>         TSecr=4118248549
>>
>>         32092 148.245539 67.190.210.166 → 192.168.1.8TCP 66 60236 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         33326 151.728538 74.132.91.47 → 192.168.1.8TCP 72 47306 →
>>         7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598524532
>>         TSecr=4079162306
>>
>>         33340 151.867383 74.132.91.47 → 192.1681.8TCP 66 47306 → 7300
>>         [RST] Seq=1 Win=0 Len=0
>>
>>         33341 151.867471 74.132.91.47 → 192.168.1.8TCP 66 47306 →
>>         7300 [RST] Seq=10 Win=0 Len=0
>>
>>         33342 151.868904 74.132.91.47 → 192.168.1.8TCP 66 47306 →
>>         7300 [RST] Seq=13 Win=0 Len=0
>>
>>         34141 156.245366 67.190.210.166 → 192.168.1.8TCP 66 59968 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         36145 166.704908 74.132.91.47 → 192.168.1.8TCP 72 55558 →
>>         7300 [RST, ACK] Seq=13 Ack=8 Win=64256 Len=0 TSval=2598539512
>>         TSecr=4079177284
>>
>>         36150 166.844112 74.132.91.47 → 192.168.1.8TCP 66 55558 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         36151 166.844112 74.132.91.47 → 192.168.1.8TCP 66 55558 →
>>         7300 [RST] Seq=13 Win=0 Len=0
>>
>>         37488 173.246799 67.190.210.166 → 192.168.1.8TCP 66 55428 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         37877 176.782641 72.14.148.41 → 192.168.1.8TCP 72 47454 →
>>         7300 [RST, ACK] Seq=1 Ack=9 Win=64256 Len=0 TSval=3068827240
>>         TSecr=1953255335
>>
>>         38468 182.245044 212.251.236.77 → 192.168.1.8TCP 66 25610 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         40367 190.261692 67.190.210.166 → 192.168.1.8TCP 66 41508 →
>>         7300 [RST] Seq=1 Win=0 Len=0
>>
>>         Kin EA3CV
>>
>>         *De:*IZ2LSC <iz2lsc.andrea at gmail.com
>>         <mailto:iz2lsc.andrea at gmailcom>>
>>         *Enviado el:* sábado, 21 de septiembre de 2024 13:10
>>         *Para:* The DXSpider Support list <dxspider-support at tobit.co.uk>
>>         *CC:* Kin <ea3cv at cronux.net>; Keith Maton <g6nhu at me.com>
>>         *Asunto:* Re: [Dxspider-support] New mojo version
>>
>>         Hi,
>>
>>         Any change on the router that is doing the port forward?
>>
>>         Maybe there is ddos protection on it that kick in.
>>
>>         Are we sure that the disconnect is coming from dxspider and
>>         not from the router?
>>
>>         I think we have to take a tcpdump to look at the tcp flow to
>>         understand from where the TCP RST or FIN is coming from.
>>
>>         If you need help taking the tcpdump we can setup a call with
>>         screen sharing and I can guide you.
>>
>>         73
>>
>>         Andrea, iz2lsc
>>
>>         -->
>>
>>         Il giorno sab 21 set 2024 alle ore 13:01 Kin via
>>         Dxspider-support <dxspider-support at tobit.co.uk> ha scritto:
>>
>>             Hi,
>>
>>             I have been trying to help Keith with his problem, and
>>             after analysing
>>             everything I can think of, I can't see the reason for the
>>             disconnection with
>>             the traces we have.
>>
>>             This is basically what is happening to him:
>>
>>             1726911492^(connect) ExtMsg accept 165:192.168.1.208 from
>>             68.117.200.55:58828 <http://68.117.200.55:58828/>
>>             1726911492^(connect) ExtMsg connect 165: login:
>>             1726911492^(connect) connect 165: timeout set to 60
>>             1726911492^(connect) connect 165: AE5DW
>>             1726911492^(state) AE5DW channel func  state 0 -> prompt
>>             1726911492^(DXCommand) AE5DW connected from 68.117.200.55
>>             cols 80
>>             1726911492^(progress) CMD: 'unset/beep ' by AE5DW ip:
>>             68.117.200.55 0mS
>>             1726911492^(progress) CMD: 'show/cluster ' by AE5DW ip:
>>             68.117.200.55 0mS
>>             1726911492^(DXCommand) AE5DW disconnected
>>
>>             But with the rest of the users it is not failing.
>>
>>             Kin EA3CV
>>
>>
>>             -----Mensaje original-----
>>             De: Dxspider-support
>>             <dxspider-support-bounces at tobit.co.uk> En nombre de
>>             Keith Maton via Dxspider-support
>>             Enviado el: sábado, 21 de septiembre de 2024 12:30
>>             Para: The DXSpider Support list
>>             <dxspider-support at tobit.co.uk>
>>             CC: Keith Maton <g6nhu at me.com>
>>             Asunto: Re: [Dxspider-support] New mojo version
>>
>>             This morning I took a fresh Pi, a new SSD and built a new
>>             node from scratch.
>>             I copied over the user file and imported it.  I also
>>             copied the spots
>>             directory so no history would be lost and the filters
>>             directory so my users
>>             would still have their filters.
>>
>>             I also copied my startup file, my connect scripts and my
>>             crontab.
>>
>>             I hashed out pretty much everything in the crontab.  I
>>             started the node,
>>             disconnected some links from the old one and manually
>>             started them on the
>>             new one to confirm I could connect and get data in.
>>
>>             Then I stopped the old node and changed the port
>>             forwarding in my router to
>>             the new one.
>>
>>             It’s no different. I’m still getting exactly the same
>>             thing. Some (but not
>>             all) HamClocks are connecting and then immediately being
>>             disconnected before
>>             they can send any commands.  I’m 99.9% sure the
>>             disconnect is coming from
>>             the dxspider and not the HamClock because HamClock tracks
>>             whether the
>>             disconnect is coming from local or remote.
>>
>>             There’s no pattern to this, it doesn’t seem to be
>>             HamClock version specific
>>             as I sent a sample to the developer who checked and saw
>>             multiple different
>>             versions.
>>
>>             The HamClock connects
>>             I see the connection in the debug log and then
>>             immediately, after two
>>             commands are forced by the node (unset/beep and
>>             show/cluster), the node
>>             disconnects.
>>             This repeats ten times then the HamClock stops connecting
>>             for one hour
>>             because it’s reached its hard limit of ten
>>             disconnects/hour.  It only tracks
>>             remote disconnections towards this limit.
>>
>>             But the crazy and unexplained thing is that when I
>>             reverted back to build
>>             536 by restoring a backup, the same thing is still
>>             happening. Nothing has
>>             changed on my network as the connections are still making
>>             it to the node.
>>
>>             I’m really lost here.  I feel bad because there are well
>>             over 200 people who
>>             won’t have been able to connect since Thursday
>>             afternoon.  They’ve probably
>>             gone over to other nodes, which is fine but it doesn’t
>>             resolve the problem
>>             I’ve got here and what could happen to me could happen to
>>             anyone.   I’ve
>>             gone out of my way recently to push my node as the best
>>             for HamClocks
>>             (because I know a lot of sysops weren’t happy with it)
>>             and now it’s utterly
>>             rubbish for them.
>>
>>             I owe it to my users to try and resolve this but at the
>>             moment, I feel as
>>             though after eight years of running a node (which I
>>             appreciate is a lot less
>>             than many), I just want to switch the damn thing off. 
>>             I’m not going to,
>>             because I don’t like things to beat me but it’s very,
>>             very frustrating.
>>
>>             73 Keith
>>
>>
>>
>>
>>             > On 21 Sep 2024, at 04:25, Rene Olsen via Dxspider-support
>>             <dxspider-support at tobit.co.uk> wrote:
>>             >
>>             > Hi.
>>             >
>>             > Still waiting for a replay as to why G6NHU-2 lost like
>>             75% of his
>>             > users before I do anything with the new version.
>>             >
>>             > So, will at least wait until next week. Like W1NR, I
>>             never update just
>>             > before or during a weekend.
>>             >
>>             > Vy 73 de René / OZ1LQH
>>             >
>>             > On 20 Sep 2024 at 17:44, Kin via Dxspider-support wrote:
>>             >
>>             >> Hi,
>>             >>
>>             >> The new build is working very well for me.
>>             >> Only 60 out of 318 dxspider have been updated.
>>             >> Cheer up, it's been in testing for a while and it's
>>             stable.
>>             >>
>>             >> 73 de Kin EA3CV
>>             >>
>>             >>
>>             >> De: Dxspider-support
>>             <dxspider-support-bounces at tobit.co.uk> En nombre
>>             >> de Dirk Koopman via Dxspider-support Enviado el:
>>             jueves, 19 de
>>             >> septiembre de 2024 15:24
>>             >> Para: Dxspider-Support <dxspider-support at dxcluster.org>
>>             >> CC: Dirk Koopman <djk at tobit.co.uk>
>>             >> Asunto: [Dxspider-support] New mojo version
>>             >>
>>             >> There is a new mojo version which has been under test
>>             by a few brave
>>             sysops and they have determined that it is stable. Please
>>             look at the
>>             Changes file for the list of issues dealt with.
>>             >>
>>             >> One of the issues that has become apparent is the
>>             random lock status
>>             (historically) granted to new nodes that appear on the
>>             network. For some
>>             reason they defaulting to "unlocked". I don't understand
>>             why this has
>>             suddenly become a problem AGAIN, but it does seem to
>>             affect longer running
>>             nodes more than newer ones.
>>             >>
>>             >> This release is an attempt to fix this. It will lock
>>             all nodes that are
>>             not specifically unlocked via explicit unset/lock or
>>             set/spider type
>>             commands. Unfortunately, previous attempts to deal with
>>             this may have got
>>             this all confused and it *MAY* (and I stress this) mean
>>             that a (very) few of
>>             your older node partners *MIGHT* get locked out. If this
>>             happens then simply
>>             unset/lock or set/spider any of these nodes manually.
>>             >>
>>             >> There is new spot deduping code which seems to reduce
>>             the number of
>>             dupes, but since I have not been able to reproduce this
>>             further than making
>>             sure that nodes that issue multiple dupe spots with the
>>             same sequence number
>>             don't cause dupes.
>>             >>
>>             >> 73 Dirk G1TLH
>>             >>
>>             >
>>             >
>>             >
>>             >
>>             > _______________________________________________
>>             > Dxspider-support mailing list
>>             > Dxspider-support at tobit.co.uk
>>             >
>>             https://mailmantobit.co.uk/mailman/listinfo/dxspider-support
>>             <https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support>
>>
>>
>>             _______________________________________________
>>             Dxspider-support mailing list
>>             Dxspider-support at tobit.co.uk
>>             https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
>>
>>
>>             _______________________________________________
>>             Dxspider-support mailing list
>>             Dxspider-support at tobit.co.uk
>>             https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
>>
>>
>
>
> _______________________________________________
> Dxspider-support mailing list
> Dxspider-support at tobit.co.uk
> https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20240926/cf359199/attachment-0001.htm>


More information about the Dxspider-support mailing list