[Dxspider-support] Cluster failure

Joaquin . joaquin at cronux.net
Mon May 18 11:33:00 CEST 2020


Paul, I was using the Mojo branch for more than 1 month in a VM much higher
than the current one with 2 CPUs, 8 GB RAM and 1 SSD, I had no problems.
Also another cluster on VMWare with 1 CPU, 4 GB and 1 SSD, and only
experienced two crashes in over a month.

In no case have I exceeded 20 local users with 12 nodes.
Perhaps the difference with other DXSpider clusters is in the amount of
traffic, for example, yesterday the total volume of packages reached about
2,700,000 of which about ~ 83K human spots, ~ 129K RBN spots, if we also
take into account that there were users who were using very detailed
filters, and with many band changes that involved requesting the refresh of
your bandmap (there was a contest).
The CPU load was always very low.
But at one point CPU consumption and read-to-disk operations skyrocketed
(up to 50 times more than average), and from there ...

Dirk, this is new scenario:

Cluster EA3CV-4                         Cluster EA3CV-2
-----------------------------           -----------------
Users:       6                          6
Nodes:       4                          13
Uptime:      0.00, 0.01, 0.00           0.05, 0.03, 0.00
Load Perl:   0.5 %                      0.7 %
Mem:         16.1 %                     7.1 %

1 CPU, 1 GB, SSD                        1 CPU, 2 GB, HDD
Ubuntu 18.04.4 LTS                      Ubuntu 18.04.4 LTS
Linux 5.3.0-1020-azure                  Linux 5.3.0-1020-azure
1.57 build 227                          1.57 build 227

Note. "Load Perl" was obtained like this: ps -Ao pcpu, cmd --sort = -pcpu |
grep 'perl'
          "uptime" is not always significant.

As you can see, there are no big differences between both VMs and their
performance.
In EA3CV-4: From the system graphs an average load of 1.5% and Disk Write
Operation 4/s are obtained.
In EA3CV-2: From the system graphics an average load of 1.7% and Disk Write
Operation 5/s are obtained.

EA3CV-4 cluster installed today.
EA3CV-2 updated today.

to be continue...

Kin
EA3CV

El dom., 17 may. 2020 a las 22:22, Dirk Koopman via Dxspider-support (<
dxspider-support at tobit.co.uk>) escribió:

> On 17/05/2020 20:20, Paul Pescitelli via Dxspider-support wrote:
> > I am running instances of Mojo on Google Cloud in 3 different regions
> > and <knock on wood> have no issues.
> >
> > Last 7 days was less than 1% CPU utilization.
> >
>
> I'm guessing that you are not hammering the disk with the debugging from
> 800+ users and 26 nodes? THAT instance wouldn't run Mojo at snails pace
> but (unlike anywhere else) could run on the master branch. Except that
> it would lose its users file from time to time and refused to output any
> actual user records into the weekly user_asc. It now runs very happily
> on a $6/month droplet ($5 + $1/month for a weekly backup) at Digital Ocean.
>
> top - 20:16:57 up 25 days,  1:09,  3 users,  load average: 0.05, 0.08, 0.09
>
> sh/cl: Nodes: 27/412 Clr - Users: 932/5958 Clr Max: 1011/7056 Clr -
> Uptime: 6 19:08
>
> 73 Dirk G1TLH
>
>
>
>
> _______________________________________________
> Dxspider-support mailing list
> Dxspider-support at tobit.co.uk
> https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.tobit.co.uk/pipermail/dxspider-support/attachments/20200518/7f13233a/attachment.htm>


More information about the Dxspider-support mailing list