[Dxspider-support] Cluster hangs
Jimmy Turner
k5jtj at arrl.net
Sun Mar 6 00:25:32 GMT 2005
I had the same thing happen to our cluster, but I would get a message
like this "cannot fork, try again" every time I tried to issue a
command. I checked the ps ax command and found lots of stuff called
"hotplug" running, about 60-100 instances. I also found some alerts in
my /var/log/secure.log files pertaining to invalid SSH logins. I
figured that this was an omen telling me to shut down SSH until I really
need it. Since I have disabled SSH on the box (it's local anyway) I
have not had any other lock ups.
I hope this helps,
73 de K5JTJ
sysop K5PLD-2 Spider DXCluster.
On Sat, 2005-03-05 at 17:29, Kelly Jones wrote:
> Hi guys,
>
> I seem to have an issue somewhere. For whatever reason, DX Spider just
> seems to 'hang' for no apparent reason. I noticed this during the ARRL CW
> contest a couple of weeks ago and it appears to be doing it again during
> the phone contest.
>
> If I try to enter any command or even a simple carriage return, I get no
> response on the screen. If I try to telnet in, I am connected, but never
> greeted with the 'welcome' message.
>
> If I let the cluster continue to run, eventually (anywhere from 10 to 30
> minutes later) it will 'break loose' and spew the last x minutes of data to
> the screen all at once. Once this happens the cluster appears to hang once
> again and we repeat the symptoms as above.
>
> If it tail the .dat file in the /spider/data/debug/2005/xxx.dat, there is
> no activity being written to this file when the cluster is
> 'hung'. However, once it 'lets go', all of the activity is written at once.
>
> The only way to 'reset' the cluster is to restart it. Once a restart
> happens, it's good for a while at which time it eventually hangs again.
>
> If I run a 'top', there appears to be almost 0 cluster activity and the box
> is running at 99% idle. Currently running v1.51 build 58.323.
>
> I never had this problem with the old cluster which was running code base
> from about two years ago, but for whatever reason this new box isn't
> performing very well.
>
> Any ideas what could be causing my 'hang'?
>
> 73
> Kelly - N0VD
>
>
> _______________________________________________
> Dxspider-support mailing list
> Dxspider-support at dxcluster.org
> http://www.tobit.co.uk/mailman/listinfo/dxspider-support
>
>
More information about the Dxspider-support
mailing list