[Dxspider-support] UTF-8?

Rene Olsen rene at rcolsen.dk
Mon Nov 21 19:02:30 GMT 2022


Hi Dirk.

To make the migration to the new system as easy as possible, I decided to go with the 
master branch, so I could pretty much just do an rsync from the old box to the new one.

I plan to update to mojo, but will leave the system running for a bit first.

I am using: 
DXSpider v1.55 (build 249 git: master/2fc6c64f[r]) using perl v5.32.1 on Linux

I was using watchdbg, when I saw those "p%C3%A5" things. I started up the old cluster 
again, and it actually looks exactly the same, but what I see as a user is different. So I guess 
it does exactly like the old system did.

And yes, UTF-8, or the lack of UTF-8, support can be a pain in the butt, and things can be 
changed to something totally different, when traversing various OS systems.

So for now at least, the new system seems to be behaving like the old one. I just hope that 
the new system is as stable as the old one. One never knows when getting a VPS, how it will 
turn out.

Vy 73 de René / OZ1LQH


On 21 Nov 2022 at 17:51, Dirk Koopman via Dxspider-sup wrote:

> It's a good question. Well asked.
> 
> The short answer is that nothing has changed (that I know about) between 
> mojo and master. But there are a whole lot of issues. Basically, and I 
> am sorry to explicit about this, there is a problem with Americans and 
> Microsoft (not necessarily in that order) and their computer setups. And 
> that problem is "code pages" and / or "Wide characters". Code Pages are 
> a kludge from the early IBM computer(ish) era that persists on the 
> network because Amateurs like to use old hardware and keep them going. 
> Microsoft has the problem that it thinks it creates standards and can't 
> be bothered to fall in line until it's good and ready. So if one is 
> using Windows 7+ and you have selected a UTF-8 language for your 
> computer then that's likely to be just fine. More proportionally more US 
> amateurs are using default Microsoft language setups which are not UTF-8 
> based. They are also using old, unsupported (and sadly now 
> unsupportable), Node software (yes, I am looking at you AR Cluster) 
> which doesn't understand UTF anything.
> 
> So there is a kludge in DXSpider that can produce weird things like this 
> where it tries to display (for instance) some comment in a non-US 
> English language. In theory it should pass it through and, AFAIK, it 
> does. But what comes out at the other end will depend on the route is 
> has taken. Stuff gets stripped out or added in by other node software - 
> having said that, your example looks like valid UTF-8 but hexified. I do 
> do stuff to try to cope with duplicates (and a few other things) that 
> strips out "special" characters, in a attempt to deal with duplicate 
> text that reappears when it comes out of old node software connections. 
> But this should not affect display. But I have not looked at this for 
> quite a while (read: several years).
> 
> For the absence of doubt: perl supports (and uses) utf-8 internally and 
> has done since at least 5.10.1. I want to make DXSpider (mojo) utf-8 
> compliant. In fact: MAKE IT STANDARD. But, sadly, that will throw up all 
> sorts of other problems which may limit sources of spots. Please don't 
> scream at me, it's  not my fault.
> 
> There seems to be an increasing list of things I need to look at and fix 
> (sigh).
> 
> René, do you know which (git) version of DXSpider, or failing that, the 
> software date you were using? It would help me enormously.
> 
> 73 Dirk G1TLH
> 
> On 21/11/2022 17:08, Rene Olsen via Dxspider-support wrote:
> > Hi.
> >
> > Just started up my new install in prodiction, but noticed that what I had as test on the old
> > system with special Danish characters are now shown as "ind p%C3%A5 tilf%C3%A6ldige
> > dx" for example.
> >
> > I have checked locale, and they are both set to the same, so not sure if it is an issue, or if it is
> > just the way it is.
> >
> > Been years since I have been fiffling with this, since the old system just ran and ran and ran.
> > Just did a spider update now and then, but otherwise the server had been running for 2100+
> > days.
> >
> > So in short. Do I need to make changes, or does spider live fine with UTF-8, or do I need to
> > install yet another perl package that I don't know about?
> >
> > Thanks in advance.
> >
> > Vy 73 de René / OZ1LQH
> >
> >
> >
> > _______________________________________________
> > Dxspider-support mailing list
> > Dxspider-support at tobit.co.uk
> > https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support
> 
> 
> _______________________________________________
> Dxspider-support mailing list
> Dxspider-support at tobit.co.uk
> https://mailman.tobit.co.uk/mailman/listinfo/dxspider-support






More information about the Dxspider-support mailing list