Page 1 of 1

International characters broken on vulnscan.org

Posted: Sat Aug 21, 2004 4:53 pm
by AngryWolf
In the past days I noticed that the German and Hungarian translations of unreal32docs.html don't look the way they used to on vulnscan.org. The language-specific characters aren't appearing correctly. I guess something's happened with the web server which is resulting in this behaviour, but I'm not sure.

Here is how the Hungarian version looks now: bad.jpg
And how it should look: good.jpg

Could someone please fix this?

Posted: Mon Oct 04, 2004 10:02 am
by AngryWolf
Well, am I asking this question at the wrong place? :)

Posted: Mon Oct 04, 2004 5:39 pm
by ToNyOmAn
Yep, it's ugly, and makes it a bit harder to read |:

Posted: Thu Oct 07, 2004 9:30 am
by AngryWolf
No reply, I'll assume it won't be fixed at all.

Posted: Thu Oct 07, 2004 9:12 pm
by codemastr
We haven't a clue why it is doing it. The only conceivable problem is that it is an Apache bug, in which case there isn't much we can do. If you download the file and save it to a .html, it displays fine. If you view it from the web, it doesn't. That means the server is screwing it up somehow. Likely through the HTTP headers which we really can't control.

Posted: Fri Oct 08, 2004 4:06 am
by Syzop
Hm, first time I see this thread :p.

Posted: Fri Oct 08, 2004 8:22 am
by Dukat
Why don't you use the correct HTML Entities (i.e. ä instead of ä)? That would solve the problem...

Posted: Fri Oct 08, 2004 12:38 pm
by AngryWolf
Sorry, Dukat, but I definitely hate those entities. Once because, they make the documentations a lot larger in size, once because they don't support every international characters (details here), and once because, for example, the HTML representation of the Hungarian ő becomes ? in w3m (my favourite text based browser for *NIX).

Well, from what I see with Web-Sniffer, the web server says: Content-Type: text/html; charset=UTF-8 in the HTTP header, but the document uses a different charset (iso-8859-2) as the meta tag shows. Unluckily, if the character set is defined in both places, the header line wins.

I don't think there's no configurable option in Apache to define different character sets for certain HTML files. Can a .var file probably help?

Posted: Sat Oct 09, 2004 12:04 am
by Syzop
Fixed.
(AddDefaultCharset off @ .htaccess)