Page 1 of 1

support encoding UTF-8 in UnrealIRCD 4.x

Posted: Tue Jan 19, 2016 5:38 am
by Epic
Greetings. I represent the Russian user network. We have a very acute problem, literally on the verge of extinction like chat, tell whether ever support for UTF-8 encoding ?? Maybe somebody will create a special separate module connected to switch between encodings online? Team /codepage UTF-8, /codepage CP1251
Users who are using the new version of the popular client mIRC 7+ or mobile clients for Android, which by default encoding is UTF-8 can not communicate with the people who use old familiar customers with support only CP1251 encoding and it can not be combined without support through the server.

Once you have made such a powerful new program we have a big request to study the issue. It is very important for us. Thank you. Waiting for a response.

Re: support encoding UTF-8 in UnrealIRCD 4.x

Posted: Thu Jan 28, 2016 1:57 am
by dboyz
Hi, this feature has been requested here:

https://bugs.unrealircd.org/view.php?id=3719

Thanks

Re: support encoding UTF-8 in UnrealIRCD 4.x

Posted: Thu Mar 24, 2016 7:40 pm
by k4be
Should work in 3.2, needs converting for 4: link. (Can't attach C file to the post).
Just made this for you basing on our module for Polish characters conversion.
How it works:
The default is utf-8.
The client sets umode +k to enable windows-1251 encoding support.
Russian characters encoded in windows-1251 get autodetected and then +k is set automatically.
The client sets umode +n to force utf-8 and disable autodetection.
This version of the module is not tested thoroughly, but its original form works without problems for more than 5 years.
If you want to edit the source, please use encoding windows-1251 in your editor.

Re: support encoding UTF-8 in UnrealIRCD 4.x

Posted: Mon Aug 21, 2017 6:10 am
by winstongel
Most software is not designed to handle 16 bit or 32 bit characters, yet to create a universal character set more than 8 bits was required. Therefore, a special format called UTF-8 was developed to encode these potentially international characters in a format more easily handled by existing programs and libraries. UTF-8 is defined, among other places, in IETF RFC 3629 (updating RFC 2279), so it’s a well-defined standard that can be freely read and used. UTF-8 is a variable-width encoding; characters numbered 0 to 0x7f (127) encode to themselves as a single byte, while characters with larger values are encoded into 2 to 4 (originally 6) bytes of information (depending on their value).