Case mapping is harmful

Talk about pretty much anything here, but DO NOT USE FOR SUPPORT.

Moderator: Supporters

Locked
nslay
Posts: 7
Joined: Sun Oct 14, 2012 5:11 pm

Case mapping is harmful

Post by nslay »

While I think RPL_ISUPPORT is a pretty general and elegant solution, I'm finding some of it to be harmful to code complexity and ... to security. Case mapping is the most obvious. In my case, I have a general event-driven framework supporting one or more instances of an IRC bot class. I have to pass case mapping information throughout several parts of the bot to use the proper case-insensitive string comparison, string searching, and wild card matching. On top of that, I have a user list and a s**t list that is stored to disk and contains wildcard masks. But the meaning of these masks changes depending on the server case mapping. That is definitely a problem ... even if it is only on those fringe nicknames that contain ~,^,[,],\,{,|,}.
Jobe
Official supporter
Posts: 1180
Joined: Wed May 03, 2006 7:09 pm
Location: United Kingdom

Re: Case mapping is harmful

Post by Jobe »

To be totally honest I'm not sure I agree with you. So could you provide examples and a full explanation of how you find this to be an issue please?
Your IP: Image
Your Country: Image
nslay
Posts: 7
Joined: Sun Oct 14, 2012 5:11 pm

Re: Case mapping is harmful

Post by nslay »

Easy.

1) Inconsistent interpretation for hostmasks.
If a client stores a list of hostmasks to disk and also honors the the server case mapping, then the interpretation of those hostmasks can potentially change based on the server. Such lists could be relied upon to restrict access to resources on the client.

Workaround: Ignore the server case mapping. Define and stick to a case mapping.

2) Programming complexity for client software.
If the server case mapping is absolutely essential (which it probably isn't), then you have to construct run-time traits for the connection and propagate the traits to various distinct subsystems in the client software. For example, you may have a specialized access system, or a specialized internal channel structure where it might be necessary to emulate the server's case mapping behavior.

In all honesty, I can't imagine a server that would sometimes tolower()/toupper() nicknames/channels ... so you can probably ignore the server case mapping and use case-sensitive string operations (in, for example, an internal channel data structure). Though, I don't think the standard(s) would prevent a server from doing that.

EDIT:
Here's a specific example:

Pull yourself up by your own bootstraps.

You have an access list of host masks to read in (e.g. for a bot). But you haven't established a connection to an IRC network yet (e.g. program startup). How do you interpret the hostmasks? rfc1459, strict-rfc1459, or ascii? I mean, this is stupid ... I decided to ignore the server case mapping and use ascii (since all the string functions are hardcoded this way anyway).
katsklaw
Posts: 1124
Joined: Sun Apr 18, 2004 5:06 pm
Contact:

Re: Case mapping is harmful

Post by katsklaw »

Unrealircd users have always been welcome to submit patches for bugs, features and anything they think should be changed. Please fill out a bug report at http://bugs.unrealircd.com and include your more efficient system as a patch file.

Thanks.
nslay
Posts: 7
Joined: Sun Oct 14, 2012 5:11 pm

Re: Case mapping is harmful

Post by nslay »

It's not an issue with UnrealIRCD (it's an off-topic post), it's a general IRC detail.
katsklaw
Posts: 1124
Joined: Sun Apr 18, 2004 5:06 pm
Contact:

Re: Case mapping is harmful

Post by katsklaw »

nslay wrote:It's not an issue with UnrealIRCD (it's an off-topic post), it's a general IRC detail.
So it is. My bad. Carry on! ;)
nenolod
Posts: 2
Joined: Sat Oct 27, 2012 10:30 pm

Re: Case mapping is harmful

Post by nenolod »

While I agree that rfc1459 casemapping is generally harmful, changing it abruptly in IRCds which use it would cause problems with desyncs, so we have generally tried to avoid doing it in the IRCds which implement rfc1459 casemapping for this reason.

From the client perspective, I would do the following:

- don't care about the CASEMAPPING ISUPPORT token
- apply case-harmonization so that both rfc1459 and ascii mappings are allowed (since it only affects non-alphabetic characters)

IRCd needs to use irccasecmp() on channel names because it may have a different capitalization on one server than another server. Consider for example:

- user on server A joining #chatzone
- user on server B joining #ChatZone
- server A and server B link together after the above pre-conditions
- server A still has it as #chatzone
- server B still has it as #ChatZone

And how do you determine who wins for normalizing the case on the channel name? If timestamps are even, then who wins?
nslay
Posts: 7
Joined: Sun Oct 14, 2012 5:11 pm

Re: Case mapping is harmful

Post by nslay »

Thanks for the feedback nenolod.

To be cautious, (particularly in cases of nickname collisions), I've determined two important places to respect server case mapping
  • Determine if the target of an IRC command (protocol) is you
  • Internal channel accounting
I think all the IRC servers I've seen will always use nicknames/channels as they were cased by the user (though the standard does not appear to disallow that). Even so, In addition to nenolod's scenario, one could imagine the following scenario:
  • Two servers are split: Server A and Server B.
  • Server A has nslay^ and a malicious user registers as nslay~ on Server B.
  • enslay shares a common channel with nslay^ on Server A.
  • Server B links and there is a nickname collision for nslay^
Does Server A report the QUIT for nslay^ or nslay~?

It's conceivable that Server A report the QUIT for nslay~ (I would think). A precautionary measure is that internal channel accounting take the server's case mapping into consideration. In this case, nslay^ may not be removed from the internal channel list by enslay if the QUIT was reported for nslay~ unless it respected the server case mapping.

Of course, this an extreme example.

On the other hand, I think it's best that the client defines its own case mapping for client-client interaction. This way, you get consistent behavior no matter the network your client is using. My bot does precisely this. Server case mapping for the two listed items above and ASCII case mapping when dealing with users.

Upshot
The client should respect case mapping when maintaining internal lists of channels and when determining if the target of an IRC command (protocol) is itself. Additionally, the client should define a case mapping exclusively used for interacting with other clients for consistent behavior across networks.
Syzop
UnrealIRCd head coder
Posts: 2112
Joined: Sat Mar 06, 2004 8:57 pm
Location: .nl
Contact:

Re: Case mapping is harmful

Post by Syzop »

It's obvious you have to use the same case mapping on the client as the server or you run in the trouble you mentioned.
Your point was that it's not possible to do proper case mapping when offline. Basically you're saying that when you're offline you don't know the case mapping a server uses (duh..) and thus you can't do proper case mapping. Valid point, and not something a server can do something about (you aren't connected to it, after all). So you'll have to deal with that on your end.
My suggestion would be to save case mapping information once you've connected to the server/network. Then next time you get disconnected from it you still know it's setting. This means you only have this problem once, before you ever connected to a server/network. It's rather uncommon for a network to switch case mapping after all.
And/or, make it definable in your bot config. This is exactly what eggdrop has been doing for 10? 15? years:

Code: Select all

# [0/1] use rfc 1459 compliant string matching routines?
# All networks apart from Dalnet comply with rfc 1459, so you should only
# disable it on Dalnet or networks which use Dalnet's code.
set rfc-compliant 0
nslay
Posts: 7
Joined: Sun Oct 14, 2012 5:11 pm

Re: Case mapping is harmful

Post by nslay »

Syzop, I appreciate your prompt response.

I resolved this by honoring server case mapping when maintaining internal channel lists or determining if the source/target of an IRC command is myself. For everything else (client-client interaction), I use ASCII case mapping (since most people don't know or care that ^ and ~ are the same, for example).

I can see where a case mapping configuration could be useful ... for ircds that don't report RPL_ISUPPORT and deviate from the standard case mapping.
Locked