Page 1 of 1

odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Mon May 31, 2021 10:50 am
by Orobas
Revisting a topic from an earlier unrealircd release 3.2 with a query please

The original post is here
viewtopic.php?t=2392&sid=9308edc55a1898 ... 7b04c518c2

Instead of using the .conf to set this.. i did it direct via the /spamfilter add command

I did
/spamfilter add -regex c soft-block - caps_or_netspammer (?-i)[A-Z ]{30}
and made sure i included the space between the Z and the ] as per Syzop's post
Spamfilter added: '(?-i)[A-Z ]{30}' [type: regex] [target: c] [action: soft-block] [reason: caps or netspammer]
tested it on regex101.com to make sure it worked with no false hits first and all was good. Ran a test on the server with both a regged and an unregged nick and it worked as required with the unregged nick getting the block and the spamfilter reporting correctly.

Now here's the weird thing.. we had a false hit on the filter about 8 hours after it was set
This is the direct snotice from the server
[Spamfilter] aisha-banner15!~Mibbit@ip.address.here matches filter '(?-i)[A-Z ]{30}': [PRIVMSG #randomchat: ' hello '] [caps or netspammer] at 06:40:58 on 31/05/2021 on
she then retyped hello afterwards again in the channel and that time it showed! I have tried multiple times to reproduce this hit which i have been unable to replicate on the server. I double checked the filter snotice and that is all she actually said.. no message was cut off that i can see!

Any thoughts ? We're running Unrealircd 5.0.9.1 with PCRE2 10.36 not the Unrealircd3.2 this filter was originally recommended on
Since this single hit.. there has not been another false hit.. Just more curious than anything as to if there was a logical answer as to why it occurred

Re: odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Mon May 31, 2021 3:14 pm
by alhoceima
try this: [A-Z]{30}

Re: odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Mon May 31, 2021 3:23 pm
by Orobas
The regex explainer
(?-i)[A-Z ]{30}
/
(?-i) match the remainder of the pattern with the following effective flags:
i modifier: -insensitive. Case sensitive match
Match a single character present in the list below [A-Z ]
{30} matches the previous token exactly 30 times
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
<space> matches the character literally (case sensitive)
Syzop on the original post said
Re: Caps block
Post by Syzop ยป Fri Aug 26, 2005 4:33 pm

You can use (?-i) to make it case sensitive:
CODE: SELECT ALL

regex "(?-i)[A-Z]{20}";
but probably something like this would be better (include space):
CODE: SELECT ALL

regex "(?-i)[A-Z ]{20}";
Or even include digits and other stuff... (punctuation) ?

Perhaps someone else has some great suggestions :p
This is why i'm curious as to why the space was mentioned as better than without?

Re: odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Mon May 31, 2021 10:11 pm
by CrazyCat
We are speaking of channel messages, so spaces are part of the message. Without it, you'll only detect strings of 20 (or more) letters. With it, you detect sentences whith 20 caps or spaces.

Without space in regex, you match "SUPERCALIFRAGILISTICEXPIALIDOCIOUS" but you don't match "MY LITTLE HORSE IS A PONEY CALLED BUTTERFLY"

Re: odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Mon May 31, 2021 10:30 pm
by Orobas
@CrazyCat thanks for confirming and ensuring the filter was right firstly.. still puzzled as to why i got that false hit though. we've had nothing trip it all day and i have tried multiple times to false trip it to no avail lol

Re: odd false positive hit from a correct spamfilter. One for Syzop maybe

Posted: Tue Jun 01, 2021 6:22 am
by CrazyCat
I can't understand to. The only explanation I can see is the user write a rainbow hello and colours (or other non-printable codes) are seen as spaces but not showed in the spamfilter warning.