This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

some regexp problems

Old posts that have not been replied to for several years.
Locked
User avatar
entrapmen
Voice
Posts: 27
Joined: Tue Jul 08, 2003 9:08 am
Location: TR

some regexp problems

Post by entrapmen »

Hi,

i was working on spammers a while ago and im back again. but now spammers get developed. they really got weird spam methods.

here are some of the examples:

ºwºwºwº
I¤R¤C

(using with color and it seem like w w w and I R C)

lots of thing like that. i think they all can be catch by using "regexp" but i couldnt figure it out. i dont wanna add all spam words which i was doing to that time. speacial charecters allways changing (, . ? ' ( ) and words). i want to catch them if there is "w w w" in the sentence. it doesnt matter where they are. (ex: where is will wonder) <<-- that person must be caught.

got an another problem, the channel im protecting is at about 300 person and its very active i m using greet message to catch some kind spammers which are using the way "away message". so i have to message all the users but when i give that work to 1 bot it get lagged or "eflood"ed. so i want to cut it to half. if nick starts with "a-k" message else dont if nick starts with "k-z && "special charecters??"" message else dont.

sorry for bad english, i tried my best. thanks
<@ll the world is about smiles and cries>
g
greenbear
Owner
Posts: 733
Joined: Mon Sep 24, 2001 8:00 pm
Location: Norway

Post by greenbear »

You could remove everything thats not a normal char or a number from the string before doing any regexp matches on it.

Code: Select all

regsub -nocase -all {[^a-z0-9]} $line {} line
User avatar
Alchera
Revered One
Posts: 3344
Joined: Mon Aug 11, 2003 12:42 pm
Location: Ballarat Victoria, Australia
Contact:

Post by Alchera »

| stripcodes <strip-flags> <string>
| Description: strips specified control characters from the string given.
| strip-flags can be any combination of the following:
| b - remove all boldface codes
| c - remove all color codes
| r - remove all reverse video codes
| u - remove all underline codes
| a - remove all ANSI codes
| g - remove all ctrl-g (bell) codes
| Returns: the stripped string.
| Module: core
NB: eggdrop v1.6.17.0
Add [SOLVED] to the thread title if your issue has been.
Search | FAQ | RTM
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

/me makes a note to himself to check that (source code for stripping) out
User avatar
De Kus
Revered One
Posts: 1361
Joined: Sun Dec 15, 2002 11:41 am
Location: Germany

Post by De Kus »

greenbear wrote:regsub -nocase -all {[^a-z0-9]} $line {} line
Just think what will happen to "vist us on http://www.mystupid.ad and our #stupid-channel". yes, it will become "vistusonhttpwwwmystupidadandourstupidchannel". Good luck :).
you won't be able to find advirtising spam that way, you should keep at least the characters " ", "#", "/", ":" and ".". So I would rather prefer: [^a-z0-9\./:# ]


However, for general purpose stripcodes is the best and probably the fastest solution.
De Kus
StarZ|De_Kus, De_Kus or DeKus on IRC
Copyright © 2005-2009 by De Kus - published under The MIT License
Love hurts, love strengthens...
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

this will strip mIRC color and control codes:

Code: Select all

regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
and this will match a hotlink (clickable URL) or chan ad:

Code: Select all

regexp {(?i)(http://|www\.|irc\.|\s#)} $str
User avatar
entrapmen
Voice
Posts: 27
Joined: Tue Jul 08, 2003 9:08 am
Location: TR

Post by entrapmen »

demond wrote:this will strip mIRC color and control codes:

Code: Select all

regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
and this will match a hotlink (clickable URL) or chan ad:

Code: Select all

regexp {(?i)(http://|www\.|irc\.|\s#)} $str

thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links.
<@ll the world is about smiles and cries>
User avatar
entrapmen
Voice
Posts: 27
Joined: Tue Jul 08, 2003 9:08 am
Location: TR

Post by entrapmen »

about the second problem can someone help me?

i think one if and else should be enough, like if nick starts with the letter a-q do something else do nothing...
<@ll the world is about smiles and cries>
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

entrapmen wrote:
demond wrote:this will strip mIRC color and control codes:

Code: Select all

regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
and this will match a hotlink (clickable URL) or chan ad:

Code: Select all

regexp {(?i)(http://|www\.|irc\.|\s#)} $str

thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links.
really? give me just ONE clickable link that these don't match
User avatar
entrapmen
Voice
Posts: 27
Joined: Tue Jul 08, 2003 9:08 am
Location: TR

Post by entrapmen »

demond wrote:
entrapmen wrote:
demond wrote:this will strip mIRC color and control codes:

Code: Select all

regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
and this will match a hotlink (clickable URL) or chan ad:

Code: Select all

regexp {(?i)(http://|www\.|irc\.|\s#)} $str

thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links.
really? give me just ONE clickable link that these don't match
i say it really works for clickable links but bots(spammers) doesnt use that method anymore. thats what i am saying. sorry for bad understandings and bad english :cry:
<@ll the world is about smiles and cries>
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

so what do you care if they use another method with no clickable links? nobody will bother to manually strip the spam and paste it in their browser - people are lazy, spammers know that; spam without clickable links is harmless (annoying yes, but that's all about it)
User avatar
entrapmen
Voice
Posts: 27
Joined: Tue Jul 08, 2003 9:08 am
Location: TR

Post by entrapmen »

demond wrote:so what do you care if they use another method with no clickable links? nobody will bother to manually strip the spam and paste it in their browser - people are lazy, spammers know that; spam without clickable links is harmless (annoying yes, but that's all about it)
as you say they are annoying. i m trying to stop them before they message to users. users are lazy if they would //mode their self +R they would not see the spammers...

anyway demond can you help me about the second problem :D :D :D
<@ll the world is about smiles and cries>
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

if you've managed to write some script satisfying some of your needs, you should be able to grasp the [regexp] concept and eliminate nick patterns that are of no interest to you, it's not that hard
Locked