This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Help about regexp :)

Old posts that have not been replied to for several years.
Locked
c
cerberus_gr
Halfop
Posts: 97
Joined: Fri Feb 07, 2003 8:57 am
Location: 127.0.0.1

Help about regexp :)

Post by cerberus_gr »

Sorry, but I still can't understand regural expressions. :(

I have this line and I want to extract some characters (all lines have the same format, but the characters which I want to extract are not the same).

1)
Line is:
<td class="asmall" align="center" valign="bottom">6.3%<br><img src="dir/blue-v.png" alt="817" title="817" height="100" width="15"></td>

Text to extract:
6,3%, blue-v.png, 817 (of title), 100 (of height) and 15 (of width)


2)
Line is:
<td class="rankc10">User1 (91)</td>

Text to extract:
User1 (91)


3)
Line is:
<img src="dir/blue-h.png" alt="" align="middle" border="0" height="15" width="87">

Text to extract:
blue-h.png, 15 (of height), 87 (of width)


4)
Line is:
<tbody><tr><td class="hicell"><b>Fireball</b>: text text text...

Text to extract:
Fireball

More help:
The text I want to extrct is always between <b> and <.b> html tags


5)
Line is:
<td class="hicell" valign="top">crazy_4_loved, thermidis, crazy_4, XXX-7639097181847</td>

Text to extract:
crazy_4_loved, thermidis, crazy_4, XXX-7639097181847


6)
Line is:
<td class="hicell"><a href="http://www.site.com/">http://WwW.Site.CoM</a></td>

Text to extract:
http://WwW.Site.CoM


7)
Line is:
<td style="background-color: rgb(198, 198, 209);" class="male"><a href="http://www.site.com/" target="_blank" title="Ανοιγμα σε νέο παράθυρο: http://www.Site.com">_geo_</a></td><td style="background-color: rgb(198, 198, 209);">28</td><td style="background-color: rgb(198, 198, 209);" nowrap="nowrap"><img src="dir/blue-h.png" alt="" align="middle" border="0" height="15" width="17"><img src="dir/green-h.png" alt="" align="middle" border="0" height="15" width="3"> 129</td><td style="background-color: rgb(198, 198, 209);">"1 fora arkei :)"</td>

Text to extract:
http://www.Site, _geo_, 28, (blue-h.png, 15, 17), (green-h.png, 15, 3), 129, "1 fora arkei :)"

More help:
Number of <img> could be between 1-4
1) ... <img="name" height="num" width="num2"> ...
2) ... <img="name" height="num" width="num2"><img="name2" height="num3" width="num4"> ...
3) ... <img="name" height="num" width="num2"><img="name2" height="num3" width="num4"><img="name3" height="num5" width="num6"> ...
4) ...

I want to extract name and nums



Thx :)
User avatar
^DooM^
Owner
Posts: 772
Joined: Tue Aug 26, 2003 5:40 pm
Location: IronForge
Contact:

Post by ^DooM^ »

Theres a link on this site with a tutorial on regular expressions

Click me
The lifecycle of a noob is complex. Fledgling noobs gestate inside biometric pods. Once a budding noob has matured thru gestation they climb out of their pod, sit down at a PC, ask a bunch of questions that are clearly in the FAQ, The Noob is born
g
greenbear
Owner
Posts: 733
Joined: Mon Sep 24, 2001 8:00 pm
Location: Norway

Post by greenbear »

Code: Select all

1)
regexp {>(.*?)<br>.*src=\"dir/(.*?)\".*title=\"(.*?)\".*height=\"(.*?)\".*width=\"(.*?)\"} $line . percent src title height width

2)
regexp {>(.*)<} $line . user

3)
regexp {src=\"dir/(.*?)\".*height=\"(.*?)\".*width=\"(.*?)\"} $line . src height width

4)
regexp {<b>(.*?)</b>} $line . var

5)
regexp {>(.*?)<} $line . text

6)
regexp {.*>(.*?)</a} $line . url
/me bored ...
Locked