This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

need help with regexp and html code [solved]

Old posts that have not been replied to for several years.
Locked
User avatar
De Kus
Revered One
Posts: 1361
Joined: Sun Dec 15, 2002 11:41 am
Location: Germany

need help with regexp and html code [solved]

Post by De Kus »

after reading the tcl doc and the faq related here I got really far, but now I am out of ideas.
here my current code

Code: Select all

regexp {Temperatur:</span></td>\r\n\t*<td><span class="Body">(.{1,5}) °C} $state(body) {} temp
regexp {Luftdruck:</span></td>\r\n\t*<td><span class="Body">(.{4,6}) hPa} $state(body) {} druck
regexp {Wind:</span></td>\r\n\t*<td><span class="Body">(.{1,3}) km/h / (.{3,12})</span>} $state(body) {} windg windr
html code I am triing to phrase (wetter.com, German!):

Code: Select all

					<tr>
						<td width="100"><span class="Headline">Temperatur:</span></td>

						<td><span class="Body">-0.7 °C</span></td>
					</tr>
					<tr>
						<td><span class="Headline">Luftdruck:</span></td>
						<td><span class="Body">1015.8 hPa</span></td>
					</tr>
					<tr>

						<td><span class="Headline">Wind:</span></td>
						<td><span class="Body">26 km/h / West</span></td>
					</tr>
Then I noticed... HEY, idiot! You are reading a Unix type html file, you need \n not \r\n. But the funny thing is, if I replace \r\n with \n, I'll get this:
[03:48:03] * Last context: tclhash.c/684 [Tcl proc: getwetter_cmd, param: $_pub1 $_pub2 $_pub3 $_pub4 $_pub5]
[03:48:03] * Please REPORT this BUG!
[03:48:03] * Check doc/BUG-REPORT on how to do so.
[03:48:03] * Wrote DEBUG
[03:48:03] * SEGMENT VIOLATION -- CRASHING!
:)

any idea to get the 4 values? I could leave out the code before the new line in the 1st and 2nd, but never in the 3rd line. Do I have to use something else for newlines in the expression or is "\t*" not refering to an unspecific amount of tabs?.

PS: $state(body) contains the correct html code (between <body> and </body>), I get 2 other values correct.

PPS: if you believe having the whole code will make you able to help me, I can send the code via PM/email. Since it has 155 lines and is German only commented, it might not help much anyway and I'd just risk someone takes my code without credit ;).

PPPS: ah yeah, just remembered... using another weather script won't help. Their source page all suck when it comes to German weather, I want to use www.wetter.com. In case you have seen another script for this page, I'd be happy, too. I could only find a remote script, but I it doesnt look, like I could anything from there :(. I don't even know if this code would actually really work :D.
Last edited by De Kus on Sat Feb 19, 2005 12:44 am, edited 1 time in total.
De Kus
StarZ|De_Kus, De_Kus or DeKus on IRC
Copyright © 2005-2009 by De Kus - published under The MIT License
Love hurts, love strengthens...
User avatar
De Kus
Revered One
Posts: 1361
Joined: Sun Dec 15, 2002 11:41 am
Location: Germany

Post by De Kus »

after some relaxing and thinking again I got THE idea, so sorry for bothering.
Thats the trick, just putting the 3 into 1 regexp and replacing the \n and \t part with .* .

Code: Select all

regexp {<td><span class="Body">(.{1,5}) °C(.*)<td><span class="Body">(.{4,6}) hPa(.*)<td><span class="Body">(.{1,3}) km/h / (.{3,12})</span>} $state(body) {} temp {} druck {} windg windr
De Kus
StarZ|De_Kus, De_Kus or DeKus on IRC
Copyright © 2005-2009 by De Kus - published under The MIT License
Love hurts, love strengthens...
Locked