theice wrote:but I couldn't find anywhere how to easily open a connecting to a website and parse the information.
Are you kidding?
You mean, out of all the hundreds of web-parsing script topics here, and all the web-parsing scripts in the archive, you couldn't find one example of how to grab/parse a website?
theice wrote:but I couldn't find anywhere how to easily open a connecting to a website and parse the information.
Are you kidding?
You mean, out of all the hundreds of web-parsing script topics here, and all the web-parsing scripts in the archive, you couldn't find one example of how to grab/parse a website?
sorry I guess I didn't look aroung very much. I'm dumb.
I think it would be cool just to have the current week, once I figured out how to do that maybe in the future I would write something to be like Last 3 Weeks Downloaded Content!
regexp {<tr class="dlc_info_row">.*?<td>(.*?)</td>} $data - artist
Newlines are just another char in regexp. And, the tcl parser isn't going to work properly if you have newlines in the middle of a regexp (unless perhaps if you escape them so the parser knows to continue reading the next line as part of the previous.)
<tr class="dlc_label_row">
<td width="45%">Band</td>
<td width="40%">Song</td>
<td width="15%">Type</td>
</tr>
<tr class="dlc_info_row">
<td>*</td> #Artist
<td>*</td> #Title
<td>*</td> #Type
</tr>
<tr class="dlc_credits_row">
<td colspan="3"></td>
</tr>
<tr class="dlc_info_row">
<td>*</td> #Artist
<td>*</td> #Title
<td>*</td> #Type
</tr>
<tr class="dlc_credits_row">
<td colspan="3"> “CrushCrushCrush” as performed by Paramore courtesy of Warner Music Group<br />
Hayley Williams and Josh Farro<br />
2007 WB Music Corp. (Ascap), But Father, I Just Want To Sing Music (ASCAP), FBR Music (ASCAP) And Josh's Music (ASCAP) All rights reserved. Used by permission</td>
</tr>
<tr class="dlc_info_row">
<td>*</td> #Artist
<td>*</td> #Title
<td>*</td> #Type
</tr>
<tr class="dlc_credits_row">
<td colspan="3"> “Beethoven’s C***” as performed by Serjical Strike courtesy of Warner Music Group<br />Serj Tankian<br />2007 Stunning Suppository Sounds (BMI)All Rights Administered By Warner Tamerlane Publishing Corp.<br />
All rights reserved. Used by permission</td>
</tr>
</table>
the *'s are what I want, the problem is this page is updated each week, and the format will stay the same, but the amount of songs wont. so idk what to do
that works great, but what I was trying to figure out is how to make it only paste the first weeks information, it keeps going until it runs out of songs, when I want it to stop after it reaches the first </table>
# your header
regexp {<div class="dlc_week">(.*?)</div>} $data - week
putserv "PRIVMSG $c :$week"
# slim our html down to just that one week
regexp {<table class="dlc_table".*?>(.*?)</table>} data - $data
# parse as usual
foreach tr [regexp -all -inline {<tr class="dlc_info_row">.*?</tr>} $data] {
regexp {<tr class="dlc_info_row">.*?<td>(.+?)</td>.*?<td>(.+?)</td>.*?<td>(.+?)</td>} $data - artist title type
putserv "PRIVMSG $c :$artist - $title ($type)"
}
edit: corrected obvious mistake... future glances can see this as correct now.
Last edited by speechles on Mon Mar 17, 2008 11:52 pm, edited 1 time in total.