This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Fetching data from www?

Old posts that have not been replied to for several years.
Locked
t
t3ch^
Voice
Posts: 16
Joined: Sat Dec 27, 2003 7:19 pm

Fetching data from www?

Post by t3ch^ »

Can some one give me some dockuments howto fetch data from www or maby even explain a little bit?

i whant to do a tcl script wich get data from a certain page and displays it on irc on a chanel, kinda like !google.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

check out this
Elen sila lúmenn' omentielvo
t
t3ch^
Voice
Posts: 16
Joined: Sat Dec 27, 2003 7:19 pm

Post by t3ch^ »

thx, now i do get some data from www but its not working as i want it to...

My code:

Code: Select all

bind pub - !www www:www
proc www:www {nick host handle chan text} {
        package require http
        set a ""
        set b ""
        set token [http::config -useragent "Mozilla 4.0"]
        set url "http://lajt.mine.nu/index.php"
        set data [http::geturl $url]
        set data2 [http::data $data]
        http::cleanup $data

        regexp -nocase {<a href="(.*?)">(.*?)</a>} $data2 data2 a b
        puthelp "PRIVMSG $chan :$a"
        regexp -nocase {<a href="(.*?)">(.*?)</a>} $data2 data2 a b
        puthelp "PRIVMSG $chan :$a"
        regexp -nocase {<a href="(.*?)">(.*?)</a>} $data2 data2 a b
        puthelp "PRIVMSG $chan :$a"
}
The page source looks like this:

Code: Select all

<body bgcolor="gray">
Gott Nytt år 2004!
<br>
<br>
<a href="phpmyadmin/">PHP Myadmin</a>
<br>
<a href="upload/">Upload</a>
<br>
<a href="movies/">Movies</a>
<br>
<br>
<a href="mailto:t3ch@lajt.mine.nu">Kasta ett e-brev!</a>
I want output to be like:

Code: Select all

phpmyadmin/
upload/
movies/
But only this get printed:

Code: Select all

phpmyadmin/
//
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

you need a loop to go through each line..

Code: Select all

# set this outside the proc, it's safer
package require http 
bind pub - !www www:www 
proc www:www {nick host handle chan text} { 
        #why setting these here? 
        set a "" 
        set b "" 

        #what you use this for? 
        set token [http::config -useragent "Mozilla 4.0"] 

        set url "http://lajt.mine.nu/index.php" 
        set data [http::geturl $url] 
        set data2 [http::data $data] 
        http::cleanup $data 
        foreach line [split $data2 \n] { 
          if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line line a b]} { 
            puthelp "PRIVMSG $chan :$a" 
          } 
        } 
 }
umm... sorry I forgot to change the variable in the regexp, it's fixed now :)
Last edited by Papillon on Tue Dec 30, 2003 7:02 pm, edited 5 times in total.
Elen sila lúmenn' omentielvo
t
t3ch^
Voice
Posts: 16
Joined: Sat Dec 27, 2003 7:19 pm

Post by t3ch^ »

man i love you!
thx alot

happy new year all

[edit]

Code: Select all

# set this outside the proc, it's safer
package require http
bind pub - !www www:www
proc www:www {nick host handle chan text} {
        #why setting these here?
        set a ""
        set b ""

        #what you use this for?
        set token [http::config -useragent "Mozilla 4.0"]

        set url "http://lajt.mine.nu/index.php"
        set data [http::geturl $url]
        set data2 [http::data $data]
        http::cleanup $data
        foreach line [split $data2 \n] {
          if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line <---- data2 a b]} {
            puthelp "PRIVMSG $chan :$a"
          }
        }
 }
O
Ofloo
Owner
Posts: 953
Joined: Tue May 13, 2003 1:37 am
Location: Belguim
Contact:

Post by Ofloo »

i know its fixed but i am trying to understand the code
if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line line a b]} {

this part i don't understand

i know string map but never understood regexp realy

like :

Code: Select all

[string map -nocase {\\ \\\\} $arg] 
what would this look like ?? in regexp
XplaiN but think of me as stupid
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

if you want to compare string map and re_syntax you would have to use regsub
regexp is used to compare strings and add the matches to different vars
regsub is used to switch bits of strings, they both use re_syntax to achieve this

Code: Select all

string map -nocase {\\ \\\\} $arg
regsub {(\\){2}} $txt {\\\\\\\\} ntxt
these two would do exactly the same
as you can see you need some \'s to escape the following \'s
the {2} means that it wants exactly 2 matches of \\ to switch with {\\\\\\\\}
you could write it like this:

Code: Select all

regsub {(\\\\)} $txt {\\\\\\\\} ntxt
which is exactly the same as the one above :)
Elen sila lúmenn' omentielvo
O
Ofloo
Owner
Posts: 953
Joined: Tue May 13, 2003 1:37 am
Location: Belguim
Contact:

Post by Ofloo »

correct me if i am wrong if i understand correctly then it would be

Code: Select all

regsub {(\\){2}} $txt {\\\\\\\\} ntxt
=

Code: Select all

string map {\\\\ \\\\\\\\} $arg
the {2} means that it wants exactly 2 matches of \\ to switch with {\\\\\\\\}
cause the other map is doing \ > \\ but if u use \ it wouldn't work so i must use \\ and replace it with \\\\ so this means output \ => \\

if i understand right u are saing \\\\ => \\\\\\\\ out put \\ => \\\\

just verifying if i understand correctly

also
if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line <---- data2 a b]} {
so if i am supposed to use regsub why u use regexp ?? i mean what is it doing then ? i tought u where replacing chars
XplaiN but think of me as stupid
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

sorry my bad
you are correct =)

Code: Select all

if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line <---- data2 a b]} {
this would be almost like

Code: Select all

string match "<a href=\"*\">*</a>" $line
ofc the difference is that where the string match only returns 1 or 0, the regexp puts the matches in different vars
so I'm not replacing chars, I'm just puliing out the needed info :)
Elen sila lúmenn' omentielvo
O
Ofloo
Owner
Posts: 953
Joined: Tue May 13, 2003 1:37 am
Location: Belguim
Contact:

Post by Ofloo »

ah tnx nice ic what u mean now
(.*?)">(.*?)
sets the url to a and name of url to b
if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line <---- data2 a b]} {
<---- what does this do ? just some info ?
XplaiN but think of me as stupid
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

I have no idea.. it wasn't me that put the <---- in there ;)
test it and see what it does... and then tell meh :mrgreen:
Elen sila lúmenn' omentielvo
O
Ofloo
Owner
Posts: 953
Joined: Tue May 13, 2003 1:37 am
Location: Belguim
Contact:

Post by Ofloo »

well ill test it and once i find out .. ill let u know but i think it makes more will make more errors then do good hehe
XplaiN but think of me as stupid
e
eggnutz

defining tags from the page in the expression

Post by eggnutz »

Tcl provides two pattern matching mechanisms: glob-style patterns and regular expressions. Of the two, regular expressions are more difficult to understand and apply correctly, but they're much more powerful.

I have noticed that regexp is more powerful and more difficult..
i have been sitting here trying to solve it but i have to request your assistance..
I looked at the http://tcl.activestate.com/man/tcl8.4/TclCmd/http.htm page.. which helped me understand a bit..
but the parsing on my page stops at the tags <NOBR> spitting out:
[2:55pm] <PeteyB> <NOBR>[0123] Site remodified</NOBR>
as u can see it spits out the html tag and the string i am looking for.. and stops parsing on further...
i tried replacing the

Code: Select all

if {[regexp -nocase {<a href="(.*?)">(.*?)</a>} $line <---- data2 a b]}
with

Code: Select all

if {[regexp -nocase {<a href="*<NOBR>*">"*</NOBR>*"</a>} $line <---- data2 a b]}
and other variations..

i know u are probably laughing... (me too) but thx for the help
t
t3ch^
Voice
Posts: 16
Joined: Sat Dec 27, 2003 7:19 pm

Post by t3ch^ »

<---- was only there to notice you about a wrong typing in the code.

it was at first $data when it should be $line

//
e
eggnutz

okay.. figured out how regexp syntax works

Post by eggnutz »

okay.. figured out how regexp syntax works
but now i still have the problem of the foreach loop stopping after the first match... using this code:

Code: Select all

        set data2 [http::data $data] 
        http::cleanup $data
        foreach line [split $data2 \n] {
        if {[regexp {<NOBR>(.*?)</NOBR>} $line data2 a ]} {
            puthelp "NOTICE $chan :$a" 
          }
why is the loop stopping?
Locked