Script to pubmsg webpage titles from pasted urls

spankymcfresh · Post by **spankymcfresh** » Wed Sep 20, 2006 5:23 pm

Anyone know of a simple script that will just publicly message the titles of urls pasted in a channel?

Thanks! :oops:

rosc2112 · Post by **rosc2112** » Wed Sep 20, 2006 8:04 pm

I've posted several sample scripts on how to retrieve data from webpages, like:
http://forum.egghelp.org/viewtopic.php?p=64426#64426

You'd just be regexp searching for

Code: Select all

{<title>(.*?)</title>}

Ohh, and using $text as the $url (which could be slightly dangerous allowing users to inject their own strings/urls into that...)

spankymcfresh · Post by **spankymcfresh** » Thu Sep 21, 2006 9:26 am

That is exactly what I wanted but wasn't searching for the right key words. Thanks!

I'm not sure how cpu intensive this will be, but I've changed it so that it scans everything said and runs the procedure. This appears to work for triggering without a space/keyword bind, though it strikes me as a nasty way to go about it. Any ideas anyone?

Code: Select all

putlog "loaded: url.tcl (lists url titles from channel - thanks to rosc2112)"
package require http 2.3
bind pubm - * webcheck

proc webcheck {nick uhost hand chan text} {
   foreach i $text {
      if {([string match "*http://*" $i]) || ([string match "*www.*" $i])} {
              set url [string trim $i]
              set url [split $url]
              regsub -all {http://+} $url "" url
              catch {set page [::http::geturl $url -timeout 90000]} error
              set html [::http::data $page]
              ::http::cleanup $page
              foreach line {[split $html \n]} {
                if {[regexp -nocase {<title>(.*?)</title>} $html match titlematch]} {
                    regsub -all {&+} $titlematch "&" titlematch
                      if { ([string match "*302*" $titlematch])} {
                    } else {putserv "PRIVMSG $chan :^BURL title:^B [join $titlematch]" }
                 }}
    }
    }
   }

rosc2112 · Post by **rosc2112** » Thu Sep 21, 2006 7:53 pm

Take a look at the tinyurl script, that'll give you some ideas for using pubm for grabbing urls:

http://www.egghelp.org/cgi-bin/tcl_arch ... ad&id=1259