This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Script to pubmsg webpage titles from pasted urls

Requests for complete scripts or modifications/fixes for scripts you didn't write. Response not guaranteed, and no thread bumping!
Post Reply
s
spankymcfresh
Voice
Posts: 2
Joined: Wed Sep 20, 2006 5:19 pm

Script to pubmsg webpage titles from pasted urls

Post by spankymcfresh »

Anyone know of a simple script that will just publicly message the titles of urls pasted in a channel?

Thanks! :oops:
User avatar
rosc2112
Revered One
Posts: 1454
Joined: Sun Feb 19, 2006 8:36 pm
Location: Northeast Pennsylvania

Post by rosc2112 »

I've posted several sample scripts on how to retrieve data from webpages, like:
http://forum.egghelp.org/viewtopic.php?p=64426#64426

You'd just be regexp searching for

Code: Select all

{<title>(.*?)</title>}
Ohh, and using $text as the $url (which could be slightly dangerous allowing users to inject their own strings/urls into that...)
s
spankymcfresh
Voice
Posts: 2
Joined: Wed Sep 20, 2006 5:19 pm

Post by spankymcfresh »

That is exactly what I wanted but wasn't searching for the right key words. Thanks!

I'm not sure how cpu intensive this will be, but I've changed it so that it scans everything said and runs the procedure. This appears to work for triggering without a space/keyword bind, though it strikes me as a nasty way to go about it. Any ideas anyone?

Code: Select all

putlog "loaded: url.tcl (lists url titles from channel - thanks to rosc2112)"
package require http 2.3
bind pubm - * webcheck

proc webcheck {nick uhost hand chan text} {
   foreach i $text {
      if {([string match "*http://*" $i]) || ([string match "*www.*" $i])} {
              set url [string trim $i]
              set url [split $url]
              regsub -all {http://+} $url "" url
              catch {set page [::http::geturl $url -timeout 90000]} error
              set html [::http::data $page]
              ::http::cleanup $page
              foreach line {[split $html \n]} {
                if {[regexp -nocase {<title>(.*?)</title>} $html match titlematch]} {
                    regsub -all {&+} $titlematch "&" titlematch
                      if { ([string match "*302*" $titlematch])} {
                    } else {putserv "PRIVMSG $chan :^BURL title:^B [join $titlematch]" }
                 }}
    }
    }
   }
User avatar
rosc2112
Revered One
Posts: 1454
Joined: Sun Feb 19, 2006 8:36 pm
Location: Northeast Pennsylvania

Post by rosc2112 »

Take a look at the tinyurl script, that'll give you some ideas for using pubm for grabbing urls:

http://www.egghelp.org/cgi-bin/tcl_arch ... ad&id=1259
Post Reply