This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

[SOLVED]help using egghttp from egghttp_example.tcl

Help for those learning Tcl or writing their own scripts.
Post Reply
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

[SOLVED]help using egghttp from egghttp_example.tcl

Post by malcsman »

hello

Iḿ really confused as to how this all works. if someone could please help me.
all i want is to show the bible verses.

if i echo the $body, then i get an output. if i use any other var, nothing comes out..or it comes out as ¨example¨

here´s my code:

Code: Select all

# egghttp_example.tcl

# Config
set url "http://www.biblegateway.com/passage/?search=mark%204:1-9&version=72"
set dcctrigger "example"
# End of config

if {![info exists egghttp(version)]} {
  putlog "egghttp.tcl was NOT successfully loaded."
  putlog "biblegateway.tcl has not been loaded as a result."
} else {
  proc biblegateway {sock} {
    global url
    set headers [egghttp:headers $sock]
    set body [egghttp:data $sock]
    set boddy "example"

    regsub -all "\n" $body "" body
    regsub -all -nocase {<br>} $body " " body

    regsub -all -nocase {<p>} $body " " body
    regsub -all -nocase {<span id=\"en-TNIV-24335\" class=\"sup\">} $body "\002" body
    regsub -all -nocase {</span>} $body "\002" body
    regsub -all -nocase {<br>} $body " " body
    regsub -all -nocase {</p>} $body " " body

    regexp {<div class="result-text-style-normal">(.*?)<p/><p><br/></p></div>} $body - boddy

foreach line [split $body \n] {
  if {[string match "*<div class=\"result-text-style-normal\">*" $line]} {
    set start [string first "<div class=\"result-text-style-normal\">" $line]
    set start [expr {$start + 38}] ;# 38 is the number of characters in "<div class=\"result-text-style-normal\">"
    set end [string first "<p/><p><br/></p></div>" $line]
    set end [expr {$end - 1}];# We don't want the "<" from "<p/>"
    set boddy [string range $line $start $end]
  }
}


putlog "PRIVMSG #questions :Mark 4v1-9 '$boddy' THE END!"
}

  bind dcc n|n $dcctrigger our:dcctrigger
  proc our:dcctrigger {hand idx text} {
    global url
    set sock [egghttp:geturl $url biblegateway]
    return 1
  }

  putlog "biblegateway.tcl has been successfully loaded."
}
as you can see, i´ve tried everything - my last attempt was with the loop!

thanks in advance.
pilchards
Last edited by malcsman on Mon Apr 14, 2008 2:48 pm, edited 1 time in total.
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

You've already removed \n's with your regsub...
Your regexp already gets the portion of the text that you want... so searching for those tokens in your loop is pointless.

What is the output you get when you run your example?
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Code: Select all

    set boddy "example" 
    regsub -all "\n" $body "" body
    regsub -all -nocase {<br>} $body " " body

    regsub -all -nocase {<p>} $body " " body
    regsub -all -nocase {<span id=\"en-TNIV-24335\" class=\"sup\">} $body "\002" body
    regsub -all -nocase {</span>} $body "\002" body
    regsub -all -nocase {<br>} $body " " body
    regsub -all -nocase {</p>} $body " " body

    regexp {<div class="result-text-style-normal">(.*?)<p/><p><br/></p></div>} $body - boddy 
This will never work, the regexp won't ever see a p (not to mention he uses incorrect tagging p/). He's regsubbing them into spaces before that regexp ever gets to see $body, therefore it will fail, the contents of $boddy will not be overwritten, $boddy will remain "example" (and in the case of previously undefined variables, the variable will not even get created if the regexp fails). His output of course when he runs his example will be just that and is just that. He stated so in his plea for assistance.
I know you know this strikelight, I was just mentioning it for his benefit. ;)

@malcsman: The reason you see others use regsub before their regexp's is so that any html tokens not needed for parsing are removed prior to parsing. In this way minor website changes shouldn't cause a hassle for the script if it merely involes any of those removed html tokens. To have your regexp's parse things perfectly may not be possible if you remove too many html tokens (they will fail and won't carry over variable associations). So the general rule of thumb is, only try to regsub out html tokens you perceive as clutter (cruft), not those you can possibly use to base regexp's around.
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

speechles wrote:

Code: Select all

    set boddy "example" 
    regsub -all "\n" $body "" body
    regsub -all -nocase {<br>} $body " " body

    regsub -all -nocase {<p>} $body " " body
    regsub -all -nocase {<span id="en-TNIV-24335" class="sup">} $body "\002" body
    regsub -all -nocase {</span>} $body "\002" body
    regsub -all -nocase {<br>} $body " " body
    regsub -all -nocase {</p>} $body " " body

    regexp {<div class="result-text-style-normal">(.*?)<p/><p><br/></p></div>} $body - boddy 
This will never work, the regexp won't ever see a p (not to mention he uses incorrect tagging p/). He's regsubbing them into spaces before that regexp ever gets to see $body, therefore it will fail, the contents of $boddy will not be overwritten, $boddy will remain "example" (and in the case of previously undefined variables, the variable will not even get created if the regexp fails). His output of course when he runs his example will be just that and is just that. He stated so in his plea for assistance.
I know you know this strikelight, I was just mentioning it for his benefit. ;)
Ya, have to admit, i missed it... didn't pay close enough attention i guess :oops:
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

@malcsman, try using this snippet for inspiration. Uses .chanset #yourchan +bible.. have a fun. :D

Code: Select all

see code snippet in my last post below.
Last edited by speechles on Sun Apr 13, 2008 11:58 pm, edited 2 times in total.
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

wow,
thanks strikelight and speechles!!

to let you know, i have no clue how regexp an all works. my only experience with programming is with turbopascal 7 from school and html and a little php.
that <p/> was actually coppied out of the website - i was a little confused as to why it was not </p>.

i'll most definately use it for inspiration.

thanks again to both of you
SMiLE
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

um, ok. i .rehash and then it crashed, so i checked the logs and this is what it sed:
[14:15] egghttp.tcl API v1.1.0 by strikelight now loaded.
[14:15] Tcl error in file 'statsbot.conf':
[14:15] wrong # args: extra words after "else" clause in "if" command
while executing
"if {![info exists egghttp(version)]} {
putlog "egghttp.tcl was NOT successfully loaded."
putlog "biblegateway.tcl has not been loaded as a result...."
(file "scripts/biblegateway.tcl" line 5)
invoked from within
"source scripts/biblegateway.tcl"
(file "statsbot.conf" line 1369)
[14:15] * CONFIG FILE NOT LOADED (NOT FOUND, OR ERROR)

i dont know tcl, so i'm confused! dont know how to fix it...
HeLP

is it possibly the curly braces at the end?:

Code: Select all

# pull the verses from the page in list format
    # which allows using a foreach to step through it.
    regexp -all -inline {<span id.*">(.+?)</p>} $body - text]} {
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

ok, that didnt work, trying something else: see comment of 'try 1' and 'try 2'

Code: Select all

 # pull the verses from the page in list format
    # which allows using a foreach to step through it.
    # i removed The '} {' from the end of the line --FAILED and put back - try 1 unsuccessful
    regexp -all -inline {<span id.*">(.+?)</p>} $body - text]} {

    # at this point, you can figure it out more than likely...
    # $title is your main title

Code: Select all

# grab the subtitle and other relevant subtitles
    # gather them into a single line.
    # removed a '}' end of moretitle} -- try 2
    if {[regexp {<p><h.*>(.+?)<span id} $body - moretitle} {
      # change closing header tag to a period to better
      # compact all the subtitles into one line and keep
      # it looking pretty.
[/code]
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

ok, i opened netbeans and got a new script file. pasted the code and played with it until the else { had his partner } at the bottom. not sure what i did, but i got to this:

Code: Select all

# Config
set pubtrigger "!example"
# End of config

if {![info exists egghttp(version)]} {
  putlog "egghttp.tcl was NOT successfully loaded."
  putlog "biblegateway.tcl has not been loaded as a result."
} else {
  proc biblegateway {sock} {
    global url
    global chan
    set headers [egghttp:headers $sock]
    set body [egghttp:data $sock]
    # remove all newlines, carriage returns, tabs, and vertical tabs
    regsub -all {(?:\n|\r|\t|\v)} $body "" body
 
    # grab the title and if success, change   into space
    # if fails, set title accordingly
    if {[regexp {<h3>(.+?)</h3>} $body - title]} {
      # change   into space
      regsub -all { } $title " " title
    } else {
      # if no title is found we must set one
      set title "No Title"
    }
   
    # grab the subtitle and other relevant subtitles
    # gather them into a single line.
    if {[regexp {<p><h.*>(.+?)<span id} $body - moretitle]} {
      # change closing header tag to a period to better
      # compact all the subtitles into one line and keep
      # it looking pretty.
      regsub -all {</h.*>} $moretitle ". " moretitle
      regsub -all { } $title " " title
      # remove all remaining html tags.
      regsub -all {<.*>} $moretitle "" moretitle
    } else {
      # some valid pages have no subtitles
      # so if none is found, we will just set it blank.
      set moretitle ""
    }

    # pull the verses from the page in list format
    # which allows using a foreach to step through it.
    regexp -all -inline {<span id.*">(.+?)</p>} $body - text 

    # at this point, you can figure it out more than likely...
    # $title is your main title
    # $moretitle is the subtitles
    # $text is the body of each verse in list format

    putserv "privmsg $chan :$title"
    # eggdrop will not message a blank line
    # so this line may not be messaged.
    putserv "privmsg $chan :$moretitle"
    # roll thru the list
    foreach line $text {
      putserv "privmsg $chan :$line"
    }
  }


  # bind pub
  bind pub -|- $::pubtrigger our:pubtrigger
  # channel flag
  setudef flag bible

  proc our:pubtrigger {nick uhand hand chan text} {
    global url
    global chan
    # if bible isn't set for the channel, return and do nothing
    if {[lsearch -exact [channel info $chan] +bible] == -1} {return}
    # set the url using urlencode function based on $text
    set url "http://www.biblegateway.com/passage/?search=[urlencode $text]&version=72"
    # call bible
    set sock [egghttp:geturl $url biblegateway]
  }

  # convert text into html approved % codes.
  proc urlencode {text} {
    set aurl ""
    foreach byte [split [encoding convertto utf-8 $text] ""] {
      scan $byte %c i
      if {$i < 65 || $i > 122} {
        append aurl [format %%%02X $i]
      } else {
        append aurl $byte
      }
    }
    return [string map {%3A : %2D - %2E . %30 0 %31 1 %32 2 %33 3 %34 4 %35 5 %36 6 %37 7 %38 8 %39 9 \[ %5B \\ %5C \] %5D \^ %5E \_ %5F \` %60} $aurl]
  }

  putlog "biblegateway.tcl has been successfully loaded."
} 
i .rehash and didnt crash the bot. i'm gonna go have tea and maybe leave home after that..dont know when i can test it.

wish i could have a running commentary when i was playing with the code. i dont think i have learnt anything... :(

until then, bye?!?
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

ok, couldnt resist...
got this error when i said !example mark 1:1
[15:57] Tcl error [our:pubtrigger]: variable "chan" already exists
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

see my very last post, the most relevant. the code here was buggy and useless. same can be said of the entire text of this post, heh.
Last edited by speechles on Mon Apr 14, 2008 12:33 am, edited 2 times in total.
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

Post by malcsman »

um, ok, perhaps something else is preventing thing from happening.
here's from the console after i give the command:
[21:05] [@] frommn!~ninda@SF-609D90B5.cybersmart.co.za PRIVMSG #questions :!example genesis 4:12
[21:05] WARNING: open_telnet_raw() is about to block in gethostbyname()!
[21:05] net: connect! sock 27
[21:05] net: eof!(read) socket 27
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

<speechles> !bible mark 4:1-9
<sp33chy> Mark 4:1-9 (Today's New International Version)
<sp33chy> Mark 4. The Parable of the Sower.
<sp33chy> 1) Again Jesus began to teach by the lake. The crowd that gathered around him was so large that he got into a boat and sat in it out on the lake, while all the people were along the shore at the water's edge.
<sp33chy> 2) He taught them many things by parables, and in his teaching said:
<sp33chy> 3) "Listen! A farmer went out to sow his seed.
<sp33chy> 4) As he was scattering the seed, some fell along the path, and the birds came and ate it up.
<sp33chy> 5) Some fell on rocky places, where it did not have much soil. It sprang up quickly, because the soil was shallow.
<sp33chy> 6) But when the sun came up, the plants were scorched, and they withered because they had no root.
<sp33chy> 7) Other seed fell among thorns, which grew up and choked the plants, so that they did not bear grain.
<sp33chy> 8) Still other seed fell on good soil. It came up, grew and produced a crop, some multiplying thirty, some sixty, some a hundred times."
<sp33chy> 9) Then Jesus said, "Whoever has ears to hear, let them hear."

<speechles> !bible genesis 4:12
<sp33chy> Genesis 4:12 (Today's New International Version)
<sp33chy> 12) When you work the ground, it will no longer yield its crops for you. You will be a restless wanderer on the earth."

Code: Select all

# BibleGateway Example using EggHttp via speechless
# freeware - feel free to use any part of this script for any purpose.
# To enable this script requires one of 2 methods below:
# 1) some chans: on dcc partyline, type: .chanset #yourchan +bible
# 2) all bot chans: on dcc partyline, type .chanset * +bible

# NOTE: Code is purposely over commented to explain each section.

# Config
set pubtrigger "!bible"
set pubspeed 5
# End of config

# script begins with egghttp check
if {![info exists egghttp(version)]} {
  putlog "egghttp.tcl was NOT successfully loaded."
  putlog "biblegateway.tcl has not been loaded as a result."
} else {
  proc biblegateway {sock} {
    global url
    global chan2
    set headers [egghttp:headers $sock]
    set body [egghttp:data $sock]
    # remove all newlines, carriage returns, tabs, and vertical tabs
    regsub -all {(?:\n|\r|\t|\v)} $body "" body

    # grab the title and if success, change   into space
    # if fails, set title accordingly
    if {[regexp {<h3>(.+?)</h3>} $body - title]} {
      # change   into space
      regsub -all { } $title " " title
    } else {
      # if no title is found we must set one
      set title "No Title"
    }
   
    # grab the subtitle and other relevant subtitles
    # gather them into a single line.
    if {[regexp {<p><h.*?>(.+?)<span id} $body - moretitle]} {
      # these end each subtitle, changing them to . allows
      # them to fit on one line, in a sort-of sentence.
      regsub -all {</h.>} $moretitle ". " moretitle
      # replace non-breaking space tags with true spaces.
      regsub -all { } $moretitle " " moretitle
      # remove all remaining html tags.
      regsub -all {<.*?>} $moretitle "" moretitle
    } else {
      # some valid pages have no subtitles
      # so if none is found, we will just set it blank.
      set moretitle ""
    }
    
    # message title and subtitles.
    putserv "privmsg $chan2 :$title"
    putserv "privmsg $chan2 :$moretitle"

    # reset pubspeed counter
    set count 0
    # while we have text, let's recurse the loop
    while {[regexp -- {<span id.*?">(.+?)<span} $body - line]} {
      # increment pubspeed counter
      incr count
      # remove from the $body the regexp leader
      # so that the same exact segment can never
      # match twice.
      regsub -- {<span id.*?">} $body "" body
      # each verse number is encased within span tags
      # changing the end tag to a parenthesis cleans up text
      regsub -all {</span>} $line "\)" line
      # this combination should be dealt with first
      regsub -all {\[<.*?>\]} $line "" line
      # now remove the rest of the html tags
      regsub -all {<.*?>} $line "" line
      # this non-breaking space is for web display
      # irrelevant on irc, let's remove it
      regsub -all { } $line "" line
      # pubspeed check, if exceeded puthelp
      if {$count > $::pubspeed} {
        puthelp "privmsg $chan2 :$line"
      } else {
        putserv "privmsg $chan2 :$line"
      }
    }
  }


  # bind pub
  bind pub -|- $::pubtrigger our:pubtrigger
  # channel flag
  setudef flag bible

  proc our:pubtrigger {nick uhand hand chan text} {
    global url
    global chan2
    set chan2 $chan
    # if bible isn't set for the channel, return and do nothing
    if {[lsearch -exact [channel info $chan] +bible] == -1} {return}
    # set the url using urlencode function based on $text
    set url "http://www.biblegateway.com/passage/?search=[our:urlencode $text]&version=72"
    # call bible
    set sock [egghttp:geturl $url biblegateway]
  }

  # convert text into html approved % codes.
  proc our:urlencode {text} {
    set aurl ""
    foreach byte [split [encoding convertto utf-8 $text] ""] {
      scan $byte %c i
      if {$i < 65 || $i > 122} {
        append aurl [format %%%02X $i]
      } else {
        append aurl $byte
      }
    }
    return [string map {%3A : %2D - %2E . %30 0 %31 1 %32 2 %33 3 %34 4 %35 5 %36 6 %37 7 %38 8 %39 9 \[ %5B \\ %5C \] %5D \^ %5E \_ %5F \` %60} $aurl]
  }

  putlog "biblegateway.tcl has been successfully loaded."
}
Tested & works, doing the lords work.. haw.. pubspeed is the amount of lines which will be putserved, the rest will be puthelped. If you find the bot getting excess flooded off IRC, lower it. If you find the bot is too slow to display everything, raise it.

You will now need to develop some kind of flood protection as well as some kind of error detection mechanism, other than that here you go, enjoy.. ;)

Note: inline regexp with a foreach would've been nice to use, but it isn't possible with how the page is built. The while loop is the way to do it in this case.
m
malcsman
Voice
Posts: 8
Joined: Sat Apr 12, 2008 5:19 pm

thanks, one more questions...

Post by malcsman »

thanks speechles

i'll create a new thread if i need more help. i worked out the following via trial and error:

where abouts do i check the text for errors before it gets the things from the site? after

Code: Select all

# if bible isn't set for the channel, return and do nothing
    if {[lsearch -exact [channel info $chan] +bible] == -1} {return} 
?

and do i use the var $text to check on correct spellings and to allow for shortcuts like 'cor' for corinthians?

like

Code: Select all

 if {[lindex [split $text] 0] == "help"} {biblehelp} 
to check if someone asked for help, then goto a proc biblehelp and return -- not sure how to do this yet, but i'll keep looking!

thanks
Post Reply