This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Submit a html form

Help for those learning Tcl or writing their own scripts.
Post Reply
t
tintin
Voice
Posts: 5
Joined: Thu Feb 07, 2013 3:25 am

Submit a html form

Post by tintin »

Sorry if this is answered somewhere else feel free to point me in the right direction if you know of something.

OK So I'm trying to get a script to do a search of some forums, then relay that data to the channel or pm, but these forums are login only so I need to get the bot to log into the site. Once I can do that I will be able to write the rest of the code relatively easily.

So I need to get the http::geturl to submit the login form then make the search.

The form has 2 parts plus the submit button.
username
&
password

These are on the obviously named http://somesite.tld/login.php
and the search is submitted from http://somesite.tld/forums.php?action=s ... h=whatever

I only need help with the login and keeping it logged in to be able to make the search.

Thanks for any help
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Re: Submit a html form

Post by speechles »

A wrapper would be required to simplify this entire process. A wrapper with redirect/cookie support and can handle errors gracefully. One entirely recursive, straight forward how it works. Hopefully..

It would be used like so:
s:wget <url-string> <type-string> <referer-string> <cookies-list> <redirects-int> <poison-int>
The only required part to issue is <url-string>. The rest if omitted will have defaults assumed. Timeout and UserAgent are assumed to make this simple as well...

You would interact with the wrapper like so...

Code: Select all

# login and get cookiejar and html to start with
set reply [s:wget "http:://website.com/login?username=user&password=pass&submit=submit%20the%20url" POST]
# was there an error?
if {$reply != 0} {
   # no, seperate html and cookiejar from reply
   foreach {html cookiejar_login} $reply { break }
   # get next url, using cookiejar_login obtained from above
   set reply [s:wget "http:://website.com/search?text=this&somethingelse=that&submit=Go" GET "http://website.com/login" $cookiejar_login]
   # was there an error?
   if {$reply != 0} {
      # no, get html to parse and create a cookiejar for search
      # seperate from the one we use for login. we may never
      # use the cookiejar_search but if we need to it's there.
      foreach {html cookiejar_search} $reply { break }
      # rest of script here..
      # etc...
      # etc...using cookiejar_login
      # if cookiejar_login expires, you will need
      # to re-login, you can use a timer or bind
      # to time to simply reget a new fresh
      # cookiejar and keep this in global space
      # for the search procedure to use. there
      # are many ways to go from here....
   } else {
      # yes, error type is issued in partyline...
   }
} else {
  # yes, error type is issued in partyline...
}
  
Here is the code to said wrapper:

Code: Select all

# AN EASIER WAY v1.1
# by speechles (c) 2013

# s:wget - simple http wrapper for tcl scripting to acesss html/cookies
# s:debug - simple debug output for s:wget wrapper
#
# v1.1 - introducing an easier way...
# v1.0 - the beginning...

package require http
# remove the two lines below if you don't care about HTTPS
package require tls
http::register https 443 [list ::tls::socket -require 0 -request 1]

# recursive wget with cookies and referer
proc s:wget { url {type GET} {refer ""} {cookies ""} {re 0} {poison 0} } {
   http::config -useragent "Mozilla/EggdropWget"
   # if we have cookies, let's use em ;)
   if {[string equal -nocase GET $type]} {
      if {![string length $cookies]} {
         catch {set token [http::geturl $url -binary 1 -timeout 10000]} error
      } else {
         catch {set token [::http::geturl $url -binary 1 -headers [list "Referer" "$refer" "Cookie" "[string trim [join $cookies {;}] {;}]" ] -timeout 10000]} error
      }
   } else {
      foreach {url query} [split $url ?] {break}
      if {![string length $cookies]} {
         catch {set token [http::geturl $url -query $query -binary 1 -timeout 10000]} error
      } else {
         catch {set token [::http::geturl $url -query $query -binary 1 -headers [list "Referer" "$refer" "Cookie" "[string trim [join $cookies {;}] {;}]" ] -timeout 10000]} error
      }
   }
   # error condition 1, invalid socket or other general error
   if {![string match -nocase "::http::*" $error]} {
      s:debug "Error: [string totitle [string map {"\n" " | "} $error]] \( $url \)"
      return 0
   }
   # error condition 2, http error
   if {![string equal -nocase [::http::status $token] "ok"]} {
      s:debug "Http error: [string totitle [::http::status $token]] \( $url \)"
      http::cleanup $token
      return 0
   }
   upvar #0 $token state
   # iterate through the meta array to grab cookies
   foreach {name value} $state(meta) {
      # do we have cookies?                                                           
         if {[regexp -nocase ^Set-Cookie$ $name]} {
         # yes, add them to cookie list                                                       
         lappend ourCookies [lindex [split $value {;}] 0]
      }
   }
   # if no cookies this iteration remember cookies from last
   if {![info exists ourCookies] && [string length $cookies]} {
      set ourCookies $cookies
   }
   # recursive redirect support, 300's
   # the full gambit of browser support, hopefully ... ;)
   if {[string match "*[http::ncode $token]*" "303|302|301" ]} {
      foreach {name value} $state(meta) {
         if {[regexp -nocase ^location$ $name]} {
            if {![string match "http*" $value]} {
               # fix our locations if needed
               if {![string match "/" [string index $value 0]]} {
                  set value "[join [lrange [split $url "/"] 0 2] "/"]/$value"
               } else {
                  set value "[join [lrange [split $url "/"] 0 2] "/"]$value"
               }
            }
            # catch redirect to self's. There is one rule:
            # A url can redirect to itself a few times to attempt to
            # gain proper cookies, or referers. This is hard-coded at 2.
            # We catch the 3rd time and poison our recursion with it.
            # This will stop the madness ;)
            if {[string match [string map {" " "%20"} $value] $url]} {
               incr poison
               if {$poison > 2} {
                  s:debug "HTTP Error: Redirect error self to self \(3rd instance poisoned\) \( $url \)"
                  http::cleanup $token
                  return 0
               }
            }
            # poison any nested recursion over 10 traversals deep. no legitimate
            # site needs to do this. EVER!
            if {[incr re] > 10} {
              s:debug "HTTP Error: Redirect error (>10 too deep) \( $url \)"
              http::cleanup $token
              return 0
            }
            http::cleanup $token
            # recursive redirect by passing cookies and referer
            # this is what makes it now work! :)
            if {![info exists ourCookies]} {
               return [s:wget [string map {" " "%20"} $value] $url $type "" $re $poison]
            } else {
               return [s:wget [string map {" " "%20"} $value] $url $type $ourCookies $re $poison]
            }
         }
      }
   }
   # waaay down here, we finally check the ncode for 400 or 500 codes
   if {[string match 4* [http::ncode $token]] || [string match 5* [http::ncode $token]]} {
      s:debug "Http resource is not available: [http::ncode $token] \( $url \)"
      http::cleanup $token
      return 0
   }
   # --- return reply
   set data [http::data $token]
   http::cleanup $token
   if {[info exists ourCookies]} {
      return [list $data $ourCookies]
   } else {
      return [list $data ""]
   }

}

# debug - errors pass thru here
proc s:debug {text} {
  putlog "s:wget: $text"
}

# eof
.tcl set cookie [lindex [s:wget http://www.google.com/search?q=hi POST] 1]
> [01:32] s:wget: Http resource is not available: 405 ( http://www.google.com/search )
> Tcl:
.tcl set cookie [lindex [s:wget http://www.google.com/search?q=hi GET] 1]
> Tcl: PREF=ID=5fce19142b49a09b:FF=0:TM=1368865867:LM=1368865867:S=VkNZO3p8OIULztjJ NID=67=JZcQs5ry96mcOry62REFi1cnEuQmFWCfFSez9hifN9Raz5PWwHTcJY1HBkepwwMQgI3dyMdIfqYzjVq6rPpsvRFi61yiloXZpdtPJhVEbOAByVvHRRVKYPaFN1TEWhZS
.tcl llength $cookie
> Tcl: 2
Google doesn't like POST used for GET requests as you can tell, and it also issues "two" cookies to keep track of... Hence the "cookiejar" is always a list. ;)

Hopefully with all the #comments in the code everything is clear what is going on...
t
tintin
Voice
Posts: 5
Joined: Thu Feb 07, 2013 3:25 am

Post by tintin »

Thanks That worked a charm.
t
tintin
Voice
Posts: 5
Joined: Thu Feb 07, 2013 3:25 am

Post by tintin »

I’m Just going to reuse this topic as its still relevant, I’m using the http module to edit a post, to add a picture, I can get the script to log in and post the data, but the problem is that it whips the rest of the post away, the html of this page has an image and description field, the description field is the part that is lost when posting the image. I think this is by design so I figured I could just 'backup' the description field and re-add it as well as the picture. I’m not sure if that is the best way to go about it I’m open for suggestions. The issue is that the data backed up does not preserve 'newlines' so when it’s added back in it all appears on one line, not exactly what I want.

I’m just using a regexp to grab the description from the edit page (so to keep any bb code intact) again I’m not sure if that is the best way to go about it or if my regexp is wrong. Here is the code section in question. The image and in future the post ID are all handled in a different section

Code: Select all

proc image:upload {image site} {
http::config -useragent "Mozilla/5.0 (Windows NT; U; Windows NT x86; en-ES; rv:1.9.0.3) Firefox 3.0.7" -accept "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
set login [::http::formatQuery username user password password]
set tok [::http::geturl http://www.site.net/login.php -query $login]
	upvar \#0 $tok state
	set cookies [list]
	putlog $cookies
	foreach {name value} $state(meta) {
     if { $name eq "Set-Cookie" } {
         lappend cookies [lindex [split $value {;}] 0]
     }
 }
 ::http::cleanup $tok
 set backup [http::geturl http://www.site.net/torrents.php?action=editgroup&groupid=31 -timeout 10000 -headers [list Cookie [join $cookies {;}]]]
 upvar #0 $backup state
    set data $state(body)
    regsub -all {\n|\t} $data "" data
	regexp -nocase -all -- {<textarea name="body" cols="91" rows="20">(.*?)</textarea>} $data desc
	putlog "Desc $desc"
 set cover [http::formatQuery image $image body $desc auth authcode action takegroupedit groupid 31]
 putlog $cover
 set token [http::geturl http://www.site.net/torrents.php -query $cover -timeout 10000 -headers [list Cookie [join $cookies {;}]]]
Post Reply