This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

HTTP timeout problems

Help for those learning Tcl or writing their own scripts.
Post Reply
b
burfo
Voice
Posts: 3
Joined: Thu Nov 29, 2012 12:04 am

HTTP timeout problems

Post by burfo »

Hey folks,

Eggdrop 1.6.21
TCL 8.4.13
HTTP 2.2 or 2.7.9

I've been having some crazy hair-pulling problems with TCL and its HTTP module.

I have some simple TCL code that executes whenever someone types a specific command into chat. This command simply submits an HTTP query. (It happens the query is to a Google docs form, though I'm not sure that's relevant.)

Code: Select all

set query [::http::formatQuery "pageNumber" 0 "entry.0.single" $a "entry.1.single" $b "entry.2.single" $c "entry.4.single" $d "entry.6.single" $e "entry.7.single" $f]
set token [::http::geturl $url -timeout 45000 -query $query]
set status [::http::status $token]
if {$status != "ok"} {
  putchan $chan "Error..."
  ::http::cleanup $token
  return 1
}
::http::cleanup $token
return 1
Now normally, this works just fine. However, every so often (apparently randomly), one time it will timeout. When it does timeout, it always leaves the TCP connection in a CLOSE_WAIT state with 1 byte in the Receive queue:

Code: Select all

Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        1      0 *******************:33810   nuq04s08-in-f7.1e100.n:http CLOSE_WAIT
When this does happen, the http module is (as always) logging that it has closed the socket (by calling 'close $sock') with no exception being thrown, so I don't think the HTTP module code is doing anything incorrectly.

From this point on, every future HTTP connection from the bot will fail (timeout). It fails because the socket it obtains never becomes writable (on this line in http code):

Code: Select all

fileevent $sock writable [list http::Connect $token]
The only way I can resolve this is to kill the bot and start it up again. This causes the OS process to die and the CLOSE_WAIT TCP connection to go away. A ".restart" doesn't work because that doesn't kill the process.

Can anyone offer me any suggestions?

I've tried high and low to reproduce this problem directly in the TCL interpreter via scripts that submit queries every X seconds, but I've never been able to reproduce it in this way.

Thank you.
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Code: Select all

set query [::http::formatQuery "pageNumber" 0 "entry.0.single" $a "entry.1.single" $b "entry.2.single" $c "entry.4.single" $d "entry.6.single" $e "entry.7.single" $f]
catch {set token [::http::geturl $url -timeout 45000 -query $query]} error
# 1) Was an ::http::* session token correctly created?
if {![string match -nocase "::http::*" $error]} {
  # no, throw error
  putchan $chan "Error... [string totitle $error]"
  # return to avoid http cleanup below that depends on
  # the token which was never set that we won't have
  # which in turn would cause a tcl error at that point.
  return 1
}
# 2) Is the http transaction ok?
if {![string equal -nocase [::http::status $token] "ok"]} {
  # no, throw error
  putchan $chan "Error... [string totitle [::http::status $token]]"
  # no cleanup here or return
  # as this is called below anyways
  # why be redundant?!
}
# 3) Are we given a 200 code?
if {![string equal [http::ncode $token] 200]} {
  # no, throw problem report to show numeric code given...
  # 302,403,501,etc
  putchan $chan "There was a problem: [::http::ncode $token] error"
}
# Normally we would call [::http::data $token] here
# and store it into a variable to use in additional code
# if this is the case you need to add:
# ::http::cleanup $token
# return 1
# to the end of conditionals 2 and 3 above as well as below

# Call cleanup to wipe http transaction
::http::cleanup $token
# return and log
return 1
The problem is you are not using [catch] when calling ::http::geturl. Sometimes it cannot create an http connection and your variable token cannot get set. This in turns halts execution of the tcl-interpreter and it will bail from the procedure at that point not executing code any further. The socket in this case does not need to be cleaned up. You call cleanup only if an valid http $token has been set to your variable (the catch can tell when it isn't ;)). Test the code above and let me know if you have any issues. This is my usual way of handling all socket errors/http session errors in all my scripts.

BTW: The single byte held in the receive queue is the EOF marker. This is why you always read once before you check for true EOF.

BTW2: This is google we are talking about as well. There may be some sort of "throttle" at work here where you are rate-limited. This allows you a lease of so much traffic per hour or such. When you exceed this rate your lease is terminated and you start to only receive timeouts as you've exceeded acceptable traffic. This is likely the scenario you experience after adapting to my code you will see the errors in channel.

BTW3: As above, this is google. Further, the EOF marker is given because there is no page body associated with a 302 redirect. The code I gave above should tell you the numeric code when there is a problem. You are expected to follow the redirect with a 302. It is redirecting you because of BTW2 and you've exceeded rate. You would need to incorporate support for following redirects and since this is google, even as far as possibly decoding captcha's.. This is where reality sets in and you learn to stick below whatever rate lease you are given. ;)
b
burfo
Voice
Posts: 3
Joined: Thu Nov 29, 2012 12:04 am

Post by burfo »

Thank you for the information. I'm going to give your code a shot.

I don't think the problem is the lack of the catch. The code I listed has always gotten far enough to output the "Error..." message to the channel.

When I initially wrote the code I actually used the "-command" callback (so it doesn't hang processing) and did have a catch around the geturl call then--but it never caught anything. I thought the problem was caused by simultaneous processing, so I removed the callback, but the problem persisted.

Although, you're using:

Code: Select all

catch {set token [::http::geturl ...]} error
But I was using:

Code: Select all

if {[catch {::http::geturl ...} token]} { ... }
These are the same, right?

I'll report back after I've tried your example.

By the way:
This is why you always read once before you check for true EOF.
Are you suggesting I should have something in my code to do this?
b
burfo
Voice
Posts: 3
Joined: Thu Nov 29, 2012 12:04 am

Post by burfo »

Using speechles' new code, I am still running into the TIMEOUT issues.

Any other ideas?

Edit: Hmm, this time it's because Google is being slow and not responding quickly enough. I'll keep monitoring.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Using catch there will not have any effect on the issue (It is, however, correct to not call ::http::cleanup without a valid token).

Using the "error" argument of the catch-command as opposed of using set within the command argument would work just as well (and makes for cleaner code, IMHO). Checking the result of the call to "catch" would probably be the preferred way, as geturl should always return a valid token unless it throws an exception.

Now, to the issue:
Tcp CLOSE_WAIT state means that the remote end has closed the connection, but your application (the http-package) has not yet done so.
Be adviced, that ::http::cleanup will not try to close any sockets, it will merely release the memory allocated by the token itself.
The timeout-routine should close the local socket by itself, and you say you've seen this being logged (I suppose by enabling the ::http::Log proc). This would suggest that there's something in eggdrop that breaks the closing of sockets from within tcl.

It would be interresting to see the outputs of "file channels" once the issue arises, and also "info vars ::http::*" and "array get ::http::1234" (replace 1234 with the actual token). This would require that you do not call ::http::cleanup in the event of a timeout.
NML_375
Post Reply