Why is it slow when i use HTTP::geturl ?

Taiyaki · Post by **Taiyaki** » Mon Nov 18, 2002 4:46 am

i think i'd better explain my problem

i use the HTTP::geturl to grab lines from a remote text file, and then display it in a NOTICE

set mydata "$nick" 
set token [::http::geturl "http://www3.telus.net/dekan/newrlsd.txt" -command [list mycallback "$mydata"]] 

<snip>

proc mycallback {mydata token} { 
    set data [::http::data $token] 
    foreach z [split $data \n] { 
         putquick "NOTICE $mydata :$z" 
  } 
}

this method works, but i'm just wondering why there is such a delay (like a 10-20second delay) from when I type the trigger to when the NOTICE arrives. And how do I get rid of this lag?

ppslim · Post by **ppslim** » Mon Nov 18, 2002 6:08 am

In reality, we need to see what is under snip, to make better judgment.

There are many factores that can affect the length of time ti takes to display the output.

File size: If the file you are downloading is large, then you have to wait until the machine can download the whole file. Seeing as you output line for line, then this may not be the case, but is valid never the less.

DNS servers: While downloads may be fast, the NS lookup that has too occur before connecting may take a long time.

Blocking: Most likely cause

You codes uses a method, to prevent blocking by the download operation.

Whilst preventing blocking is the best way, regardless of file size (though only using small files may benefit more from blocking operations), it will cause more overhead, both in CPU time and in code processed.

To use non-blocking mode, there has to be a mechanism in place, to download and store data, between the actual call to download, and the output.

The HTTP package provides these functions for you, however, they are not transparent. They do require a certain bit of background processing to work. This done using the Tcl event loop.

In Tcl, this can be entered using a few commands, vwait and update, it can also be done in C.

These functions will process incoming and outgoing buffered data, that have been declared to do so, using the fileevent functions, and any other bacvkground process like after.

In eggdrop main loop, the Tcl_DoOneEvent function is called, so that these events can be processed.

From what I remember, eggdrop only issues this command once per second, which results in a partial block. IE, a socket buffer will fill, Tcl_DoOneEvent will process the buffer, and the socket will have to wait until the next time Tcl_DoOneEvent is called.

Thus, a blocking HTTP get, will process the whole file in one go, while sacrifising IRC connectivity due to the block. Non-blocking will be slowly processed in eggdrop, due to the way Tcl_DoOneEvent is called.

Even more long winded still

There are method to get around this in eggdrop. 1: Decrease the time in the event loop (This has been sugested before on this forum, though the outcome may be unpredictable), 2: Process background tasks yourself more often.

The HTTP package provide the -blocksize option, which allows you to set the amount of blocks being read in each chunk. This defaults to 8192. This should be more than adiquate, though you can change this.

stdragon · Post by **stdragon** » Mon Nov 18, 2002 7:35 am

We should make a version of http.tcl that works with eggdrop's builtin network code.

ppslim · Post by **ppslim** » Mon Nov 18, 2002 8:54 am

The single word "YUK!" comes to mind.

Taiyaki · Post by **Taiyaki** » Mon Nov 18, 2002 2:36 pm

hmm actually there isn't much under <snip>, lol. this is like my first TCL script

let's see here, this is the whole thing

Code: Select all

package require http

set cmd "!updates"
bind pub -|- $cmd updates

proc updates {nick handle host chan text } {

set cdate [clock format [clock seconds] -format "%m/%d/%y"] 
putquick "NOTICE $nick :Updates for the day of $cdate" 
set mydata "$nick" 
set token [::http::geturl "http://www3.telus.net/dekan/newrlsd.txt" -command [list mycallback "$mydata"]] 

}

proc mycallback {mydata token} { 
    set data [::http::data $token] 
    foreach z [split $data \n] { 
        putquick "NOTICE $mydata :$z" 
    } 
}

i don't really understand all the possible causes that you have listed out, but can you please look at my code and see maybe why there is a lag? or maybe suggest a new method to read the text file (http://www3.telus.net/dekan/newrlsd.txt) faster? thanks in advance

ppslim · Post by **ppslim** » Mon Nov 18, 2002 3:14 pm

The fastest sugestion, and least network hungry, is to do periodic updates.

Either using a timer, or the time bind, you would store downloaded data, into a variable whioch is made global.

On the request of information, your bot can then output the information.

You may also want to issue a ::http::cleanup. Failure to do so, will incur a nice memory overhead, each and every time the above script is loaded. This will happen until all memory is used up, or the system refuses to issue any more.

stdragon · Post by **stdragon** » Mon Nov 18, 2002 3:34 pm

Taiyaki, the reason it is probably slow is that you are using "asynchronous" transfer. Asynchronous transfer is when Tcl only transfers information in the background when it is available. Using synchronous transfer,your script will stick at that one line until it is totally complete. So asynchronous is very good, because it prevents your bot from freezing.

However, since Eggdrop also uses asynchronous transfer, there is a problem when using Tcl to do the same thing. Namely, Tcl will only check if there is information ready about once every second. Normally, it will check almost continuously, so it's very fast.

What this could mean is it's only downloading a few lines of the webpage per second, which makes it very slow.

Unfortunately there isn't much you can do about it. One solution is to execute a program like wget in the background, which will run separate from eggdrop, and thus be able to download at full speed.

Taiyaki · Post by **Taiyaki** » Mon Nov 18, 2002 8:16 pm

hmm...well if this method i'm using is going to be ineffective (since the text file being shown in the script is only a sample...the actual file is pretty big) is there another way i can achieve this effect?

like how can i grab info from a remote website (doesn't have to be a txt file cuz i can parse the output) and display it in a NOTICE efficiently?
is this even possible with eggdrop/TCL?

ppslim · Post by **ppslim** » Mon Nov 18, 2002 8:52 pm

There are several method of doing this, including using the system you currently use.

Yes, the sample file is small, and would only reflect sample out, and not the bandwidth and size issues.

Adapt you current code to somthing like the following.

Code: Select all

package require http

set cmd "!updates"
bind pub -|- $cmd updates
set updatedata {}
proc updates {nick handle host chan text } {
  global updatedata
  set cdate [clock format [clock seconds] -format "%m/%d/%y"]
  putquick "NOTICE $nick :Updates for the day of $cdate"
  foreach data $updatedata {
    puthelp "NOTICE $nick :$data"
  }
}

bind time - "*0 *" update:timer
proc update:timer {args} {
  set token [::http::geturl "http://www3.telus.net/dekan/newrlsd.txt" -command [list mycallback "$mydata"]]
} 
update:timer
proc mycallback {token} { 
    global updatedata
    set updatedatadata [split [::http::data $token] \n]
    ::http::cleanup $token
}

The above, will output data, with it's contents being no more than ten mins old.

A bind is setup, to call a script every 10 mins. At this time, it will download a copy of the data, using the non-blobking system you have setup.

Everytime a client requests data, it will reply with the cached information, rather than a freshly downloaded copy.

This will stop DOS attacks to a certain extent, as it saves making a download for every request.

Taiyaki · Post by **Taiyaki** » Mon Nov 18, 2002 9:02 pm

thanks for the fast reply

and so the *.txt file will be copied to my HD every 10 mins and the output will be read from that file right?

Taiyaki · Post by **Taiyaki** » Mon Nov 18, 2002 9:36 pm

hmm...i'm trying with that code but it doesn't seem to work...

i edited a little bit of it

Code: Select all

package require http 

#set url of the update file
set url "http://www3.telus.net/dekan/newrlsd.txt"

#set current date
set cdate [clock format [clock seconds] -format "%m/%d/%y"]

set cmd "!updates" 
bind pub -|- $cmd updates 

set updatedata {} 
proc updates {nick handle host chan text } { 
	global updatedata url cdate	 
	putquick "NOTICE $nick :Updates for the day of $cdate" 
	foreach data $updatedata { 
		puthelp "NOTICE $nick :$data" 
	} 
} 

bind time - "*0 *" update:timer 
proc update:timer {args} { 
	global updatedata url cdate
	set token [::http::geturl $url -command [list mycallback]] 
} 

update:timer 

proc mycallback {token} { 
	global updatedata url cdate 
	set updatedatadata [split [::http::data $token] \n] 
	::http::cleanup $token 
}

hmmm, can anyone help? actually i don't really understand what is going on =/ this is my first script...it's prolly too much for me aha

ppslim · Post by **ppslim** » Tue Nov 19, 2002 5:07 am

This line is the cause

Code: Select all

set updatedatadata [split [::http::data $token] \n]

It should read

Code: Select all

set updatedata [split [::http::data $token] \n]

No, it doesn't store it to the HD. It works by reading it into memory, then every 10 mins, it will replace this stored information with a fresh set of data.