running randomquote.tcl on a log file of 80 mb sort of sends my computer to a bad place. i don't really know tcl, nor any programming at all, so i'm mostly wondering about ways to speed up getting a random quote. would doing something like 'wc -l filename', and then having it somehow grab a random range of, say, 100 lines with 'head' and 'tail' be a good approach to investigate, or is that completely the wrong way to think about this issue? below is the script i'm currently using--it comes from mel.sf.net, though i changed it to use 'split' instead of regsub (or at least i think i did).
thanks for any ideas.
If the log file only contains quotes, then a good way to go is:
1. to determine the size of the file,
2. set the read pointer to a value randomly between 0 and the filesize,
3. retrieve the line at the pointer location.
Tcl has the functions for each of the three steps.
set size [expr {[file size [file join $statslogdir $rq_chan].log] -1}]
set rq_read [open [file join $statslogdir $rq_chan].log r]
for {set i 0] {$i < 20} {incr i} {
seek $rq_read [rand $size]
gets $rq_read
set y 0
while {![eof $rq_read] && $y < 100} {
set rq_data [split [gets $rq_read]]
if {[string match -nocase $rq_query [string trim [lindex $rq_data 1] <>]]} {
incr rq_lines
set rq_userlines($rq_lines) $rq_data
}
incr y
}
if {$rq_lines => 2} {break}
}
close $rq_read
you mean something like that? that should search 100 lines on up to 20 random positions without parsing the whole file. depending on speed of system you could try with 50 positions and 200 lines or something like that.
that was pretty much it exactly, De Kus. thank you.
i tried changing it a bit, so it would do perform the part you wrote three times. i know i could get the same outcome by changing the number of lines searched and the number of random searches, but i wanted to see if i could make the while loop work. does the following look at least somewhat like a correct way to have the tcl loop through the random search three times?
set rq_lines -1
# trying to setup that while loop
set tries 0
while {$tries < 3} {
# done according to http://forum.egghelp.org/viewtopic.php?p=56844#56844
set size [expr {[file size [file join $statslogdir $rq_chan].log] -1}]
set rq_read [open [file join $statslogdir $rq_chan].log r]
for {set i 0} {$i < 40} {incr i} {
seek $rq_read [rand $size]
gets $rq_read
set y 0
while {![eof $rq_read] && $y < 400} {
set rq_data [split [gets $rq_read]]
if {[string match -nocase [split $rq_query] [string trim [lindex $rq_data 1] <>]]} {
incr rq_lines
set rq_userlines($rq_lines) $rq_data
}
incr y
}
if {$rq_lines > 1} {break}
}
close $rq_read
unset i y
# more edits by me in an attempt to get this thing to loop three times
# while looking for a quote
if {$rq_lines == -1} {incr tries}
if {$tries == 3} {putserv "PRIVMSG $channel : no random quote found for $rq_query."}
if {$rq_lines != -1} {putserv "PRIVMSG $channel : [join $rq_userlines([rand $rq_lines])]"; set tries 3}
}
array unset rq_userlines
return 0
}
}