As some of you know, I have a Tcl script which turns eggdrop into a trivia quizmaster. I'm busy rewriting the script and I'm having some trouble with selecting a method to read files containing questions. I've tried 2 methods, one is quite memory hungry and a lot of people using shell accounts from a commercial provider have limits on how much memory is used by their processes. The other method is very disk intensive and can be a little slow, depending on the shell box's hardware.
First method:
Each line in each file is read from disk into a Tcl array. Selecting a question is very fast using this method because RAM is so fast. Problem is that with a large number of questions (60,000 questions = approx. 5mb), it uses a lot of memory. This often leads to problems because of limits imposed by the shell provider's admin.
Second method:
Each line in each file is read from disk to obtain the total number of questions. Selecting a question is slower using this method because each time a selection needs to be made, the script picks a random number, then reads the file from disk again until it gets to the selected number. This means that the whole lot of questions is not stored in memory, but it is very disk intensive.
Can anyone suggest a better method of handling this?
I have a game script too and I ran into a similar problem for word scrambles. I wanted to read a random word out of the dictionary file, which contains over 100k entries.
Well, rather than read in the entire file and choose a random number, use "seek" and "tell" to figure out the file size, then seek to a random spot. Read in the partial line, then read in the next full line, and you have your random line.
Of course, it's not as evenly distributed as reading in every line, but it's still entirely random and extremely efficient.