This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Save array to file / restore array from file

Help for those learning Tcl or writing their own scripts.
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Save array to file / restore array from file

Post by willyw »

Hello,

I want to save the data that is held in an array to a file on disk.
After rehash, I want to be able to restore that array, by reading the file.

Searched this forum.
Found some other threads that looked exactly like what I needed.

http://forum.egghelp.org/viewtopic.php? ... ckup+array
see post by GodOfSuicide, Oct 8 .
and
http://forum.egghelp.org/viewtopic.php? ... ckup+array
post by stdragon

I wrote a little script, to experiment with this.
It would write the file.
It would not read back from the file.
Error was:
Tcl error [readfile]: list must have an even number of elements

Examined file on hard disk.
In it were the pairs. Element name/data
However, the first character in the file was a { , and the last char was a} .

Removed them, and it now seems to read back the file, into array.

To get script to write the file, without the leading and trailing curly braces,
changed:
puts $fp list [array get $arr_name]
to
puts $fp [array get $arr_name]

Yet in one of those threads, it seemed like the poster was specifically saying that list command was needed.

On top of this, I've gone and looked up the uplevel command. I don't *think* I need that, so I'm not using it.

I've gotten myself confused, and could use some direction here.

1.) Is that an error in those old threads? ... that list command in
there?

2.) Would this be ok? :

Code: Select all

# basic proc to write from array to file
set fp [open ${testfilesdir}${testfile} w]
	puts $fp  [array get $arr_name]
	close $fp

Code: Select all

#basic proc to read from file, to array
set fp [open ${testfilesdir}${testfile} r]
	list array set $arr_name [read -nonewline $fp] 
	close $fp
3.) What the idea or purpose behind that list command in there anyway? I have the feeling it was for some sort of safety, but I could use some help to understand.

Thanks
User avatar
arfer
Master
Posts: 436
Joined: Fri Nov 26, 2004 8:45 pm
Location: Manchester, UK

Post by arfer »

The command 'array get' already returns a list consisting of pairs of element names/values, so at first sight the use of an additional list command seems superfluous and causes the data to be saved with the curly braces.

The only thing I can think the author is doing is trying to preserve the data when reading it back into an array if, for example, one or more of the element values is an empty string. What you have to consider is that data saved to a file is neither an array or a list because it is then outside of a Tcl script. It is merely a line of text in a file. Similar difficulties would occur if one or more element name contains spaces (definitely not adviseable) or if one or more element value contains spaces. In those cases the saved text may not be an even number of 'words' representing name/value pairs and so would not readily read back into an array variable.

If the array element names do not contain spaces (as should normally be the case) and no array element value is an empty string (though only minor modification is needed to deal with this), I would probably choose to save the data to a file as one array element per line. The first word in the line being the array element name and the remainder being the array element value. Something like the following (untested) :-

Code: Select all

proc pReadArray {} {
  global vDataArray
  set id [open whatever.txt r]
  set data [split [read -nonewline $id] \n]
  foreach item $data {
    set name [lindex [split $item] 0]
    set value [join [lrange [split $item] 1 end]]
    set vDataArray($name) $value
  }
  close $id
  return 0
}

proc pWriteArray {} {
  global vDataArray
  set id [open whatever.txt w]
  foreach {name value} [array get vDataArray] {
    puts $id $name $value
  }
  close $id
  return 0
}
You would obviously call pReadArray to read in the data and populate the array vDataArray, and call pWriteArray to save the vDataArray name/value pairs to a text file.

Depending on exactly what the array names/values could contain, you might have to rethink things.

The code above is probably incomplete, even if it is correct for your needs. For example you might wish to test if the file exists before trying to read it etc etc.
I must have had nothing to do
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Post by willyw »

Thanks for replying.
arfer wrote: ...
Similar difficulties would occur if one or more element name contains spaces (definitely not adviseable)
That is even possible?? that's interesting.

But, in this case - no. The element names do not ever contain spaces.

or if one or more element value contains spaces.
They do. Random length too. Any value could be a complete sentence, for example.

someword,1 Here could be any random sentence, ... anything at all
someword,2 including any sort of punctuation
someword,3 and so on and so on

In those cases the saved text may not be an even number of 'words' representing name/value pairs and so would not readily read back into an array variable.
Exactly.
If the array element names do not contain spaces (as should normally be the case) and no array element value is an empty string (though only minor modification is needed to deal with this),
I hadn't thought of that possibility... but yes, I suppose it would be possible for a value to to be empty.
I would probably choose to save the data to a file as one array element per line. The first word in the line being the array element name and the remainder being the array element value. Something like the following (untested) :-

Code: Select all

proc pReadArray {} {
  global vDataArray
  set id [open whatever.txt r]
  set data [split [read -nonewline $id] \n]
  foreach item $data {
    set name [lindex [split $item] 0]
    set value [join [lrange [split $item] 1 end]]
    set vDataArray($name) $value
  }
  close $id
  return 0
}

proc pWriteArray {} {
  global vDataArray
  set id [open whatever.txt w]
  foreach {name value} [array get vDataArray] {
    puts $id $name $value
  }
  close $id
  return 0
}
You would obviously call pReadArray to read in the data and populate the array vDataArray, and call pWriteArray to save the vDataArray name/value pairs to a text file.

Depending on exactly what the array names/values could contain, you might have to rethink things.

The code above is probably incomplete, even if it is correct for your needs. For example you might wish to test if the file exists before trying to read it etc etc.
Tried it (writing to file) quickly, and it didn't work. It had to do with $value being more than one word, I believe.

Played around with trying to solve that a little bit.
Tried puts $id [list $name $value]
and it produces a nice looking file. value is inside curly braces.
But... reading that file back, and value now is inside double curly braces.

Decided not to fiddle with it more right now. Late, and tired. Waste of time. :) Will tackle it tomorrow, after a night's sleep.

Just wanted to get back to you with this info, in case you might be able to direct me with method, - and commands I'll need to read up on.

Thanks
User avatar
arfer
Master
Posts: 436
Joined: Fri Nov 26, 2004 8:45 pm
Location: Manchester, UK

Post by arfer »

Sorry, my mistake. I should have grouped together the arguments in the 'puts' command by placing them inside quotes.

Code: Select all

proc pWriteArray {} {
  global vDataArray
  set id [open whatever.txt w]
  foreach {name value} [array get vDataArray] {
    puts $id "$name $value"
  }
  close $id
  return 0
}
Don't forget that I said the code is potentially incomplete. Eg. What do you want to happen if pWriteArray is called and vDataArray does not exist? The code is unfinished until you can account for every eventuality.
I must have had nothing to do
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Just to pitch in;
The list command used in the first linked thread is not used to "protect" the output from array get. In fact, the list command will never see that output, instead it is used to avoid "double evaluation" when generating script code on the fly.

The most trivial form would be something as simple as this. In fact, it's a very stripped down version of the script posted by user in that thread, with no checks for the existance or type of the variable/array:

Code: Select all

proc saveArray {varname file} {
 upvar 1 $varname v

 set fp [open $file w]
 puts $fp [list array set $varname [array get v]]
 close $fp
}
This would then be restored by simply loading the generated file as a script (source thefile).

Edit: Forgot to include the variable name in array set... fixed.
Edit: used $v instead of v... fixed.
Last edited by nml375 on Fri Oct 30, 2009 1:29 pm, edited 2 times in total.
NML_375
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Post by willyw »

arfer wrote:Sorry, my mistake.
Not at all. Mine. I meant to mention ....
I should have grouped together the arguments in the 'puts' command by placing them inside quotes.
... that I'd tried that.
Didn't work.
I remember trying puts $id $name "$value"
and I would have sworn I then tried puts $id "$name $value" , just like yours below.
And it still didn't work. ... I thought.

But, I went ahead and did it again, today.

One brief trial today, and it seems to work fine.
Apparently I was half asleep....

Code: Select all

proc pWriteArray {} {
  global vDataArray
  set id [open whatever.txt w]
  foreach {name value} [array get vDataArray] {
    puts $id "$name $value"
  }
  close $id
  return 0
}
Don't forget that I said the code is potentially incomplete. Eg. What do you want to happen if pWriteArray is called and vDataArray does not exist? The code is unfinished until you can account for every eventuality.
Understand your concern/warning. Thanks. :)

Let me get back to building with what you've given me, and see what I can do. I needed this basic method to get me over this hump.

Still not sure if the style that I started with would have been ok to use, in the way I described that I'd gotten it working.
But no matter now.
I like the way the file looks, much better - with your method.
The pairs on each on their own line, not just one long line.
Not that I plan to ever want to examine the file with my own eyes, but if I ever should, it is certainly much nicer, when using your method to write it.



Thank you
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Also, may I suggest an improved version of arfer's approach:

The changes in the code above should guarantee data remaining intact even if they should contain newlines or other "awkward" characters...

Edit: doing something wrong in my logics... will get back with working code...
NML_375
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Post by willyw »

Thanks for replying.
nml375 wrote:Just to pitch in;
The list command used in the first linked thread is not used to "protect" the output from array get. In fact, the list command will never see that output, instead it is used to avoid "double evaluation" when generating script code on the fly.
Sorry, I don't understand what double evaluation means.

The most trivial form would be something as simple as this. In fact, it's a very stripped down version of the script posted by user in that thread, with no checks for the existance or type of the variable/array:

Code: Select all

proc saveArray {varname file} {
 upvar 1 $varname v

 set fp [open $file w]
 puts $fp [list array set [array get $v]]
 close $fp
}
This would then be restored by simply loading the generated file as a script (source thefile).
I played with this. Just to write out a file... did not try reading the file back.

I see what happens. Interesting. :)
But wouldn't one have to [list array set array_name [array get $v]]
though? I'm saying this, after looking at what was written to the file.

No offense, but I think I want to continue with the other way... reading a file, not sourcing it. Only because I started down that path.
I've no idea if one method is better than the other.

All this stuff is very interesting to learn, and I'm glad you posted that. I sort of glossed over it, when I'd read user's original posts.

Thanks
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Post by willyw »

nml375 wrote:Also, may I suggest an improved version of arfer's approach:

The changes in the code above should guarantee data remaining intact even if they should contain newlines or other "awkward" characters...

Edit: doing something wrong in my logics... will get back with working code...
You have my attention. :)
I'll be checking back often.


Thank you.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Indeed, fixed that flaw just a minute ago.

Double evaluation is the condition when a piece of code is passed to the interpreter twice. This results in variable and command substitutions being done twice. One classic example is the utimer command:

Code: Select all

proc join_greet {nick host hand chan} {
  utimer 10 "puthelp \"PRIVMSG $chan :Hello $nick\""
}
Assume the above code is linked to a join-binding; once a user joins the channel, the code is evaluated and a 10 second timer is started. The command for the timer would be this:

Code: Select all

puthelp "PRIVMSG #thechannel :Hello Thenick"
Which looks pretty safe... However, if someone uses the nick I[die] when joining, that would cause a command substitution (due to the []), executing the command "die", thus killing your eggdrop.
The proper way to do a script like that is like this:

Code: Select all

proc join_greet {nick host hand chan] {
 utimer 10 [list puthelp "PRIVMSG $chan :Hello $nick"]
}
Lists have the feature of preserving it's content through command and variable substitutions, as well as preventing "data bleeding" (a string with spaces being treated as two or more arguments).

There are many other occasions where double evaluation comes into play, in my code, the second evaluation would be when you load the datafile as a script.

One major concern with arfer's current code, is that things will break badly if either any array indexname or value contains newlines, or if the indexname contains spaces. This is an unfortunate side-effect from using split.
NML_375
User avatar
arfer
Master
Posts: 436
Joined: Fri Nov 26, 2004 8:45 pm
Location: Manchester, UK

Post by arfer »

Thanks for the input nml375, I might just review some of my old scripts on this basis.

I should have read the files given in the links rather than simply trying to interpret the statement given by willyw in his original post :-

Code: Select all

puts $fp list [array get $arr_name] 
On reflection, I'm surprised the above statement worked at all. I would have thought an error would result.
I must have had nothing to do
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

The best approach imho would be to simply keep the list returned from array get untouched, and to restore it in a similar fashion with array set. Whenever you are using split, you run into the risk of data being mangled unless you take precautions to make sure that the "split-char" is not part of the stored data entities.

A trivial, working, save/restore would look as follows:

Code: Select all

proc saveArray {varname file} {
 upvar 1 $varname v

 set fp [open $file w]
 puts -nonewline $fp [array get v]
 close $fp
}

proc restoreArray {varname file} {
 upvar 1 $varname v

 set data ""
 set fp [open $file r]
 while {![eof $fp]} {
  append data [read $fp]
 }
 close $fp
 array set v $data
}
NML_375
User avatar
arfer
Master
Posts: 436
Joined: Fri Nov 26, 2004 8:45 pm
Location: Manchester, UK

Post by arfer »

Thanks nml375. I have an additional question. Suppose I want my cake AND eat it. That is to say I want to guard against spaces or special characters or even non-printable characters in the array element names/values BUT I want the neatness/readability of one element (record if you like) per line in the file.

Would you please check out the following and advise :-

Code: Select all

proc saveArray {varname filename} {
    upvar 1 $varname v
    set fp [open $filename w]
    foreach {name value} [array get v] {
        puts $fp [list $name $value]
    }
    close $fp
    return 0
}

proc restoreArray {varname filename} {
    upvar 1 $varname v
    set data ""
    set fp [open $filename r]
    while {![eof $fp]} {
        set d [gets $fp]
        if {[string length $d] != 0} {
            append data $d
        }
    }
    close $fp
    array set v $data
    return 0
}
I must have had nothing to do
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Well, I'm not sure why you have this slow code reading line by line... Also, your code will drop any empty line, possibly corrupting/mangling data (or index names) containing empty lines within the data.

The problem with the idea of one record per line, is really that one record may span more than one line. That is, unless you are willing to generate proper escape sequences throughout the data, and the appropriate substitutions when restoring the array, along with all the hazzle and risk of introducing unexpected variable/command substitutions into the code...
NML_375
w
willyw
Revered One
Posts: 1209
Joined: Thu Jan 15, 2009 12:55 am

Post by willyw »

I have been unable to spend serious time on my little project, and maybe that is a blessing in disguise. :) It might be best for me to wait, see where this very interesting conversation leads, and then try to read and grasp it then.

I don't want to derail it, but I do have a couple of side note type questions:

regarding: upvar 1 $varname v
is the use of upvar necessary? I looked it up, can see that it allows you to pass into the proc the value from that variable that is not within that proc.
Why not simply:
global varname
in this proc? If it was global elsewhere, would that not let it work?
or... hmm.. maybe with upvar, you are allowing for the original var to NOT be global?... perhaps that is the point?
This is a little above me, I'll be interesting in learning, from whatever you reply with. Tnx.

Other question
Regarding " or if the indexname contains spaces. "
Is this possible? I must not be understanding.... this is what it sounds like you are saying:
varname(some word,1 "string here")
varname(some word,2 "string here")
and so on....

instead of
varname(someword,1 "string here")
varname(someword,2 "string here")
and so on....

Thanks
Post Reply