Have socket read untill backspace instead of newline

Ofloo · Post by **Ofloo** » Thu Nov 12, 2009 2:09 pm

I'm writing a script which is supposed to work with epp but for some reason the epp ends with \b\0\0\b so when the last </epp> tag appears the endline is removed .. or so it seems cause it appears in the next reply. now i've converted every char to hex and i noticed that 0x08 appeared .. so how would one solve this?

I have tried fconfigure sock -eofchar {\x08}

since the replies are xml, so i suppose it is actually better to make it end with \b instead of \n that way the whole xml buildup can be read at once.

EDIT: and i've noticed i end up with broken pipe probably because then eof is initiated .. so i was wondering how can i make it so that it will either ignore the \b or ignore the \n so i can read a single char and append it to buffer or something..

Post by **nml375** » Thu Nov 12, 2009 3:56 pm

I'm not very familiar with EPP, but from what I can read from the protocol specs, especially EEP over TCP, the first 32bits (or 4 bytes) of the transaction is an integer specifying the length of the whole transaction.
Could it be that you are actually picking up the "header" (if you like) of the next message?

Since you did not post your script, I can't say exactly what you are doing right/wrong - but it would seem you are using 'gets' to read data? Gets will read one line of data, regardless of your eof-characters.
Personally, I'd use read to get the first 4 bytes, convert it to a long integer, subtract 4 (to account for the 4 bytes we already read), and read the remaining number of bytes (again using read).

Ofloo · Post by **Ofloo** » Thu Nov 12, 2009 7:52 pm

so you would use read instead of gets?

Code: Select all

proc epp_socket_callback {epp_socket}  {
  if {[string equal {} [fconfigure $epp_socket -error]]} {
    fconfigure $epp_socket -buffering line
    fconfigure $epp_socket -blocking 0
    while {![eof $epp_socket]} {
     if {[gets $epp_socket x_epp] > -1} {
...

it could be header translation like you said but the thing is .. it shows every query you send a query in xml format then you get it back in xml, however from what i understand read will block the socket and i can't have that since i'm using multiple sockets in a single thread.

Post by **nml375** » Thu Nov 12, 2009 8:20 pm

As I said, I'm not that familiar with EPP, that was just the first thought that came to mind when I skimmed through the protocol specs.. Thinking though, \b\0\0\b would evaluate to 134,217,736. Not sure if a transaction would be that large?

Yes, I would probably use read whenever I have a set of data that's not necessary terminated or structured around newlines. read should not block if gets would not block, although some caution is advisable when using read..
Telling read to read n characters will actually cause it to return at most n characters (one important thing here is characters, not bytes, as tcl supports multibyte character sets).

A good example (which is also found in the http tcl package):
When making a http-request, I'd start by using gets, as each header is expected to be on a single line (with some special conditions for multi-line data), and each header is terminated by a newline. The header-block is terminated by the very first empty line.
At this point, I would then switch to read, mainly for performance, but also in the case that the last line of the document body lacks a newline (causing gets to return an empty string, leaving the last line in the buffer).

Actually, this could also answer the issues you are having. It's not that the \b\0\0\b erase the last </epp> line, but that the line is not newline-terminated - leading to gets NOT fetching the line, but waiting for a complete newline (remember, when using sockets, EOF will not occur until all data is read AND the socket connection has been closed).

Ofloo · Post by **Ofloo** » Thu Nov 12, 2009 8:33 pm

the werid thing is the next one is \x00\x00\x01\xce and the first one is \x00\x00\x08\xd2 then the next one is \x00\x00\x03\xb5 the strange thing is that every command seems to have it's own weird chars.

EDIT

so basicly i can make it read from bytes untill it makes <epp> to open and append it to a buffer and end that buffer untill i see </epp>. using read instead of gets.

Post by **nml375** » Thu Nov 12, 2009 8:45 pm

But they're all 4 bytes long? And generally start with \x00\x00 ? Then this would indeed be the header, as it is two bytes long:
\x00\x00\x01\xCE => 462
\x00\x00\x03\xB5 => 949

I'd look at using something like below, a word of caution though. You should doublecheck that read actually read $len number of characters (to make sure you catch the whole transaction).

Code: Select all

set data [read $epp_socket 4]
scan $data "%c%c%c%c" a b c d
set len [expr ($a<<24) + ($b<<16) + (c<<8) + $d - 4]
set body [read $epp_socket $len]

Ofloo · Post by **Ofloo** » Thu Nov 12, 2009 8:55 pm

very observing of you
EDIT:
This did the trick although you're missing a $ char in front of $c (in case someone else wants to use it) you did a very nice job i know i wouldn't of ever figured this out so thank you very much.

This code is little/big endian depending right i mean someone once told me if you used shift you had to remember that it mattered when you used little big endian systems, .. both would return a different result, .. not sure there would be a way around that. I used to convert ips to long ips the same way.