The overall problem then was FEDORA FC9 .. I since DUMPED it like a bad habit and went back to FreeBSD and all those goofy problems went away- well, except for one.. and its to do with FC9 as well; this one is caused because my ISP uses Fedora FC9 throughout its network (ughh!) and I still have to connect through their FC9 sockets..
Now, I have to contend with "black hole sockets" on the immediate network for incoming connections OUTSIDE of the box itself... this will take some explaining...
It seems that my ISP engages in the pratice of "accepting connections that don't go anywhere" and yes.. TCL has a COW over it! (its believed they do this to "break" torrents and port scanners- in which it is VERY effective at doing!)
The problem? INDEFINATE "hang" on to an established socket connection- even though the machine it connects to is NON-EXISTANT!
the Fix.. "if said socket doesn't complete/close within 'x' time, KILL it!"
I have tried -async and event timers, etc.. but to no avail.. the problem is that TCL sees a socket- but its a fake one generated by my isp.. and this fake socket intends to confuse and HANG the connection indefinately!
heres a piece of some "test" code I've been working on that addresses this problem specifically...
Code: Select all
if {![info exists loop]} {
utimer 1 [list RunLoop1];
set loop 1;
};
proc RunLoop1 {} {
[list RacInfo];
utimer 1 [list RunLoop1];
return 1;
};
proc RacInfo {} {
global RacIP RacPort
if {[catch {set RacSock [socket $RacIP $RacPort];} sockerror]} {
putlog {Socket BAD!};
return 0;
} else {
putlog {Socket GOOD!};
close $RacSock;
};
}

What happens here is kinda hard to explain.. so I will describe the "testing situations" and what they resault based on the above mentioned code..
At FIRST turn-on/loading of the bot, with local machine (to where it connects TO called "RAC") is OFF
-- Socket Bad -- reports immediately- (as it SHOULD)
then RAC is started....
--Socket GOOD -- reports immediately (again, as it should)
then I UNPLUG RAC from the net (to simulate a broken connection)
NOTHING reports and the bot hangs INFEFINATELY -OR- I'll rescieve a "Timer Spun" error followed by a sometimes CRASH of eggdrop.
if the bot doesn't crash after a "timer Spun" error, its completely NON-responsive and has to be restarted anyways!
what causes this is the ISP will send a "null" socket in the RAC's place- this socket appears "valid" at "set-up" but, once established, becomes a BLACK HOLE.. and TCL/eggdrop will either stall/then timer spun.. or HANG for EVER trying to (re)establish.. its NOT TCL or eggdrop's fault.. its just my ISP playing DIRTY!
the solution, I believe is to only allow the socket to remain open for 500 ms, then force it closed.. irreguardless if it got data or not.. (I have determined that, when working properly, 200 ms is MORE than enough time to open, gets, flush and close a properlly-working socket in this situation)
I tried many different things. but had no real success in finding somethig that works reliably and/or effciently.. (the code above is the BASE code, after I cleaned up all the "mess" I've made in it)
any help/suggestions would be appreciated!
-DjZ-

