transacid wrote:apparently there is something wrong with the parser. (besides the wrong usage but this shouldn't happen at all IMHO)
It's bound to happen eventually. It's simply the nature of web scraping. Html changes with the wind. In a perfect (read this as free API) world we wouldn't need to screen/web scrape a thing. But this world we live in is far from that...

another thing:
transacid wrote:Code: Select all
[22:27:15] ( transacid) !w plenk
[22:27:16] ( NeXuS) Wikimedia Error: Unable to parse for: plenk @
http://en.wikipedia.org/wiki/index.php?title=Special%3ASearch&search=plenk&fulltext=Search
Note: This is actually a bug within wikipedia/wikimedia sites themselves, not anything lacking in the script. The problem has to do with problematic squid caches returning outdated incorrect replies.
See here for more info.
This occurs because that page just so happens to be gzipped. This script expects text data to flow. Not binary compressed data. It's no longer an issue though because I've incorporated support for the zlib package within the script now. It will seamlessly decompress those pesky gzipped wikipedia/media pages easily.
http://wiki.tcl.tk/4610 - this url explains most of what is now required. And by required, I mean if you don't need gzip for your wiki pleasures then by all means it's not required. But.. If you experience any issue like above, your going to need it.
Windrop users without tcl8.6 handy (which includes myself

) can simply use the url below to download a precompiled binary package, copying the entire folder to their windrop/lib/tcl folder gets you going.
http://pascal.scheffers.net/software/zl ... -win32.zip
Eggdrop users, on actual eggdrops running flavors of *nix can simply rely on trf package to work the gzip magic. This is to allow future versions of tcl (zlib) to work as well as the older (trf).
Now for the new additions to the script:
Code: Select all
# set this to the proxy you wish the main html fetching procedure to
# use. set both proxy_host and proxy_port. if you don't wish to use
# a proxy. set proxy_host ""
# --
variable proxy_host ""
variable proxy_port ""
# set this to the switch you want to use for youtube to filter content
# and display only high-definition videos.
# ------
variable youtube_highdef "@hd"
Those previously getting 503 google sorry messages will now be much happier. There is now a proper proxy system present within the script as well. You can also now search youtube and filter high-definition videos only. This switch is customizeable as well and can be placed anywhere (except before the .region switch).
!y @hd .de whatever - Bad, the high-def switch is placed before the region.
!y .de whatever @hd / !y .fr @hd whatever - both 100% acceptable.
Edit: While on the subject of gzipped webpages. Why not incorporate this bandwidth saving feature into the script. This has now been done. If you have zlib or trf packages handy on your bot this script will now use them and save bandwidth using it for every query it makes. In my tests this seems to speed up page loading considerably. It is not a requirement to use zlib or trf packages, the script will still function without them. But for those with it, you will notice better response times.

Download:
Incith:Google v1.9.9r
As you can see above using
webby to illustrate. Simply gzipping the query arrives at 9576 bytes. While the entire html would've been 23390 bytes. This equates to quite a savings over time. Enjoy, and most important. Have a fun ;P