The way the script crafts it's url query string is different from how just going to
www.google.com is. The query string presently in use should always lead to each regional server.
Code: Select all
set query "http://www.google.${country}/search?q=${input}&safe=${incith::google::safe_search}&btnG=Search&lr=lang_${incith::google::language}&num=10"
You can tell it's doing it correctly by the differing results and discrepancy in totals. Try testing google dynamically with
!google .anything <search terms> replacing anything with each TLD possible for google.
...and before I forget...If you put anything over 10 for search_results, you may also get no results. The MAX they allow displayed per page is 10 and the bot is hardcoded to fetch said number always. You can pick anything 10 or less and suffer no worries.
EDit: [mar 18th] 1.9.8b link updated: added preliminary html transcoding, this makes it childs play to remove html characteristics from text, making it even closer to browser like behavior.
Code: Select all
# Description Decode
# convert html codes into characters - credit perplexa (urban dictionary)
#
proc descdecode {text} {
# code below is neccessary to prevent numerous html markups
# from appearing in the output (ie, ", ᘧ, etc)
# stolen (borrowed is a better term) from perplexa's urban
# dictionary script..
set escapes {
\x20 " \x22 & \x26 ' \x27 – \x2D
< \x3C > \x3E ˜ \x7E € \x80 ¡ \xA1
¢ \xA2 £ \xA3 ¤ \xA4 ¥ \xA5 ¦ \xA6
§ \xA7 ¨ \xA8 © \xA9 ª \xAA « \xAB
¬ \xAC \xAD ® \xAE &hibar; \xAF ° \xB0
± \xB1 ² \xB2 ³ \xB3 ´ \xB4 µ \xB5
¶ \xB6 · \xB7 ¸ \xB8 ¹ \xB9 º \xBA
» \xBB ¼ \xBC ½ \xBD ¾ \xBE ¿ \xBF
À \xC0 Á \xC1 Â \xC2 Ã \xC3 Ä \xC4
Å \xC5 Æ \xC6 Ç \xC7 È \xC8 É \xC9
Ê \xCA Ë \xCB Ì \xCC Í \xCD Î \xCE
Ï \xCF Ð \xD0 Ñ \xD1 Ò \xD2 Ó \xD3
Ô \xD4 Õ \xD5 Ö \xD6 × \xD7 Ø \xD8
Ù \xD9 Ú \xDA Û \xDB Ü \xDC Ý \xDD
Þ \xDE ß \xDF à \xE0 á \xE1 â \xE2
ã \xE3 ä \xE4 å \xE5 æ \xE6 ç \xE7
è \xE8 é \xE9 ê \xEA ë \xEB ì \xEC
í \xED î \xEE ï \xEF ð \xF0 ñ \xF1
ò \xF2 ó \xF3 ô \xF4 õ \xF5 ö \xF6
÷ \xF7 ø \xF8 ù \xF9 ú \xFA û \xFB
ü \xFC ý \xFD þ \xFE ÿ \xFF
};
set text [string map $escapes $text]
# tcl filter required because we are using SUBST command below
# this will escape any sequence which could potentially trigger
# the interpreter..
regsub -all -- \\\\ $text \\\\\\\\ text
regsub -all -- \\\[ $text \\\\\[ text
regsub -all -- \\\] $text \\\\\] text
regsub -all -- \\\} $text \\\\\} text
regsub -all -- \\\{ $text \\\\\{ text
regsub -all -- \\\" $text \\\\\" text
# end tcl filter
regsub -all -- {&#([[:digit:]]{1,5});} $text {[format %c [string trimleft "\1" "0"]]} text
regsub -all -- {&#x([[:xdigit:]]{1,4});} $text {[format %c [scan "\1" %x]]} text
regsub -all -- {&#?[[:alnum:]]{2,7};} $text "?" text
return [subst $text]
}
Take for example, a description containing this:
"this text"
With html transcoding , it will now look like this:
"this text"
Which is how a real browser would display it. Credit for this addition goes entirely to perplexa, most of it ripped (borrowed) from his urban dictionary script. Some of it modified by me to eliminate the need to strip characters into nulls, so all characters can be displayed. Preliminary support for this added to translation, wikipedia, wikimedia.. Entire script will support it shortly, this is just an update to those who follow the thread closely.
Get it at any v1.9.8b link above and enjoy