Code: Select all
proc CharsetToEncoding {charset} {
Code: Select all
proc http::CharsetToEncoding {charset} {
Code: Select all
proc CharsetToEncoding {charset} {
Code: Select all
proc http::CharsetToEncoding {charset} {
Patching your eggdrop for utf-8 support solves this without requiring any changes. But for those with unpatched eggdrops/windrops this is indeed annoying. It isn't durby's fault. This is actually, inherited from "webby" unfortunately.. but you've discovered a legacy problem that has always annoyed me. It's an issue with transcoding entities (’24′ turns into ’24′) correctly into their proper encodings within a string already encoded. I've taken some time to rewrite the problem procedure, and hopefully this works for both unpatched and patched bots, regardless..x0x wrote:Found a bug as well
http://screenrant.com/24-season-9-fox
<title> Fox to Bring Back ’24′ for Season 9</title>
IRC: ~ Fox to Bring Back 242 for Season 9
Code: Select all
proc webbydescdecode {text char} {
# code below is neccessary to prevent numerous html markups
# from appearing in the output (ie, ", ᘧ, etc)
# stolen (borrowed is a better term) from perplexa's urban
# dictionary script..
if {![string match *&* $text]} {return $text}
if {[string match "*;*" $char]} {set char [string trim $char {;}] }
set escapes {
\xa0 ¡ \xa1 ¢ \xa2 £ \xa3 ¤ \xa4
¥ \xa5 ¦ \xa6 § \xa7 ¨ \xa8 © \xa9
ª \xaa « \xab ¬ \xac \xad ® \xae
¯ \xaf ° \xb0 ± \xb1 ² \xb2 ³ \xb3
´ \xb4 µ \xb5 ¶ \xb6 · \xb7 ¸ \xb8
¹ \xb9 º \xba » \xbb ¼ \xbc ½ \xbd
¾ \xbe ¿ \xbf À \xc0 Á \xc1 Â \xc2
à \xc3 Ä \xc4 Å \xc5 Æ \xc6 Ç \xc7
È \xc8 É \xc9 Ê \xca Ë \xcb Ì \xcc
Í \xcd Î \xce Ï \xcf Ð \xd0 Ñ \xd1
Ò \xd2 Ó \xd3 Ô \xd4 Õ \xd5 Ö \xd6
× \xd7 Ø \xd8 Ù \xd9 Ú \xda Û \xdb
Ü \xdc Ý \xdd Þ \xde ß \xdf à \xe0
á \xe1 â \xe2 ã \xe3 ä \xe4 å \xe5
æ \xe6 ç \xe7 è \xe8 é \xe9 ê \xea
ë \xeb ì \xec í \xed î \xee ï \xef
ð \xf0 ñ \xf1 ò \xf2 ó \xf3 ô \xf4
õ \xf5 ö \xf6 ÷ \xf7 ø \xf8 ù \xf9
ú \xfa û \xfb ü \xfc ý \xfd þ \xfe
ÿ \xff ƒ \u192 Α \u391 Β \u392 Γ \u393 Δ \u394
Ε \u395 Ζ \u396 Η \u397 Θ \u398 Ι \u399
Κ \u39A Λ \u39B Μ \u39C Ν \u39D Ξ \u39E
Ο \u39F Π \u3A0 Ρ \u3A1 Σ \u3A3 Τ \u3A4
Υ \u3A5 Φ \u3A6 Χ \u3A7 Ψ \u3A8 Ω \u3A9
α \u3B1 β \u3B2 γ \u3B3 δ \u3B4 ε \u3B5
ζ \u3B6 η \u3B7 θ \u3B8 ι \u3B9 κ \u3BA
λ \u3BB μ \u3BC ν \u3BD ξ \u3BE ο \u3BF
π \u3C0 ρ \u3C1 ς \u3C2 σ \u3C3 τ \u3C4
υ \u3C5 φ \u3C6 χ \u3C7 ψ \u3C8 ω \u3C9
ϑ \u3D1 ϒ \u3D2 ϖ \u3D6 • \u2022
… \u2026 ′ \u2032 ″ \u2033 ‾ \u203E
⁄ \u2044 ℘ \u2118 ℑ \u2111 ℜ \u211C
™ \u2122 ℵ \u2135 ← \u2190 ↑ \u2191
→ \u2192 ↓ \u2193 ↔ \u2194 ↵ \u21B5
⇐ \u21D0 ⇑ \u21D1 ⇒ \u21D2 ⇓ \u21D3 ⇔ \u21D4
∀ \u2200 ∂ \u2202 ∃ \u2203 ∅ \u2205
∇ \u2207 ∈ \u2208 ∉ \u2209 ∋ \u220B ∏ \u220F
∑ \u2211 − \u2212 ∗ \u2217 √ \u221A
∝ \u221D ∞ \u221E ∠ \u2220 ∧ \u2227 ∨ \u2228
∩ \u2229 ∪ \u222A ∫ \u222B ∴ \u2234 ∼ \u223C
≅ \u2245 ≈ \u2248 ≠ \u2260 ≡ \u2261 ≤ \u2264
≥ \u2265 ⊂ \u2282 ⊃ \u2283 ⊄ \u2284 ⊆ \u2286
⊇ \u2287 ⊕ \u2295 ⊗ \u2297 ⊥ \u22A5
⋅ \u22C5 ⌈ \u2308 ⌉ \u2309 ⌊ \u230A
⌋ \u230B 〈 \u2329 〉 \u232A ◊ \u25CA
♠ \u2660 ♣ \u2663 ♥ \u2665 ♦ \u2666
" \x22 & \x26 < \x3C > \x3E O&Elig; \u152 œ \u153
Š \u160 š \u161 Ÿ \u178 ˆ \u2C6
˜ \u2DC \u2002 \u2003 \u2009
\u200C \u200D \u200E \u200F – \u2013
— \u2014 ‘ \u2018 ’ \u2019 ‚ \u201A
“ \u201C ” \u201D „ \u201E † \u2020
‡ \u2021 ‰ \u2030 ‹ \u2039 › \u203A
€ \u20AC ' \u0027 "" "" "" ""
""
};
set text [string map [list "\]" "\\\]" "\[" "\\\[" "\$" "\\\$" "\" "\\\"] [string map $escapes $text]]
regsub -all -- {&#([[:digit:]]{1,5});} $text {[encoding convertto $char [format %c [string trimleft "\1" "0"]]]} text
regsub -all -- {&#x([[:xdigit:]]{1,4});} $text {[encoding converto $char [format %c [scan "\1" %x]]]} text
return [subst "$text"]
}
As you can see, to view gizmodo, you MUST use gzip. Some sites, you must use TLS/https.<speechles> !webby http://gizmodo.com
<sp33chy> Gizmodo - The Gadget Guide ( http://is.gd/zMhtyP )( 200; text/html; utf-8; 68504 bytes (gzip); 222576 bytes )
<sp33chy> The Gadget Guide
For your bot to fully support the future, you want to get both zlib and tls supported.Webby: Found zlib package. Fast lane activated!
Webby: https supported: tls package found.
And here's what it looks like within the request... full headers..## The request should be sent back gzip, it isn't..
<speechles> !webby http://gizmodo.com --gz
<sp33chy> Gizmodo - The Gadget Guide ( http://tinyurl.com/suee2 )( 200; text/html; utf-8; 221863 bytes )
<sp33chy> The Gadget Guide
## The request should be sent back plain-text, it isn't...
<speechles> !webby http://gizmodo.com
<sp33chy> Gizmodo - The Gadget Guide ( http://is.gd/zMhtyP )( 200; text/html; utf-8; 68820 bytes (gzip); 222580 bytes )
<sp33chy> The Gadget Guide
Here is the problem: X-Served-By=cache-s29-SJC2<speechles> !webby http://gizmodo.com --header --xheader --html
<sp33chy> Gizmodo - The Gadget Guide ( http://is.gd/zMhtyP )( 200; text/html; utf-8; 68558 bytes (gzip); 222584 bytes )
<sp33chy> ntCoent-Length=221869; Via=1.1 varnish; P3P=CP="IDC DSP COR CURa ADMa OUR IND PHY ONL COM STA"; Date=Sun, 12 May 2013 01:53:56 GMT; Content-Type=text/html; charset=utf-8; Content-Length=45713; Content-Encoding=gzip; Connection=close; Cache-Control=public, max-age=100; Age=59; Accept-Ranges=bytes
<sp33chy> x-cdn-view=mantle-root; X-Timer=S1368323636.443116426,VS0,VE0; X-Served-By=cache-s29-SJC2; X-Cache-Hits=38; X-Cache=HIT
<sp33chy> Metas: viewport=width=device-width,initial-scale=1.0;
<sp33chy> Metas: kinja:meta=%7B%22hackerspaceCssMd5%22%3A%227a40dde694f259a7a27031f5a3e7a718%22%2C%22spaceCssMd5%22%3A%225705ea850434beb072d9aef9746d0a5d%22%2C%22frontCssMd5%22%3A%22e4388a3f525f1b3674e732196d7681a6%22%2C%22templatesEnUsJsMd5%22%3A%2243c22c0836fbb5caf2b5157bacefdae5%22%2C%22io9CssMd5%22%3A%222018a4600ca16abc1ab2eb739d42a1c4%22%2C%22tinymceCssMd5%22%3A%22d646c23ed4acc4aa1cf446a2d8aef8fa%22%2C%22natgeo80sCssMd5%2
<sp33chy> Metas: kinja:mode=live; kinja:page-type=frontpage; ROBOTS=INDEX, FOLLOW; og:title=Gizmodo - The Gadget Guide; og:type=blog; og:image=http://gawker.com/assets/images/logos/t ... 00x200.png; twitter:card=summary; twitter:site=@gizmodo; og:description=The Gadget Guide; description=The Gadget Guide; og:locale=en_US; og:site_name=Gizmodo; fb:app_id=44615671688;
<sp33chy> Metas: google-site-verification=FUkM9gDOR_WvsjOMuGnUUhhYv5zvRaQHCMmNeRHvvhQ
Yes. Let's make it an ignore list though first.hille wrote:Is it possible to add the .uf trigger in the ignorelist?
Code: Select all
if {[string match -nocase "!webby*" $word] || [string match -nocase "!durby*" $word]} {
return 0
} elseif {[string match -nocase "*://*" $word] || [string match -nocase "www.*" $word]} {
Code: Select all
set ignores [list "!webby" "!durby" ".uf"] ; foreach ignore $ignores { if {[string match -nocase $ignore* $word]} { return 0 } }
if {[string match -nocase "*://*" $word] || [string match -nocase "www.*" $word]} {