This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

UNOFFICIAL incith-google 2.1x (Nov30,2o12)

Support & discussion of released scripts, and announcements of new releases.
Post Reply
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

spithash wrote:ok, adding an extra } fixed it, but it's not working at all now.

Code: Select all

      # parse the html
      while {$results < $incith::google::youtube_results} {
        # somewhat extenuated regexp due to allowing that there might be an image next to the title
        if {[regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title=".+?">(.+?)</a>.*?id="video\-description.*?>(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3]} {
          if {[string match "*</span>*" $desc]} {
            regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title="(.*?)">.*?id="video\-description.*?">(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3
          }
          regsub -nocase {<span class="img">.*?</div>      </div>} $html "" html
        }
	}
my script looks like this ^

But after I !youtube search, it just shows/does nothing :?: :!:

EDIT: the bot ping timeouts and never comes back after the searching.
Buyer beware, you can't guess at how to fix it. Your bot is endlessly looping that while, forever... It is harder than one thinks to alter a script and have it function correctly, isn't it? Yes. In this case it is....

Why? Because you've merely changed the scrape, not the scrub as well. That isn't an inline regexp you see. That is your plain jane ordinary ol regular one that will continue to match. There is a corresponding scrubber (in this case, the regsub below) that goes hand-in-hand with this type of scraping method. If the regsub cannot scrub, then the regexp will continue to match the exact same parts of text. Forever. I didn't make it this way, it was made this way originally by incith. Here is how you should likely alter that regsub and fix the scrub and that nasty endless looping. Change the regsub below:

Code: Select all

regsub -nocase {<span class="img">.*?</div>      </div>} $html
"<span class="img>" and "</div> </div>" used to encapsulate each item. It no longer does, this will also need correcting. Hopefully this weekend I'll have a correct fix for this soon, until then try changing that above regsub.. to this:

Code: Select all

regsub -nocase {<span class="video-time">.*?</span.*?href="/watch\?v=.+?".+?title=".+?">.+?</a>.*?id="video\-description.*?>.*?</p.*?class="date\-added">.+?</span.*?class="viewcount">.+?</span} $html "" html
This is a complex scrubber that wastes clock cycles, but until I get around to fixing it properly. See if this works.
n
neocratic
Voice
Posts: 15
Joined: Sun May 16, 2010 11:59 am

Re: About !g time country

Post by neocratic »

speechles wrote: There will be shortly. It's just with this script I need to devote a serious slice of continuous time. It can't be short bursts of 15 minutes here or there. This weekend I will have that time to eliminate some of the problems that have over time resurfaced: google time, wikipedia, wikimedia, youtube, etc ... These all have issues in one way or another. When fixing these I will likely find even more issues and correct these along the way. This is why I tend to let things stack up before releasing a fix because I want to evolve it forward correcting long standing issues (like no bold in results when utf-8 patched?), inconsistent encodings, etc. The things that in the long run will create a better end product. Rushing to fix regex parsing bugs is a short term fix with no evolution to me..

But suffice to say, you don't need to read any of that diatribe above if you don't want to. It's just words. But expect a new version of this script this weekend. In that it will most assuredly correct the "time" problem you are experiencing. :)
Thanks a lot for the reply, i now have understood what you are trying to tell.I will be waiting for the next version update :)
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

speechles: do you have the youtube fix somewhere uploaded?

I either must did something wrong idk, but I tried what you said on your previous post without any luck =/

I see sp33chy is working great though.. 8)
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
b
bfoos
Voice
Posts: 6
Joined: Thu Sep 30, 2010 6:17 pm

Post by bfoos »

spithash wrote:speechles: do you have the youtube fix somewhere uploaded?

I either must did something wrong idk, but I tried what you said on your previous post without any luck =/

I see sp33chy is working great though.. 8)
http://forum.egghelp.org/viewtopic.php?p=95867#95867
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

bfoos wrote:!yt was more broken than that. A better temporary solution is to set...

variable youtube_results 0

Then add...

"yt:g:site:youtube.com %search%"

Under Custom Trigger Phrasing.

speechles is due to address this issue amongst others in an upcoming update.
Actually to be honest, it was my fault for not reading that post..

It worked great after doing so.. :)
Thanks!
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
pogue
Voice
Posts: 28
Joined: Sun May 17, 2009 3:56 am
Contact:

Post by pogue »

I'm seeing some problems in the wikipedia lookup now. I attempted to setup debug to see if there was any error, but nothing was sent to me.

Here is the query, all queries produce the same result:
[12:58am] <~pogue> !wiki suez canal
[12:58am] <+BodyBuildingBot> Jump to: navigation, search
I am using 2.0.0a

Info on the bot:
I am BodyBuild, running eggdrop v1.6.19: 13 users (mem: 841k).
Online for 1 day, 00:54 (background) - CPU: 00:06 - Cache hit: 4.0%
Admin: Kelso
Config file: bbbot.conf
OS: Linux 2.6.18-194.17.1.el5
Tcl library: /usr/share/tcl8.4
Tcl version: 8.4.13 (header version 8.4.13)
Tcl is threaded.
Here is the full text of the script I'm using (only alterations in the options section @ the beginning)
http://tcl.pastebin.com/8gd9GE3R

Help would be appreciated!

Thanks,
pogue
Helpful Tools:
  • Notepad++: Windows Text Editor with TCL Syntax Highlighting
  • Pastebin TCL: For easy script collaboration
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

They changed wikipedia' s website, that's why you get this error.
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

OK, speechles fixed the wiki and added the youtube temporary fix in a kinda working pre-release so I thought it would be great to share it to you.

Make sure you guys set the script up because it is edited by me the way I have it loaded.

NOTE: MY BOT IF UTF-8 PATCHED SO YOU NEED TO CHANGE THOSE BACK TO DEFAULT (SEE A PREVIOUS RELEASE OR SOMETHING)

Code: Select all

 variable dirty_decode 1

    # enable gzip compression for bandwidth savings? Keep in mind
    # this semi-breaks some of the present utf-8 work-arounds and
    # eggdrop may mangle encodings when gzip compression that it 
    # doesn't when uncompressed html it used (default). A setting
    # of 0 defaults to uncompressed html, a 1 or higher gzip.
    # ------
    # NOTE: If you do not have Trf or zlib packages setting this
    # to 0 is recommened. Leaving it at 1 is fine as well, as the
    # script will attempt to find these commands or packages every
    # rehash or restart. But to keep gzip from ever being used it
    # is best to set the below variable to 0.
    # NOTE2: If you have Trf or zlib packages present, then this
    # should always be set to 1. You save enormous bandwidth and
    # time using this. If your bot is patched and you have Trf/zlib
    # then you should definitely leave this at 1 and you will never
    # suffer issues.
    # ------
    variable use_gzip 0

    # THIS IS TO BE USED TO DEVELOP A BETTER LIST FOR USE BELOW.
    # To work-around certain encodings, it is now necessary to allow
    # the public a way to trouble shoot some parts of the script on
    # their own. To use these features involves the two settings below.
    # -- DEBUG INFORMATION GOES BELOW --
    # set debug and administrator here
    # this is used for debugging purposes
    # ------
    variable debug 1
    variable debugnick spithashhh

    # AUTOMAGIC
    # with this set to 1, the bottom encode_strings setting will become
    # irrelevant. This will make the script follow the charset encoding
    # the site is telling the bot it is. 
    # This DOES NOT affect wiki(media/pedia), it will not encode automatic.
    # Wiki(media/pedia) still requires using the encode_strings section below.
    # ------
    # NOTE: If your bot is utf-8 pathced, leave this option at 1, the only
    # time to change this to 0 is if your having rendering problems.
    # ------
    variable automagic 1

    # UTF-8 Work-Around (for eggdrop, this helps automagic)
    # If you use automagic above, you may find that any utf-8 charsets are
    # being mangled. To keep the ability to use automagic, yet when utf-8
    # is the charset defined by automagic, this will make the script instead
    # follow the settings for that country in the encode_strings section below.
    # ------
    # NOTE: If you bot is utf-8 patched, set this to 0. Everyone else, use 1.
    # ------
    variable utf8workaround 0

So anyway, speechles is way too busy to make a complete release but soon enough, he will get back to that.

Until then, play with this one, here it is:

http://bsdunix.info/spithash/nagger/inc ... EMPfix.tcl
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
pogue
Voice
Posts: 28
Joined: Sun May 17, 2009 3:56 am
Contact:

Post by pogue »

spithash wrote:OK, speechles fixed the wiki and added the youtube temporary fix in a kinda working pre-release so I thought it would be great to share it to you.
Thanks spithash & speechles!
Helpful Tools:
  • Notepad++: Windows Text Editor with TCL Syntax Highlighting
  • Pastebin TCL: For easy script collaboration
M
Mabus4444
Halfop
Posts: 51
Joined: Mon Oct 30, 2006 7:40 pm

Post by Mabus4444 »

2.0 fixes the youtube problem but the wiki problem still isn't fixed for me. I get this error in the console;

Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent
User avatar
Trixar_za
Op
Posts: 143
Joined: Wed Nov 18, 2009 1:44 pm
Location: South Africa
Contact:

Post by Trixar_za »

Probably because your using an older version of http.tcl - all you need is to find a newer copy of it and load it before this script. You can grab my copy @ http://www.trixarian.za.net/downloads/http.tcl
M
Mabus4444
Halfop
Posts: 51
Joined: Mon Oct 30, 2006 7:40 pm

Post by Mabus4444 »

I'm using http.tcl version 2.5.2

I tried loading your copy instead, and restarted the bot. Same error message.
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

Here is a newer version/update of http.tcl

http://bsdunix.info/spithash/nagger/http.tcl
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
M
Mabus4444
Halfop
Posts: 51
Joined: Mon Oct 30, 2006 7:40 pm

Post by Mabus4444 »

Thanks for the updated version.

The problem persists however, tried a rehash and a full restart to no avail. I get the following message in the console;

Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Mabus4444 wrote:Thanks for the updated version.

The problem persists however, tried a rehash and a full restart to no avail. I get the following message in the console;

Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent
My good sir, the answer is simple. The answer is clear, the answer is close, and the answer is here:

Code: Select all

set http [::http::config -useragent $ua -urlencoding "utf-8"]
Change that, to look like this...

Code: Select all

set http [::http::config -useragent $ua]
Also, there might be one or two of these to change:

Code: Select all

set http [::http::config -useragent $ua -urlencoding "utf-8"]
To look like this:

Code: Select all

set http [::http::config -useragent $ua]
Done... Ready for more?

Now before you begin, and apply all these changes.
Use the version below to update and exchange them.

::::: >> @everyone, especially Mabus4444
New Version: Incith:Google v2.0.0b
The version above corrects several small bugs, and enhances mediawiki/wikimedia and now parses wikia sites 100% as well. This brings a plethora of new custom trigger-phrases built-in, with literally thousands more for you to design yourself.

Code: Select all

      "fg:wm:.familyguy.wikia.com %search%"
      "ad:wm:.americandad.wikia.com %search%"
      "sp:wm:.southpark.wikia.com %search%"
      "sw:wm:.starwars.wikia.com %search%"
      "na:wm:.naruto.wikia.com %search%"
      "in:wm:.inuyasha.wikia.com %search%"
      "gr:wm:.gremlins.wikia.com %search%"
      "wow:wm:.wowwiki.com %search%"
      "smf:wm:.smurf.wikia.com %search%"
      "sm:wm:.sailormoon.wikia.com %search%"
      "pk:wm:.pokemon.wikia.com %search%"
      "ss:wm:.strawberryshortcake.wikia.com %search%"
      "mlp:wm:.mlp.wikia.com %search%"
      "lps:wm:.lps.wikia.com %search%"
      "ant:wm:.ants.wikia.com %search%"
      "gm:wm:.gaming.wikia.com %search%"
      "nt:wm:.nothing.wikia.com %search%"
      "ff:wm:.finalfantasy.wikia.com %search%"
All of these "custom trigger" phrases above allow short-cuts to access wikimedia long-names. If you need any explanation of how to construct custom trigger phrases ask. Triggers with very nested and complex combinations are possible which may not be overly apparent to the mere user of this script.

If you experience issues, shout them out. Yes, youtube is still technically broken. It is merely wrapped through google, with some custom trigger-phrasing logic to make it give the appearance it does work. This _will_ eventually be addressed when time comes. This demonstrates the power of custom trigger phrases and the potential they have to do wonderful things. So remember, youtube doesn't work. It will soon, until then, investigate the !video or !v trigger which does work. Everybody forgets about that trigger...
Post Reply