Buyer beware, you can't guess at how to fix it. Your bot is endlessly looping that while, forever... It is harder than one thinks to alter a script and have it function correctly, isn't it? Yes. In this case it is....spithash wrote:ok, adding an extra } fixed it, but it's not working at all now.
my script looks like this ^Code: Select all
# parse the html while {$results < $incith::google::youtube_results} { # somewhat extenuated regexp due to allowing that there might be an image next to the title if {[regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title=".+?">(.+?)</a>.*?id="video\-description.*?>(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3]} { if {[string match "*</span>*" $desc]} { regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title="(.*?)">.*?id="video\-description.*?">(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3 } regsub -nocase {<span class="img">.*?</div> </div>} $html "" html } }
But after I !youtube search, it just shows/does nothing
EDIT: the bot ping timeouts and never comes back after the searching.
Why? Because you've merely changed the scrape, not the scrub as well. That isn't an inline regexp you see. That is your plain jane ordinary ol regular one that will continue to match. There is a corresponding scrubber (in this case, the regsub below) that goes hand-in-hand with this type of scraping method. If the regsub cannot scrub, then the regexp will continue to match the exact same parts of text. Forever. I didn't make it this way, it was made this way originally by incith. Here is how you should likely alter that regsub and fix the scrub and that nasty endless looping. Change the regsub below:
Code: Select all
regsub -nocase {<span class="img">.*?</div> </div>} $html
Code: Select all
regsub -nocase {<span class="video-time">.*?</span.*?href="/watch\?v=.+?".+?title=".+?">.+?</a>.*?id="video\-description.*?>.*?</p.*?class="date\-added">.+?</span.*?class="viewcount">.+?</span} $html "" html