This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

How can i use regsub command for web read?

Help for those learning Tcl or writing their own scripts.
Post Reply
R
Regex
Voice
Posts: 19
Joined: Sat Mar 19, 2011 1:23 pm

How can i use regsub command for web read?

Post by Regex »

Hi dear egghelp.org moderators,

I'm searching regsub command for web read.

Code: Select all

# Komutun Kullanılacağı Kanal
set cmd_chan "#OperLog"

bind pub - !ingilizce msg_english
proc msg_english {nick uhost hand chan text} {
  global botnick cmd_chan
  if {![string match -nocase "$cmd_chan" $chan]} {return}
  regsub -all "ç" $text "%C3%A7" text
  regsub -all "Ç" $text "%C3%87" text
  regsub -all "ğ" $text "%C4%9F" text
  regsub -all "Ğ" $text "%C4%9E" text
  regsub -all "ı" $text "%C4%B1" text
  regsub -all "İ" $text "%C4%B0" text
  regsub -all "ö" $text "%C3%B6" text
  regsub -all "Ö" $text "%C3%96" text
  regsub -all "ş" $text "%C5%9F" text
  regsub -all "Ş" $text "%C5%9E" text
  regsub -all "ü" $text "%C3%BC" text
  regsub -all "Ü" $text "%C3%9C" text
  set (document) ""
  set ("#history-button") ""
  set ("#history") ""
  set ("#history-clear") ""
  set connect [::http::geturl http://www.sozluk.com.tr/sozluk.php?word=$text] 
  set files [::http::data $connect] 
  set list [split [subst -nocommands $files] "\n"]
  foreach i $list {
    regexp -- {<td class="col-t">(.+?)</td>} $i - translation
    if {[info exists translation]} { 
      regsub -all "%C3%A7" $text "ç" text
      regsub -all "%C3%87" $text "Ç" text
      regsub -all "%C4%9F" $text "ğ" text
      regsub -all "%C4%9E" $text "Ğ" text
      regsub -all "%C4%B1" $text "ı" text
      regsub -all "%C4%B0" $text "İ" text
      regsub -all "%C3%B6" $text "ö" text
      regsub -all "%C3%96" $text "Ö" text
      regsub -all "%C5%9F" $text "ş" text
      regsub -all "%C5%9E" $text "Ş" text
      regsub -all "%C3%BC" $text "ü" text
      regsub -all "%C3%9C" $text "Ü" text
      regsub -all "ç" $translation "ç" translation
      regsub -all "Ç" $translation "Ç" translation
      regsub -all "ÄŸ" $translation "ğ" translation
      regsub -all "ı" $translation "ı" translation
      regsub -all "Ä°" $translation "İ" translation
      regsub -all "ö" $translation "ö" translation
      regsub -all "Ö" $translation "Ö" translation
      regsub -all "ŞŸ" $translation "ş" translation
      regsub -all "Å" $translation "Ş" translation
      regsub -all "ü" $translation "ü" translation
      regsub -all "Ãœ" $translation "Ü" translation
      putquick "privmsg $cmd_chan 3$text: 1$ translation"
      unset translation
    }
  }
  ::http::cleanup $files
}
putlog "Dictionary TCL - Written By CLubber"
Shortly, regsub -all "ç" $translation "ç" translation < this command in my script isn't working.

How can correct turkish characters?

«00:30:47» <07ZaL> !ingilizce divine
«00:30:48» <Eggdrop> divine: Tanrı gibi olan

I wanna correct this "ı" => "ı"

I hope i could explain my problem.

Thx for your concers.
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

I think you could use string map -nocase instead of the regsub lines, for example:

Code: Select all

set text "one TWO TeN"
set text [string map -nocase {one 1 two 2 ten 10} $text]
I tried your letters to figure out if it will work or not, but my copy/paste fails to copy the correct letters in my shell prompt for some reason, for example the ğ ends up as a . (dot). :roll:

Anyway, i think you can replace:

Code: Select all

  regsub -all "ç" $text "%C3%A7" text
  regsub -all "Ç" $text "%C3%87" text
  regsub -all "ğ" $text "%C4%9F" text
  regsub -all "Ğ" $text "%C4%9E" text
  regsub -all "ı" $text "%C4%B1" text
  regsub -all "İ" $text "%C4%B0" text
  regsub -all "ö" $text "%C3%B6" text
  regsub -all "Ö" $text "%C3%96" text
  regsub -all "ş" $text "%C5%9F" text
  regsub -all "Ş" $text "%C5%9E" text
  regsub -all "ü" $text "%C3%BC" text
  regsub -all "Ü" $text "%C3%9C" text
with:

Code: Select all

string map -nocase {ç %C3%A7 Ç %C3%87 ğ %C4%9F Ğ %C4%9E ı %C4%B1 İ %C4%B0 ö %C3%B6 Ö %C3%96 ş %C5%9F Ş %C5%9E ü %C3%BC Ü %C3%9C} $text
and:

Code: Select all

      regsub -all "%C3%A7" $text "ç" text
      regsub -all "%C3%87" $text "Ç" text
      regsub -all "%C4%9F" $text "ğ" text
      regsub -all "%C4%9E" $text "Ğ" text
      regsub -all "%C4%B1" $text "ı" text
      regsub -all "%C4%B0" $text "İ" text
      regsub -all "%C3%B6" $text "ö" text
      regsub -all "%C3%96" $text "Ö" text
      regsub -all "%C5%9F" $text "ş" text
      regsub -all "%C5%9E" $text "Ş" text
      regsub -all "%C3%BC" $text "ü" text
      regsub -all "%C3%9C" $text "Ü" text
      regsub -all "ç" $translation "ç" translation
      regsub -all "Ç" $translation "Ç" translation
      regsub -all "ÄŸ" $translation "ğ" translation
      regsub -all "ı" $translation "ı" translation
      regsub -all "Ä°" $translation "İ" translation
      regsub -all "ö" $translation "ö" translation
      regsub -all "Ö" $translation "Ö" translation
      regsub -all "ŞŸ" $translation "ş" translation
      regsub -all "Å" $translation "Ş" translation
      regsub -all "ü" $translation "ü" translation
      regsub -all "Ãœ" $translation "Ü" translation 
with:

Code: Select all

string map -nocase {%C3%A7 ç %C3%87 Ç %C4%9F ğ %C4%9E Ğ %C4%B1 ı %C4%B0 İ %C3%B6 ö %C3%96 Ö %C5%9F ş %C5%9E Ş %C3%BC ü %C3%9C Ü} $text
string map -nocase {ç ç Ç Ç ÄŸ ğ ı ı Ä° İ Ã¶ ö Ö Ö ŞŸ ş Å Ş Ã¼ ü Ãœ Ü} $translation
You will have to test this by yourself as my copy/paste fails miserably. :roll:
Once the game is over, the king and the pawn go back in the same box.
R
Regex
Voice
Posts: 19
Joined: Sat Mar 19, 2011 1:23 pm

Post by Regex »

@caesar thanks for your concern but still aint workin.

«15:43:01» <ZaL> !ingilizce Turk
«15:43:02» <SEO> Turk: Türk soyundan kimse

Just utf-8 characters making this.

How can we correct this with other ways?
J
Johannes13
Halfop
Posts: 46
Joined: Sun Oct 10, 2010 11:38 am

Post by Johannes13 »

Code: Select all

	proc enc v {
		subst [regsub -all {[^a-zA-Z0-9\-_\.~]} [encoding convertto utf-8 $v] {[format %%%02X [scan {\0} %c]]}]
	}

	proc decode v {
		encoding convertfrom utf-8 [subst -nocommands -novariables [regsub -all {%([a-fA-F0-9]{2})} $v {\u\1}]]
	}
should work.

Why make such a big string map/regexp thing that slows down everything?
(and is complex to maintain)
J
Johannes13
Halfop
Posts: 46
Joined: Sun Oct 10, 2010 11:38 am

Post by Johannes13 »

Ohh, and why do you subst when you got the page contents?

It can be (ab)used to inject commands.

And you know that there is a command called ::http::formatQuery?

It does basicaly the same thing as my enc proc does.

Code: Select all

bind pub - !ingilizce msg_english
proc msg_english {nick uhost hand chan text} {
  global botnick cmd_chan
  if {![string match -nocase "$cmd_chan" $chan]} {return}
  set connect [::http::geturl http://www.sozluk.com.tr/sozluk.php?[::http::formatQuery word $text]]
  set files [::http::data $connect]
  set list [split $files "\n"]
  foreach i $list {
    if {[regexp -- {<td class="col-t">(.+?)</td>} $i - translation]} {
      set translation [encoding convertfrom utf-8 [subst -nocommands -novariables [regsub -all {%([a-fA-F0-9]{2})} $translation {\u\1}]]]
      putquick "privmsg $cmd_chan \0033$text: $translation"
      unset translation
    }
  }
  ::http::cleanup $files
}
putlog "Dictionary TCL - Idea By CLubber" 
R
Regex
Voice
Posts: 19
Joined: Sat Mar 19, 2011 1:23 pm

Post by Regex »

@Johannes13 thanks for your concern

Currently, my script is working with any problems.

Thank you so much! :)
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

@Johannes13 : Interesting. Thanks for sharing. :)
Once the game is over, the king and the pawn go back in the same box.
Post Reply