This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

regsub

Old posts that have not been replied to for several years.
Locked
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

regsub

Post by caesar »

How to regsub something like "href=http://some.page.here/page__id/id__123456" where the 123456 are some random numbers from an line like "bla bla bla href=http://some.page.here/page__id/id__123456 bla bla bla" .. and to regsub something like "bla bla bla 123' bla bla" and remove the 123 random numbers and the '
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

Code: Select all

#to remove the http-part (as long as the number is always 6 digits long)
regsub { (href=http:).*(id__)(\d){6}} $line {} newline
#to remove the numbers and the '
regsub { (\d){3}\'} $line {} newline
Elen sila lúmenn' omentielvo
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

Well.. it dosen't seem to be working.. Here is the code I have done till now:

Code: Select all

bind dcc * bla my:bla

proc my:bla {idx hand arg} {
  set content [split [read [set f [open "tv.txt" r]]][close $f] \n] 
  putlog "searching..."

  regsub -all -- {<td|colspan="3"|bgcolor="#EFEFEF"><div align="center"><strong>|</strong></div></td>} $content "" content
  regsub -all -- {<table width="440"|cellpadding="0"|cellspacing="0"|align="center"|border=0>|<tr>|</tr>} $content "" content
  regsub -all -- {valign="top"|width="40"|><font size=1>|</font></td>|width="20"|>-</td>|<B>|</B>} $content "" content
  regsub -all -- {<a||bgColor='#fafafa'|bgColor='#A6A6FF'|style='text-decoration:none} $content "" content
  regsub -all -- {'</font> <img|src='http://tv.kappa.ro/images/detalii.gif'|border=0 alt='detalii'></a>|</font> <img|'} $content "" content

  set up [expr [lsearch $content "<!-- programul canalului ales -->"] + 2]
  set down [expr [lsearch $content "*<!-- template 2000 jos-->*"] - 2]
  set tosave [lrange $content $up $down]

  set i 0
  foreach line $tosave {
    if {[llength $line] == 0} {
      continue
    }
    putlog $line
    incr i
  }
  putlog "finished... $i line[expr {$i==1?"":"s"}].."
}
and here is the tv.txt file..

And from an line like "href=http://tv.kappa.ro/loc__detalii/id__469556 Roseanne (SUA serial de comedie, 1988) 30" I want to remove the url and it's random number and the last number of the line..
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

try it like this

Code: Select all

foreach line $tosave { 
    if {[llength $line] == 0} { 
      continue 
    } 
    regsub {(href=http:).*(id__)(\d){6}} $line {} line 
    regsub {(\d){2,}$} $line {} line
    putlog $line 
    incr i 
  } 
  putlog "finished... $i line[expr {$i==1?"":"s"}].." 
this is from my tclsh:
% set line "href=http://tv.kappa.ro/loc__detalii/id__469556Roseanne (SUA serial de comedie, 1988) 30"
href=http://tv.kappa.ro/loc__detalii/id__469556 Roseanne (SUA serial de comedie, 1988) 30
% regsub {(href=http:).*(id__)(\d){6}} $line {} line ;puts $line
Roseanne (SUA serial de comedie, 1988) 30
% regsub {(\d){2,}$} $line {} line ;puts $line
Roseanne (SUA serial de comedie, 1988)
%
Elen sila lúmenn' omentielvo
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

The first regsub seems to be working fine but the second is not removing corectly, it removes the minutes also.. (bot): [01:23] 01: <~ there should be some minutes and the year like (bot): [01:27] Luni, 29-12- :)
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

caesar wrote:And from an line like "href=http://tv.kappa.ro/loc__detalii/id__469556 Roseanne (SUA serial de comedie, 1988) 30" I want to remove the url and it's random number and the last number of the line..
you didn't say anything about the [01:23] :)
Elen sila lúmenn' omentielvo
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

I did now :P :mrgreen:
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

hehe... I'm bored, so I modified your script abit... I think it's faster now ;)

Code: Select all

bind dcc * bla my:bla 

proc my:bla {idx hand arg} { 
  set content [read [set f [open "tv.txt" r]]][close $f]
  putlog "searching..." 
  set tosave [string range $content [set f [string first "<!-- programul canalului ales -->" $content]] [string first "<!-- template 2000 jos-->" $content $f]]
  regsub -all -- {</b>|<br>|</font>} $tosave {  }  tosave
  regsub -all -- {<[^>]*>} $tosave {}  tosave
  regsub -all -- { } $tosave { }  tosave
  set i 0 ;set o {}
  foreach line [split $tosave \n] { 
    if {[llength $line] == 0 || [lindex $line 0] == {-}} { 
      continue 
    } 
    regsub {(\d){2,}[']} [string trim $line] {} line
    putlog $line
    incr i 
  } 
  putlog "finished... $i line[expr {$i==1?"":"s"}].." 
} 
Elen sila lúmenn' omentielvo
User avatar
Sir_Fz
Revered One
Posts: 3794
Joined: Sun Apr 27, 2003 3:10 pm
Location: Lebanon
Contact:

Post by Sir_Fz »

Alright! welcome back Papillon :P
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

Adding "|[<*>]" to the regsub like regsub -all -- { |[<*>]} $tosave { } tosave and all seem to be working smoothly. Thank you :mrgreen:

Btw, how should I make it arrange them like:
Date: Marti, 30-12-2003
07:00 Dennis, pericol public (SUA desene animate, rel.)
08:00 Minute de milioane
etc.
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

something like this?

Code: Select all

foreach line [split $tosave \n] { 
    if {[llength $line] == 0 || [lindex $line 0] == {-}} { 
      continue 
    } 
    regsub {(\d){2,}[']} [string trim $line] {} line 
    if {[regexp {^(\d){2}:(\d){2}$} $line]} { 
      set nline $line 
      continue
    } else {
      append nline " $line"
      putlog $nline 
      unset nline
    }
    incr i 
  }
hehe, thx but I'm just home for xmas Sir_Fz :mrgreen:
Elen sila lúmenn' omentielvo
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

As usualy, I love you man! :mrgreen: Merry Xmas and Happy New Year! :P
Once the game is over, the king and the pawn go back in the same box.
User avatar
Papillon
Owner
Posts: 724
Joined: Fri Feb 15, 2002 8:00 pm
Location: *.no

Post by Papillon »

/me feels loved :mrgreen:
Merry Xmas and Happy New Year! 8)
Elen sila lúmenn' omentielvo
User avatar
Sir_Fz
Revered One
Posts: 3794
Joined: Sun Apr 27, 2003 3:10 pm
Location: Lebanon
Contact:

Post by Sir_Fz »

Hope you're enjoying your vacation :)

Merry Christmas and happy New Year.
Locked