instead of digging up the HTML page structure in a messy, ugly hard-coded style with [regsub] (like virtually all eggdrop scripts do), we will use the tDOM package, which implements several important XML technologies that make out lives much easier:
Code: Select all
#!/bin/sh
# This line continues for Tcl, but is a single line for 'sh' \
exec tclsh8.4 "$0" ${1+"$@"}
package require tdom
package require http
set url "http://news.bbc.co.uk/sport"
set page [::http::data [::http::geturl $url]]
set doc [dom parse -html $page]
set root [$doc documentElement]
set node [$root selectNodes {//table[@width=416]/tr[1]/td[3]/div[2]}]
set text [[[lindex $node 0] childNodes] nodeValue]
puts "Latest sport news: [string trim $text]"
this little script's output is:
[demond@whitepine demond]$ ./tdom.sh
Latest sport news: Lord Coe flies to London on Thursday determined to make an immediate start to preparations for the 2012 Olympics.