This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Recursive File search (possibly via database?) [solved!]

Requests for complete scripts or modifications/fixes for scripts you didn't write. Response not guaranteed, and no thread bumping!
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Recursive File search (possibly via database?) [solved!]

Post by DJCharlie »

Ok folks, first a bit of background. I work for a net radio station, and I'm trying to set up a !search and !fetch command so our DJs can download songs from our central archive from the DJ chat room on our IRC server.

The problem is, we have over 471,000 files in the main section (not counting the holiday music, commercials, etc), and it takes about 3-4 minutes for a search to be run with the current script I have.

As nml375 mentioned in this thread, the best bet may be using a database to search the files.

Anyone willing to give it a shot?

Thanks in advance.
Last edited by DJCharlie on Sat Jul 24, 2010 3:56 pm, edited 1 time in total.
User avatar
caesar
Mint Rubber
Posts: 3778
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

He's right, you are stuck with using a database. In your case I would go with MySQL (you would need a MySQL server to host it) cos is a lot faster and has way more commands than SQlite (the database is stored in a local file).

I would create a web app and do any uploading, editing, deleting, etc. ONLY from there, then create a very rudimentary page for the Eggdrop to do a search for artists, title of a song and so on, and after a simple parse of the results just display them where you wish.

You can also use the MySQL library or the MySQL module to connect it directly to the MySQL server, thus skipping the parsing and continuing directly with the display of the results.

It would be something like IMDB.com searching script, but without much hassle of parsing the HTML (I'd go XML) code to be parsed. :)
Once the game is over, the king and the pawn go back in the same box.
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

Got the server, and an (unfortunately, empty) database set up. Trouble is initially populating it. I've tossed together a bash script to do that, but some artists INSIST on spelling their names with an apostrophe (I'm looking at you, Anita O'Day), which is causing errors. Once we get that done, the rest should be cake.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

I'd rather use a tcl-script for adding the entries (or PHP, or lua, etc). This way, I get access to the mysql_real_escape function from the MySQL C-API. When properly used, this allows you to disregard any issues with characters such as '.

In tcl, using the mysqltcl package, such a script would look somewhat like this:

Code: Select all

set files [glob *pattern*]
set db [::mysql::connect parms]
foreach file $files {

  set item [split $file -]
  set artist [lindex $item 0]
  set title [join [lrange $item 1 end]]

  ::mysql::exec $db "INSERT INTO the_table (`artist`, `title`, `file`) VALUES ('[::mysql::escape $artist]', '[::mysql::escape $title]', '[::mysql::escape $file]');"
}
::mysql::close $db
NML_375
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

Well, I finally cracked the error in the bash script (swapped " for ' in the sql query string), and it's doing the initial fill now. :)

Next step is to write up a quick search function for the eggdrop.

Thanks again!
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

It works!!! Search time has been cut by several seconds!

Only problem I'm having now is formatting. The result comes back as:

Code: Select all

-FredGibson- 8655 {Collective Soul} {Better Now} {/mnt/music/C/Collective Soul/Collective Soul - Better Now.mp3}
And what I'd like to have is:

Code: Select all

Collective Soul - Better Now
Here's the proc:

Code: Select all

proc pub_searchtwo {nick host hand chan text} {
   global db
   putserv "NOTICE $nick :Results for $text:"
   set sqlhand [mysqlconnect -host $db(hostname) -user $db(username)
-password $db(password)]
   mysqluse $sqlhand $db(database)
   foreach query [mysqlsel $sqlhand "SELECT * FROM music WHERE
title LIKE '$text'" -list] {
      putserv "NOTICE $nick :$query"
   }
   mysqlclose $sqlhand
}
Any suggestions?
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Well, as the docs suggest, mysqlsel returns a list of lists when the -list option is used. I suggest you have a look at the lindex command, which allows you to retrieve a single list item from a list:

Code: Select all

...
putserv "NOTICE $nick :[lindex $query 1] - [lindex $query 2]"
...
NML_375
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

Well, that works. :)

!fetch is still slower than Christmas, but I'm getting to it in a few.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Did you setup a proper index for the title column?
NML_375
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

I -think- so. What I have is 4 fields: tracknum, artist, title, path

Trouble I'm having, is when someone requests track number 8655 (as an example, with the test data I'm using, that would be Collective Soul - Better Now), instead of getting track number 8655, it's returning track number 1.

Code: Select all

proc pub_fetch {nick host hand chan text} {
   global db
   set sqlhand [mysqlconnect -host $db(hostname) -user $db(username)
 -password $db(password)]
   mysqluse $sqlhand $db(database)
   foreach query [mysqlsel $sqlhand "SELECT * FROM music WHERE tracknum = '$text'"] {
      putlog "$query"
      putserv "NOTICE $nick :Sending you Song #[lindex $query 0] [lindex $query 1] - [lindex $query 2]"
   }
   mysqlclose $sqlhand
}
I know it's probably something stupid.
User avatar
CrazyCat
Revered One
Posts: 1299
Joined: Sun Jan 13, 2002 8:00 pm
Location: France
Contact:

Post by CrazyCat »

Just some words about your query:
- if you want to use a LIKE statement, it's "LIKE title='%$text%'",
- you should try to use fulltext indexes on artist and title, and have a query looking like: "SELECT * FROM music WHERE MATCH(artist, title) AGAINST ('$text'' WITH QUERY EXPANSION)"
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

I'm currently using the code phpmyadmin spat out at me, and it works for the search. :) !fetch is a whole different can of worms though.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

As far as I recall, using "expression LIKE pattern" is the proper syntax for SELECT-queries in MySQL. However, the pattern should be passed through the mysql_real_escape function (::mysql::escape or mysqlescape), as to prevent SQL injection attacks.

As for your last query, I assume tracknum is of type INT? If so, then you should not use ticks or quotes around it (the integer value, that is):

Code: Select all

"SELECT * FROM `music` WHERE `tracknum` = [::mysql::escape $text]"
Edit: Edited for clarity..
NML_375
D
DJCharlie
Voice
Posts: 37
Joined: Wed May 06, 2009 10:45 am
Contact:

Post by DJCharlie »

tracknum is an int(11) field. And I just tried your line:

Code: Select all

"SELECT * FROM `music` WHERE `tracknum` = [::mysql::escape $text]"
and got:

Code: Select all

Tcl error [pub_fetch]: mysqlsel/db server: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''music' WHERE 'tracknum' = 8655' at line 1
So there's something else wrong in there.
n
nml375
Revered One
Posts: 2860
Joined: Fri Aug 04, 2006 2:09 pm

Post by nml375 »

Are you using backticks (`) or "normal" ticks (') around music and tracknum?
Names (tablename, columnname, etc) should only use backticks (`), while strings should use normal ticks ('), and numbers no delimiters.

For simple names, the backticks may be left out, as long as it is obvious we are referring to a name, and not a string.
NML_375
Post Reply