![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
count(*) | +----------+ 79303 | +----------+ |
#2
| |||
| |||
|
|
* I've got a leafnode directory with usenet articles named, roughly, 1-79389. After running a sh script on the directory to populate a table with various headers and the full article, I have, mysql> select count(*) from rcm; +----------+ | count(*) | +----------+ | * *79303 | +----------+ 1 row in set (0.00 sec) and an actual article count on the directory of 79347. * Obviously, there are 44 articles that for some reason or other, failed the INSERT (and an additional 42 that don't exist as filenames). *Given that server_id (mediumint) is assigned the filename of the article, how can I find which numbers between 1 and 79389 *don't* exist as server_id? |
#3
| |||
| |||
|
|
On May 5, 9:00*pm, Steve Ackman <st... (AT) SNIP-THIS (DOT) twoloonscoffee.com wrote: * I've got a leafnode directory with usenet articles named, roughly, 1-79389. After running a sh script on the directory to populate a table with various headers and the full article, I have, mysql> select count(*) from rcm; +----------+ | count(*) | +----------+ | * *79303 | +----------+ 1 row in set (0.00 sec) and an actual article count on the directory of 79347. * Obviously, there are 44 articles that for some reason or other, failed the INSERT (and an additional 42 that don't exist as filenames). *Given that server_id (mediumint) is assigned the filename of the article, how can I find which numbers between 1 and 79389 *don't* exist as server_id? I'm a bit unsure whether I correctly understand what you try to accomplish, but given you have a table with all numbers 1..79389 |
|
you could simply run a denying left join query on both tables like: select f.id as missing from fulltable f left join leafnodes l on f.id=l.nodeid where isnull(l.nodeid) |
|
but I am afraid I'm missing the point here |
#4
| |||
| |||
|
|
In <9b3a64b4-aeab-43fb-bc07-ef960a2c34b0 (AT) y38g2000hsy (DOT) googlegroups.com>, on Mon, 5 May 2008 13:57:43 -0700 (PDT), cvh@LE, christian.hansel (AT) cpi-service (DOT) com wrote: On May 5, 9:00*pm, Steve Ackman <st... (AT) SNIP-THIS (DOT) twoloonscoffee.com wrote: * I've got a leafnode directory with usenet articles named, roughly, 1-79389. After running a sh script on the directory to populate a table with various headers and the full article, I have, mysql> select count(*) from rcm; +----------+ | count(*) | +----------+ | * *79303 | +----------+ 1 row in set (0.00 sec) and an actual article count on the directory of 79347. * Obviously, there are 44 articles that for some reason or other, failed the INSERT (and an additional 42 that don't exist as filenames). *Given that server_id (mediumint) is assigned the filename of the article, how can I find which numbers between 1 and 79389 *don't* exist as server_id? I'm a bit unsure whether I correctly understand what you try to accomplish, but given you have a table with all numbers 1..79389 I'm not sure whether you think I have such a table, or whether you're suggesting I create one. |
#5
| |||
| |||
|
|
In <9b3a64b4-aeab-43fb-bc07-ef960a2c3... (AT) y38g2000hsy (DOT) googlegroups.com>, on Mon, 5 May 2008 13:57:43 -0700 (PDT), cvh@LE, christian.han... (AT) cpi-service (DOT) com wrote: On May 5, 9:00*pm, Steve Ackman <st... (AT) SNIP-THIS (DOT) twoloonscoffee.com wrote: * I've got a leafnode directory with usenet articles named, roughly, 1-79389. After running a sh script on the directory to populate a table with various headers and the full article, I have, mysql> select count(*) from rcm; +----------+ | count(*) | +----------+ | * *79303 | +----------+ 1 row in set (0.00 sec) and an actual article count on the directory of 79347. * Obviously, there are 44 articles that for some reason or other, failed the INSERT (and an additional 42 that don't exist as filenames). *Given that server_id (mediumint) is assigned the filename of the article, how can I find which numbers between 1 and 79389 *don't* exist as server_id? I'm a bit unsure whether I correctly understand what you try to accomplish, but given you have a table with all numbers 1..79389 * I'm not sure whether you think I have such a table, or whether you're suggesting I create one. you could simply run a denying left join query on both tables like: select f.id as missing from fulltable f left join leafnodes l on f.id=l.nodeid where isnull(l.nodeid) * I have one table rcm. *That's it. *I guess I can look up denying left joins and see where that leads me. Probably to understanding what you're doing there. *but I am afraid I'm missing the point here * The source is a directory containing filenames in the range of 3 to 79389. * The table "rcm" has 79303 records with server_id field in the range of those above filenames... so some files failed to be INSERTED by the script. * Very basically, given a table with record id numbers of 2,3,4,6,7,8,9, which numbers are missing? *How do I get MySQL to tell me that 1 and 5 are not there? * The ultimate point is to find out which 44 scripted INSERT statements failed and why, to fix the script. * I could find the missing articles via a couple of additional steps in the shell script, but it takes hours each time I run it, so I was hoping MySQL would have some sort of easy built-in way to do it. |
#6
| ||||
| ||||
|
|
I am abit surprised that running a shell script over a directory should take so long. |
|
Using Linux tools like awk with piped output to mysql a table (given its created) of all filenames/nodes in a directory can easily be created by something like: ls -l |awk '$1!~/^d/{print "INSERT INTO mytable set filename= \""$8"\";"}' |mysql -uxxxx -pxxxx mydatabase |
|
of course this would take longer if your putting fulltext content in the database, too. |
|
However this little snippet should suffice to fill a table of all files in a directory |
#7
| |||
| |||
|
|
In <41e20971-4fa0-41eb-9eff-b2bcddb1d6a5 (AT) d45g2000hsc (DOT) googlegroups.com>, on Tue, 6 May 2008 00:08:15 -0700 (PDT), cvh@LE, christian.hansel (AT) cpi-service (DOT) com wrote: I am abit surprised that running a shell script over a directory should take so long. It takes ~34 minutes on this machine just to do this: #! /bin/sh i=3 while [ $i -le 79389 ] do echo $i mysql news <<MARKER INSERT into count (id) VALUES ($i); MARKER i=$((i+1)) done |
![]() |
| Thread Tools | |
| Display Modes | |
| |