Your main factor with this is /when/ you run this routine.
If you ran it daily you would have less deletes each time = quicker runtime.
If you ran it weekly - and have a nice slot of time where no one is actively using the system - then who cares if it takes 9.1 minutes compared to 8.6 minutes?
So get your timing down - then do your tweaks.
I hate to see people throwing away information. Define "pretty big".
If some process is slowing down as the amount of data in the table increases, then it is likely that the fault is the query in that process, not the amount of data in the table.
Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice http://www.cintegrity.com
Like I already said I'll figure out the best time to run the job.
And it's more like 12.7 secs / 10 000 records vs 9.8 secs / 10 000 records projected on 2.5 millions records, eventually more, It might not be that big a deal but I care.
No process is slowing down. It's about disk space and getting rid of lots of unecessary data.
Like I said, how big is big. Disk is awfully cheap these days....
The data I'm cleaning goes through a first database before getting transfered to another.
The first db acts as an intermediate, and that's the one I'm cleaning regurarly.
So db1 is 10 gigs now and DB2 is 60 gigs. It grew 25 gigs in the last year and will grow by more than that in the next year.
I understand disks are cheap but isn't it a good idea to delete useless data?
Really useless data is, of course, useless ... but is it really useless. I've had customers go back years mining information from old transaction details. This is the era of Big Data!
Right! I'll need to get used to it. Thanks for your help.
Then again, we may be deluding ourselves into the belief that all that data is actually going to be good for something someday. I bet often it turns out not to be.
Clearly, in fact, one of the lessons of Big Data is that merely collecting large amounts of it is unlikely to produce meaningful results. I am merely suggesting that deciding not to keep data simply because it is big is just as wrong as deciding to keep everything just in case.