Deleting large amount of data in openedge database - Forum - Community Groups - Progress Community

Deleting large amount of data in openedge database

 Forum

Deleting large amount of data in openedge database

  • Hi, I need to delete large amounts of data from a table on a regular basis and I was wondering what the best way of doint that would be.

    I'm using OpenEdge 10.2A.

    Thank you.

  • Drop the table.

  • Well I don't need to delete all the data from it.

  • how often will you need to delete the data?

    I have some routines I perform at low usage times like 4 am (daily). Also if it's weekly you could likely hit some low usage on Sundays (weekly).

  • I have a scheduled task that currently runs once a day. The task time to run has increased significantly as the amount of data has increased over time.

    I'm currently analyzing my system to find better down times as you mentionned as my task is now stepping over some operation times and affecting those operations.

    I'm also considering running the task more than once a day.

  • I've had the least amount of complaints by running large processes at 4am server time, and more conservative transactions at lunchtime. Most of my users are off the db's during their lunch.

    Also you could add a 3rd process time of "end of work + 1hour" - also run conservatively

    We are a 24 hour shop, but the bulk of our users are active from 7am - 6pm. So gauge your own environment accordingly. It may help to consult the dbname.lic file in your database directory, in order to figure out low usage times. That would be an indirect way to do that, as the .lic file will just tell you how many users are logged on, not if anyone is crunching serious CPU/hard disk.

  • lace28 wrote:

    Hi, I need to delete large amounts of data from a table on a regular basis and I was wondering what the best way of doint that would be.

    I'm using OpenEdge 10.2A.

    Thank you.

    How is the delete code structured? Deleting data a record at a time is a lot slower than deleting them in groups of 100 or 1000.

  • Thanks about the lic file I didn't know that. As far as low usage time I'm confident to find the right times after a little bit more analysis.

  • It's interesting that you mention that as I'm currently testing different code structures.

    As it is most of the work is done inside for each loops that delete one record at the time.

    for each table1 where date

      delete table1.

    end.

    for each table2 where date

      delete table2.

    end.

    etc...

    How exactly would you delete them in groups?

  • Something like this does the job nicely, and groups the transactions into blocks of 100 records / delete:

    DEFINE QUERY q-name
       FOR table-name
       .

    OPEN QUERY q-name
        FOR EACH table-name
            WHERE table-name.field-name = "something"
            .

    GET FIRST q-name NO-LOCK.

    tx-block:
    REPEAT TRANSACTION:

        DO i = 1 TO 100:

            IF NOT AVAILABLE table-name THEN
                LEAVE tx-block.

            GET CURRENT q-name EXCLUSIVE-LOCK.

            DELETE table-name.

            GET NEXT q-name EXCLUSIVE-LOCK.

        END.

    END..

  • Sounds good. I'll do some testing and post my results. Thanks.

  • You might want to test that assumption.

    Grouping is great for creation & updates.  My experience is that it doesn't help deletes.

    --
    Tom Bascom
    tom@wss.com

  • Ok here are some results for deleting chunks of 10 000 records from one table.

    Code1:

    for each table1.

      delete table1.

      icnt = icnt + 1.

      if icnt > 9999 then leave.

    end.

    Average time: 12,7 seconds


    Code2:

    DEFINE QUERY q-name

    FOR table1.

    OPEN QUERY q-name

    FOR EACH table1.

    GET FIRST q-name NO-LOCK.

    tx-block:

    REPEAT TRANSACTION:

      DO icnt = 1 TO 100:

        IF NOT AVAILABLE table1 THEN LEAVE tx-block.

        GET CURRENT q-name EXCLUSIVE-LOCK.

        DELETE table1.

        GET NEXT q-name EXCLUSIVE-LOCK.

      END.

      icnt2 = icnt2 + 100.

      if icnt2 > 9999 then leave.

    END.

    Average time:  9.8 seconds


    Code3:

    do transaction:

      for each table1.

        delete table1.

        icnt = icnt + 1.

        if icnt > 9999 then leave.

      end.

    end.

    Average time:  10.9 seconds

    So code 2 is faster than code 1 and if you project the results for a deletion of over 1,000,000 records it's pretty significant.

    I guess I'll see how I can incorporate that into my daily job and see how it goes.

    Thanks again, and if anyone has any more input or suggestions on this it is welcomed.

  • It might help if you would tell us a little about the actual problem.  In your example, it looks like you are deleting the most recent data, which would be unusual.  If we knew something about the process, we might be able to suggest different approaches.

    E.g., years and years ago, before we had temp-tables for building temporary data for reporting, I developed a technique by which we made a copy of an empty database with a generic schema, filled that DB with the data for the report, and at the end simply deleted the database.   Back in the day, this made a really dramatic improvement in the overall processing time.

    So tell us a bit about where the data comes from and why you are deleting it.

    Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice  http://www.cintegrity.com

  • I'm actually deleting the data that's older than 30 days (

    I'm deleting it because there's a lot of data in some tables and the database grows pretty big pretty quickly.