OE10.2B08 idxbuild all - fastest options - Forum - OpenEdge RDBMS - Progress Community

OE10.2B08 idxbuild all - fastest options

 Forum

OE10.2B08 idxbuild all - fastest options

This question is not answered

Age old question! I need to rebuild indexes on a DB as fast as possible. 

Got 10GB memory

500GB free on DB volume 

250GB free on another volume

What's the best options to use? Need any more info? 

Windows by the way. 

All Replies
  • 4 cores on the processor by the way!

  • 165GB DB.

  • proutil <DB> -C idxbuild ALL -TB 64 -TM 32 -TF 90 -SG 64 -T K:\Temp

    More Memory - faster idxbuild

  • Truncate all index areas before running idxbuild.

  • Unfortunately we've got data and indexes mixed. Does that make a difference to this recommendation,

  • Of course, truncating can't be used for the mixed areas. It's sad that we can't move the inactive indexes between the areas. Otherwise you could easy separate data from indexes before running idxbuild.

  • Assuming that you're on 10.2B06 or later, look at -mergethreads and -datascanthreads. You can also set  -threadnum <x> but be careful as mergethreads are per threadnum thread. IIRC -threadnum defaults to the number of CPUs/cores but I always set it explicitly.

    There is also a new TMB parameter which I typically set to 256. And a large -B doesn't help you: leave it a 256 or 512. Same for -i: no value per my testing.

    TF was the biggest addition to idxbuild, as it can move all the I/O intensive sorting and merging to memory. What we used to fake with RAM drives is now built in.

    There was some weird issue with TF on HPUX where it couldn't/wouldn't allocate more than one shared memory segment, so even if you had 128 GB RAM on the server, proutil only used 4 GB (or whatever shmmax was). I had reported it as a bug so it might be fixed in later versions.

    There was also an issue when you ran multiple concurrent idxbuild, all with -TF 90: they would bump into each other, sucking up all the RAM and eventually one or more would crash. But again that was way back in 10.2B0x so it might be resolved in later 11.x versions.

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://dbappraise.com/protop.html
  • Thanks Paul. Appreciated as always. This is 10.2B08. Currently running a test with the command posted above by Manfred. -threadnum has definitely defaulted to num cores.

    It's currently about 10% of the way through phase 1 - 17.5m blocks. Could be a long old night ahead. :(

  • Kill it and add -datascanthreads 4 -mergethreads 2 -TMB 256 -B 512.

    Also add "-z -rusage" for some extra geeky stats.

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://dbappraise.com/protop.html
  • The -rusage param isn't just geeky, it's important for determining whether you're tuning correctly.   It will cause (11480) and (11483) messages to be written to stdout and the db log file, for each area processed, telling you how much I/O was done to temp files (-T or -SS).  Example:

    DBUTIL : (11480) Temporary sort file at /u/db/11.5/pac used up 0K of disk space.

    DBUTIL : (11483) A total of 0K of temporary sort disk space was used for area 6.

    If these numbers are larger than 0 KB then you are doing extra disk I/O: write I/O during the data scan/key build phase, read/write during the sort/merge phase, and read I/O during the b-tree insertion phase.

    That's a very small amount of RAM, especially for a DB this size.  Is this a VM?  Can you bump it up temporarily?

  • Sorry the 10GB memory was free memory. They might be able to beef it up for the live run. We're testing on the replica target to get an idea of the timings involved at the moment.

  • Sorry Paul and James, I was wrong about -rusage :(.  You get the 11480 and 11483 messages by default, at least back to 10.2B08.  The -rusage param adds lines like this:

    [2017/07/18@14:10:14.546-0400] P-75487      T-139980717442816 I DBUTIL   : (-----) Resource usage: CPU user 0.002000, system 0.001000

    Resource usage: DISK reads: 80 KB at 39 KB/sec, writes: 0 KB at 0 KB/sec

    [2017/07/18@14:10:14.546-0400] P-75487      T-139980717442816 I DBUTIL   : (-----) Resource usage: CPU user 0.003000, system 0.002999

    Resource usage: DISK reads: 1640 KB at 818 KB/sec, writes: 456 KB at 227 KB/sec

    Not sure where i got the idea that it was tied to 11480/11483 but (not surprisingly) Paul is correct: -rusage provides info that is interesting but not critical.  

    That said, you should also pay attention to your 11480/11483 messages and ideally you want to see 0K of temp file I/O per area.

  • [2017/07/19@03:49:18.139+0100] P-4460       T-4928  I          : (11480) Temporary sort file at f:\sort\ used up 259868928K of disk space.

    [2017/07/19@03:49:18.141+0100] P-4460       T-4928  I          : (11483) A total of 259868928K of temporary sort disk space was used for area 6.

    That's a lot of disk io. So adding more RAM will help that? Not that we'll be able to add 260GB more! I presume putting the sort files on as fast a disk as possible will help?

    Full idxbuild took 14.5 hours! :(

  • Bit of a side question, but we've run idxfix option 3 on some of the tables we've got index corruption on and it makes the problem go away, but after a while it comes back again. It's always the same tables, different RECIDs though. Is it possible that their ODBC connections are causing this? Is there anything else I can do to debug where this is coming from. We haven't changed our application code in a few weeks so it's not code. Is it possible that the idxfix isn't solving the problem where the idxbuild will?

  • Hi James. We had the same problem overhere. It was a code page issue. An UTF-8 session was adding Chinees characters in a database with Polish code page.