Salesforce

How to run idxbuild the fastest way?

Information

 
TitleHow to run idxbuild the fastest way?
URL NameP72607
Article Number000156731
EnvironmentProduct: Progress
Product: OpenEdge
Version: All supported versions
OS: All Supported Operating Systems
Question/Problem Description
How to run idxbuild the fastest way?
What is the quickest way to run an index rebuild?
How to run Idxbuild the fast way?
 
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
1. Create a multi-volume sort file. This negates the need to use the -T parameter on the command line and allows sorting to use more than 2GB of space. If possible, allocate the sort directories onto hard drives other than where the database files reside.

2.  Example of  proutil command <note this does not make use of all possible parameters listed in this article>:

proutil <dbname> -C idxbuild <SOME/ALL> -TB 31 -TM 32 -B 1000

3. If you are using 9.1D07 or higher, you can use an additional parameter called Sort Groups (-SG), like this:

proutil <dbname> -C idxbuild SOME/ALL -TB 24 -TM 32 -B 1000 -SG 64

4. If you are using 10.2B06 or higher, you can use some additional parameters or some parameters with altered limits:

-TB <n>:  The sort block limit to -TB has been increased from 31 to 64
-TMB <n>:  This new parameter allows a block size specification for a blocksize for the merge block which can be separate from the sort block size.
-TF <n>: This parameter allows the idxbuild process to use system memory to hold more of the information being processed by the idxbuild.
-datascanthreads <n>:  This parameter configures the number of concurrent threads used to scan table data associated with the indexes being rebuilt.  
-mergethreads <n>: This parameter configures the number of concurrent threads to spawn during the sort/merge phase of index rebuild that merges the key entries in the sort blocks produced by the data scan phase for an individual index sort group.

5. When running through the idxbuild prompts, make sure that you say YES that you have enough disk space for index sorting. If you say NO, then it will run much slower and it will not use the multi-volume sort files either. Don't be fooled into thinking that by saying "NO" to this question that it will sort everything in memory and therefore be faster. Obviously you must ensure that you do in fact have enough disk space for index sorting, which is usually ~75% of the database size but could be more/less.

6. If a storage area is used only for indexes, consider truncating the area before running idxbuild.

7. Assuming that you're on 10.2B06 or later, look at -mergethreads and -datascanthreads. You can also set  -threadnum <x> but be careful as mergethreads are per threadnum thread.
Workaround
Notes
References to Written Documentation:
  • OpenEdge Data Management: Database Administration : Maintaining and Monitoring Your Database : Managing Performance : Index use : Rebuilding indexes : Maximizing index rebuild performance
  • OpenEdge Data Management: Database Administration : Reference : PROUTIL Utility : PROUTIL IDXBUILD qualifier

Progress Solutions:

 "How to scope and define a multi-volume srt file for idxbuild"
 "Understanding the new -SG parameter for IDXBUILD"
 What are the new Index Rebuild parameters that were introduced in 10.2B06?
 What performance gain using idxbuild multi-threads
"Windows Scripting Limit Affects Progress Proutil Batch File"

Notes from one of the database developers:

For data scan threads it is really trial and error. Balancing CPU usage and I/O throughput is the key to getting this optimal. The idea of 1.5 threads * CPUs is to eliminate any wasted CPU during this part of the index rebuild. With today's machines having so many CPUs, this number can become ridiculous since the file system will become the bottleneck. Having too many threads that are not improving I/O rates will introduce contention/concurrency issues and may actually decrease performance.

So you need enough datascan threads to use all the available CPU resources up to the point where I/O rates no longer improve.    

I know everyone wants a simple formula to just apply but there really isn't one that fits every deployment.  The experience of others with success running this in similar deployments to yours is the best resource for specific tuning suggestions.
Keyword Phrase
Last Modified Date9/6/2023 1:53 PM

Powered by