Information

Title	Performance of Binary Dump/Load versus Bulkload

URL Name	21005

Article Number	000122102

Environment	Progress 9.X OpenEdge 10.X OpenEdge 11.X All Supported Operating Systems

Question/Problem Description

Performance assessment of Binary Dump/Load versus Bulkload.
Performance assessment of Binary Dump versus Bulkload
Performance assessment of Binary Load versus Bulkload

Steps to Reproduce

Clarifying Information

Error Message

Defect Number

Enhancement Number

Cause

Resolution

Binary dump will always be faster than Data Dictionary dump.

You can try 4 processes per CPU to keep system utilization high.
Before starting, run a dbanalys, sorting tables by size descending, start one parallel proutil -C dump for the number of selected processes -1, i.e, one less than the total of 4 (4 -1 = 3) in this example. Then dump the rest of the tables one at a time.

proutil dbname -C dump largetable1 &
proutil dbname -C dump largetable2 &

repeat that for processes one less than the total of 4 in the above example.

then
proutil dbname -C dump smalltable1
proutil dbname -C dump smalltable2, etc.

If the database is online, you can also do threaded binary dump:

proutil dbname -C dump <tablename> . -thread 1 -threadnum <threadnumber>
(where -thread 1 for turning on threaded binary dump)

Running multi-threaded dump for a large table is helpfull when -dumplist is used. The -dumplist parameter value is a file name where it lists <table name>.db files with the sequence of the multiple files multi-threaded binary dump will create.

Example:

1. Create a file with name <table name>.pf so for orderline table it would be orderline.pf
2. Specify the following in the orderline.pf:

#
-thread 1
-threadnum 4
-dumplist orderline.txt

3. Run the following command in the prompt:

proutil dbname -C dump <tablename> . -pf orderline.pf

The above command will create orderline.txt file in the current directory which will contain the names of the dump file for the orderline table as shown below:

<path>/orderline.bd
<path>/orderline.bd1
<path>/orderline.bd2
<path>/orderline.bd3

The orderline.txt file then can be used with the binary load using the following command:

proutil <db name> -C load orderline build indexes -dumplist orderline.txt

Bulkload doesn't write to the bi file, so it can be faster than a binary load, but you can tune the binary load to run just as fast.

Use a 16K blocksize for the BI file, 128K cluster size, set -G to zero to prevent the bi file from growing & run with the -i parameter. If the system allows it, run a ramdisk. Even without a ramdisk, the -G 0 keeps the bi small.

The 128K cluster size may seem too small, but the expense of switching clusters does not apply when using the -i parameter. Even if we need to grow more clusters in the BI file, the overhead is small because the BI file is opened non-raw.

WHEN RUNNING WITH THE -i PARAMETER, IF THE LOAD FAILS, YOU MUST GO TO BACKUP!

Prior to version 9.1B, it is necessary to do an Idxbuild after performing a binary load. In Version 9.1B, the "build index" option was added to the binary load command.

Workaround

Notes

References to Written Documentation:

Database Administration Guide & Reference: Binary Dumping & Loading

Progress Solution:

000020434, "Administration that can be carried out on the bi file"

Keyword Phrase

Last Modified Date	9/13/2015 3:58 AM

Performance of Binary Dump/Load versus Bulkload

Information