Buffer Hash Latch Factor (-hashLatchFactor)
Buffer Hash Table concurrency
This release provides the ability to adjust the number of latches that protect the Buffer Hash Table (BHT). Each BHT latch protects a portion of the table, restricting concurrent access to that portion. With more latches, the fraction of the table protected by a single latch is reduced. Prior to this release, the number of latches was a fixed at 1024; starting in this release, you can specify the number of latches as a percent of the number of entries in the hash table. A larger number of latches should reduce contention for random access to the buffer pool. A larger number of latches will not reduce contention created by multiple users attempting to access the same data.
Specify the number of latches with the -hashLatchFactor startup parameter on the primary broker. If not specified, the factor defaults to ten percent (10%) of the size of the BHT. The BHT is sized at approximately one quarter (25%) of the value of -B. Alternately, you can set the size of the BHT with the -hash startup parameter.
Did anybody already use this parameter in the readprobe test? What are the results? Is it new performance breakthrough like it was with the -lruskips? What is the best values for the tests?
-hashLatchFactor 100 -hash = (-B) -lruskips 1000000 (or alternate buffer pool)?
We (Progress OpenEdge Development) did not run “readprobe” specifically. However, as part of our performance evaluation of this enhancement we executed a very read intensive performance test monitoring both throughput and contention of the BHT latches with pretty decent results.
I’ll contact the person who actually did the testing to provide a more in-depth explanation of what the test actually does as well as post the results here once they are put into a customer consumable format.
To add my personal 2 cents worth, I do not expect it to have as broad an affect as the -lruskips since its "value add" is affected by the data access patterns of the application. Reasons for contention of the buffer pool hash table vary based on the specific data concurrntly requested.
Thanks, Richard. Of course, when BHT latches are not a bottleneck then the -hashLatchFactor will not help. But sometimes the applications create a rather crazy read activity for tiny data.
So the follow up to that is:
In this new release, concurrent access to a single block (tiny data) will not be be improved (as it was with -lruskips). At most an individual block will never have more than one buffer pool hash table latch. Increaseing -latchHashFactor will not change that fact. However concurrent access to multiple differnet blocks is expected to be improved.
For example: this mechanism is not intended to improve perfomance of 100 users cuncurrenly executing "repeat: find first customer" or even worse "repeat: current-value(seq1)" or "for each customer no-lock: end." when the customer table only contains one or two blocks.
Such data access requires additional work to improve throughput from the storage engine perspective.
This is a our sample results for BHT latch (-hashLatchFactor) testing on Linux64 platform with Progress version 117.3.
(Please see the attachment)
For the database setup:
1. we created an empty database and then loaded a .df file to create schema for 2 tables: Customer and Order. Their data and indexes located on type II areas
2. Created 1,050,000 records for table order.
3. backed up the database for testing.
For each test case, we started the database with
-B 200000 -spin 1000 -lruskips 100 -semsets 93 -hash <value1> -hashLatchFactor <value2>
We started the DB with 10 different cases, and we only changed one value for each case:
case 1: -hash 1000 -hashLatchFactor 5
case 2: -hash 1000 -hashLatchFactor 10
case 3: -hash 1000 -hashLatchFactor 100
case 4: -hash 10000 -hashLatchFactor 100
preloaded the buffer pool, and used Customer table as a gate to make ensure all clients started at the same point.
We launched 20 self-service clients simultaneously to query table Order in 2 ways:
- randomly: 20 clients read (NO LOCK) randomly in its' assigned range based on OrderNum:
client 1 can read randomly from OrderNum 100,000 - 149,999
client 2 can read randomly from OrderNum 200,000 - 249,999
- sequential access: 20 clients read (NO LOCK) on the same 100 records with OrderNum between 900,000 to 900,100
Each client read 1 million times. While all clients were reading data, test script would snapshot promon every 3 seconds to open
the Latch Count page (R&D -> debghb -> 6 -> 11) and recorded the "naps/sec" for BHT. Average "naps/sec" would be calculated
for each case after that case was done.
Please see the attached spreadsheet recording the "naps/sec" (the bigger number, the more contention)
For random access scenarios, with same -B and -hash, we can see increasing the -hashLatchFactor can help to reduce the contention.
For sequential access scenarios, with same -B and -hash, naps/sec is similar when increasing the -hashLatchFactor, because the users are contenting on the same blocks.
Beside the -hashLatchFactor, from the result, we can see increasing the -hash will also help, because the -hashLatchFactor depends on the -hash.
To conclude, there are a lot of factors to consider the right value for -hashLatchFactor. It depends on:
- data allocation
- number of users accessing the BHT
- buffer pool, BHT's size
In general, user can monitor the promon screen, check the contention ("naps/sec") for the BHT to determine what value of -hashLatchFactor is right for his/her environment.
Are there any reasons why the -hash is limited by -B / 4? Tests seem to say the higher -hash the better results (lower naps/sec). In my tests I got that lock duration of BHT latches approximately equates 25 ns * -B / -hash. So I'd try the -hash = -B.
-hash is not limited by -B / 4, this is a default value, user can adjust below or above that.
Bigger -hash will get more latches (because -hashLatchFactor is <n> % of -hash)
So it will help to improve the performance. You can increase either or both -hash and -hashLatchFactor values.
> case 1: -hash 1000 -hashLatchFactor 5
> case 2: -hash 1000 -hashLatchFactor 10
> case 3: -hash 1000 -hashLatchFactor 100
> case 4: -hash 10000 -hashLatchFactor 100
Isn't there a negative performance impact when -hash is not a prime number?
This test didn't focus on the database tuning. It only verified the new startup parameter -hashLatchFactor.
So we kept the same scenario for the database, only adjusted the value of -hashLatchFactor, and monitored how that it impacted the performance. If you look into the attachment, we grouped the test cases into 3 sets:
-hash default (~ B / 4)
and we compared the "naps/sec" when changing -hashLatchFactor in each set to prove if the new parameter was helpful.
The prime number suggestion is the result of industry research performed years ago. The prime is supposed to provide a better distribution of data accross the hash table resulting in shorter chains per bucket. These chains are searched sequentially while holding the bht and therefore can have an adverse effect on performance when they get long. Think of it as a Fast hash lookup followed by a slow sequential scan. It really all depends on the distribution of the data requested and where it hashes to in the table.
Btw we enhanced the hash chain display as part of this work making it easier to see the distribution.