Current-value() statement and the latches

Posted by George Potemkin on 15-Apr-2016 03:14

In Progress V10.2B the CURRENT-VALUE(sequence) statement uses the following latches:
SEQ, BHT, LRU and BUF (twice)

In V11.6 the same statement additionally uses the TXQ latch (twice).

NEXT-VALUE() is not changed between the versions. It uses the latches:
MTX, BIB, 2*SEQ, 2*TXQ, BHT, LRU, 2*BUF

In other words in both versions it uses the same latches as CURRENT-VALUE() in V11.6 plus (as expected) MTX, BIB, SEQ.

Why there is a difference between versions regarding the CURRENT-VALUE() statement?
New behavior? Bug in V10.2B? Or in V11.6?

Just curios (as usually ;-)

George Potemkin

All Replies

Posted by Dapeng Wu on 15-Apr-2016 07:22

It’s a bug in 10.2B. The TXQ latch is required for CURRENT-VALUE() to establish proper latch orders inside the database.
Dapeng
 

Posted by George Potemkin on 15-Apr-2016 07:46

Thanks, Dapeng!

BTW, in the Sequence Readprobe tests under V11.6 it's turned out that the bottleneck is TXQ latch instead of BUF latch as I did expected. The number of their locks are, of course, the same. BUF latch had upto 250 naps/sec while TXQ latch only 100-150 naps/sec.BUF latch is an aggregator for the multiplexed latches but during the tests all sessions read the only block - the sequence block and it's not on LRU chain (the -lruskips will not help in this case). But a lock duration of the TXQ latch seems to be (a bit?) longer than of the BUF sub-latches (latObjLatchLock).

Test environment: Old Solaris box
2 physical processors (sparcv9 1165 MHz), 112 virtual processors.
Memory size: 32544 Megabytes
Progress version: 11.6.1

Posted by George Potemkin on 17-Apr-2016 08:23

> In V11.6 the same statement additionally uses the TXQ latch (twice).

And it gives us an opportunity to test a contribution of TXQ latch to the performance.
Due to the TXQ latch each call of the current-value() function creates 7 latch locks in V11.6 compared to 5 latch locks in V10.2B: 40% (=2/5) more latch locks.

Comparing the results of the sequence readprobe tests:
V11.6 vs V10.2B
Users Increase of SEQ reads/sec
1 - 22.6%
11 - 38.1%
50 - 20.6%

11 users is the user count that provided the best result in the tests, IOW, when the most busy latch is almost 100% busy. In this case the change in performance closely matches the change of the latch locks.

Just for a record: the tests were run with the -spin 10,000 -lruskips 100.


The current-value() function also locks the LRU latch (though the sequence block is not on LRU chain).
LRU is just 1 of 7 (~14%) latch locks requested by the function in V11.6.
We can cut these locks by the -lruskips times.

11.6 lruskips: 0 vs 100
Users Increase of SEQ reads/sec
1 - 5.3%
11 - 6.9%
50 - 7.8%

Eliminating the LRU latch locks does increase the performance but the LRU is obviously is not the most busy latch in the sequence readprobe tests.

Also the SEQ reads for the best user count turned out to be 7-8 times higher than the reads generated when only one session was running. This number seems to be related mainly to the number of latches used during the tests.

11.6.1 spin.10000.skips.0

Users: 1, SEQ reads: 55,053
Users: 12, SEQ reads: 388,358, Increase by 7x, Latch timeouts: 7
Users: 50, SEQ reads: 365,103, Decrease by 6%, Latch timeouts: 837

11.6.1 spin.10000.skips.100

Users: 1, SEQ reads: 57,993
Users: 11, SEQ reads: 415,406, Increase by 7x, Latch timeouts: 0
Users: 50, SEQ reads: 393,656, Decrease by 6%, Latch timeouts: 745

10.2B08 spin.10000.skips.100

Users: 1, SEQ reads: 71,112
Users: 10, SEQ reads: 573,848, Increase by 8x, Latch timeouts: 0
Users: 50, SEQ reads: 474,686, Decrease by 21%, Latch timeouts: 803

All numbers are per second.

One more observation:
Next after the best user count seems to always course a lot of the naps for the most busy latch (a jump from 0 to 200-300 naps per sec). Probably it's because the -spin 10,000 is not the best value for those latches.

Conclusion: The latches are real! We can feel them! ;-)

This thread is closed