We've got a query performance issue that we're trying to work out. None of us are particularly expert at this sort of thing. It's hard to debug this in production because there's a lot of noise going on from other users. Running the queries in a development db gives you some benefits but sadly buffer hits are at 100% which doesn't reflect the experience in live. But I'm hoping you can help me work out what is going on.
Customer and address are related to each other 1-1. Each address record is unique even if the address is shared. So to read 500 customers you'd expect to also read 500 unique addresses. All of the queries below use the primary index which is on the key.
If you do a for each on customer we get 4793 record reads but 13501 logical reads. It takes x amount of time.
If you do a for each on customer, first address of customer, we get 9566 record reads but 33277 logical reads. It takes 2.5x amount of time compared to just customers. In live it can take up to 10x as long - probably due to the reduced buffer hits.
My initial question is, why so many more logical reads compared with record reads? Is there anything else I can look at to help debug this?
Thanks for your help
I should add - Progress 11.2.1 on Windows 2003 Server (32 bit still!!!!).
Thanks Julian. My dev DB was restored from backup yesterday...
Will dumping and loading just the two tables in question be necessary? I don't fancy doing the whole 360GB DB over a USB2.0 connection ;)
Unless you use the new-fangled TABLE-SCAN option every record read involves the logical reading of both a data block and (at least one) index block. So the expected ratio of logical reads to record reads is a bit more than 2:1.
(CAN-FIND() might, under some circumstances, result in index reads with no record reads.)
3:5:1 is a bit high -- but not scary high. It may indicate fragmentation, DBANALYS can give you some insight there. It might also mean that your index isn't as good as you think it is. Or that you aren't using the index that you think you are using.
ProTop is very helpful here -- it will show you, at runtime, which tables and indexes are in use dynamically. The results sometimes surprise people.
Also that "FIRST address OF customer" probably isn't doing what you think it is doing.
Thanks. I'll give it a bash tomorrow.
I wouldn't be surprised as we've not performed much by way of admin on the DB side of late. Mainly because we don't have the disk space available to do so. It's a bit of a catch 22.
Dumping and loading will only make a difference if you are actually fragmented. Check dbanalys first.
Thanks Tom - agree OF is bad. I just used it as a shorthand here. I know which indexes are used as I have a tool similar to protop. It's definitely using the primary index - primarykey of the table in each case.
OF is not the problem. FIRST is.
What is a "bad" amount of fragmentation? It's not been that long since I ran dbanalys and it's showing 1.2 for Customer. 1.0 for Address.
PRIMARYness isn't necessarily good. The quality of the bracket is what matters. How well do the WHERE clause and the indexes match? PRIMARY is a tie-breaker, not a goal.
If the WHERE clause and the indexes are not 100% equality matches then the expected ratio should be expected to be higher than 2:1 because there will be filtering of the result set which means excess reads.
For each customer,
first address where address.addresskey eq customer.addresskey.
The indexes are: addresskey on address, customerkey on customer. If that makes sense.
Re: fragmentation. Don't trust the "factor". In some releases it is calculated incorrectly. Look at "number of records" vs "number of fragments". If fragments is 110% of records then 10% of records are fragmented (roughly). That's not enough to be behind reading 50% more blocks than you expect.
Factor looks reliable enough in this case based on the check you describe.