I vaguely, and perhaps incorrectly, remember seeing an interesting alternative method of estimating the number of records in a table.
It was NOT any of the usual FOR EACH, SELECT *, OPEN QUERY, etc. approaches and I don't think it was "run tabanalys and parse the output (or use the csv)" either.
I'm thinking it had something to do with looking at index blocks. Maybe.
Does this ring any bells?
Sounds like the sort of thing [mention:ae2ea2f6412743fc8be36c522f414ef0:e9ed411860ed4f2ba0265705b8793d05] would be playing around with!
Since V11.6 viewB2 shows (instantly) the number of blocks allocated per object in type 2 storage areas.
Also it's possible to get quickly the number of levels in an index tree. It can be used as a very rough estimation of table's size.
yes. idxblockreport. there's the slib/utils/fast-rec-cnt.p in the standard libraries project at the oehive.org
that uses proutil idxblockreport to calc the number of records. it's not exact but it's pretty close.
i timed it years ago at roughly 500,000,000 records per second.
<code>
define var i as int no-undo.
run slib/utils/fast-rec-cnt.p( "<physical dbname>", "table", output i).
message i.
</code>
idxblockreport is unsupported and has some issues.
As an alternative we can use idxfix/2 scan indexes without recids validation. It will be as fast as idxblockreport:
Index 15 (PUB.Customer, Name): 83 keys.
thank you George! i'll have a look at updating the code to use idxfix.
[quote user="George Potemkin"]
Since V11.6 viewB2 shows (instantly) the number of blocks allocated per object in type 2 storage areas.
[/quote]
That does make it tempting to put everything in -B2 ;)
It would be great to be able to get that same info for B1.
[quote]
Also it's possible to get quickly the number of levels in an index tree. It can be used as a very rough estimation of table's size.
> It would be great to be able to get that same info for B1.
viewB2 reports the objects in B1 as well as in B2
Exampe:
proutil sports -C viewB2
Area "Customer/Order Area":8 - Primary Buffer Pool Object Enablement Size Type Object Name ----------------- -------- ------- ------------ Default 7 Master Area.Control-Object:0 Default 8 Table PUB.Customer:2 Default 16 Table PUB.Order:4 Default 32 Table PUB.Order-Line:5 Default 8 Index PUB.Customer.Sales-Rep:16 Default 8 Index PUB.Order.Cust-Order:21 -------- 79
prostrct statistics sports
Database Block Usage for Area: Customer/Order Area Active blocks: 79 ... Records/Block: 32 Cluster size: 8
> Once you have the number of levels how do you calculate the estimated table size?
Number of records ~ 1000 ^ levels
Number of levels in the index tree (1, 2, 3, 4, 5, 6).is a kind of the table's size gradation: "very small", "small", "medium", "large" and "huge".
Here's another approach, using the cardinality stats. It would be accurate once you hacve run "UPDATE TABLE STATISTICS" for each table. It is lightly tested on sports, giving the same totals as tabanalys and ABL. Caveat: it would have to be rewritten to work on a partitioned DB.
define temp-table table-stat no-undo field file-number as integer format "->>>>9" label "Tbl#" field file-name as character format "x(32)" label "Table name" field rec-count as int64 format ">>>,>>>,>>>,>>>" label "Record count" field stats-dt as datetime label "Update Stats date" index file-number as unique file-number index file-name file-name . for each dictdb._file no-lock where _file._file-number > 0 and _file._file-number < 32768, each dictdb._systblstat no-lock where _systblstat._tblid = _file._file-number: find table-stat no-lock where table-stat.file-number = _file._file-number no-error. if not available( table-stat ) then do: create table-stat. assign table-stat.file-number = _file._file-number table-stat.file-name = _file._file-name . end. case _systblstat._property: when 2 then table-stat.stats-dt = _systblstat._val_ts. when 4 then table-stat.rec-count = _systblstat._value. end case. end. for each table-stat no-lock use-index file-name: display table-stat. end.
Rob, it works very will on 1.5 TB database giving results matching tabanalys and ABL.
I would sort temp-table by record-count descending instead of file-name. With 2000 tables that would be a bit better.
define temp-table table-stat no-undo
.....
index rec-count rec-count desc.
.......
for each table-stat no-lock use-index rec-count:
this question of how many rows there are in a table seems to come up fairly often.
if you are one of those who care, could you please tell me /why/ you care. why is it important for you to know this number? and why cant you get it from a table analysis report from last week or last month?
-gus
Gus,
A couple of examples come to mind, there are probably others.
- D&L: after the fact, I want to compare record counts in source and target. I use tabanalys for this but it can be time-consuming on large DBs. And compared to others here, my "large" DBs are quite small.
- looking at CRUD stats. I was doing this today. Looked at a test DB and found a huge number of reads on a table. I tracked down the program and found an unbracketed query. I wanted to know whether, given the number of times the program had been run, this would account for the reads I saw. So I wanted to know how many records were in the table, to see the cost of each table scan. In cases like this an approximate count is good enough.
One of the reasons I need to know the approximate record count is to estimate how fast I can add an index to the table. Of course tabanalys will work too. I run dbanayls every 5 days, parse it and publish in html on our internal web portal. So every developer could look at it before coming to me for a schema change. It is helpful.
Here Rob presented an other way. As we learned in math two solutions is better than one.
1) CRUD stats
2) Before I would suggest for a customer how to check the corruptions in their database I should understand how large are the objects in the database.
I, of course, can ask them to send me the recent dbanalys but I prefer that my script to garther quickly all information I may need. That is why viewB2 is a solution for me. Though I would like if viewB2 (or another tool) will also reports the length of object's chains.
Hi Rob,
concerning the record counts during the D&L:
When you perform a binary dump & load, record counts are given. It is quite easy to compare dump and load number of records by scraping the logfiles. I have some simple *nix shell-scripts to handle this. It wil compare the number records dumped against the number of records loaded. (don't load twice ;-)). If you like, i can send them to you.
Simon
> When you perform a binary dump & load, record counts are given.
It's the number of the indexed records, not the number of all records that exist in a table.
Thanks George, indeed, you are fully right. ;-)