High create / Delete tables in own extent (Type 1) - Forum - OpenEdge RDBMS - Progress Community

High create / Delete tables in own extent (Type 1)

 Forum

High create / Delete tables in own extent (Type 1)

This question has suggested answer(s)

75%+ of all CREATE activity occurs in just 3 tables on our 9.1 DB (i.e. Type 1 extents) .  These tables constitute about 38% of the total DB size.

Would it be worth having these in their own extent to reduce Scatter?  My thinking here is that all this delete activity is peppering the DB with gaps like Swiss cheese leading to high scatter over time.

All Replies
  • 20 is a "rule of thumb" that basically boils down the various block and record overhead into single number.  It is not terribly scientific but, in my experience, it works well.  As George points out -- there isn't much value in calculating the value out to N decimal places when you're just going to round up to a power of 2.  There are kbases, white papers and spreadsheets floating around that will tell you how to do that if you feel really compelled.  But IMHO there is simply no value in doing so.

    --
    Tom Bascom
    tom@wss.com

  • > On May 4, 2016, at 3:02 AM, Grant Holman wrote:

    >

    > I've run proutil tabanalys to get the avg row size and then used this to split the tables into 256,128,64,32,8,4,2,1 RPB.

    this is not necessary.

    versions 10.1b ? and later have 64-bit rowid's (which you should enable if they are not so).

    this means theat you do not have to ever worry about running out of rowid's. The roughly 2.1 billion rowid (and record fragment) limit is gone. you do not have to be sure the maximum-rows-per-block is set low to avoid wasting rowids. forget about that.

    you only have to be sure that the maximum rows per block is set high enough that you don't waste too much space. for example:

    with maximum-rows-per-block set to 64, an 8k block with 64 records has room for about 6600 bytes of data. if your records average 50 bytes long (not common), then 64 records adds up to only 3200 bytes. so about half the block would be wasted (though it could be partly used if the rows in it expand during updates).

  • Hope this ok George:

    Table  Create  Update  Delete

    A      326K    12.2M!  465K

    B      335K    417K    333K

    C      176K    196K    177K

     

    I can't see 'RM chain' in the dbanalys output.  Should I be running something else?

  • Thanks Tom.

  • Hi Gus.  Are you saying creating extents to based on RPBs is a pointless exercise?

    We are running on Windows with a 4K O/S block size (I think Windows actually calls this Cluster size?) and a 4K DB size (it's my understanding this is the most appropriate for Windows, please correct me if 8K would be better!).  BI is 8K though.

    The point of the exercise is not to address 32 bit rowid limits - our DBs only have 10s of millions of records in total so 2.1 billion per extent is more than enough!  However we have some tables with very small records (e.g. 30 bytes)  and some larger (e.g. 3,500 bytes) - it seemed to me that putting these in appropriately RPB sized extents would make sense.  Am I wasting my time?  Should I just bunge everything into 32 RPB extents and not worry about it?

  • > Table  Create  Update  Delete

    > A      326K    12.2M!  465K

    More deletes than updates?! The number of records in the table is decreasing. Not to zero? ;-)

    > B      335K    417K    333K

    > C      176K    196K    177K

    It might be that the most updates happen right after the creates.

    For example:

    do transaction:

     create table.

     assign table.key_fields.

     assign table.other_fields.

    end.

    If this guess is correct then you need carefully to choose RPB for these two tables (or to set the create limit if this option is available in your Progress version).

    > I can't see 'RM chain' in the dbanalys output.  Should I be running something else?

    Did you run dbanalys or tabanalys?

  • Creating Storage Areas (extents are the individual files that make up a storage area) with finely tuned rows per block has very few benefits these days.

    Personally I still do it.  1) it's a habit  2) it's not that hard so long as you don't obsess over the decimal point and 3) I have certain tools that I sometimes run which use Monte Carlo sampling of RECIDs.  They run a lot faster if there aren't a lot of missing recids ;)

    If you are going to take the "one size fits all" approach then you should probably choose 128 or 256.  Some people argue that 256 is pointless because nobody has records that small.  I happen to know of some tables whose records are indeed that small.  So I go with 256 when I am doing that.

    In my experience 8K blocks will give somewhat better read performance.  Even on Windows.  But it is not usually enough to argue about.

    --
    Tom Bascom
    tom@wss.com

  • > On May 4, 2016, at 1:26 PM, ChUIMonster wrote:

    >

    > If you are going to take the "one size fits all" approach then you should probably choose 128 or 256. Some people argue that 256 is pointless because nobody has records that small. I happen to know of some tables whose records are indeed that small. So I go with 256 when I am doing that.

    They have to be /very/ small.

    Here's the scoop on 256 rows per block:

    Block size 8192 bytes

    Data Block header 64 bytes

    Record block extra header 12 bytes

    Row directory for 256 records 512 bytes

    Per row overhead (17 x 256) 4352 bytes

    Create limit 300 bytes

    Total overhead 5240 bytes

    Space left for records 2952 bytes

    2952 / 256 11.5 bytes for each record if they are all the same size.

    HOWEVER

    *******

    with 256 rows per block, if you put just one record in the block (say 7,000 bytes) it would look like this:

    Block size 8192 bytes

    Data Block header 64 bytes

    Record block extra header 12 bytes

    Row directory for 1 record 2 bytes

    Per row overhead (17 x 256) 17 bytes

    Create limit 300 bytes

    Total overhead 395 bytes

    Space left for records 7797 bytes

    1 7,000 byte record 7000 bytes

    Available space 797 bytes

    My point here is this: you do NOT have to sort all your tables into buckets of 1,2,4,8,16,32,64,128, and 256 buckets.

    You simply have to make sure that the maximum rows per block is large enough. 7,000 byte records can be stored in areas that are NOT 1 row per bloc or 2 rows per block just fine. You lose nothing by putting 7,000 byte records into a block that can handle 64 or 128 rows.

    In Type 2 areas (which you should use exclsuively) rows from diferent tables are not mixed.

    Stop doing all the extra pointless analysis you have been browbeaten into doing. The times when it is worthwhile are rare.

  • George - I ran dbanalys.  This is on 9.1D07

  • George - interested to see that multiple assign statements would increase the update count in _tablestat even if within a single transaction, I didn't realise this.

  • Gus - thanks for this detailed reply, I'll need to study it first to make sure I get the finer detail!  N.B. I am using 9.1D so only have Type 1 extents at my disposal.

  • > I ran dbanalys

    The output's header should report the line:

    Options: chanalys ixanalys tabanalys

    The "chanalys" option means that the output should have the "* CHAIN ANALYSIS" segments (msg ## 3873 & 3892). These messages exist at least since Progress V6.

    > interested to see that multiple assign statements would increase the update count in _tablestat even if within a single transaction

    Most likely a first assign statement creates a record with the size just a bit larger than the template record that in its turn can be close to the minimal record size reported by tabanalys. Second assign statement changes the record's size to its final value, let's say to an average record size reported by tabanalys. So you can estimate the size of the record's expansions as AverageRecSize - MinRecSzie. Only a record that uses a last slot in a block has a chance to become fragmeneted. If it's your case than (Fragments - Records) / Records ration should be close to 1 / RealRPB where RealRPB ~ BlockSize / AverageRecSize. This approach will work only for those tables where _Table-update / _Table-create ration is aproximately 1:1.

  • Hi Tom - taking eveyone's input into account I think I'll test both 4K block / 64 RPB and 8K block / 128 RPB DB versions.  I'll put the Indexes in their own extent with 32 RPB (to cover me in case a table gets put in their by mistake).  Thanks for all your input.

  • More:

    Suppose you have maximum-rows-per-block at 256 and 8k data blocks.

    Now you put a single 100 byte record into a block. It will look like this:

    Block size 8192 bytes

    Data Block header 64 bytes

    Record block extra header 12 bytes

    Row directory for 1 record 2 bytes

    Per row overhead (17 x 1) 17 bytes

    Create limit 300 bytes

    Total overhead 395 bytes

    Space left for records 7797 bytes

    1 100 byte record 100 bytes

    Free space 7697 bytes.

    Note that the row directory is only /one/ slot at this point.

    If we put another 100 byte record into the same block, we expand the row directory by one more entry (2 bytes) and we add another 17 bytes of overhead for the second record. So the second record uses 119 bytes of space (100 + 2 + 17). We end up with this:

    Block size 8192 bytes

    Data Block header 64 bytes

    Record block extra header 12 bytes

    Row directory for 1 record 4 bytes

    Per row overhead (17 x 2) 34 bytes

    Create limit 300 bytes

    Total overhead 414 bytes

    Space left for records 7778 bytes

    2 100 byte records 200 bytes

    Free space 7578 bytes.

  • Gus - are your figures (17 byte overhead per row, 64 byte block header, 2 bytes per row etc.) specifically for Type 2 or do they apply to Type 1 as well?