A hundred million rows?

Posted by dbeavon on 12-Jan-2020 22:50

I'm having trouble with a table that has a total of about a hundred million rows.  Is that the limit of what the DBMS is capable of? 

For example adding new indexes is extremely painful.  On top of everything else, I think I've run into a new bug where the OE database refuses to create an active, unique index via a DDL statement.  The statement runs for about 20 mins and then the disk rate slowly drops off to zero, but the CPU remains on 100% indefinitely.  I knew that adding the active index via DDL was going to take a lot longer than using the proutil/idxbuild...  but I didn't imagine that it would run forever!

In SQL Server I can do a hundred million rows (per year) pretty easily, especially if I give it a clustered index of my own choosing (usually some type of custom, surrogate key/s). 

We are using the enterprise edition of the OE RDBMS, but haven't purchased the "table partitioning" feature yet.  I suppose that is the next step forward, right?  Can someone tell me how my table compares to their biggest table (in a single database *without* partitioning).  I'd also like to know how you would suggest adding an index to it, if you were allowed a large outage window, eg 10 hours or so.

All Replies

Posted by ChUIMonster on 13-Jan-2020 16:15

No such limit.

I just took a quick look at a db that I manage which is easy for me to take a peek at.  The largest table in terms of number of rows is: 468,323,164.  There are others that are more of a pain to login and look at that are probably quite a bit larger.

There are indirect limits on number of rows related to storage area type, block size, and storage area rows per block.  But unless you have a type 1 storage area it is extremely unlikely that you need to worry about that. Those hypothetical limits are huge.  If you do have data in type 1 areas then that is a serious problem that should have been fixed a long time ago and which should be a very high priority for your DBA team.

When you say "via a DDL statement" are you implying that you are using SQL to add the index?

If I had a 10 hour outage window to work with I would add the index inactive and do a simple offline proutil idxbuild.  I wouldn't expect it to take more than an hour if I had decent modern hardware and a reasonably up to date release of Progress.

Posted by George Potemkin on 13-Jan-2020 16:47

> Can someone tell me how my table compares to their biggest table (in a single database *without* partitioning).

                                                           -Record Size (B)-           ---Fragments--- Scatter
Table                                    Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.table                            24987814259    3.4T    44 32017   151          24988196147    7.1     1.0

Posted by gus bjorklund on 13-Jan-2020 16:56

wrt type ii data areas, tom said:

> On Jan 13, 2020, at 11:17 AM, ChUIMonster wrote:

>

> Those hypothetical limits are huge.

That's what i said in the v6 days. and i remember when the first customer hit the 2 gb /database/ size limit.

Posted by Thomas Mercer-Hursh on 13-Jan-2020 17:03

When I read the original post, I thought to myself ... just wait until George responds, then the hammer will fall! :)

Posted by ChUIMonster on 13-Jan-2020 19:19

Forgot to add - I don't see a reason to bring table partitioning into the picture.  It *might* make sense but I wouldn't do that merely because of 100 million rows.  I would want to understand the usage of that data and how that usage might be improved via partitioning before I pursued something like that.

Posted by gus bjorklund on 13-Jan-2020 19:22

you win, george :)

Posted by gus bjorklund on 13-Jan-2020 19:32

indeed. table partitioning can be quite useful, especially for data which are historical or time-related and older data are write-locked and purged when retention period has expired. but for sheer table size, may be counter-productive, depending on queries used and types of updates performed.

Posted by dbeavon on 13-Jan-2020 20:04

>> When you say "via a DDL statement" are you implying that you are using SQL to add the index?

Yes.  For the majority of our database schema it is possible to maintain it via SQL statements.  But it seems as if there is a limit that you eventually reach based on the size of data, where you have to switch modes and start using proutil commands as part of the workflow.  It feels like "cheating"... mainly since proutil doesn't feel like a regular part of the DBMS, but more like a back-door tool for editing the internal files *without* the help of a DBMS layer.  (Perhaps it would at least feel less like "cheating" if this stuff could be accomplished with the standard data dictionary user interface).

>> If I had a 10 hour outage window to work with I would add the index inactive and do a simple offline proutil idxbuild.

Yes, I think I'll have to start factoring in the need for proutil whenever working with our larger OE tables.  I guess the SQL/DDL interface isn't cut out for that.

>> PUB.table                            24,987,814,259

Wow, I had to put the commas in there to see that this is table that is 250 times more difficult to manage than the one I'm dealing with.

I'm betting that is a table whose schema is rarely changed.  Maybe in its own database?  Full backup only once a year?  What happens if a user kicks off the wrong query?  It hurts to think about...

Posted by Etienne Begin on 13-Jan-2020 20:22

Dr Evil wins with one hundred billion rows!

Posted by dbeavon on 15-Jan-2020 00:25

I reported this case to Progress.  My table is under 100 million rows but it is still quite difficult to manage.  If Progress doesn't intend for large tables to be managed via SQL statements then hopefully that can be documented in the KB.  I've never heard anything to that effect.  

>> If I had a 10 hour outage window to work with I would add the index inactive and do a simple offline proutil idxbuild

Would it change the answer if you had to break replication and re-enable it again (and your database was in the ballpark of 1 TB?)  This becomes a difficult compromise because the idxbuild runs so fast, but having to re-enable replication is a pain (although that can be done outside of the outage window).

>> PUB.table    24,987,814,259

I would still love to hear more about this table!  I believe you ... but it seems very hard to envision how we could ever manage a table like this ourselves.  And I want to get the whole story in case I ever need to retell it to someone.  

The main thing I'm wondering about is how many years of data this includes, and whether you keep a window/range of dates or if the data grows indefinitely.  I'm also wondering if you would consider simply archiving or deleting much of this data in the event of a schema change (like a new index, or a new column of data that was derived from other pre-existing columns).  

Posted by ChUIMonster on 15-Jan-2020 00:50

If replication is involved then "it depends".  The db that my 468 million row example comes from is almost exactly 1TB.  I can back that up in about 90 minutes and restore it in a bit less than an hour (which is the bulk of the effort required to re-initialize replication).  So that still fits pretty easily into a 10 hour outrage window.

As it happens I do not generally have 10 hours to work with.  So in reality I would probably add it inactive, leave replication running, and build it online with a combination of idxfix and idxactivate.  See this: pugchallenge.org/.../318_IndexMaintenance.pdf

OTOH, as I recall, you are on an old klunky HPUX server with some sort of horrifically awful SAN (which is sort of from the department of redundant statements department).  There's an argument to be made that you could get migrated to Linux before that index will be built.

Posted by Nigel Allen on 15-Jan-2020 01:28


I'm guessing the "10 hour outrage window" is an apt description? Or was it a Freudian slip?

Asking for a friend.

Posted by George Potemkin on 15-Jan-2020 11:57

> I would still love to hear more about this table! I believe you ... but it seems very hard to envision how we could ever manage a table like this ourselves. And I want to get the whole story in case I ever need to retell it to someone.

BTW, there are the tables with similar size in the other branches of the same customer.

The customer does not add new indexes to this table.
The existent indexes:

Table                      Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.table
  index1                       9       3      5       53213748  114.8G    28.4     2.4
  index2                      10       3      4       21102108   89.1G    55.6     1.9
  index3                       8       6      5      144165188  680.6G    62.1     1.8
  index4                      11       2      4       30022068  129.1G    56.6     1.9

                    ------------------------------------------------------------------
Totals:                                              248503234 1013.7G    53.6    1.9

There is a real problem with such large tables: it's not possible to check the health of its indexes for a reasonable timeframe. The fastest way is to build the indexes from a scratch. We can only trust that the indexes are not corrupted.

Posted by dbeavon on 15-Jan-2020 14:17

>> you are on an old klunky HPUX server ...

Is there any another kind of HPUX server?  Do you suppose anyone is buying the "new kittson" version of the itanium chip? (Especially now that HPUX has an end-of-life date!)

We do much of our preproduction work on an old processor (Intel(R)  Itanium(R)  Processor 9350 1.73 GHz).  It has 8 logical processors which is fairly decent for a development box.... but any time our OE -related work is being done single-threaded (as SQL "create index" appears to be) then we have lots of trouble.  Thankfully we have finally made a tentative plan to move off of HPUX but it is going to take a while!  We are moving to Windows Server on modern 4 GHz x64 chips.   Both PASOE and the OE database are running pretty smoothly on Windows, based on my testing.  And the Progress tech support cases we have opened have gone much smoothly now that they are based on the Windows platform rather than HPUX.

Since we are already on the topic of HPUX, has anyone heard of "project kiran" aka "Portable HP-UX"?  In theory it is a binary instruction translator that will allow IA64 code from HP-UX to run seemlessly on Linux x64, to help with migrations....  I don't think it is supposed to be a full VM as such, but something along those lines.  It seems a bit far-fetched to me and I can hardly find any references to this project in my searches (maybe it is supposed to be a secret?)  The only place I found something was in a youtube video: www.youtube.com/watch

Even if HPE is working on a project like that, it is doubtful that Progress would ever support the hosting of their OpenEdge product within that type of a container.  I don't have much hope for "project kiran" ... but it would be interesting to hear what other HPUX customers think about it.  I suspect this is just giving people false hope, and a reason to drag feet while staying on the expensive HPUX hardware for a few more years.

>> The customer does not add new indexes to this table.

Makes sense.  I suppose much of the data itself is static too.  The size of "index3" alone is huge!  It is as big as you might expect for an entire ERP database.

Posted by George Potemkin on 15-Jan-2020 14:52

> I suppose much of the data itself is static too.

No. Table activity per sec:

Time     Reads  Updates Creates Deletes
04:30:02 2620.93  11.08  488.19   87.94
05:00:01 1230.22   8.00  389.70  236.52
05:30:05 1273.29   1.57  851.79  204.68
06:00:03 1239.86   2.39  761.11  190.76
06:30:02 1411.50   0.96  347.13  222.14
07:00:01 4008.59   1.06  417.46  245.09
07:30:02 7769.64   1.82  346.03  232.30
08:00:03 2848.69   1.93  357.93  230.18
08:30:02 2120.96   2.81  765.26    5.71
09:51:16 1775.03   0.04   19.70    0.01
10:00:06 5874.99  13.08  265.98    1.32
10:30:02 5062.48  13.13  397.14    0.15
11:00:03 1119.75  34.45  573.81    0.92
11:30:01 1474.63  59.05  723.82    0.62
12:00:04 2793.17  71.65  639.60    0.74

Posted by Thomas Mercer-Hursh on 15-Jan-2020 15:10

I think one has to step back and realize that while adding an index to a 100M row table might seem like just adding some reference information, one *is* actually adding 100M pieces of data.  While there might be ways to do this particularly quickly with a DB that is inactive and not being replicated, if the DB is active and being replicated, then that is a huge amount of new information being added to the DB and a huge amount of information which must be passed to the replicate.  Hence the whole thing taking a long time using SQL and/or with replication active.

Posted by ChUIMonster on 15-Jan-2020 15:53

20 years ago HP used to make some nice servers.

I cannot think of any reason why I would want to still be running on HPUX in 2020.  You may as well shoot yourself in the foot while your foot is inserted in your ear.  Every moment that you spend not migrating away from HPUX is more agony for your users.

I feel the same way about Solaris.

AIX still has some value at the extreme high end but it isn't very competitive in the mid range or lower.

Posted by Thomas Mercer-Hursh on 15-Jan-2020 16:06

Tom, I was thinking earlier along these lines, to wit, if one is trying to do things of this scale and the performance isn't what one would like, should one actually be blaming Progress or, how likely is it that it is really the platform to blame.  Do you or others have experience with tasks of equal or greater scale which perform at dramatically higher levels without being on exotic hardware which would be beyond a site like this?  Perhaps it is time to point the finger at the real problem?  I know you have a rich store of examples where enterprise IT departments have dictated storage for DBs that is not good for supporting DBs and the contrast between that and some in box disks or even SSDs is staggering.

Posted by ctoman on 15-Jan-2020 16:51

We been running HPUX for 20 years and the performance has been great for the last 14 years.  I personally love HPUX.  It's really all about the storage and ABL programs, anyhow we are going to migrate to RH Linux in the near future, not because of performance, because OE 12 does not support HPUX anymore.

Posted by ChUIMonster on 15-Jan-2020 17:38

I think that unless you have hobbled your new servers by choosing lots of slow cores instead of a small number of fast cores or by continuing to use slow SAN storage that you will very likely be amazed at just how much faster a modest Linux box is.  There is simply no comparison.  HP and SUN lost their way a long, long time ago and it hasn't been close for 10+ years.

Posted by Dmitri Levin on 20-Jan-2020 22:05

> So in reality I would probably add it inactive, leave replication running, and build it online with a combination of idxfix and idxactivate.

That is want I would do too. But the question is How much time it would take to run idxactivate for the 400 million row table. I have several tables over 400 million rows. And while it is not even close to Georges case, I do not think single-threaded idxactivate is possible to do in 24 or 48 hours.

idxbuild on an other hand is multi-threaded and runs circles around idxactivate (and idxdix) !

As to "adding 100M pieces of data", yes SQL DDL would take some time in Oracle DDL too. I am easily adding indexes to the Oracle tables of the size much greater than we discussed here ( except George's client of course ). And IMHO, adding an index to a big table is the biggest thing that defines the difference between Progress and others (Oracle) for me. Everything else I can tolerate.

I hope that the speed of idxactivate will be matching idxbuild soon! We really desperately need it!

Posted by Patrice Perrot on 21-Jan-2020 17:42

Hello

YMMV , I let you estimate and do your own test on idxactivate , …..

I did some tests on idxactivate (with and without idxfix before) on a sports2000 db and another test DB (each time 2 fields in the new index with more fields => more time).

 - On a 10 million row table it took around 10 minutes to achieve idxactivate (mean value 614 secunds).

 - On a  1 million row table it took around 1 minutes to achieve idxactivate (mean value 60 secunds).

YMMV , I let you estimate and do your own test on idxactivate ,

I do not test with 400 million row table ...

On idxactivate there are 2 different possibilities

-  useindex [UNIQ INDEX]

-  useindex [NOT UNIQ INDEX]

Progress write that you can use use IDXACTIVATE “ONLINE”

- with use-index “not uniq index” you cannot create update or delete row of the table

- with use-index “uniq index” you “can” create update or delete row of the table

In the case of using an “Uniq Index” to do the idxactivate,

I did some tests with an create / update/ delete activity on this table when I did my idxactivate.

I notice some few things

- At the end of the wait time (of the usernotifytime) the schema lock to update this index could crash my other session (for example message “6527” “Table <table name> is locked.  Record create failed.”)

            -> No activity when you launch it

- During the IDXACTIAVTE (create index keys) , I could have some messages “2624” due to the lock of idxactivate on the session that do the update activity

         ->  Test and probability show that decreasing the recs from 5000 to 50 have some benefit on those messages

        -> But it increase my AI file size of 3 % and the elapse time of 1.5 %

I notice too that idxfix before idxactivate could be suppress in case of activating a non-unique Index ,

The block of the “inactive” index create by the idxfix are dropped at the begining of IDXACTIVATE .

So it increases for IDXACTIVATE  time ( 0.8 %) and AI file size  (3.5 to 4 % depending on the recs value 50 or 5000)

Remember YMMV.

Patrice

Posted by Patrice Perrot on 21-Jan-2020 18:39

Hi

Forgot to thanks Paul and James for their workshop and breakout session.

Patrice

Posted by dbeavon on 22-Jan-2020 01:49

>> I did some tests on idxactivate ...

Your times are very similar to what I experienced.  I was adding two indexes of about 5 fields each to a table of 100 million rows.  The addition of each of the new indexes was taking about two hours.  For my tests I used two approaches that were both done on an offline database (1) adding the INDEX so that it is initially active via SQL, and (2) adding the INDEX as an inactive one and then using idxactivate afterwards.

Did you happen to monitor your CPU and disk while doing this?  My observation is that CPU runs on a single core at 100% the entire time and disk is *not* really taxed at all (no issues related to the response times on disk, especially when using a fast SSD).

The bottleneck definitely appears to be CPU (or more specifically the clock speed of a *single* core).

Insofar as performance goes, it seemed that "idxactivate" was similar to adding the an INDEX that is initially active via SQL.  The main difference is whether the system is available to user while the index is being built.

Posted by George Potemkin on 22-Jan-2020 12:28

If I understand correctly, the activity of idxactivate consists of:
1. Record reads using a base index (“useindex”);
2. Index key reads from the uncompleted tree of the activated index;
3. Creating in memory the recovery notes for the index blocks.

Idxactivate does not write BI blocks, it’s a duty of BIW process. Transaction size (the "recs" option of idxactivate) or bi cluster size may affect the result but it's an indirect influence and it can be more or less eliminated.

So the main activity is a reading, me think. Did anybody compare the idxactivate time with the time to read a table using a base index and/or the activated index?

My guess: idxactivate is faster than idxfix 4. Cross-reference check of multiple indexes for a table (if you will select two indexes – the base and activated ones). Note that idxfix supports the -rusage option.

And, of course, the reading tests require to empty the database and filesystem caches.

Posted by gus bjorklund on 22-Jan-2020 19:06

> On Jan 22, 2020, at 7:29 AM, George Potemkin wrote:

>

> Update from Progress Community

>

> George Potemkin

>

> If I understand correctly, the activity of idxactivate consists of:

> 1. Record reads using a base index (“useindex”);

> 2. Index key reads from the uncompleted tree of the activated index;

> 3. Creating in memory the recovery notes for the index blocks.

>

note on 1: mostly sequential if reference index matches storage order

note on 2: random reads not sequnential

4. creating index entry in new index

5. writing modified index blocks to disk

Posted by George Potemkin on 22-Jan-2020 19:15

> 5. writing modified index blocks to disk

Should it be done by APWs?

Posted by Patrice Perrot on 23-Jan-2020 09:11

Hi ,

Extract of my tests , on orderline with 10 000 000 rows , idxactivate index  OL_new  using itemnum (or orderline).

Table activity durng idxactivate :

- READ :

   - _File              ->10 000 410

   -  OrderLine     ->10 000 000

   - _Field            ->            111

   - _Index           ->             42

   - _Index-Field  ->               6

- Update

   - _Index          ->              1

INDEX ACTIVITY during IDXACTIVATE

- Create

    - OrderLine.OL_new                   ->10 010 615      

- Read

    - OrderLine.itemnum                   ->10 000 001      (same value for OrderLine.orderline )

    - _File._File-Name                      ->            407

    - _Index-Field._Index/Number    ->              10

    - _Index._File/Index                    ->                5

- Split

    - OrderLine.OL_new                  ->       10  637

I think you can add

=> For each record of the table (orderline) , you do  1 read of _file

I find a huge difference between IDXACTIVATE using an "uniq index" and using a "non-uniq Index" , on the "Record LOCK" of promon :

- the difference is 20 000 000

 - Probably explainable by the fact that the update are alowed when we use an "Uniq index"

        =>  cannot create the new index key when its record is locked by another user

One more point ,  After IDXACTIVATE , you probably need to do an IDXCOMPACT on the new index.

Patrice

Posted by gus bjorklund on 23-Jan-2020 16:56

> On Jan 22, 2020, at 2:17 PM, George Potemkin wrote:

>

> > 5. writing modified index blocks to disk

>

> Should it be done by APWs?

>

Yes. But it must be done and is part of the additional load on the database and system imposed by index activate. As are the additional transaction log writes.

Posted by gus bjorklund on 23-Jan-2020 17:34

> On Jan 23, 2020, at 4:13 AM, Patrice Perrot wrote:

>

> - _File ->10 000 410

>

>

should not be necessary to read _File ten million times. smells like a bug.=

Posted by Patrice Perrot on 24-Jan-2020 09:23

Hi Gus

I think so .

Case opened in november 2019, I said and wrote to Tech support  that this case a low priority for my company…

I opened 2 others case on the "same subject" (-usernotifytime , IDXFIX , IDXACTIVATE).

- IDXFIX does not take care about its option "recs" (The number of records to process in one transaction. If not specified, numrecs defaults to 100.) Low priority too

    => When running IDXFIX on inactive index of orderline (10 000 000 records on orderline) with severals value of recs , the Numbers of commits (from Promon DB Option 5 ) never changes. Commits is alaways 10 000 001.  

    => Impact on AI file size and elapse time

- "-usernotifytime" (with OE11.7.5 + AIX 7.1 + REMC on Windows)  High priority

    => Error message 2703 is reported in the db log. (less than 1 % of the my Db are dealing with this issue)

Patrice

This thread is closed