Database slowness

Posted by mboubidi on 06-Oct-2019 14:17

Good morning all,

I have a very important database that runs on progressDB but it is very slow when I activate the replication, and fast as soon as I stop it, it turns on VMWare side resource it is well equipped, I have not enough experience with NoSQL but I ask your support to guide me if there are parameters to tunner or specific troubleshooting.

Thank you in advance for your support.

Greetings

All Replies

Posted by James Palmer on 07-Oct-2019 13:07

It would be very helpful to understand what Progress version you are on. I presume you are talking about OpenEdge Replication? What are you noticing is slower? How are you measuring this? What is the network like between the source and the target database? Are you monitoring your pica queue. knowledgebase.progress.com/.../P136161

Posted by mboubidi on 07-Oct-2019 14:35

Hi James,

Thank you very much for your replay,

Progress version ==> 11.6.4

I presume you are talking about OpenEdge Replication? ==> Yes

What are you noticing is slower? During Database activity, on peack hour

How are you measuring this? when we stop the replication all the process take 3h, and 7h when we enable replication

What is the network like between the source and the target database? ==> master and slave are in different datacenter, all DB's work fine, except progressDB

Are you monitoring your pica queue ==> no, how we can monitor this ?

best regards.

Posted by mboubidi on 07-Oct-2019 14:36

Hi James,

Thank you very much for your replay,

Progress version ==> 11.6.4

I presume you are talking about OpenEdge Replication? ==> Yes

What are you noticing is slower? During Database activity, on peack hour

How are you measuring this? when we stop the replication all the process take 3h, and 7h when we enable replication

What is the network like between the source and the target database? ==> master and slave are in different datacenter, all DB's work fine, except progressDB

Are you monitoring your pica queue ==> no, how we can monitor this ?

best regards.

Posted by Dapeng Wu on 07-Oct-2019 16:51

There have been a couple of performance improvements added in 11.7, which might be able to solve your problem.

1. Replication writes a lot of notes to for fragmented records so that read-only clients can see data with consistency. One performance improvement is to avoid writing these notes in case a record is not fragmented. These notes can be seen from AI scan -verbose, and they are called "RL_LOGOP_START/END" notes.

2. An algorithm change where replication server polls PICA queue to reduce the overhead. It used to be too aggressive and could cause performance slowdown during heavy loads.

Posted by George Potemkin on 07-Oct-2019 17:17

> Are you monitoring your pica queue ==> no, how we can monitor this ?

promon/R&D/1/16. Database Service Manager

Posted by mboubidi on 08-Oct-2019 15:35

> promon/R&D/1/16. Database Service Manager

Free messages and total messages are same and not consumed.

High water mark is 652 (what it indicates) and can you explain what we have monitor. Please find below information,

Enter a number, <return>, P, T, or X (? for help): 16

10/08/19        Status: Database Service Manager

12:12:03

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29945984

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Enter <return>, R, P, T, or X (? for help): R

10/08/19        Status: Database Service Manager

12:13:03

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29948456

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Enter <return>, R, P, T, or X (? for help): R

10/08/19        Status: Database Service Manager

12:13:05

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29948507

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Enter <return>, R, P, T, or X (? for help): R

10/08/19        Status: Database Service Manager

12:13:08

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29948641

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Enter <return>, R, P, T, or X (? for help): R

10/08/19        Status: Database Service Manager

12:13:09

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29948748

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Enter <return>, R, P, T, or X (? for help): R

10/08/19        Status: Database Service Manager

12:13:15

Communication Area Size   :      32768.00 KB

 Total Message Entries   :      299592

 Free Message Entries    :      299592

 Used Message Entries    :           0

 Used HighWater Mark     :         652

 Area Filled Count       :           0

 Service Latch Holder    :          -1

 Access Count            :    29948937

 Access Collisions       :        5534

Registered Database Service Objects

Name                             Rdy Status  Messages Locked by

OpenEdge RDBMS                    Y  REG            0

OpenEdge Replication Server       Y  RUN            0

Posted by George Potemkin on 08-Oct-2019 16:04

More important:  Area Filled Count       :           0

In other words the pica was not an issue since db startup.

Posted by James Palmer on 08-Oct-2019 16:07

So the question is, was the slow process run since the DB was last started?

Posted by mboubidi on 09-Oct-2019 13:51

Hi James,

Yes, It happens every time we run monthly periodic job on the database. Its been observed for last 3 months.

Posted by George Potemkin on 09-Oct-2019 17:15

bymboubidi, you posted the statistics for a database that was up most likely only for a few days:

Time   Access Count

12:12:03 29945984
12:13:15 29948937
72 sec =>  2953 accesses

29948937 / 2953 * 72 = 730214 sec = 8.5 days

Posted by mboubidi on 13-Oct-2019 14:47

Hi George,

As per my knowledge this database is running from months, is there any way to check uptime of database?

As you have mentioned pica is not an issue, what parameter could be verified to make understand issue?

Thanks,

Posted by gus bjorklund on 13-Oct-2019 16:32

Are the source and target machines of the same power and capability? Are they configured the same? Database configuration parameters the same or close? The target machine has to do the same database operation as the source.

In the past, when I worked with a customer to solve a similar problem, after much exploration and study, we finally leanred that the source had a RAID 10 storage and the target had RAID 5. Once the target disk array was reconfigured, the problem was solved.

Posted by mboubidi on 16-Oct-2019 09:43

Hi Gus,

Thanks for responding.

In our case, environment is running on RAID6 both source and target.

OS level and DB level parameters also I checked, it is same configuration on both the ends.

Thanks,

Posted by mboubidi on 13-Nov-2019 11:29

Hi,

We are still facing same issue when we are running monthly activity. We had to disable replication to process this month job too.

Let us know can we pull any diagnostic report or some details during the activity, so we can analyse how replication or any other factor effecting our jobs?

Thanks,

Posted by frank.meulblok on 13-Nov-2019 11:45

You may want to read this: blog.storagecraft.com/.../

And then you may want to move the replication target database (= a system expected to see a lot of I/O write activity)  to a file system that is not explicitly designed to sacrifice write performance.

Posted by George Potemkin on 13-Nov-2019 14:38

> Let us know can we pull any diagnostic report or some details during the activity, so we can analyse how replication or any other factor effecting our jobs?

If you're running on Unix then you can gather statistics using dbmon script:

community.progress.com/.../25700

Dbmon uses promon. It's safe to use promon with V11.6.

This thread is closed