Good morning all,
I have a very important database that runs on progressDB but it is very slow when I activate the replication, and fast as soon as I stop it, it turns on VMWare side resource it is well equipped, I have not enough experience with NoSQL but I ask your support to guide me if there are parameters to tunner or specific troubleshooting.
Thank you in advance for your support.
Greetings
It would be very helpful to understand what Progress version you are on. I presume you are talking about OpenEdge Replication? What are you noticing is slower? How are you measuring this? What is the network like between the source and the target database? Are you monitoring your pica queue. knowledgebase.progress.com/.../P136161
Hi James,
Thank you very much for your replay,
Progress version ==> 11.6.4
I presume you are talking about OpenEdge Replication? ==> Yes
What are you noticing is slower? During Database activity, on peack hour
How are you measuring this? when we stop the replication all the process take 3h, and 7h when we enable replication
What is the network like between the source and the target database? ==> master and slave are in different datacenter, all DB's work fine, except progressDB
Are you monitoring your pica queue ==> no, how we can monitor this ?
best regards.
Hi James,
Thank you very much for your replay,
Progress version ==> 11.6.4
I presume you are talking about OpenEdge Replication? ==> Yes
What are you noticing is slower? During Database activity, on peack hour
How are you measuring this? when we stop the replication all the process take 3h, and 7h when we enable replication
What is the network like between the source and the target database? ==> master and slave are in different datacenter, all DB's work fine, except progressDB
Are you monitoring your pica queue ==> no, how we can monitor this ?
best regards.
There have been a couple of performance improvements added in 11.7, which might be able to solve your problem.
1. Replication writes a lot of notes to for fragmented records so that read-only clients can see data with consistency. One performance improvement is to avoid writing these notes in case a record is not fragmented. These notes can be seen from AI scan -verbose, and they are called "RL_LOGOP_START/END" notes.
2. An algorithm change where replication server polls PICA queue to reduce the overhead. It used to be too aggressive and could cause performance slowdown during heavy loads.
> Are you monitoring your pica queue ==> no, how we can monitor this ?
promon/R&D/1/16. Database Service Manager
> promon/R&D/1/16. Database Service Manager
Free messages and total messages are same and not consumed.
High water mark is 652 (what it indicates) and can you explain what we have monitor. Please find below information,
Enter a number, <return>, P, T, or X (? for help): 16
10/08/19 Status: Database Service Manager
12:12:03
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29945984
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
Enter <return>, R, P, T, or X (? for help): R
10/08/19 Status: Database Service Manager
12:13:03
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29948456
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
Enter <return>, R, P, T, or X (? for help): R
10/08/19 Status: Database Service Manager
12:13:05
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29948507
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
Enter <return>, R, P, T, or X (? for help): R
10/08/19 Status: Database Service Manager
12:13:08
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29948641
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
Enter <return>, R, P, T, or X (? for help): R
10/08/19 Status: Database Service Manager
12:13:09
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29948748
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
Enter <return>, R, P, T, or X (? for help): R
10/08/19 Status: Database Service Manager
12:13:15
Communication Area Size : 32768.00 KB
Total Message Entries : 299592
Free Message Entries : 299592
Used Message Entries : 0
Used HighWater Mark : 652
Area Filled Count : 0
Service Latch Holder : -1
Access Count : 29948937
Access Collisions : 5534
Registered Database Service Objects
Name Rdy Status Messages Locked by
OpenEdge RDBMS Y REG 0
OpenEdge Replication Server Y RUN 0
More important: Area Filled Count : 0
In other words the pica was not an issue since db startup.
So the question is, was the slow process run since the DB was last started?
Hi James,
Yes, It happens every time we run monthly periodic job on the database. Its been observed for last 3 months.
bymboubidi, you posted the statistics for a database that was up most likely only for a few days:
Time Access Count
12:12:03 29945984
12:13:15 29948937
72 sec => 2953 accesses
29948937 / 2953 * 72 = 730214 sec = 8.5 days
Hi George,
As per my knowledge this database is running from months, is there any way to check uptime of database?
As you have mentioned pica is not an issue, what parameter could be verified to make understand issue?
Thanks,
Are the source and target machines of the same power and capability? Are they configured the same? Database configuration parameters the same or close? The target machine has to do the same database operation as the source.
In the past, when I worked with a customer to solve a similar problem, after much exploration and study, we finally leanred that the source had a RAID 10 storage and the target had RAID 5. Once the target disk array was reconfigured, the problem was solved.
Hi Gus,
Thanks for responding.
In our case, environment is running on RAID6 both source and target.
OS level and DB level parameters also I checked, it is same configuration on both the ends.
Thanks,
Hi,
We are still facing same issue when we are running monthly activity. We had to disable replication to process this month job too.
Let us know can we pull any diagnostic report or some details during the activity, so we can analyse how replication or any other factor effecting our jobs?
Thanks,
You may want to read this: blog.storagecraft.com/.../
And then you may want to move the replication target database (= a system expected to see a lot of I/O write activity) to a file system that is not explicitly designed to sacrifice write performance.
> Let us know can we pull any diagnostic report or some details during the activity, so we can analyse how replication or any other factor effecting our jobs?
If you're running on Unix then you can gather statistics using dbmon script:
community.progress.com/.../25700
Dbmon uses promon. It's safe to use promon with V11.6.