Explicitly synchronizing source and target

Posted by ChUIMonster on 03-Dec-2019 17:10

Is there a simple and straight-forward way to ensure that a replication target is completely synchronized when the source is shutdown?

All Replies

Posted by Dapeng Wu on 03-Dec-2019 21:27

A couple of "aimage new" commands on the source may do the trick.

Posted by ChUIMonster on 03-Dec-2019 22:00

That seems like it ought to work but I worry about scenarios that I might be unaware of where maybe something doesn't get transferred.  I was hoping for something more like "*will* do the trick" rather than "may do the trick" ;)

The other problem is that in order for that to work without the possibility of those pesky users slipping in more transactions I think that I need a good way to kick them all out of the application without shutting it down.  And not every application is going to make that easy.

To expand on the use case a bit...  I'm looking for a way to ensure that when I want to do something such as "flip" the source and target or pull a target out of replication and use it as a "clone" of production that I can be *certain* that there were no pending transactions that have not yet made it to the target.  I'd like to not have to run multiple commands on multiple servers to do that or to verify that.

I think that I need to shutdown the source db in order to ensure that all users have been disconnected and that no new transactions are started.  But what I am not sure of is what, if anything, happens during the shutdown to try to bring both sides into sync.  

Posted by ducity on 03-Dec-2019 22:41

Could you use DSRUTIL -C monitor , the Transaction#'s ?

   Latency Information:

       Repl Server behind Source DB by:        2553  second(s)

       Current Source Database Transaction:    2182416

       Last Transaction Applied to Target:     2177660

Posted by Dapeng Wu on 04-Dec-2019 17:30

i can add a new feature request to our backlog. Basically a new option to the dsrutil command. I'm thinking to let it freeze the database activities and ensure all AI notes sent to the targets before it unfreezes. Will this meet  your requirement? Do you also need to make sure all the AI notes applied and flushed on the targets with this operation? This will make the freeze longer because with AI streaming, there can be a latency on the target side to apply the notes.

Posted by Pieterm on 05-Dec-2019 06:02

A new dsrutil option would be great i.e terminate server graceful, whereby all AI notes for a transaction busy still send to all targets but any new transaction started, it's AI notes are send to the AI logs only. And once these busy AI notes are all send, the replication server can terminate.

Posted by ChUIMonster on 05-Dec-2019 14:02

> Could you use DSRUTIL -C monitor , the Transaction#'s

That would *confirm* that they are synchronized so it could be a useful double checking step but what I am looking for is a simple way to get to that state in a reliable and repeatable manner.  Sort of like one might use "truncate bi" to get the db state cleaned up prior to a version upgrade.

Posted by ChUIMonster on 05-Dec-2019 14:06

This might not be obvious but I am looking for an *offline* option.  The use-case is for circumstances where the source db will be shutdown.

An online option would also be nice to have but is obviously only going to be true for as long as you can prevent users from actually doing any work.

Posted by Simon L. Prinsloo on 05-Dec-2019 14:31

I think that this question highlights a short coming that I encountered myself before. You want to get the database into a state where users are kicked out and no new users are allowed in, the only reliable way is to shut the database. But if you want replication to catch up, you need to start the database.

What I need from time to time is the ablity to switch the database to a state where a) Users are disconnected as soon as they have no open transactions, b) (Almost) no new connections are allowed. (Promon is still needed to detect and disconnect users with long running processes and if the replication server crashed, it needs to be possible to restart it. Keeping the ai archiver going is also a benefit). c) Replication process normally. d) The database shuts down automatically as soon as there are no more users and replication is up to date. Idealy, if the ai archiver is configured, the very last step should be to switch to a new ai extent and to finish archiving the remaining ones.

I do suspect that [mention:dc2ff39fa15940708b78017f1194db6a:e9ed411860ed4f2ba0265705b8793d05]  has a very similar need, prompting this question. The closest I could get is in applications that requires connection to more than one database, where you force the startup procedure to access at least two databases. Then you shut one of these and wait for processing to finish on the others. Then you shut those and start the one that was shut first, wait for it to finish processing and shut it again. But this is a manual, labor intensive process that cannot be scripted (easily).

Posted by ChUIMonster on 05-Dec-2019 14:38

Simon - well said.  Yes, that describes my situation.

Posted by Peter Judge on 05-Dec-2019 15:00

Some aspects of this user lockout can be achieved via the OEAG and the connection authorization capabilities. If you disabled access to new connections to the DB, then you'd "just" have to disconnect other users.  The conenction authorization is role-based so you should be able to allow certain groups of users back in (or not kick them out) too.
 
 

Posted by Pieterm on 05-Dec-2019 15:05

We have a customer that stops replication for a short period every day for a point in time data warehouse report to run on the target, but on quite a few instances we had to restart the replication server to complete partial transaction which was 'locked' and could not be read.

So stopping the replciation server online of offline will result in this scenario whereby a target database could have in-completed transactions.

Both an online and offline enhancement would thus be very useful.

Offline: The database shutdown utility should allow for a new parameter that does the normal shutdown steps but first ensures the AI is 100% in sync (similar to what is done during the online transitioning) before shutting the replication server and archiver processes, and then the database.

Thus a shutdown on the source will first ensure all the targets are sync, and all database are in a usable state.

Online: The dsrutil utility should allow for an option to do a graceful terminate server, which ensure all AI notes for a transaction is completely send to all targets but any new transaction started, it's AI notes are send to the AI logs only. And once these busy AI notes are all send, can the replication server terminate.

This will ensure a target database will never have partial transactions due to the replication server being terminated part way during a transactions being send across.

Posted by Paul Koufalis on 05-Dec-2019 15:28

Reading this thread, I see two enhancement requests:

1. proshut DB -by -syncRepl

Shut the database down but as part of the normal shutdown/cleanup procedure, after you do all the undo/redo processing, finish sync'ing the replTarget. I do see a challenge here because the "send signal 7 to all the users" would have to leave out the rpServer.

2. Single-user mode for running database

We've been asking for this for some time: the ability to bring up the DB to different states, like Linux (or Oracle).

Posted by Dapeng Wu on 05-Dec-2019 16:43

Paul,

There was indeed some discussion about a special maintenance mode for database in the past, but it didn't actually get started. This is a bigger project and I'll bring it back to the teams for more inputs. Besides replication, do you have other use cases for this maintenance mode?

As to the offline syncRepl case, we also need to consider the time it may take to complete the sync. It can be very long before the database is fully shut down, depending on the transactions to back out.

For the online case, there are some technical challenges for what has been described. For example, "ensure all AI notes for a transaction is completely send to all targets but any new transaction started, it's AI notes are send to the AI logs only". This will require changes to our crash recovery mechanism and it will be big. What can be relatively easier to do is to cut a line that no new transactions can be started before the existing ones get replicated.

Posted by Paul Koufalis on 05-Dec-2019 16:54

Other use cases: dictionary changes, purge or mass load activities or similar activities where I want shared mem and all the helper processes but not the users.

proshut -syncRepl: I agree, and it's a similar issue now with -by vs -F.

For the online case, is your suggestion essentially a proquiet enable DB -syncRepl ?

Posted by Dapeng Wu on 05-Dec-2019 17:06

Similar idea. But the existing proquiet can install quiet point in the middle of transactions, which is not what we want.

Posted by Dapeng Wu on 06-Dec-2019 17:06

I'll add two feature requests to our backlog according to what we have discussed here:

1. a tool to explicitly synchronize source and targets with no incomplete transactions, including both online and offline.

2. a new maintenance mode for the database, from where the online sync tool can run, together with other operations mentioned by Paul.

Appreciate your valuable inputs!

This thread is closed