Status 1100 "RP STATE CONNECTING" This status means that the RPLS is still trying to establish contact with the RPLA within the
connect-timeout period and after that the
defer-agent time. Essentially one needs to find out why the RPLS cannot establish connection with the RPLA at that time, remembering that there are 5 minute intervals between each attempt.
The quick and easy is to restart the source database whenever network failure conditions pre-ceed the CONNECTING status from clearing.
Assuming that the
agent-shutdown-action=recovery property is set in the
replication.properties file, then the RPLA will still be listening and a new RPLS will start with the source database.
It is never-the-less a good idea to first verify this is the state, with
DSRUTIL <target> -C monitorOtherwise the target database needs to be started to restart the RPLA prior to OpenEdge 11.6,
Since OpenEdge 11.6 the agent can be restarted online with:
DSRUTIL <target> -C restart agentSince DSRUTIL status 1100 means: "RP STATE CONNECTING"
1. First confirm that the RPLA is running on target(s):
$ DSRUTIL <target> -C monitor
2. Use Network utilities to see if there's anything preventing communications to the Target -S listening port.
3. The source RPLS cannot be terminated before the
initial connect-timeout has expired before it goes to defer-agent timeouts
The connect-timeout specifies how many seconds the Replication Server will wait for connection to the Replication Agent before the Replication Server enters defer-agent timeouts or simply terminates if defer-agent is not in use.
If in defer-agent time, first try to force a connection to the Replication Agent:
$ dsrutil <source> -C startAgent
Otherwise, the RPLS can be terminated during defer-agent-startup time, after first checking that the RPLS-Q is not full (see Step 4):
$ dsrutil <source> -C terminate server
4. Check the pica queue (RPLS-Q) in PROMON:
$ promon <source> -> R&D -> 1. Status Displays -> 16. Database Service Manager
If the RPLS-Q is full:
The source database needs to be stopped and restarted (see quick and easy above).
Closing a Command Prompt that is running a hung DSRUTIL command (dsrutil -C terminate server) causes an abnormal database shutdown:
If the RPLS-Q is not full:
and the connect-timeout has expired and the RPLS is in defer-agent time and the RPLA is still listening, but RPLS-RPLA contact cannot be completed, then terminate the RPLS and only if confirmed as stopped, release waits:
$ dsrutil <source>-C terminate server
$ dsrutil <source>-C RELWAITS
If this fails, then try disconnecting RPLS process through:
$ promon <source> -> 8 -> 1 disconnect user
Then restart the replserv process.
$ dsrutil <source> -C restart server
As a last resort, kill the RPLS pid, and restart the RPLS
- Killing the RPLS when not connected to the RPLA is fail safe.
- It is only when RPLS RPLA are communicating and killed that there is a remote possibility that when restarted synchronisation will fail.
- It is not possible for target corruption if synchronisation has completed successfully and replication is in normal processing.
$ dsrutil <source> -C restart server
5. Otherwise, shut source down and restart, ensuring that the RPLA is still listening (otherwise the target will also need to be started).