Sharing my experience with System Center Configuration Manager and Microsoft.
Todays post is about the much talked about feature on ConfigMgr 2012-Data Replication Service(DRS). The replication mechanism internally uses SQL Service broker and I am discussing the three common issues with SSB that can stop the replication.
Issue 1:In the SQL error log we see the following:
Connection handshake failed. Error 15581 occurred while initializing the private key corresponding to the certificate. The SQL Server errorlog and the Windows event log may contain entries related to this error. State 88.
In thesys.transmission_queuewe see this in thetransmission_statuscolumn:
Service Broker login attempt failed with error: Connection handshake failed. An error occurred while receiving data: 10054(An existing connection was forcibly closed by the remote host.)
The history was the SQL server was installed with the system account and then later changed to a domain user account.
The problem with doing the above is that when Configuration Manager is installed it creates some internal certificates which are dependent on the master key. When the account being used to run the database server changes the new account is no longer able to unlock the master key and consequently can not read the internal certificates which then cause communication between sites to fail.
In order to address this we dropped and regenerated the master key in the SQL database – this effort was hampered because the user account being used to run the SQL Service appeared not to have sufficient rights to generate a new master key. To combat this we temporarily put the user account into the local administrators group after which a new master key could be generated. We then regenerated the primary site servers certificate using thespCreateandBackupSQLCertstored procedure in the ConfigMgr database. Having successfully regenerated the SSB certificates we copied them to the CAS (parent) and Secondary (child) site and the SQL errorlogs no longer showed SQL Service Broker login failures.
Following this we left the sites to recover but overnight it appeared that some init messages from the Primary had become lost in transit – possibly cleared from the queue as a part of the spCreateandBackupSQLCert stored proc. We set the current rows with status 3 in RCM_DRSInitilizationTracking to status = 7 to kick off the replication initialization process. This executed very quickly and replication appeared to be working as expected.
To resolve this issue, give full control permissions to the account that the SQL Service is running under to theProgramData\Microsoft\Crypto\RSA\MachineKeysdirectory.
Service Broker login attempt failed with error: Connection handshake failed. The login ConfigMgrEndpointLoginCAS does not have CONNECT permission on the endpoint. State 84.
In many cases the Endpoint itself would be missing we can create the same and give it the connect permissions:
SQL Security logins checked ConfigMGRENDPOINTLOGINCAS properites Securables and there were no permissions for the Connect for the user, Gave Grant for Connect.
Service Broker login attempt failed with error: Connection handshake failed. The certificate used by the peer is invalid due to the following reason: Certificate not found. State 89.. [CLIENT: 10.172.20.133]
This happens when the public key cert of the other SSB endpoint login for the ConfigMgr somehow goes missing on the other SSB endpoint login.
So suppose if I have the hierarchy is as follows-
When I run the SpDiagDRS on the PRI site, I see that it cannot find the cert for the SEC site.
Then I export the cert from SEC by using the query:
Then copy theSEC.CERfile onto the primary siteC:\SEC.cer.
After this we can connect to the Primary site DB and run the below query.
Support Escalation Engineer ConfigMgr Microsoft
Express could cause limitation in many scenarios but You might have to engage the SQL team for further analysis.
Exec dbo.spCreateSSBLogin @EndPointLogin=ConfigMgrEndpointLoginSEC, @DestSiteCode=SEC, @DestSiteCertFile=C:SEC.cer, @EndpointName=ConfigMgrEndpoint
The activated proc [dbo].[spRCMActivation] running on queue nfigMgrRCMQueue output the following: ERROR 50000, Level 16, State 1, Procedure spRethrowError, Line 42, Message: ERROR 50000, Level 16, State 1, Procedure spRethrowError, Line 42, Message: ERROR 50000, Level 16, State 1, Procedure spGetSSBDialogHandle, Line 58, Message: Route is not defined for target site with service name ConfigMgrRCM_SiteTV2.
Great article it solved one of my issues. Have you seen issues with SQL Express with memory errors? I get this in the sql errorlog over and over again.
spid367 There is insufficient system memory in resource pool internal to run this query.
spid293 Error: 18052, Severity: -1, State: 0. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
spid293 There is insufficient system memory in resource pool internal to run this query.
I see that you have used the SEC I the command and assume that it is the sitecode.
From the error it seems that the route to the site is missing. You can view the same in the SQL management studio under your DB in the Service broker tab. Compare the number of routes from a working site and replicate/create the same route with the desired sitecode.
Amazing work 🙂 Waiting for more DRS blogs from you 🙂
Thanks for your feedback on the previous DRS unleashed post. As promised, this post would
I am getting different error on my windows event log as follows. Kindly help me out how can i remove this error.
Service Broker login attempt failed with error: A previously existing connection with the same peer was detected during connection handshake. This connection lost the arbitration and it will be closed. All traffic will be redirected to the previously existing
connection. This is an informational message only. No user action is required. State 80.. [CLIENT: 10.61.208.37]
I have created a secondary site, State is showing ACTIVE, but Database Replication shown Link is being configured from last two days.
I have found error: Service Broker login attempt failed with error: Connection handshake failed. The login public does not have CONNECT permission on the endpoint. State 84.. [CLIENT: 172.16.1.169].
Dont know why it is showing public as login ID? Any Idea what needs to be done to solve this issue?
That actually means the ConfigMGrEndPoint login which generally gets created with Public access is not currently linked to it. You can delete and recreate the login again and that should resolve the issue.
great blog. We are facing the issue since Upgrade to CB 1610 that everytime while Upgrading to latest Current Branch SSB Port gets reset by ConfigMgr. We use Custom Ports. After CAS is upgraded successfully PRI says in replictrl.log SSB Port was not found for site DEC, so using 4022. We have to delete the routes manually and set them to right port. After that PRI gets updated to Current Branch.
Any clue why ConfigMgr cant find configured SSB Port?
We generally store the value in the SSBPort on site server Reg Key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SMS\SQL Server\Site System SQL Account. Please verify if this is empty which is where it might then default.
After that we store the value in the DB which can be queries below:
select SD.SiteCode, , RSCP.Value from RcmSqlControlProperty RSCP
inner join RcmSqlControl RSC on ntrolID = RSC.ID
inner join SC_SiteDefinition SD on RSC.SiteNumber = SD.SiteNumber
where RSCP.Name = SQL Server Service Broker Port
Do we have a certificate for stand alone primary site? I am getting errors in Hman stating that the Service broker certificate is expired or missed. When i check in the database , i see certificate section stating NOT FOUND. But in registry i have the data regarding the certificate. Please share the location of the Service broker certificate.
For a standalone primary with no secondaries this should not be a cause of worry. The certificate can be viewed from master.sys.certificates view.
Upon running SPDIAGDRS ..SSB_cert, SSB_certname, SSB_login, SSB_user is missing on both the sides. Primary and on CAS too. Please suggest how shall i go ahead with this. Because in the given example one side was missing on one side and then we exported it from the other site. Whereas in my case its missing on both.
Service Broker login attempt failed with error: A previously existing connection with the same peer was detected during connection handshake. This connection lost the arbitration and it will be closed. All traffic will be redirected to the previously existing connection. This is an informational message only. No user action is required. State 80..
Log Location: Data\MSSQL11.MSSQLSERVER\MSSQL\Log\ERRORLOG