SCOM 2012 agent or gateway certificate issue

After we were stuck for several weeks, the resolution to this problem was actually found by my colleague Jens Van Hove, so all credit goes to him 😉

Special thanks to Kurt Van Hoecke for providing a wall to bounce some ideas off

To start from the beginning: we had a problem adding a Windows Server 2012 machine to our SCOM 2012 SP1 monitoring environment when using a certificate based trust. Whether as an agent-monitored machine or a SCOM gateway, if the managed server is located in a different domain than the management server, the problem was identical in both cases. Deploying the agent and installing the SCOM agent certificate goes well but when you try to add the server to the environment to effectively start monitoring, you get an error stating that the certificate is not trusted. Using a browser to verify the certificate trusts reveals no issues. The chain is trusted and all root and intermediary certificates are in place. After we tried re-installation, renewing certificate templates and even temporarily bypassing the Cisco firewall between both machines, we still came no closer to a solution.

But by accident when searching on the different event id’s in the event logs, we came across a Read more of this post

SCOM alert – Max concurrent API reached

EDIT (11/03/2014): 2nd possible cause found for the SCOM alert and added to the article (at the bottom).

If you got a recently patched Operations Manager environment then the current version of the basic OS management pack includes new intelligence to check for problems due to the maximum amount of NTLM or Kerberos PAC password validations a particular server can handle at a time.


Performance issues; these can be veeery hard to troubleshoot due to the large amount of variables in your environment (from storage to networking to server hardware or virtualization performance etc etc). If you had your storage engineers, your network specialists and your HyperV or Vmware gurus run all the tests they can think of, try to look at the following as well (or better: SCOM could have done it preventively already 😉

Besides performance issues which are not only difficult but also often subjective, you can see some strange application behaviour. Read more of this post