26 Eylül 2011 Pazartesi

Discover Unix / Linux Servers and Deploy Agents

In this post I will go through basic configuration in System Center Operations Manager 2007 R2, and the focus in this post will be on Discovering Unix\Linux Servers and Deploy Agents.
Steps are as the following:
User local administrator privileges to log on to the SCOM 2007 R2 Root Management Server (SCOM) [Member of "OpsMgrAdmins" Group]. This account must also have system administrator privileges on the instance of SQL Server that will host the Operations Manager 2007 R2 database
1.Open a command prompt as a Local Administrator and execute the following command:
Winrm set winrm/config/client/auth @{Basic=”true”}
2.Open the SCOM console. Highlight Administration and Click Discovery Wizardimage
3.On the What do you want to manage page, Click and select Unix/Linux computers then Click Nextimage
4.On the Discovery Method page, Click Addimage
5.On the Define discovery criteria dialog box, do the following:
· Provide the IP address of the Unix/Linux Server
· Provide the Credentials
· Click OK
6.On Discovery criteria page, do the following:
· Select the Management Server
· Check Enable SSH based discovery
· Click Discover
7.On the Discovery results page, Select the computer and click Nextimage
8.On the Deployment complete page, Click Doneimage

Removing servers from database

[IsDeleted] = 1,
[LastModified] = getutcdate()
Path = 'FQDN' OR Name = 'FQDN' OR DisplayName = 'FQDN' AND [IsDeleted] = 0

[IsManaged] = 0,
[LastModified] = getutcdate()
Path = 'FQDN' OR Name = 'FQDN' OR DisplayName = 'FQDN' AND
[IsDeleted] = 1 AND [IsManaged] = 1

17 Eylül 2011 Cumartesi

Guidance, Tuning and Known Issues for the Exchange 2010 Management Pack for System Center Operations Manager 2007

A new KB has just been posted with some really, really good information.
This is especially important for larger environments where you might have struggled with this MP. This KB article goes through the concepts of the correlation engine, the design challenges of the MP, some scalability issues, and some good modifications you can make to your OpsMgr environment to proactively prepare for this management pack.
The Exchange 2010 MP is not very similar to any other MP out there, so special care must be applied when leveraging this powerful management pack.

Kevin Holman

A list of all possible security events in the Windows Security Event Log

This may be old news, but it is a handy reference for OpsMgr admins, when asked to monitor for specific events from security event logs:

Windows Server 2003: http://technet.microsoft.com/en-us/library/cc163121.aspx
Windows Server 2008: http://www.microsoft.com/download/en/details.aspx?id=17871
Windows Server 2008 R2: http://www.microsoft.com/download/en/details.aspx?id=21561

How to monitor SQL Agent jobs using the SQL Management Pack and OpsMgr

When you use Operations Manager to monitor your SQL servers using the SQL Server Management Pack, there are some options that you will need to think about up front.

One of those areas is the SQL Agent, and SQL Agent jobs.

I will cover three conceptual areas:
  • Out of the box experience and defaults
  • SQL Agent only monitoring (monitoring all jobs at the agent level)
  • SQL Agent Job monitoring (optionally discovering and monitoring agent jobs on an individual level)

Out of the box, we don’t discover or monitor individual SQL agent jobs by default. What we do is discover the SQL Agent object:


By default – all we are monitoring with regard to the SQL Agent, is the SQL Agent service availability:

Keep in mind – on SQL clusters, we don’t monitor manual services by default. The assumption is that if the service is clustered, and is down, the cluster MP will alert that a clustered resource is partially offline. This keeps from having duplicate alerts about the same service availability issue. However, if you WANT the SQL MP to alert when a clustered SQL Agent service is down – you will need to override this monitor:

There are also several rules that target the “SQL Agent” object, that primarily look in the Application event log for issues/errors related to the SQL Agent:
Notice that the “An SQL job failed to complete successfully” is disabled out of the box. This was done to reduce out of the box noise…. as many customers have terribly monitored/maintained SQL environments and have so many jobs failing that this was almost un-actionable. If you want to be alerted to ANY sql agent job failure of any kind – then you might consider enabling this rule via override. I would recommend this if you aren't going to discover and monitor individual SQL agent jobs (covered later in this article)

Lastly – at the SQL Agent level – there is a monitor for alerting when SQL Agent jobs run too long. To reduce noise, this is also disabled out of the box.
If you want to monitor ALL jobs run under a specific SQL Agent with the same thresholds, you should consider enabling and adjusting this monitor. By default – the warning threshold is 60 minutes, and critical is 120 minutes. If you want to be able to control and override individual agent jobs that need longer run-times than others, leave this disabled and configure the SQL MP to discover and monitor individual SQL agent jobs (covered later in this article)

Ok – that covers the out of the box defaults, and SQL Agent only monitoring. Now, what if I need to go deeper, and have the ability to override settings for individual SQL Agent jobs? You will notice that in the view for SQL Agent Job State – there are no objects:
This is by design, because we don’t discover SQL Agent jobs individually out of the box. Perhaps I have some SQL Agent jobs that I don’t want to monitor at all, and others have especially long run-times that need unique thresholds (like SQL backups or maintenance on VERY large DB’s?) In this case – we will want to enable the discovery to discover SQL Agent jobs as discovered objects.
To enable the object discovery, go to Authoring Pane, select Object Discoveries. In the upper right corner, select Change Scope, and clear everything, selecting on “SQL 2008 Agent Job” and “SQL Server 2005 Agent Job”:
These are disabled by default. Create an override for each, “For all objects of class" and set Enabled = true. Save these overrides to your custom “SQL Overrides” management pack.
This discovery runs every 4 hours by default, so within 4 hours you should see your SQL Agent job state view populated with your individual SQL Agent jobs. (**Hint – to speed this process up, bounce the System Center Management service on your agents – as this will force a discovery to run on service startup)

Now – lets talk about what we monitor by default for discovered SQL Agent jobs. There are a total of two monitors. Last Run Status, and Job Duration:

“Last Run Status” monitor checks every 10 minutes, and will remain Yellow (Warning state) for any job whose last run was not a success. This gives you a nice “real time” view of the unhealthy jobs that need some attention.
“Job Duration” monitor runs every 10 minutes, and looks for jobs that have exceeded a run-time threshold. The default is 60 minutes (warning) and 120 minutes (critical). This monitor also does not generate alerts by default. If you want this monitor to generate alerts you must override that setting.
In the following example – I am creating an override for my SQL agent job that performs full backups, setting the thresholds higher and enabling alerting….. and saving it to my SQL Overrides management pack:


As you can see, this gives you SQL Agent job by job granularity.
You could easily also create groups of SQL agent jobs based on group name…. if you wanted to treat all jobs with the same or similar names the same from an override perspective. This will additionally ease the burden of maintaining overrides using a dynamic group based on criteria of the job:


Kevin Holman