30 Aralık 2011 Cuma

Logical Disk free space alerts don’t show percent and MB free values in the alert description

I recently wrote about the new Base OS Monitoring Packs that shipped, adding many new features and fixes for monitoring the OS. You can read more about that new release HERE. While this MP update contained many fixes and new features which are VERY beneficial in making alerts more actionable by controlling “false positives”, some of these modifications left a bit of a negative side effect.
One of the areas this new MP focused on, was changing a lot of the “average threshold” monitors to “consecutive sample” monitors. This helps control the noise when there are short term fluctuations in a performance value, or when some counters can spike tremendously for a very short time, skewing the average. So for the most part – changing these over to consecutive samples is a good thing. That said, one of the changes made was to the Logical Disk free space monitors, both for Windows Server 2003 and 2008 disks.
The script used to monitor logical disk free space in previous versions of the Monitoring Pack would output two additional propertybags for free space in MB and Percent. This was very useful, because these values could easily be added to the alert description, alert context, and health explorer. This was very beneficial, because the consumer of the alert in a notification knew precisely how much space was left for each and every alert generated. Here are some examples of how it looked previously:

image
image
image

Now – when the new MP shipped – this script was changed to support the new consecutive samples monitortype, and was completely re-written. When it was rewritten, the script no longer returned these propertybags, so they were removed from the alert description, alert context, and health explorer. The current MP (6.0.6958.0) looks like this:
image
The monitor still works perfectly as designed, and you are alerted when thresholds that you set are breached. The only negative side effect is the loss of information in the alert description.
Several customers have indicated that they preferred to have these values back in the alert description. The only real way to handle this scenario, until the signed and sealed MP gets updated at some point in the future, is to disable the built in monitor, and enable a new monitor with an alert description that you like.
I have written two addendum MP’s attached at the bottom of this article, which do exactly that – I created two new monitors (essentially the same monitors from the previous older version of the Base OS MP’s) and included two overrides which disable the existing monitors from the sealed MP’s. These two new monitors are essentially exact copies of the monitors before they got updated. They run once per hour and have all the default settings from the previous monitors.
With the addendum MP imported – health explorer looks like the following:
image
Note the new name for the addendum monitor, and the fact that the existing “Logical Disk Free Space” monitor is unloaded as it is disabled via override.

These addendum MP’s for Windows Server 2003 and Windows Server 2008 each simply include a script datasource, monitortype, and monitor to use instead of the items in the current sealed Base OS MP’s. These addendum MP’s are unsealed, so you have two options:
  1. Leave them unsealed, and use them as-is. This allows you to be able to tweak the monitor names, alert descriptions, and any other settings further.
  2. Seal the MP’s with your own key (recommended) after making any adjustments that you desire. This will be necessary in order to create overrides for existing groups in other MP’s should you desire to use those.

One caveat to understand – is that any overrides you have created on the existing Base OS free space monitors will have to be re-created here on these new ones. There is no easy workaround for that.
Let me know if you have any issues using these addendum MP’s (which are provided as a sample only) and I will try to address them.

Credits – to Larry Mosley at Microsoft for doing most of the initial heavy lifting writing the workaround MP.

Kevin Holman

Microsoft.Windows.Server.LogicalDisk.Addendum.zip   

Danielle Grandini

I want to follow a different approach to achieve a comparable thus not identical result. The goal is to not modify the original code but rather add a diagnostic and a task to the new monitors that get the MB and % free space. The major difference with Kevin solution is you won’t have the data in the alert description but in the health explorer change state context, on the other hand you should be fairly independent from any new OS MPs release.
But before digging inside the diagnostic code I want to set some points (not necessarily ordered):
  • a diagnostic is a probe that gets executed when a monitor changes its health state from healthy to warning or error. A diagnostic should not change the system state
  • the new monitors lost the ability to report on disk free space because the MPs author decided to keep the old code and then chain a filter module to change the state only if the disk stays under threshold for n (4) samples. Since there’s no generic filter module to do this in OpsMgr the author transformed the data in performance data and then used the performance specific filter System.Performance.ConsecutiveSamplesCondition. This highlights two annoyance:
    • the lack of generic filter modules for non-performance data
    • the need, to overcome this limitation, to implement persistence, when it’s needed, in every single script. The MP author should have chose this way to implement the new monitor.
But let’s return to the diagnostic stuff, we need:
  • a probe to return disk data (%free space, MB free and anything else we thing can be useful)
  • a couple of diagnostic for the warning and error states
  • a task, since it comes for free once we get the probe done
The net effect is the following:
image
Once you have the probe the syntax for the diagnostic is as follows:
<Diagnostics>
      <Diagnostic ID="Progel.Windows.Server.2008.LogicalDisk.FreeSpace.Error.Diagnostic" Comment="List current disk allocation." Accessibility="Public" Enabled="true"
                  Target="Win2008!Microsoft.Windows.Server.2008.LogicalDisk" Monitor="Win2008Mon!Microsoft.Windows.Server.2008.LogicalDisk.FreeSpace" ExecuteOnState="Error" Remotable="true" Timeout="300">
        <Category>MaintenanceCategory>
        <ProbeAction ID="PA" TypeID="QND.Library.DiskSpaceGet.PT">
          <ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$ComputerName>
          <DiskLabel>$Target/Property[Type="Windows!Microsoft.Windows.LogicalDevice"]/DeviceID$DiskLabel>
          <ScriptTimeout>120ScriptTimeout>
        ProbeAction>
      Diagnostic>
      <Diagnostic ID="Progel.Windows.Server.2008.LogicalDisk.FreeSpace.Warning.Diagnostic" Comment="List current disk allocation." Accessibility="Public" Enabled="true"
                  Target="Win2008!Microsoft.Windows.Server.2008.LogicalDisk" Monitor="Win2008Mon!Microsoft.Windows.Server.2008.LogicalDisk.FreeSpace" ExecuteOnState="Warning" Remotable="true" Timeout="300">
        <Category>MaintenanceCategory>
        <ProbeAction ID="PA" TypeID="QND.Library.DiskSpaceGet.PT">
          <ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$ComputerName>
          <DiskLabel>$Target/Property[Type="Windows!Microsoft.Windows.LogicalDevice"]/DeviceID$DiskLabel>
          <ScriptTimeout>120ScriptTimeout>
        ProbeAction>
      Diagnostic>
    Diagnostics>
I just want to highlight the Diagnostic is state specific, so you have two different diagnostics one for Error state and the other one for Warning state. All the other parameters are pretty straightforward.

Hiç yorum yok: