31 Temmuz 2011 Pazar

DataWarehouse StandardDataSetMaintenance fails due to too many tables


Kevin Holman:

I wrote previously about the most common issues that cause the StandardDataSetMaintenance to fail:
http://blogs.technet.com/b/kevinholman/archive/2010/08/30/the-31552-event-or-why-is-my-data-warehouse-server-consuming-so-much-cpu.aspx

One of those reasons can be caused by using SQL 2005, and having long retention with large amounts of data being stored.  This can bring about a situation where more than 256 tables exist for a given dataset (state, perf, etc…) and SQL starts failing to be able to join that many objects in a query.  This is a design limitation of SQL 2005.  You might see something like the following:
Log Name: Operations Manager
Source: Health Service Modules
Event Number: 31552
Level: 1
Logging Computer: RMS.OPSMGR.NET
User: N/A
Description:
Failed to store data in the Data Warehouse. Exception 'SqlException': Sql execution failed. Error 777971002, Level 15, State 1, Procedure StandardDatasetAggregate, Line 424, Message: Sql execution failed. Error 777971002, Level 15, State 1, Procedure StandardDatasetAllocateStorage, Line 364, Message: Sql execution failed. Error 777971002, Level 15, State 1, Procedure StandardDatasetBuildCoverView, Line 135, Message: Sql execution failed. Error 106, Level 15, State 1, Procedure vStateHourly, Line 2818, Message: Too many table names in the query. The maximum allowable is 256. One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance Instance name: State data set Instance ID: {4915FC67-912E-4930-4B86-6B4189B3CB73} Management group: PROD

What happens in the warehouse – is that data for any given dataset (alert, perf, event, state)  gets inserted into tables.  When those tables reach a predefined maximum row count, we start a new table for inserting new data.  We then update a view that joins all these tables together for use in queries.  This way we can index each table in smaller chunks instead of having one single huge table to contain all the data… and only query the tables we actually need to get data from, ending in reports running much faster.
However, if you have a large number of agents, or a massive amount of data coming in, and run on SQL 2005, you can get stuck on the 256 table limit.  To validate that is you issue, you can run some SQL queries to see how many tables you have now:  (these are the most common large datasets – you might have to tweak for the one you are after)
select COUNT (*) as AlertTables from sys.tables
where name like 'Alert[_]%'
select COUNT(*) as AlertDetailTables from sys.tables
where name like 'AlertDetail[_]%'
select COUNT(*) as EventTables from sys.tables
where name like 'Event[_]%'
select COUNT(*) as EventDetailTables from sys.tables
where name like 'EventDetail[_]%'
select COUNT(*) as PerfRawTables from sys.tables
where name like 'PerfRaw[_]%'
select COUNT(*) as PerfHourlyTables from sys.tables
where name like 'PerfHourly[_]%'
select COUNT(*) as StateRawTables from sys.tables
where name like 'StateRaw[_]%'
select COUNT(*) as StateHourlyTables from sys.tables
where name like 'StateHourly[_]%'
In the event logged above – I can see in this case it is vStateHourly that is having the issue – so it will likely be my StateHourly tables that are too much:
select COUNT(*) as StateHourlyTables from sys.tables
where name like 'StateHourly[_]%'
If this result is over 256 – there is my issue.

What are my options now???

Well, I’m glad you asked!

First and foremost – we need to get StandardDataSetMaintenance  working again.  The quickest way to recover immediately from this is to reduce the retention for this dataset.  Soon – when DS maintenance runs again – the grooming job will drop these tables and DS maintenance will be able to complete.  To set/reduce the retention – see my post at:  http://blogs.technet.com/b/kevinholman/archive/2010/01/05/understanding-and-modifying-data-warehouse-retention-and-grooming.aspx

Now that the emergency is over…. we need a better long term resolution.  Here are our options:

  1. Upgrade the SQL instance housing the Data Warehouse to SQL 2008 SP1 or later.
  2. Tune the data coming in and reduce the amount of data being written to the tables.
  3. Reduce retention significantly enough so that the issue does not recur
  4. Increase the amount of data rows per table, resulting in larger tables but fewer of them

Now – sometimes #1 is just not an option in the near term.  And what if you have already covered #2 but are still seeing too much data (large agent counts)?  That’s where #3 and #4 come in.
However, perhaps you cannot reduce retention anymore, because of SLA’s for reporting?

This leaves us with number 4 – and the purpose of this blog post.

The amount of rows in a dataset table is controlled by a setting in the StandardDatasetAggregationStorage table:
select BaseTableName, MaxTableRowCount
from StandardDatasetAggregationStorage
where MaxTableRowCount is NOT NULL
order by BaseTableName
Here are the default results seen in a new install of R2:
BaseTableNameMaxTableRowCount
Alert1000000
AlertDetail1000000
AlertParameter1000000
AlertResolutionState1000000
CMAemApplicationDaily10000000
CMAemDaily10000000
CMAemErrorGroupDaily10000000
CMAemRaw10000000
Event15000000
EventDetail15000000
EventParameter15000000
EventRule15000000
PerfDaily10000000
PerfHourly10000000
PerfRaw10000000
StateDaily5000000
StateHourly15000000
StateRaw5000000

This will show you the MaxTableRowCounts for each dataset.  What you can do, is potentially increase the setting for MaxTableRowCount to a larger number, ONLY if you have first attempted to resolve the condition using #1 through #3 above.
***Also take note – if you initially deployed your warehouse on RTM, and then upgraded to SP1 or R2 – the default values for some of these have changed…. but are not updated during the upgrade process.  So if your value is not the current default as seen above – you might consider resolving the issue by updating to the R2 defaults.
RTM BaseTableName
MaxTableRowCount
Event
5000000
EventDetail
5000000
EventParameter
5000000
EventRule
5000000
StateHourly
5000000
  
  
R2 BaseTableName
MaxTableRowCount
Event
15000000
EventDetail
15000000
EventParameter
15000000
EventRule
15000000
StateHourly
15000000

Here is an example of a SQL statement which will modify these most common tables:
***Note – Back up your warehouse before ever considering modifying anything in an OpsMgr database.  Modifying a SQL table directly is NOT SUPPORTEDunless under the direct supervision of a Microsoft Support representative via a support case.
UPDATE StandardDatasetAggregationStorage
SET MaxTableRowCount=15000000
WHERE BaseTableName='StateHourly'
UPDATE StandardDatasetAggregationStorage
SET MaxTableRowCount=15000000
WHERE BaseTableName='Event'
UPDATE StandardDatasetAggregationStorage
SET MaxTableRowCount=15000000
WHERE BaseTableName='EventDetail'
UPDATE StandardDatasetAggregationStorage
SET MaxTableRowCount=15000000
WHERE BaseTableName='EventParameter'
UPDATE StandardDatasetAggregationStorage
SET MaxTableRowCount=15000000
WHERE BaseTableName='EventRule'

How big is too big?

For instance – what if you are still hitting the issue with too many StateHourly tables – and you are already set at the R2 default of 15 million rows?  How big can you set it?
Well, there is no hard limit.  You can try doubling the existing value, such as setting it to 30 million rows.  The pro… this will likely resolve your problem.  The con?  Index maintenance will take longer to complete now – as this table size will be MUCH larger… so you have to be prepared for more warehouse impact during index operations.
When you can – always try to resolve this issue by attacking the other methods first:
  1. Upgrade the SQL instance housing the Data Warehouse to SQL 2008 SP1 or later.
  2. Tune the data coming in and reduce the amount of data being written to the tables.
  3. Reduce retention significantly enough so that the issue does not recur

Hiç yorum yok: