|
Overview of Data Tools
Every night a copy of the production data is captured for data warehousing. Data warehousing allows reports to run against a day old copy of the data rather than competing for the production databases, which would adversely affect the performance of data entry applications.
Data Warehousing has two parts: data marts and summarized totals.
Data marts are the copies of the source data, somewhat transformed to facilitate reporting, these records are also known as "fact tables."
The pre-summed totals are stored in "data cubes" which allow reports with totals by county, or program and so forth to run within seconds instead of minutes. Data quality checks are run on "cubed" data to ensure that all the required fields are present and conform to lookup table values.
There are times when it is necessary to run queries against the production database, and therefore some query tools are provided for this purpose. However, the production query tools restrict any adverse performance impacts. Therefore, the filter options are more limited than those you will see in query tools available for the warehouse.
The most flexible query tools with the most options are those that query the warehouse data. The databases being queried are very large and the volume of data being returned may not be appropriate for the destination tools. For example, Excel has a limit of approximately 60,000 records. Pivot tables are limited by the memory capacity of your client machine. Browsers have limitations on the amount of data which can be displayed as well. It is important to use the filter tools to limit the number of records returned as much as possible.
Basic definitions of the following terms will help you understand the data available for reports.
Production database - The production database is used to enter records for PRS.
Warehouse database - The warehouse database is used to store non-NCP data and a day old copy of the NCP database data.
NCP database - The National Conservation Practices database contains information from Toolkit and PRS on the plans and practices both planned and applied that have been reported.
Grid tables - Grid tables are used to display production data, such as practice data as yet unreported to NCP, and non-NCP data, such as performance measure data and state and local measures data.
Pivot tables - Pivot tables present data in a more complex and flexible format than grids.
Summary Pivot table - Summary pivot tables present aggregate data in the pivot table format.
Additional information about each of these terms will be presented in the context of the individual data viewing options.
Three options on the Quality Assurance Tools menu allow you to view production data. These options are intended for locating small amounts of data and are restricted to minimize the impact of the query on production run times. The main function of these tools is to access unreported data and limited data.
Menu Options:
View Conservation Plans (on Quality Assurance Tools list)
View Land Units (on Quality Assurance Tools list)
View Practices options (on Quality Assurance Tools list)
Features:
Allows you to view practice records that have not been reported and therefore are not in the NCP data base
Allows you to view records that are less than one day old (i.e., not yet in the data warehouse)
View data errors reports on errors found in the last version of the data warehouse. Fix in the production NCP and corrections show up in the warehouse the next day
Restrictions:
Limits the number of plan records returned to 100
Limits the number of land unit records returned to 100
Limits the number of practice records returned to 100
If the quantity of results is too great, the system
will return an error message without displaying any results. The message
reads:
National Conservation Planning (NCP) database server has timed out
due to the heavy server load and the complexity of the query. Please try
again by restructuring or simplifying the query or try again later when
the NCP database may be less busy.
Recommendations:
Specify as much detail on the filter screens as possible to obtain less than 100 records
Use to find single or limited transactions
Warehouse Data presented in Grid Tables
The main function of these tools is to locate data that may need editing.
Menu Options:
View NCP Data Errors (on Quality Assurance Tools list)
View Prior Entries (on Quality Assurance Tools list)
Data Checker (on Quality Assurance Tools list;
provides sub-options)
Only FY2006 sub-options shown below.
View Plans
View Land Units
View Practices
View CSG Impacts
Features:
Use View NCP Data Errors to locate and identify records with NCP Data Errors
Use View Prior entries to research non-NCP data.
Data checker allows queries of NCP data that is not reported
Data checker returns data to an HTML browser screen. It allows selection of the plan for entry into the data entry application for editing current fiscal year records
Data Checker allows queries of snapshots of past fiscal year warehouses taken at the end of the fiscal year when data entry was locked for that year.
Warehouse Data presented in Pivot Tables
The main function of these tools is to report and summarize large quantities of records.
Menu Options:
Data Viewer options for both PRS non NCP practice data and also NCP data (only FY2006 options shown below; Data Viewer uses warehouse data for past years non-practice data also)
PRS Standard Measures
PRS State and Local Measures
NCP Cubes Summary Reports – pre-summed NCP data from the warehouse stored in cubes
Data Viewer Features:
Data Viewer allows you to query PRS state & local measure data in a pivot
Data viewer allows you to view the NCP cubes in a pivot
Data viewer allows you to view NCP practices records in a pivot
Queries warehoused data
Allows queries of past fiscal year warehouses.
Allows you to download data to Excel or to an XML file
Transaction data (detail data, user-selected cubes) allows customized viewing
NCP Reports
Summary (aggregated) data (summary data, pre-cubed for user) allows reporting of large quantities of records.
Summary Reports
Download options bypass displaying large quantities of records onscreen
Data Viewer Restrictions:
Only queries reported NCP records – not unreported records
Only returned records are included on Excel or XML downloads
Additional fields can be included in downloads such as FIPS code or lat/long that would be useful in offline tools such as MSAccess or ARCview
In some cases, the user may receive a timeout error, if the user's query results in an attempt to return an excessive number of records.
Download options allow large quantities of data to be passed to the user within following constraints:
Excel is limited to 65,535 rows, and 255 columns
XML files are not limited by file type
User hardware can be a limiting factor—both cache and hard drive
Excessive download times