NOAA Logo National  Environmental Satellite, Data, and Information Service. National Climatic Data Center National Climatic Data Center, U.S. Department of Commerce
Global Historical Climatology Network - Daily

Methods - Quality Control

During each reprocessing cycle, the data are first passed through a "format checking program" that looks for problems such as impossible months or days, invalid characters in data fields, and so forth.  If this occurs, the routine sets the offending records to missing. The primary purpose of this program is to ensure that the data integration procedures do not either introduce or retain records that violate the intended and documented GHCN-Daily data format.  Next, a comprehensive sequence of fully-automated QA procedures identifies daily values that violate one of the quality tests. Described below and in greater detail in Durre et al. (2010), these tests identify a variety of data problems, including the excessive duplication of data records; exceedance of physical, absolute, and climatological limits; excessive temporal persistence; excessively large gaps in the distributions of values; internal inconsistencies among elements; and inconsistencies with observations at neighboring stations. This system flags approximately 0.3% of nearly 2 billion data values, and it has been estimated that 98-99% of the values flagged are true data errors and only 1-2% are false positives (Durre et al. 2010) . This level of performance was achieved through careful selection and evaluation of procedures and test thresholds using the techniques described by Durre et al. (2008) .   The tests are as follows (see the readme.txt file for a list of the flags assigned when a particular test fails): 

  • Trace flag consistency check- Checks for days on which the data measurement flag indicates a trace yet the amount is nonzero. Applies to precipitation, snowfall, snow depth, evaporation, water equivalent of snow on the ground, and wind movement.
  • Naught check - Checks for days on which maximum and minimum temperature are both equal to 0°C at stations not operated by the United States or are both equal to -17.8°C (0°F) at United States stations.
  • Duplicate data check- Checks for duplication of the data between entire years, different years in the same calendar month, different months within the same year. Applies to air, evaporation pan, and soil temperatures, precipitation and snowfall. In the case of precipitation and snowfall, at least three non-zero precipitation values must be available during the month (zeros are ignored). between maximum and minimum air temperatures within the same month.
  • World record exceedence check- Identifies values that fall outside the world extremes for the highest and lowest ever observed. Applies to all elements except weather types.
  • Streak check- Checks for unrealistic sequences of identical values in time series of nonmissing values (or in non missing/non-zero values in the case of precipitation). Flags sequences of
    • 20 or more consecutive identical values in time series of nonmissing daily maximum, minimum, and observation-time air temperature;
    • 20 or more consecutive identical values in time series of nonmissing and nonzero precipitation observations;
    • 10 or more consecutive identical nonzero values in time series of nonmissing snowfall totals; and
    • 90 or more consecutive identical nonzero values in time series of nonmissing snow depth values.
  • Frequent-value check (precipitation only) - Checks for clusters of 5-9 identical moderate to heavy daily totals in time series of nonzero precipitation observations.
  • Gap check - Identifies unrealistic breaks in the period-of-record distribution of elements for a particular calendar month. Flags:
    • maximum/minimum air, evaporation pan, or soil temperatures that are at least 10°C warmer or colder than all other corresponding maximum/minimum temperatures for a given station and calendar month.
    • precipitation values that are at least 300 mm larger than all other precipitation totals for a given station and calendar month.
    • snowdepth values that are at least 35 cm larger than all other reported snow deaths for a given station and calendar month.
  • Z-score-based climatological outlier check - Checks for daily surface air maximum and minimum temperatures that exceed the respective 15-day climatological means by at least six standard deviations.
  • Percentile-based climatological outlier check - Checks for daily precipitation totals that exceed the respective 29-day climatological 95th percentiles by at least a certain factor (9 when the day's mean temperature is above freezing, 5 when it is below freezing).
  • Internal temperature consistency check - Checks for consistency among maximum, minimum, and time of observation temperature within a three-day window. Applies to air, evaporation pan, and soil temperatures.
  • Temporal consistency check (spike or dip) - Checks whether a daily maximum (minimum) temperature exceeds the maximum (minimum) temperatures on the preceding and following days by more than 25°C.
  • Lagged temperature range check - Identifies maximum temperatures that are at least 40°C warmer than the minimum temperatures on the preceding, current, and following days as well as minimum temperatures that are at least 40°C colder than the maximum temperatures within the three-day window.
  • Consistency check between evaporation pan temperatures and surface air temperatures (flags pan temperature only). - Checks for inconsistencies between:
    • Maximum surface air temperature and minimum evaporation pan temperature;
    • Maximum evaporation pan temperature and minimum surface air temperature;
    • Maximum evaporation pan temperature and maximum surface air temperature plus 10°C
    • Minimum evaporation pan temperature and minimum surface air temperature -10°C.
  • Snow-temperature consistency (warm) check - Checks for nonzero snowfall totals that occur when daily minimum temperatures at the same station are equal to or warmer than 7°C
  • Snowfall to snow depth increase consistency check - Checks for days on which the increase in snow depth from the previous day to the current day exceeds the current+previous and current+following days' snowfall total by more than 25 mm.
  • Snowfall (or snow depth increase) to precipitation ratio check- Checks for cases in which snowfall (or snow depth increase) is excessively large compared to precipitation, that is, if the current day's snowfall (or snow depth increase) is more than 100 times larger than both the current+previous and current+following days' precipitation sums. If so, the current day's precipitation and snowfall (or snow depth increase) totals fail the check on the preceding, current, and following days.
  • Spatial consistency check (regression) Checks for temperatures that differ greatly from a predicted value generated from a linear-regression-based estimate generated from neighboring values. A target temperature is flagged when the regression-based predicted value differs by more than 8°C from the observed value, and the standardized residual of the predicted value exceeds 4 standard deviations on the target day.
  • Spatial consistency check (corroboration of anomalies)- checks for temperatures whose anomalies differ by more than 10°C from the anomalies at neighboring stations on the preceding, current, and following days.
  • Spatial consistency check (corroboration of precipitation amounts and percentiles)- checks for precipitation totals that differ significantly from totals (and percentiles) reported at neighboring stations on the preceding, current, and following days.
  • Spatial consistency check (snow to minimum temperatures)- checks for snowfall or snow depth increases when all neighboring stations reported a minimum temperature greater than 7°C on the preceding, current, and following days.
  • Megaconsistency check - Flags:
    • daily maximum surface air temperatures that are less than the lowest minimum surface air temperature for the respective station and calendar month;
    • daily minimum temperatures that are greater than the highest maximum temperature for the station and calendar month;
    • observation-time temperatures that are higher than the highest maximum temperature or lower than the lowest minimum temperature for the station and calendar month;
    • daily maximum evaporation pan temperatures that are less than the lowest minimum evaporation pan temperature for the respective station and calendar month, less than the lowest minimum surface air temperature for the respective station and calendar month, or more than 10°C above the highest surface air temperature for the respective station and calendar month;
    • daily maximum evaporation pan temperatures that are less than the lowest minimum temperature for the respective station and calendar month;
    • daily minimum evaporation pan temperatures that are greater than the highest maximum evaporation pan temperature, greater than the highest maximum surface air temperature, or one and 10°C below the lowest minimum surface air temperature for the station and calendar month; daily maximum soil temperatures that are less than the lowest minimum soil temperature for the station, calendar month, groundcover, and depth; and,
    • daily minimum soil temperatures that are greater than the highest maximum soil temperature for the station, calendar month, groundcover, and depth.
    • flags nonzero snowfall and snow depth values for stations in calendar months whose lowest reported minimum temperature is 7°C or warmer. The check is applied only if there are at least 140 daily minimum temperatures for the station and calendar month
    • warm season nonzero snowfall totals at stations where no valid cold season snowfall was ever reported; and ,
    • warm season nonzero snow depths at stations where no valid cold season snow depth was ever reported. (The warm season is defined as May-September in the Northern Hemisphere and April-October in the Southern Hemisphere. The remaining months of the year comprise the cold season).
  • Date-based climatological outlier check for snowfall and snowdepth.- Flags snowfall and snow depth values that fall outside their respective plausible seasons as determined from respective observations at the station and neighboring stations within 1° latitude of the station. This check is designed to remove non-zero observations in locations/seasons where snow is not plausible but not flagged by any other check. Note this check has higher false positive rate (50% for snowfall and 75% for snowdepth) than the GHCN-Daily standard of less than 20%. The intent is to replace this check with one that is more efficient.

http://www.ncdc.noaa.gov/oa/climate/ghcn-daily/index.php
Created by Jon.Burroughs@noaa.gov
Downloaded Tuesday, 30-Sep-2014 19:59:06 EDT
Last Updated Thursday, 10-Sep-2009 10:59:38 EDT by ron.ray@noaa.gov
Please see the NCDC Contact Page if you have questions or comments.