National Overview - February 2014
Climate Dataset Transition

« National Overview - February 2014

Climate Dataset Transition

Effective with this (February 2014) monthly climate report, NCDC transitions to an improved instance of its climate division dataset, named nClimDiv. A more complete description of climate divisions, the history of climate division data, and the scientific and practical implications of this transition are described at the climate division reference page.

This page provides questions and answers to some questions from the applied climatology and service climatology communities as they previewed the new nClimDiv dataset. Many of these answers contain generalizations that don't hold for all parts of the CONUS, or every month in the dataset. For specific information related to time of year or regions, please see Vose et al., 2014 or use the Intercomparison Tool at the climate division reference page.

Q. What are climate divisions and why are they important?

A. There are 344 climate divisions in the contiguous United States. The climate division temperature and precipitation dataset, as well as derived products, have long been a staple of the applied climate and service climate communities, because they are temporally complete (no missing months) and spatially complete (no missing areas) all the way back to January 1895. Climate division data are used by the public- and private-sector in climate-related analyses, such as patterns of energy demand, retail behavior, agricultural production and regional climate-change detection. Please see the climate division reference page for more complete information.

Q. Why transition to a new dataset?

A. The traditional climate division dataset was designed and built almost a quarter-century ago, using the technology and data available at the time. For example, at the time, much of the weather data from the early 20th century and before only existed on paper and not in a digital format. Since then, three major developments have made an improved dataset possible:

  1. NCDC's sustained efforts to "digitize" paper-bound data resulted in the availability of many more stations from the early 20th and late 19th centuries. nClimDiv utilizes this newly-available data.
  2. Computer processing power and communications have improved dramatically in the 21st century. nClimDiv, as a result, can make a much more detailed analysis, and has many more stations available at the time of data processing early in the month.
  3. NCDC research has developed advanced ways to identify, and correct, issues at observing stations around the country. nClimDiv incorporates this advanced science into its analysis.
These three developments result in a climate division dataset that uses many more stations than ever before, more advanced computational techniques, and, most importantly, a more accurate climate division dataset.

Q. Are the changes only to the area-wide or sets of the data (such as Oklahoma Statewide, or the Oklahoma "Panhandle" climate division), or are they applied directly to station data?

A. These changes do not apply to the stations. Station data are the inputs to the gridded data and the climate division (and statewide, regional and national) values derived from the gridded data. The station data (GHCN-Daily) are managed separately and remain unchanged.

Q. In some statewide graphs, especially in the Western U.S., there is a large increase in annual precipitation added to all years. Was this due to additional stations being added? Changes in station data? Are you assuming that the relationship of elevation to precipitation and general storm tracks is constant over time?

A. Generally speaking, for divisions/states with high terrain, the climatology (each year's precip) got wetter. In most places, this is primarily due to additional (wetter) higher elevation stations (e.g., SnoTel) being incorporated into the underlying GHCN-Daily station dataset. The relationship between precipitation and elevation should be fairly uniform over time. Storm tracks were not considered explicitly in the methodology.

Q. I understand why past data may be changed due to historic sensor bias, or poor records. Why was the current data changed (last 30 years)? Was this due to problems in the current data, or additional stations and networks?

A. This is an entirely different methodology. The dataset was rebuilt from scratch. The divisional values, which used to be based upon averages of station data are now based on averages of grid points analyzed from station data. This means that terrain is now a factor in the basic climatology, and so is station distribution. For example, for climate divisions with stations concentrated in one corner of the division or limited to low elevations, the new dataset, being based on analyzed across the division (and incorporating information from stations in adjacent areas), even recent data is going to differ between datasets.

Q. Temperature in some states decreased more in the early record, and not changed as much in the more recent years. And overall, the whole record was lowered by several degrees. Was this due to errors in the original data, additional stations or networks being added? Was the orographic nature of the state a driver? Why was the early record changed much more than the current years of data?

A. There are several factors at play here. They are listed here in rough order of importance. On a region-by-region basis, the respective order of this list may be different.

  • Yes, the orography of a place (how mountainous the place is) plays a role. Just like high-terrain divisions got wetter, high-terrain divisions also are generally cooler from beginning to end in the new dataset. This is because the higher elevations are now more represented in the divisional averages. This is especially true for those divisions which more strongly had "stations in the valleys but not the mountains" in the traditional dataset.
  • Because, at the time the old CD database was designed, there were few stations digitally available in the early years of the dataset, the old dataset was actually built using two methodologies: one for 1895-1930 which uses regression from statewide averages to compute climate division values and another for 1931-present, using stations within climate divisions. The new dataset uses a single, consistent approach for the entire period of record, so climate divisions from states with significant issues with the pre-1930s data will be evident when comparing the new with the old.
  • Many more early 20th Century and late 19th Century observations are now digitally available than when the traditional climate division dataset was assembled.
  • Finally, the new dataset has the benefit of the homogeneity corrections developed for GHCN-Daily (the main dataset upon which the grids and division values are built). These corrections are for station moves, new instruments, etc.

Q. Do you know of any NCDC efforts to offer divisionally-based maximum temperature and minimum temperature time series? Here in the land of the frozen, our winter minimums offer striking examples of upward trends.

A. Yes, Tmax and Tmin are part of nClimDiv and should be available operationally soon after this transition.

Q. How will this change the ranks for states and climate divisions?

A. This raises an important point. For institutions that have relied on the NCDC climate division database to determine their CD and Statewide records (warmest, coolest, wettest, driest), we strongly advise re-checking the results. The absolute values will have almost certainly changed, and the ranks may be affected as well, especially for years prior to 1930.

For, example, Oklahoma's climatology cooled relative to Texas, and Oklahoma and Texas are now tied for hottest summer (summer 2011, both 86.8°F). Oklahoma still has the hottest month on record.

Q. Do you anticipate any data updates to the old CD database?

A. The old CD database will be actively maintained until the final December 2013 data have been processed, and kept online for some period (months to years) for quick retrieval. It will also be available semi-permanently (at least 20 years) from the deeper NCDC archive.

Q. Will the State and CD 1981-2010 normals for this new dataset also be released on March 13? Should we just be doing a 30-year average of the various variables from the new dataset?

A. The Statewide and CD 1981-2010 normals will be released very soon. They will be simple, straightforward arithmetic averages of the 30 years 1981 through 2010 for the timeseries (CD, State, etc.) in question. That's what we'll be using for the 30-year normal. These should scale appropriately from climate division through state to CONUS, with the exception of rounding factors.

Q. How often will the new CD database be refreshed?

A. The initial transition will bring changes to the entire period of record. After that, on a monthly basis, each month from the current calendar year ("this year") and the previous calendar year ("last year") will be updated. Monthly data from earlier years will remain unchanged unless there is a systemic reason to reprocess the whole history (such as new/changed quality control methods for the station data, the introduction of new historical data, etc.). So, for example, during 2014, each available month during 2013 and 2014 will be recalculated.

Q. How did the transition noticably affect climate division and statewide trends, but had only a minimal effect on the national temperature trend?

A. The old climate division dataset did not use the corrections and scientific advances that were already built into the contiguous United States temperature. The transition brings many of these advances into the climate division and statewide values. Because the CONUS temperature data series already incorporated these advances, the transition has little effect on trends, especially on the annual scale (i.e., not for individual months, but for the series of years). Please see the National Temperature Index page for more details.

Q. How did the transition change the national values for winter 2013/14?

A. Temperature and precipitation values reported for December 2013 and January 2014 changed slightly when transitioning from the old dataset to the new dataset. This table shows how the values, departures from normal, and rankings changed for the contiguous United States.

Month Old Value New Value Old Departure from 20th century average New Departure from 20th century average Old Rank New Rank
December 2013 30.8°F 31.1°F -2.2°F -1.6°F 21st coldest 29th coldest
January 2014 30.3°F 30.5°F -0.1°F +0.4°F 52nd coldest 60th coldest
Month Old Value New Value Old Departure from 20th century average New Departure from 20th century average Old Rank New Rank
December 2013 2.16 inches 2.21 inches -0.07 inch -0.14 inch 54th driest 45th driest
January 2014 1.32 inches 1.36 inches -0.90 inch -0.95 inch 5th driest 5th driest

Vose, R.S., S. Applequist, M. Squires, I. Durre, M.J. Menne, C.N. Williams, Jr., C. Fenimore, K. Gleason, and D. Arndt, 2014: Improved historical temperature and precipitation time series for U.S. climate divisions. Journal of Applied Meteorology and Climatology, in press, doi:10.1175/JAMC-D-13-0248.1.