Beta Release of the Global Land Surface Databank

Conceptual diagram of the land surface Databank structure

Figure 1. Conceptual diagram of the land surface Databank structure and its relation to the benchmark analogs.

Since the 1980s, there have been great advances toward understanding how Earth’s temperatures have varied and changed. These advances have been due in large part to efforts such as the Global Historical Climatology Network (GHCN) Monthly and Daily datasets that make the study of decadal and century scale changes in temperature possible. As part of a continuing focus on enhancing the observed climate record, NCDC and international partners launched the International Surface Temperature Initiative (ISTI) in 2010 to improve understanding of the Earth’s climate from the global to local scale, bringing together many international scientists from many scientific specialties.

Map of location and period of record of stations in the beta relase of the Databank

Figure 2. Location and period of record (years) of the more than 39,000 stations in the Beta release of the Databank as of October 2, 2012.

The ISTI, through its Databank Working Group, released a beta version of an innovative data holding that brings together new and existing sources of surface air temperature. This data holding provides users a way to better track the origin of the data from its collection through its integration into a merged data holding. By providing the data in various stages that lead to the integrated product, by including data origin tracking flags with information on each observation, and by providing the software used to process all observations, the processes involved in creating the observed fundamental climate record are more open and transparent.

Graph of the number of  stations in GHCN-M version 3 and the Databank

Figure 3. Number of stations in GHCN-M version 3 (black) and Databank (red) from 1850 through 2010.

This beta release contains more than 39,000 stations, greatly enhancing spatial coverage from the 1800s to the present. Data for these and other station records withheld from the merged product (due to source redundancy or ambiguity) are provided in Stages as shown in Figure 1. Stage-0 consists of observations in their original form, including those recorded on paper and housed in various archives and those converted to photographic or scanned images. Stage-1 contains digital data in its native format such as ASCII text, spreadsheets, or other electronic documents. All data are then converted to a common format in Stage-2 with an inventory containing all available metadata for each station. This typically consists of a station identifier, name, latitude, longitude, elevation, and beginning and ending year of data. This stage also includes data provenance tracking (DPT) flags to help users understand the history of each observation. DPT flags provide links to information such as the data source, location of original data archive, method and source of digitization, and mode of transmission.

Graph of the number of gridboxes sampled with GHCN-M version 3 and the Databank

Figure 4. Percent of land area sampled with GHCN-M version 3 stations (black) and Databank stations (red) from 1850 through 2010.

Forty-three Stage-2 sources are merged into a single Stage-3 dataset. This is a complicated process due to the nature of weather and climate data, which are collected by hundreds of thousands of observers in hundreds of countries often using differing languages, observing methods, averaging, and documenting and archive procedures. Because many different sources may contain records for the same station, it is necessary to identify and remove duplicate stations, merge some sources to produce a more complete station record, and incorporate new stations. Figure 2 shows the locations and lengths of the more than 39,000 stations included in the merged dataset. Figure 3 shows an illustration of the number of stations compared to the GHCN-M version 3 dataset and Figure 4 shows the percent land coverage.

Details of the merging process and other aspects associated with the development of each Stage leading to a merged Stage-3 dataset along with all the data and the processing code are available at ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/monthly/stage3/ .

In the future, the ISTI team will enlist independent research groups to develop a suite of fully quality controlled (Stage-4) and homogenous (Stage-5) datasets from the merged Stage-3 data. Any submitted data products will be rigorously assessed using processes established by the ISTI Benchmarking working group.

A team of international scientists, who comprise the Databank Working Group, made this Databank development possible.

WMO Region I

  • Albert Mhanda (ACMAD, Niger)

WMO Region II

  • Vyacheslav Razuvaev (Russian Research Institute of Hydrometeorological Information)
  • Kenji Kamiguchi (Japan Meteorological Agency)

WMO Region III

  • Matilde Rusticucci (Univ. of Buenos Aires, Argentina)
  • Madeleine Renom (Universidad de la Republica, Montevideo, Uruguay)
  • Waldenio Gambi Almeida (CPTEC/INPE, Brazil)

WMO Region IV

  • Jay Lawrimore (NOAA’s National Climatic Data Center)
  • Matthew Menne (NOAA’s National Climatic Data Center)
  • Steve Worley (National Center for Atmospheric Research)
  • John Christy (University of Alabama Huntsville)

WMO Region V

  • Meghan Flannery (Australian Bureau of Meteorology)

WMO Region VI

  • Albert Klein-Tank (Koninklijk Nederlands Meteorologisch Instituut)
  • David Lister (Climatic Research Unit, University of East Anglia, UK)

Ex-officio Members

  • Peter Thorne (Cooperative Institute for Climate and Satellites-North Carolina / NOAA’s National Climatic Data Center)
  • Jared Rennie (Cooperative Institute for Climate and Satellites-North Carolina / NOAA’s National Climatic Data Center)

This effort provides a foundation to establish new methods of analysis, assessments of uncertainties, and services to end-users and comes at a time when the need for high-quality, traceable, and complete data is greater than ever. While the initial focus is on temperature data on the monthly timescale, the ISTI team will add other elements and timescales in the future. More information is available at www.surfacetemperatures.org/databank.

An overview of the vision for the International Surface Temperature Initiative and Databank development is provided in “Guiding the Creation of A Comprehensive Surface Temperature Resource for Twenty-First-Century Climate Science” published in the Bulletin of the American Meteorological Society.

Thorne, Peter W., and Coauthors (2011), Guiding the Creation of A Comprehensive Surface Temperature Resource for Twenty-First-Century Climate Science. Bull. Amer. Meteor. Soc., 92, ES40–ES47. doi: http://dx.doi.org/10.1175/2011BAMS3124.1 .