You are here

Global Historical Climatology Network Monthly - Version 4

ASCII Text Files
Version 4 Archive
Additional Diagnostics

The Global Historical Climatology Network–monthly (GHCNm) dataset is a set of monthly climate summaries from thousands of weather stations around the world. The monthly data have periods of record that vary by station with the earliest observations dating to the 18th century. Some station records are purely historic and are no longer updated whereas many others are still in operation and provide short time delay updates that are useful for climate monitoring.

The first release of GHCNm & dates to the the early 1990s (Vose et al. 1992). Subsequent releases include version 2 in 1997(Peterson and Vose, 1997), version 3 in 2011 (Lawrimore et al. 2011) and, most recently, version 4 (Menne et al. 2018). For the moment, GHCNm v4 consists of mean monthly temperature data only. Mean monthly maximum and minimum temperatures as well as monthly total precipitation will be included at a later date.

Relative to previous versions, v4 provides an expanded set of station temperature records as well as more comprehensive uncertainties for the calculation of station and regional temperature trends. The increase in station data comes primarily from the temperature observations available in the Global Historical Climatology Network–daily dataset (GHCNd; Menne et al. 2012), which have been combined with the original monthly sources used in previous versions of GHCNm. Additional station data collected under the auspices of the International Surface Temperature Initiative are also used (ISTI; Rennie et al. 2013) and the data merging process was conducted within the ISTI project. Combining these various sources brings the total number of monthly temperature stations in v4 to approximately 26,000 compared to 7200 in v2 and v3.

The most recent update for GHCNm v4 temperature data are provided using the naming convention "latest" for each form of the data (unhomogenized or homogenized--see below). These files will untar into a directory that contains a version designation of the form "ghcnm.x.y.z.yyyymmdd" where:

  • x = version number associated with major methodological changes to quality control, homogenization, data merging or fundamental changes to the sources of the data. A change in 'x' is accompanied by a peer reviewed manuscript
  • y = version number associated with subsequent and substantial modifications to a major version release, including a new set of stations or the addition of quality control checks. A change in 'y' is accompanied by a technical note
  • z = version number associated with minor revisions to station data and/or processing software. A change in 'z' is noted in the "status and errata" document
  • yyyy = year in which the update to the dataset occurred
  • mm = month in which the update to the dataset occurred
  • dd = day in which the update to the dataset occurred

The data files are alsonamed according to the latest version number and time-stamp to indicate when the update was run. For example, the directory and filenames for the version 4 data produced on October 4, 2018 contains the string "v4.0.0.20181004" to indicate that the data are part of the GHCNm version 4.0.0 (x.y.z) code base and set of stations and the files were updated with the latest data available as of October 4, 2018 (20181004).

Information on version 4 quality control, homogeneity corrections, and other aspects of the dataset are available in the documents library.

Data and inventory files
Directions on Uncompressing and Extracing Files (includes description of inventory file and format of data files [measurement, quality, and source flags)]

Quality Assurance

GHCNm v4 uses the same set of quality control (QC) algorithms applied to v3 with some additions. The checks and associated flags are shown in Table 1. Further details regarding the quality control checks are available in the version 4 Algorithm Theoretical Basis Document.

Table 1. Quality Assurance Checks Applied to GHCNm Version 4 Temperatures

Data Problem Description of Check
Inter-Station Duplicate Check (E Flag)

Identifies a station’s monthly values when they are duplicated in any year of another station’s data (annual data must have at least 3 years of data and at least 12 values within 0.015 deg C

Consecutive Month Duplicate Check (W Flag) Used to identify duplicate retransmission and mislabeling of previous month's temperature for current month. Occurs in GTS transmitted CLIMAT bulletins
Series Duplication (D Flag) Identifies duplication of data between years within the same station record
World Record Extremes Check (R Flag)

Identifies temperatures that fall outside the range of the highest and lowest monthly mean maximum and minimum temperature values

Isolated Value(s) (L Flag) Identifies single data months or small clusters of data that are isolated in time. A single datum, or a cluster, of consecutively spaced data (up to three consecutive months) is examined to see if the time period before and after the data (or cluster) contain 18 consecutive months of missing data, or more, both before and after the datum (or cluster)
Streak Check (K Flag)

Identifies runs of the same value (non-missing) in five or more consecutive months

Climatological Outlier (O Flag) Identifies temperatures that exceed their respective climatological bi-weight means for the corresponding station and calendar month by at least five bi-weight standard deviations
Spatial Inconsistency 1 (S Flag)

Any value found to be between 2.5 and 5.0 bi-weight standard deviations from the bi-weight mean is more closely scrutinized by examining the 5 closest neighbors (not to exceed 500.0 km) and determining their associated distribution of respective z-scores. At least one of the neighbor stations must have a z score with the same sign as the target and its z-score must be greater than or equal to the z–score listed in column B (below), where column B is expressed as a function of the target z-score ranges (column A)   

     4.0 - 5.0      1.9
     3.0 - 4.0      1.8
    2.75 - 3.0      1.7
    2.50 - 2.75      1.6
Spatial Inconsistency 2 (T Flag) This check uses a weighted average of neighboring stations to identify extreme temperatures that are likely erroneous. Z–scores for the month of interest for the target station are compared with other station z–scores within 500 km using an inverse distance weighting function. If the absolute difference >= 3.0, then it is flagged
Erroneous value not detected through automated quality control checks (Z Flag) Datum (or data) are flagged after manual investigation determines value(s) to be erroneous

Temperature Data Homogenization

Nearly all weather stations undergo changes in the circumstances under which measurements are taken at some point during their history. For example, thermometers require periodic replacement or recalibration and measurement technology has evolved over time. Temperature recording protocols have also changed at many locations from recording temperatures at fixed hours during the day to once-per-day readings of the 24-hour maximum and minimum. “Fixed” land stations are sometime relocated and even minor temperature equipment moves can change the microclimate exposure of the instruments. In other cases, the land use or land cover in the vicinity of an observing site can change over time, which can impact the local environment that instruments are sampling even when measurement practice is stable. All of the these different modifications to the circumstances of recording near surface air temperature can cause systematic shifts in temperature readings from a station that are unrelated to any real variation in local weather and climate. Moreover, the magnitude of these shifts (or “inhomogeneities”) can be large relative to true climate variability. Inhomogeneities can therefore lead to large systematic errors in the computation of climate trends and variability not only for individual station records, but also in spatial averages.

For this reason, detecting and accounting for artifacts associated with changes in observing practice is an important and necessary endeavor in building climate datasets. In GHCNm v4, shifts in monthly temperature series are detected through automated pairwise comparisons of the station series using the algorithm described in Menne and Williams (2009). This procedure, known as the Pairwise Homogenization Algorithm (PHA), systematically evaluates each time series of monthly average surface air temperature to identify cases in which there is an abrupt shift in one station’s temperature series (the “target” series) relative to many other correlated series from other stations in the region (the “reference” series). The algorithm seeks to resolve the timing of shifts for all station series before computing an adjustment factor to compensate for any one particular shift. These adjustment factors are based on the average change in the magnitude of monthly temperature differences between the target station series with the apparent shift and the reference series with no apparent concurrent shifts.

The PHA has undergone extensive evaluation (e.g., Williams et al. 2012) and GHCNm v4 data are provided as both homogenized (adjusted) and unhomogenized (unadjusted). The homogenized data are known by the string "qcf" and the unhomogenized data are designated by the string "qcu". As described in Menne et al. (2018), the PHA is periodically run as an ensemble to quantify the uncertainty of homogenization. Other components of uncertainty are also evaluated. The combined effect of uncertainties for GHCNm v4 are shown in the figure below.

Land Surface Temperature Anomalies


(Right) Total uncertainty for GHCNm v4 mean annual Global Land Surface Air Temperature anomalies. Darker grays show homogenization uncertainties (parametric and missed breaks) and the lighter grays show anomaly and spatial coverage uncertainties. The uncertainties are displayed as cumulative, so the uncertainty bounds depicted in each lighter shade includes the uncertainty of the darker shades (see Menne et al. 2018 for details).





GHCNm v4

Menne, M. J., C. N. Williams, B.E. Gleason, J. J Rennie, and J. H. Lawrimore, 2018: The Global Historical Climatology Network Monthly Temperature Dataset, Version 4. J. Climate, in press.

GHCNm v3

Lawrimore, J. H., M. J. Menne, B. E. Gleason, C. N. Williams, D. B. Wuertz, R. S. Vose, and J. Rennie, 2011: An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3, J. Geophys. Res., 116, D19121, doi:10.1029/2011JD016187.

GHCNm v2

Peterson, T. C., and R. S. Vose, 1997: An overview of the Global Historical Climatology Network temperature database. Bull. Amer. Meteor. Soc.,78, 2837–2849.

GHCNm v1

Vose, R. S., R. L. Schmoyer, P. M. Steurer, T. C. Peterson, R. Heim, T. R. Karl, and J. Eischeid, 1992: The Global Historical Climatology Network: Long‐term monthly temperature, precipitation, sea level pressure, and station pressure data, ORNL/CDIAC‐53, 325 pp., Carbon Dioxide Inf. Anal. Cent., Oak Ridge, Tenn.


Menne, M. J., I. Durre, B. G. Gleason, T. Houston, and R. S. Vose, 2012: An overview of the Global Historical Climatology Network Daily dataset. J. Atmos. Oceanic Technol., 29, 897–910, doi:10.1175/JTECH-D-11-00103.1.

Pairwise Homogenization Algorithm (PHA)

Menne, M. J., and C. N. Williams, 2009: Homogenization of temperature series via pairwise comparisons, J. Climate, 22, 1700–1717, doi:10.1175/2008JCLI2263.1.

Williams, C. N., M. J. Menne, and P. W. Thorne, 2012: Benchmarking the performance of pairwise homogenization of surface temperatures in the United States, J. Geophys. Res., 117, D05116, doi:10.1029/2011JD016761.

ISTI Databank

Rennie, J. J., Lawrimore, J. H., Gleason, B. E., Thorne, P. W., Morice, C. P., Menne, M. J., Williams, C. N., de Almeida, W. G., Christy, J. R., Flannery, M., Ishihara, M., Kamiguchi, K., Klein-Tank, A. M. G., Mhanda, A., Lister, D. H., Razuvaev, V., Renom, M., Rusticucci, M., Tandy, J., Worley, S. J., Venema, V., Angel, W., Brunet, M., Dattore, B., Diamond, H., Lazzara, M. A., Le Blancq, F., Luterbacher, J., Mächel, H., Revadekar, J., Vose, R. S., and Yin, X. (2014), The international surface temperature initiative global land surface databank: monthly temperature data release description and methods. Geoscience Data Journal, 1, 75–102. doi: 10.1002/gdj3.8.