IGRA constitutes a compilation of 11 source datasets (Table 1) selected based on the timely availability of the data, the existence of documentation for codes and conventions, and data quality. The core of IGRA consists of four datasets of Global Telecommunications System (GTS) reports that were preprocessed at one of three locations in the United States: NCDC (1963-1970 and 2000-present); the National Center for Atmospheric Research (NCAR; December 1970-1972); and the National Centers for Environmental Prediction (NCEP; 1973-October 1999). Since these datasets have nearly consecutive periods of record, their records were concatenated into one "core" time series per station. Depending on data availability, the resulting time series may begin as early as September 1963 and continue until present. Many of the concatenated core records contain a 2.5- month break between the end of the NCEP/NCAR GTS in October 1999 and the beginning of the NCDC real-time GTS in January 2000. This gap is, in many cases, filled in with data from other sources.

Two additional GTS data sources originate from the Australian Bureau of Meteorology (1990-1993) and the All-Russian Institute for Hydrometeorological Information (1998-2001). For a variety of reasons, including differences in decoding practices, some messages transmitted over the GTS are decoded only at certain receiving centers and not at others. Thus, even though extensive duplication generally exists among the core, Australian, and Russian GTS data, the latter two sources occasionally supply soundings that are either not present or incomplete in the core data.

Five other datasets are also in IGRA. With a period of record of 1946-1973, a dataset compiled by the United States Air Force extends the records of many stations back in time from the 1960s to the 1950s or 1940s. The temporal completeness and vertical resolution of data at stations in the United States, Australia, Argentina, and South Korea are further enhanced by four country-specific sets of data that were archived before their transmission over the GTS and thus contain levels not found in the GTS data. (Six additional sources archived at NCDC were excluded from IGRA due to questionable data quality, undocumented quality assurance flags, or unusual and undocumented conventions for reporting pibal observations.)

In most data sources, stations are identified only by their station number and location. Consequently, information such as the name and country of the station were obtained from external sources: GTS metadata from NCEP and NCDC; the station inventory of the Global Historical Climatology Network (GHCN; Peterson and Vose 1997); WMO Publication 9 Volume A (WMO 2004); and a list of station moves affecting National Weather Service stations (Elliott et al. 2002). In those rare cases in which significant discrepancies exist in the information provided by the various lists, online searches were used to determine any necessary corrections.

