|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Blue River Case Study |
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Evaluating
data characteristics Calibrating
the reconstruction model |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Very old Douglas-firs at the Dillon (DIL) site. The DIL tree-ring chronology by itself explained 48% of the variance in the Blue River gaged flow record, and thus was the primary predictor in the Blue River reconstruction model. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Introduction The
Blue River case study provides a detailed example of the steps taken to
generate a tree-ring reconstruction of annual streamflow. The Blue River,
in addition to supplying domestic, agricultural, recreational, and wildlife
water demands on the Western Slope, is a major supply source for Denver
Water (via Dillon Reservoir and Roberts Tunnel) and is also a component
of the Colorado-Big Thompson Project (Green Mountain Reservoir). The streamflow
gage on the Blue River above Dillon Reservoir is one of a set that Denver
Water uses to characterize their historic and current water supply. The
"natural flow" record used in this reconstruction was derived
by Denver Water from the raw gage record to account for diversions and
transfers of water. The record begins in 1916, and the annual flow values
are for the standard water year, October through September. Evaluation
of data characteristics The first step in reconstructing streamflow from tree-ring data is to assess the suitability of both the tree-ring data and the streamflow data for the reconstruction. The strength of the relationship between tree growth and streamflow are assessed, as is the shape of the relationship. The statistical characteristics of both the tree-ring and streamflow data are also evaluated. Strength of relationship. The strength of the relationship between the available tree-ring chronologies and the streamflow data is evaluated in terms of the correlation coefficient, R, which quantifies the variance shared by the two records. In our tree-ring collection efforts, we specifically target moisture-sensitive trees, whose growth responds to the same regional climate patterns that control streamflow. Consequently, nearly all of our west slope chronologies are significantly positively correlated (lower growth = lower flow; higher growth = higher flow) with the Blue River gage record and other records in the upper Colorado River basin. Using tree-ring chronologies that have a plausible physical relationship to streamflow (as indicated by a significant correlation) helps prevents a model based on spurious relationships Shape
of relationship. Simple
scatterplots of tree-ring chronologies versus streamflow are used to assess
the linearity of the relationship between tree growth and flow. The statistical
method used in most reconstructions, multiple linear regression, specifically
applies to linear relationships. If a linear relationship is not evident
in the plots, data can be transformed to make the relationship linear
(e.g., streamflow is sometimes transformed using a log transformation)
In this case, scatterplots of our west slope chronologies against the
Blue River gage data showed the relationships to be generally linear,
so no transformation was required. Statistical
characteristics of the data. The
multiple linear regression technique used in the reconstruction process
also requires that a number of assumptions about the data be met in order
to obtain unbiased, efficient, and consistent estimates
from the model. These
assumptions are ultimately tested by evaluating the errors (also called
residuals) in the reconstruction model--the difference
between the gaged and estimated values. Checking the input data to evaluate
the extent to which they meet these assumptions prior to generating the
model helps ensure that the resulting model errors will also meet the
assumptions (or if there are problems meeting the assumptions, may point
to a cause).
Histograms
of both the tree-ring and streamflow data showed
the data to be normally distributed. The
"standard" tree-ring chronologies, however, usually contain
statistically significant low-order autocorrelation (that is, one year's
growth is strongly related to the next). Most of this autocorrelation
is a function of the trees' physiology, and not related to climate. We
removed the low-order autocorrelation in the tree-ring chronologies using
ARMA modeling, creating time-series of residuals. These residual chronologies
are then used in the reconstruction model. Finally, the streamflow data
was found to have sufficiently constant variance. Generating
(calibrating) a reconstruction model The
statistical process used to generate the Blue River flow reconstruction
(and other flow reconstructions in this project) model is called a stepwise
multiple linear regression, a form of least squares regression. The tree-ring
chronologies (the independent variables, or predictors) are calibrated
with gage data (the dependent variable, or predictand) in such
a way as to minimize the difference between estimated and true gage values
(these differences or errors are squared, thus the smallest squared errors,
or least squares, are sought). The stepwise process determines
which predictors from a pool of possible candidate predictor chronologies
provides a statistical model that best fits the gage data. In the simplest
terms, the process first selects the predictor/chronology that explains
the most variance in the gage record, then adds the chronology that explains
the most variance in the gage record not already explained by the first,
and so on, until the remaining unexplained variance cannot be significantly
reduced by any of the remaining chronologies. The resulting regression
equation--the weighted linear combination of chronology values--is used
to estimate the gage value for each year, in this case, 1916-1999. This
stepwise regression process requires a pool of candidate predictor variables,
which have been evaluated for suitability as described above. In this
study, the pool includes all of our chronologies from western Colorado
that are sensitive to moisture and that extend at least through 1999 (25
total at the time the reconstruction was generated). All of these chronologies
would be expected to potentially contribute to explaining the variance
in the Blue River gage record. One consideration in the selection of chronologies for the predictor pool is the length of the chronology. The length of the final reconstruction is limited by the shortest chronology that contributes to it. If a reconstruction must go back to a certain year (like 1600), then all chronologies starting after 1600 should be excluded from the pool of candidate predictors. For the Blue River reconstruction, no chronology was excluded from the predictor pool on the basis of length, and the shortest predictor chronology, Montrose (MTR) begins in 1440. Thus, the reconstruction begins in 1440. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The locations of the chronologies entered into the stepwise regression process (green and yellow), the chronologies selected by the regression as predictors in the reconstruction model (yellow), and the Blue River gage. Note that the predictor chronologies are not necessarily located in the same basin as the gage--reflecting the regional coherence of climate variability--though the chronology (DIL) explaining the most variance in the gaged record is in fact the one closest to the gage. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In
the calibration, a stepwise linear regression is run for the full set
of years common to both the tree-ring and gage data. For the Blue River
calibration, the steps in the regression process are shown in the table
below: Summary of Stepwise Regression |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here,
the chronology that explains the most variance in the Blue River gage
record is Dillon (DIL). This chronology explains almost 48% of the variance
by itself (Change in R2). Pumphouse (PUM) contributes another
8%, and the remaining three add between 1% and 3%, together explaining
62.6% of the variance in the gaged record. It
is important to limit the number of predictors in the regression model,
by imposing a significance threshold for additional predictors to be entered
into the equation, ending the process at a predetermined number of steps,
or applying both of these approaches, as we did. A model with a large
number of predictors tends to be "overfitted" to the gage data;
the model will be so highly tuned to the calibration period that it is
unlikely to perform well during the reconstruction period. None of our
reconstruction models has more than 6 predictors. Regression
Summary R=
.791 R²= .626 F(5,78)=26.138 p<.00000 Standard Error of estimate: 37419. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The errors (or residuals) in the regression model were then examined to make sure assumptions of multiple linear regression, as outlined above, were not violated. Plots of the residuals for the Blue River model showed no violations of these assumptions. Also, residuals were not correlated with any individual predictor variable, one additional assumption. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After
the model is generated, the skill of the model is tested using a set of
validation statistics. There are a number of ways to go about validating
the model (or comparing several competing models to select the best).
Ideally, the model is validated using independent data, i.e. gage data
completely withheld from the calibration process. But since gaged streamflow
records in Colorado are only 50-100 years long at best, withholding enough
data from the calibration to independently validate the model (at least
30 years) significantly shortens the calibration period, and thus can
reduce the range of values upon which the model is calibrated. Here,
all available gage data are used in the calibration, and a split-sample
validation is used, which tests reconstruction skill of the predictor
chronologies selected in the stepwise process. . This approach is based
on splitting the period of time common to the tree-ring and gaged data
into two or more subsets, then calibrating the model on one part and estimating
the values for the remaining data. Two extremes of this approach are (
1) splitting the common period in half, calibrating on one half and testing
the model on the other half and then switching the calibration/verification
periods or (2) calibrating on all but one case, estimating that case,
then removing a different case, and estimating that one, repeating until
each case has been omitted and estimated (sometimes called "leave-one-out"
or PRESS method). The split sample validation does not test the regression
model per se. Instead, it assesses the ability of the set of predictor
chronologies to estimate streamflow using different subsets of the data,
and then tests these estimates on the withheld portion of the data. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Blue River natural gaged flow record, 1916-1999, showing the major climate fluctuations affecting the west slope of Colorado and much of the intermountain West in the 20th century: generally wet conditions in the 1910s and the 1920s, generally dry conditions throughout the 1930s, severe drought in the mid-1950s, drought epsiodes in the mid-1960s, extreme winter-spring droughts in 1977 and 1981, and the relative absence of drought from 1982 to 1999, with very wet conditions in 1983-84 and again in 1995-97. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| In
the Blue River reconstruction, the approach of splitting the common time
period into halves did not work well because the halves of the streamflow
record had notably different variance, range of values, and mean. Instead,
we used the PRESS method. At each regression run, one case was omitted and
estimated until each case had been estimated, generating a time-series of
independently estimated values. Model
validation statistics compare the observed gage record to the series of
individually estimated cases, called the validation series. Statistics
reported are the correlation between the validation series and the gage
record (Rval), the reduction of error (RE), and the root mean
squared error (RMSE). The RE tests the skill of the regression model in
estimating the gage values relative to a prediction based on no knowledge
(the mean of the calibration period for the gage record is used as "no
knowledge"). The RE can be treated as the validation series equivalent
of the explained variance in the original regression (R2cal).
The RMSE (root mean squared error) is a measure of the average size of
the prediction error for the validation series. It is given in the original
units of the gage data, and can be compared to the standard error
of the estimate in the original regression. Another validation approach used is a Linear Neural Network (LNN) which, as in the split-sample approaches above, assesses the ability of the predictors selected in the stepwise process to estimate the gage values. In general, a LNN is numerically equivalent to a linear regression model, but uses an iterative process to generate estimates of flow. It should yield explained variance (R2) and estimated values very similar to the regression results. So the comparison of R2 values for the calibration and LNN models is one check on the robustness of the predictors in estimating flow. We used an LNN program (NEVPROP) that employs a bootstrapping process to assess bias in explained variance (R2) and to generate confidence intervals. Here, the bootstrapping was done 500 times, each time drawing a random set of cases, equal in number to the original data set, with replacement. For each of the 500 runs, the entire model-fitting process is repeated, yielding a set of estimates and an R2 value. The set of 500 R2 values is used to generate a bias-adjusted R2. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Gaged (blue) and reconstructed (green) records for Blue River flow over the calibration/validation period (1916-1999). Note that the fit between the two records is poorest during the 1930s; the features of this decade are generally not captured well by trees on the west slope. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
One
result inherent to the least-squares regression process is that reconstructions
have reduced variance relative to the gaged record, so that wet extremes
are typically (not always) underestimated, and dry extremes, overestimated.
Wet extremes also tend to be underestimated because of tree physiology;
in years when moisture is sufficiently plentiful (such as 1983-84, above),
the trees' growth may not respond to additional inputs of moisture. But
overall, the trees reproduce both the year-to-year variability and decadal-scale
trends in streamflow very well. Evaluating
the Calibration/Validation Statistics In
evaluating the reconstruction models, the higher the explained variance
(R2) in the calibration, and the smaller the standard error,
the better, but the validation statistics are needed to demonstrate that
the regression is not overly tuned to the calibration data, and to provide
a more robust assessment of the quality of the reconstruction model. The
validation statistics are based on data not used in the calibration or,
in the case of the LNN, on an iterative method that uses randomly selected
cases. To evaluate the quality of the reconstruction, compare the similarity
of
The calibration and validation statistics for the Blue River model are reported below, based on the years 1916-1999: |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (1) |
|
(2) |
|
(3) |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The
statistics based on the validation are lower than the calibration statistics,
showing decreasing skill--as would be expected when tested on independent
or bootstrapped data--but the decrease is relatively modest. Tree-ring
reconstructions that explain 50% or more of the variance in the instrumental
record are considered good, particularly if the validation's explained
variance is also 50% or more. Here, about 63% of the variance in the Blue
River gage is explained by the full calibration model, and the various
validation statistics indicate that at least 56% of the variance is accounted
for when the predictors are tested on validation data. The Blue River
reconstruction is considered a high-quality reconstruction. Once
the model is calibrated and validated, the predictor chronologies and
their regression coefficients are used to reconstruct estimates of streamflow
for the years of the tree-ring chronologies. This is done by entering
the chronologies' values into the regression equation and calculating
the estimated streamflow for each year. For the Blue River reconstruction,
the regression equation is: Blue
River gage estimates = 49642.0 + DIL (74039.9) + PUM (62346.5) + COD (27425.1)
+ GOU (50232.9) - MTR (40977.8). Each of the five chronologies extends at least to 1440, so the full reconstruction is 1440-1999. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The full reconstruction of the Blue River above Dillon Reservoir flow record, with annual values (green) and a 5-year weighted mean (black) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Because the reconstruction model explains most, but not all, of the variance in the gage record, there are uncertainties in the reconstructed values. Estimates of uncertainty can be described by confidence intervals (CIs) around the reconstruction. These confidence intervals describe the range of uncertainty (usually at a 95% level) that can be expected in the estimates. Narrow confidence intervals represent a stable reconstruction model. There are several way to estimate confidence intervals. Two of these are the use of bootstrapped series generated in the iterative model-fitting process of the linear neural network, and the use of the root mean squared error in the regression equation. In this case, we used bootstrapping to generate 95% confidence intervals, which indicate the range of possible regression equation solutions in the calibration period. The CIs for the full reconstruction are estimated by taking the standard deviation of the errors expressed by the calibration period CIs (i.e. the standard deviation of the difference between the 95% CI and each value), multiplying by two and adding or subtracting this value to the mean error of the calibration period. This is essentially 95% of the 95% CI, so it is a conservative estimate. It is then added to the reconstructed values to generate estimated the +95% CI and subtracted for the -95% CI.
The extended streamflow reconstructions generated from tree rings provide a basis for many different analyses that may be useful to water resource management. Several examples of such analyses are described below. It is important to recognize that these results are for one gage (the Blue River above Dillon Reservoir) and one reconstruction, and these specific results should not be applied elsewhere. Although similar results are found for other gages in the Upper Colorado, reconstructed drought years do vary somewhat, as a consequence of local differences, quality of the gaged data, and uncertainties in the reconstruction model. Long-term
assessment of modern drought events
Many water managers have considered 2002 to be the third year of a three-year drought. When considered as a three-year event, this drought is much less rare. In the Blue River gage record (1916-1999) alone, the cumulative severity of 2000-2002 was exceeded six times, most recently in 1975-1977. The reconstruction confirms this three-year drought as being unexceptional, with 48 three-year droughts exceeding 2000-2002, even without considering the uncertainty in the reconstruction. Changes in
distribution of drought years
In the 20th century, there were only five years with flow in the lowest 10th percentile, fewer than in any other century. In all other centuries except the 18th, more than double this number occurred (19th - 11, 18th - 9, 17th - 13, 16th - 13, 1440-1499 - 5). In addition, there are several instances of back-to-back extreme dry years, most notably the three-year sequence 1845-47. This figure also shows sequences of years when flow was below the 40th percentile for many consecutive years. For example, for nine years, 1453-1461, no flows were above the 40th percentile. This reconstruction also shows that many extremely dry years are preceded or followed by very wet years. The period from 1580 to 1588 contains two sets of two consecutive extremely dry years, but both sets are followed by a very wet year. This representation of the Blue River reconstruction make it clear that there has been a great deal of variability in streamflow over the past 5 centuries.
For the Blue River, the years preceding extremely dry years show a slight tendency to be drier than average. In contrast, years following extremely dry years tend to be wet or moderately wet, although there is a secondary peak of dry years. Again, these tendencies are for the record of flow in the past. They may provide some guidance as to what to expect in the future, but it is important to note that the reconstructions cannot be used as predictive tools. The climate of the past is likely not an analogue to the climate of the future because of human impacts on climate in the 20th century, which will doubtlessly continue into the future. The tree-ring reconstructions of steamflow provide a record of natural hydroclimatic variability over which human impacts on climate will be superimposed.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
NEVPROP: NevProp
Artificial Neural Network Software with Cross-Validation and Boostrapped
Confidence Intervals. NevProp
is a feedforward backpropagation multilayer perceptron simulator-that
is, statistically speaking, a multivariate nonlinear regression program.
NevProp3 is distributed for free under the terms of the GNU Public License
and can be downloaded from http://brain.cs.unr.edu/publications/NevProp.zip
and http://brain.cs.unr.edu/publications/NevPropManual.pdf |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Home - Background - Chronologies - Reconstructions - Case Study - Resources |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||