You are on page 1of 7

MEMORANDUM

To: Anne LeHuray, Pavement Coatings Technology Council

From: Tom Gauthier, ENVIRON

Date: February 9, 2014

Subject: Initial Review of the Pavlowsky Paper

ENVIRON has performed an initial evaluation of the journal article entitled Coal-tar pavement sealant
use and polycyclic aromatic hydrocarbon contamination in urban stream sediments recently published
by Pavlowsky (2013). The paper is based on data collected and reported in an earlier report entitled
Baseline Study of PAH Sources and Concentrations in Pond and Stream Sediments, Springfield,
Missouri (Pavlowsky, 2012).

The author purports to show a strong link between the percentage of sealed parking lots in the
upstream watershed area and downstream sediment polycyclic aromatic hydrocarbon (PAH) levels;
but the analysis is flawed on a number of levels. Many of the downstream sediment samples are not
stream sediments at all, but rather parking lot particulate samples collected at the edge of a parking
lot. An example of the locations designated downstream sediment is shown in Figure 1.

Figure 1. Location of Pavlowsky Sample 41 with second highest total PAH concentration
classified as downstream sediment.

ENVIRON International Corp. 10150 Highland Manor Drive, Suite 440, Tampa, FL 33610
V +1 813.628.4325 F +1 813.628.4983
environcorp.com
LeHuray -2- February 9, 2014

The author uses the term urban stream sediments as a catch-all phrase to include parking lot
sediments as well as lake, pond and stream sediments. In fact, the majority of samples are classified
as parking lot sediments which were collected from lot edges, inlet structures, and adjacent storm
water basin channel beds (Pavlowsky, 2012). The problem is that the sample collected from location
41, along with samples collected at other similar parking lot sediment sampling locations are included
in a regression model to predict downstream sediment PAH levels. Mixed in with these parking lot
samples are a handful of lake, pond and stream sediment samples at actual downstream locations
that show no relationship between the percentage of sealed parking lots in the upstream watershed
area and sediment PAH levels.

In the paper, the author draws a number of conclusions about PAH levels in urban stream sediments
that are unsupported by the data. The author concludes that sediment PAH concentrations are
strongly correlated with the percentage of sealed parking lot area within the upstream drainage area of
the sampling site, in contrast to total parking lot area or sediment composition. Yet, the strongest
correlation in the reported analyses is observed between the sum of the 16 different PAH
concentrations measured in the study (PAH16) and the total organic carbon (TOC) content of the
sediment. This is a well-established relationship that is used to predict PAH levels in sediments and
soils in numerous other studies. Moreover, both the sealed lot area (SLA%) and the total lot area
(TLA%) are positively correlated with PAH16 (r = 0.65 and 0.39, respectively) and positively correlated
with each other (r = 0.63), which makes it difficult to discern any significant differences between the
two.

The author also concludes that parking lots with coal tar coatings contribute >80% of the total PAH
concentration in urban stream and pond sediments in Galloway Creek. This statement is based on a
flawed regression estimate that:

relies primarily on parking lot sediment samples and not Galloway Creek sediments;

uses sealed lot area as a proxy for coal tar coatings without verification that the coating
is actually coal tar-based as opposed to asphalt-based;

cannot mathematically account for parking lot areas with no sealer applied due to
inappropriate use of logarithms; and

includes independent variables that are significantly correlated with each other and not
truly independent thus introducing issues of multicollinearity in the model.

What the author refers to as an important element of the study (i.e. the interaction of SLA% and
TLA% in the regression model) is instead a major flaw with the statistical interpretation because both
independent variables are correlated with each other. The author recognizes this flaw but proceeds
with the analysis anyway.
LeHuray -3- February 9, 2014

By including both variables in the regression analysis, it was anticipated that the effects of
PAH contributions coal-tar lots could be distinguished from PAH contributions from
unsealed lots and other urban non-point sources based on the relative significance of one
variable compared to the other. However, the two variables are not completely
independent of one another because SLA% is a subset of TLA% and is equal to or less
than the total area. (p. 402).

The correlation between sealed lot area (SLA%) and total lot area (TLA%) is shown in Figure 2. Aside
from the obvious distinction for the three unsealed parking lots (red squares) and two other mixed
sealed/unsealed lot areas, there is an obvious, strong relationship between the sealed lot area and
total lot area.

On average, the sealed lot area is 66% of the total lot area for all lake, pond and stream samples.
Only two of the stream samples deviate by more than 5% from this ratio. However, even though the
relative contribution between SLA% and TLA% is approximately constant, PAH16 concentrations in the
lake, pond and stream samples vary by more than two orders of magnitude. For example, at location
14, the ratio of SLA%/TLA% is 0.68 and the PAH16 concentration is 266 ppm, while at location 21, the
ratio of SLA%/TLA% is 0.71 and the PAH16 concentration is 2 ppm. Thus, the relative contribution of
the two so-called source variables (i.e. SLA% and TLA%), which is approximately the same for all the
lake, pond and stream samples, has no apparent effect on PAH16 concentrations.

Sealed Parking Lots Unsealed Parking Lots Lake, Ponds and Streams

100
Sealed Lot Area (%)

80

60

40

20

0
0 20 40 60 80 100
Total Lot Area (%)

Figure 2. Correlation between sealed lot area and total lot area.
LeHuray -4- February 9, 2014

The author suggests that the effects of PAH contributions from coal-tar lots could be distinguished
from PAH contributions from unsealed lots and other urban non-point sources based on the relative
significance of one variable compared to the other. To do this, the author calculates a user-defined
lot source index value to evaluate the relative significance between total lot area and sealed lot area
but the concept is flawed. The lot source index value is defined by the author as the ratio of the p-
value for TLA% regression parameter to the p-value for the SLA% regression parameter. An index
value greater than 1.2 is interpreted as representing a greater significance of SLA% in the regression
model compared to TLA%. This interpretation makes no formal sense and reflects a fundamental
misunderstanding of the regression modeling output.

In a regression model of the type Log Y = 0 + 1 log X1 + 2 log X2, the calculated p-value reflects the
probability that the regression coefficient () is significantly different than zero in other words
whether the predicted effect of the independent variable is real or not significantly different than zero.
The p-value has no bearing on the magnitude of the effect of the variable on the model output.
Rather, the magnitude of the influence of the independent variable on the predicted value is
determined by the regression coefficient itself.

If the log of both the dependent and independent variables are taken, as the author has done, then the
regression coefficients are termed elasticities and the interpretation is that a 1% increase in the
independent variable leads to a % increase in Y, on average, with the stipulation of all other
parameters being equal. 1 According to Table 4 of the paper, using the authors output, a 1% increase
in SLA% leads to a 0.79% increase in PAH16; whereas, a 1% increase in TLA% leads to a 1.67%
increase in PAH16, on average with all other parameters held equal. Thus, the total lot area has a
larger relative effect than SLA% on PAH16 with all other parameters held constant which is opposite
to the conclusion drawn by the author based on his calculated lot source index value.

As a further demonstration of the lack of validity of this concept, the lot source index values for
nitrogen and phosphorus nutrients (N% and P%) are 10.5 and 8.7, respectively compared to a value of
2.1 for PAH16. Total organic carbon (TOC%) also yields a high lot source index value of 11.6. Using
the authors interpretation of the lot source index value, the reader would conclude that coal tar sealed
parking lots are a much more important source of nitrogen and phosphorus than other sources such
as fertilizers and that sealed parking lots are a significant source of TOC. This makes no sense
particularly since TOC is present at percent levels in the soil/sediment samples and originates from the
breakdown of organic matter not sealed parking lots.

Even the three parameter regression model developed by the author suggests the concept is flawed.
For example, the authors three parameter regression model, which includes TOC as a third variable,
suggests that TLA% is more significant that SLA% using his lot source index criterion.

1
This is an approximation that holds for fractional increases in the independent variable X. The formal result is

that for an x fractional increase in the independent variable X there is an (1 + x) increase in the dependent
variable Y. For example, if = 0.8, a 10% (0.1) increase in x leads to a (1.1) = 1.079 8% increase in Y.
0.8
LeHuray -5- February 9, 2014

Both parking lot variables are highly significant in the equation. However the significance
of the TLA% parameter (p = 0.000010) is about nine times more significant than the
SLA% parameter (p = 0.000093) (Table 5). This result indicates the interaction of the
SLA% variable with TOC in the model and is not interpreted as a stronger effect of TLA%
on PAH16.

To resolve this conflict, and demonstrating an inherent bias to a-priori concluding that SLA% is an
important indicator variable, the author simply suggests the addition of the third variable (TOC) is
masking the relationship. In fact, he suggests that the regression model is used to determine how
much masking of the relationship that he knows a priori to be true is going on.

For Galloway Creek, regression analysis was used to evaluate the degree to which the
effects of sediment transport and geochemistry may be clouding the source relationship
with SLA% and TLA%.

Thus, although the lot source index in this three parameter regression model is 0.1, the author
suggests that this is due to the interaction of the SLA% variable with TOC in the model and supports
this claim with confusing reasoning.

The regression model developed by the author makes no physical sense and breaks down at zero
sealed lot area. Whereas it would be logical to hypothesize that the downstream sediment PAH
concentration is the sum or linear combination of various input variables, the author elected to use the
log of the input parameters thereby assuming a power law relationship. For example, the three
parameter model included in the paper is shown in equation (1).

Log (PAH16) = 0.676 + 0.656 Log (SLA%) + 1.63 Log (TLA%) + 1.2 Log (TOC) (1)

This equation can be rewritten in the form of the power law relationship it represents (eq. 2).

PAH16 = 0.676(SLA%)0.656(TLA%)1.63(TOC)1.2 (2)

There is some merit in taking the log of PAH16 to normalize the concentration data, which are often
log-normally distributed; however, it is unclear why other variables are log-transformed in the model
other than to artifactually improve model fit. Doing so comes at a price because the log of zero is
undefined and thus the model cannot address unsealed lot areas (i.e., upstream areas where SLA% =
0).

In order to address areas with concrete and unsealed parking lots, the author inserts an arbitrary proxy
value of 0.1% to arrive at equation (1). Curiously, the author used a different proxy value of 1% in his
earlier report to arrive at a different model shown in equation (3). An even different result (eq. 4) is
obtained if 0.01% is used as a proxy value.
LeHuray -6- February 9, 2014

Log (PAH16) = 0.77 + 1.05 Log (SLA%) + 1.23 Log (TLA%) + 1.19 Log (TOC) (3)

Log (PAH16) = 0.63 + 0.472 Log (SLA%) + 1.82 Log (TLA%) + 1.22 Log (TOC) (4)

Clearly, the model chosen by the author is flawed because it depends strongly on the choice of the
arbitrary proxy value used to represent unsealed lot areas. Yet the author uses this model to
purportedly predict the effect of banning the use of coal tar sealants by using the model and setting the
contribution of sealed lots to zero. In doing this, the author predicts that banning the use of coal tar
sealants has the potential to reduce sediment PAH levels by 80-99% in urban streams and ponds.

Although the model is reported to explain 85% of the variance in PAH16 concentrations, the model
residuals are not normally distributed and show that the model tends to over predict PAH16
concentrations, in some cases by more than 300%. This is shown in Figure 3 which presents the
distribution of model residuals as a percentage of the actual value i.e. (predicted PAH16 actual
PAH16)/actual PAH16.

5
Frequency

0
-100% -50% 0% 50% 100% 150% 200% 250% 300% 350% 400%
Modeled PAH16 (% Difference from Actual)

Figure 3. Distribution of Modeled PAH16 Levels Percent Difference from Actual

Other Issues

The author does not treat duplicates consistently. Averaged values should be reported for all
duplicates. The author selected the higher of the duplicates for the parking lot samples (locs.
34 and 42), but the lower of the duplicates for the stream samples (locs. 14 and 33) and the
higher of the duplicates for the pond sample (loc. 15). For example, at parking lot location 42,
the duplicate results for PAH16 were 1,665,620 and 2,652,610 g/kg and the author selected
LeHuray -7- February 9, 2014

the higher value; whereas, at stream location 14, the duplicate analyses were 266,580 and
280,870 g/kg and the author selected the lower value. 2

As stated on page 395, the purpose of the study is to quantify and evaluate the spatial
patterns of PAH concentrations and source-sink relationships in stream and pond sediments in
an urban watershed in the City of Springfield, Missouri. Yet the analysis is based on only 4
stream samples, samples from two ponds (totaling 4 samples) and samples from a single lake
(total of two samples). Most of the samples (12 of 22) are classified as parking lot samples.
Although a map of the sampling locations is provided, there are no spatial coordinates or
distance measures presented to indicate how the urban stream sediments are spatially
related or located with respect to hypothesized urban source areas.

The author assumes that all sealed lots are sealed with a coal-tar based sealer as opposed to
an asphalt-based sealer, yet evidence suggests that asphalt-based sealers are also used in
Springfield. According to the author, information on the type of sealant used on each lot was
unavailable; however, the primary wholesaler of coal-tar and asphalt sealers in the Springfield
area noted that 85% of their sales is coal-tar sealant and 15% is asphalt based sealant. While
the largest applicators in town applied 95% coal-tar sealant, and two other applicators also
applied primarily coal-tar based sealers, two other applicators replied that they apply only
asphalt based sealers. Clearly, some fraction of the parking lots in the Springfield area are
sealed with asphalt-based sealer, yet the author has assumed that all sealed lots in this study
are sealed with coal-tar based sealer.

The author notes that most PAHs are typically concentrated in the low density fraction of the
sediments which typically accounts for only a small percentage of the total sediment mass
yet does not measure or report this important sediment characteristic and evaluate it in his
model.

2
There are slight differences between all PAH16 values reported in the paper (Pavlowsky, 2013) and PAH16
values included in the report (Pavlowsky, 2012) for the same location.

You might also like