Professional Documents
Culture Documents
10,355
10,356 HERNANDEZ:TIME SERIES,PERIODOGRAMS,AND SIGNIFICANCE
andunequallyweighteddataare alsopresented.Followingthese,
the periodogram,its statisticalproperties,and significancetests
are described.A concreteexampleis usedto illustratethe deduc- j:N /=l j=N
x • sin2(2rckT-•tj)
• cos2(2rckT-•tj)
j=l
Y(ti)
=Y/=k•:
iakcøs(2rckT-•tJ
)
,=N
• cos(2n;kr
1-I
-• ti) cos(2n;nr
-1tj) =
{TT/2n=k•:
0(6) n = k--0
k=l
+bksin(2n;kT
-Iti)], k=0,1,2..... (1) • sin(2n;kT
/:1
-1tj)sin(2n;nT
-itj) = 2n =k•0 (7)
n =k--0
By leastsquaresmethods,the generalbestfit (for any of the k j=N
coefficients)
is givenwhen< e2> is a minimum,
where • cos(2•ckT
-• tj) sin(2rcn/T
-• tj) = 0 . (8)
j=l
=0. j=N
(3)
Oak 3b•. = 2T-• • Yjcos(2nkT
-•ti) , (9)
j=l
• cos2(2rckT
-• tj)
j=N
Note, however,thatao andbo arespecialcases,where
j=N
• Yisin(2rckT
-• ti)
ao = r -• • Yj ; bo= O.
bk= j=l
j=N j=l
• sin2(2gkr
-1tj) Usually, the zero-frequencycoefficientis of no interest;so the
j=l
j=N
time seriesusedin this studyis thatobtainedby subtracting
the
ak • sin(2•ckT
-I ti) cos(2•ckT
-• ty) meanvalueof theoriginal
series,
i.e.,redefining
Yi = (YJ- ao).
j=l This operationis a simple redefinitionof the ordinateaxis and
j=N . (5a)
hasthe crucialpropertyof preservingthe variance.Implicit in
• sin2(2nkT
-•tj) thisstatement is thatthetimeseriesis a realizationof a stationary
j=l
process of at leastorder2 [Priestley,1981]. That is to say,it has
the samemeanandvarianceat all time points,andthe covariance
Equations (4a) and(5a) showthelackof orthogonality between betweenthe valuesat any two time pointsdependson the interval
the coefficients.This is not clearlynoticeablein the usualfull betweenthesetimepointsandnotthelocationof thepointsalong
derivationof thecoefficients,
suchasthefollowingforbk: the time axis.
HERNANDEZ:
TIME SERIES,
PERIODOGRAMS,
AND SIGNIFICANCE 10,357
Underthesecircumstances,
the ak and bk are orthogonalup to tieswhichassured
independence
of the resultantcoefficients
are
the Nyquist limit, i.e., K = kmax _<N/2. Becauseof their no longerapplicable.Theseproperties
mustbe investigated
independence andorthogonality,
thesefrequencies
areknownas independently.Therefore, it becomesnecessaryto use other
the naturalfrequencies.As waspointedout earlier,the original meansto obtain a measureof the true numberof degreesof free-
datapointsarerequired to be equallyspaced,
andimplicitlyeach dom in the data series. This topic will be discussedlater in Sec-
point has the sameuncertainty,qt- = constant.For the case tion 5. Implicit in the following discussion is the minimizingof
wherethe time seriesis not evenly spaced,possiblybecauseof the effectsof discontinuities,or edgeeffects,of real datasamples
missingdata [Little and Rubin, 1987], or the data pointshave having limited length. 'Tapering' the data with an appropriate
unequalweight, and the frequenciesdesiredare not commen- windowis oftenemployedto accomplishthis purpose[Blackman
suratewith the lengthof the data;then, harmonicfitting tech- and Tukey, 1959; Priestley, 1981; Percival and Walden, 1993].
niquesmust be employed.Under thesecircumstances, it is This topic will alsobe discussed.
possibleto approachthe normalleastsquaresFourierseries.
Specifically,
it mustbe mentionedthatin thepresentcontext,not 3. Periodogram
evenlyspaced dataaremeantto be irregularly-spaced
data.This
fittingtreatment is dueto Lomb[1976]. The weightassociated Schuster[1898] definedthe periodogramas a measureof the
withtheuncertainties
is defined
as% - of2, where
o] ½o•. relative power of a time seriesas a functionof frequency. He
In addition,an arbitraryphaseshiftterm•:•,is addedto eachof was searchingfor 'hiddenperiodicities',or small periodicvaria-
the trigonometrictermsin Equation(2a). This is a simplerede- tionshiddenbehindirregularfluctuations.Here our notationwill
finition of the axis, which does not alter the function. Thus: changeto the morecommonusageof to = 2•kT-•. The
j=N periodogramis defined[Priestley,1981]:
<œ2>= •, [yi _ a• cos(2nkT-•(tj-x•))
j=l
l(to) = Yicos(totjx • cos
2(tOt
i
L/=!
- bt.sin(2•;kT-•(tj-'rt.))]2 wi' (2')
+ Yisin(tot• x sin2
(toti , (13)
Treating Equation(2b) in the samemanneras Equation(2),
we obtainthe tbllowing result:
j=N
which, in termsof the previousderivations,can be written as
•, Yjwjcos[2•;kT
-• (tj-'r•)] 1(•0) =[A(•0)] 2+[B(to)] 2
j=l
ak = j=N i =N j=N
• wjcos2[2nkT
-• (tj-'r•)] = a2(to)• wicos2(tot•)+b2(to)
•_•wjsin2(totj).(14a)
j=l
/=l j=l
j=N For the case where the uncertainties of the data are the same
b• •, wisin[2•;kT
-• (tj-x•)] cos[2•;kT
-• (tj-x•)] (i.e.,w/ = constant)
for all valuesandthedatapointsareevenly
1=1
(4') spaced,it is easyto showthat:
• wjcos2[2nkT-•(ti-xk)] l(to) = N2-1[a2(to) + b2(to)] = N2-tc2(to). (14b)
j=l
If we arbitrarilyforce the numeratorof the secondterm of This equationshowsthe 'multiplicative'effectof the periodo-
Equation(4b) to be zero, we havethat: gram [Priestley,1981). In thissectionit will be implicitlyunder-
I=N
stood that the coefficients a(tO) and b(to), and their associated
= O. (11) quantitiesA(to) and B(to), are independentbecauseof their
b• •,wisin[2•;kT-I(tj-x•)]cos[2•;kT-•(tj-x•)]
j=l orthogonalproperties and/or their independentlydetermined
It is thenpossibleto solvefor x •' degreesof freedomof the time series.
j=N
For the null case,wherethe Y• consists
of a sequenceof
independentrandomvariables 2 the
of zeromeanandvarianceOy,
• wisin(4nkT
-• t•)
setof A(to) and B(to) of Equation(13) is a linear combinationof
xk= T(4nk)
-• tan-• •=2v
•=• ., (12) theYi setandhasa multivariate
normaldistribution.
Thesetof
• wjcos(41rkr
-t tj) l(to) hasa distribution
whichisproportional
toZ2 in 2 degrees
of
j=l
freedom. Thus:
2
which,whenreplacedin Equation(4b), givesthe desiredanswer: Iv(co)= OyZ2
2, (15)
j=N
•_•w•Yjcos[2•rkr
-• (tj-'rt,)] wheretheZ22
distribution
is a simpleexponential
distribution
hav-
j=l ing a meanv andvariance2v. However,notethatthisis applica-
at.= j=•/ . (9') ble only to the (maximum)numberN/2 of independent
I(to), i.e.,
•, wjcos2[2•kr
-• (tj-xt.)] natural frequencies. Schuster [1898] tested the largest
j=l
periodogram ordinatewith the statistic¾:
Althoughit is possibleto obtainestimatesof the variancefor
the determinedcoefficients,this doesnot directly give a measure
5'= (Ip)max
O.7,2; l<p<N/2. (16a)
2
of the importanceof a particularfrequencyin the spectrumrela- In practicalapplications
the variance0 v mustbe estimated,
tive to the otherpossibleindependent frequenciespresent.Also, preferablywith an unbiasedestimatesuchas the expectation
if a fitting method is used becauseof the irregular spacing, value,ratherthanwiththesample
variance
sy.2 Forcompleteness,
unequalvariance,etc., in the data,then the orthogonalityproper- whenthe sample
variance
is replaced
by 2s•, the resulting
10,358 HERNANDEZ:TIME SERIES,PERIODOGRAMS,AND SIGNIFICANCE
expressionof ¾ can be recognizedas the conventionalLomb- geometricalarguments,for N odd, later analytically proven by
Scargle(LS) periodogram
statistic[Presset al., 1986],e.g., Grenanderand Rosenblatt[ 1957]. Fishershowedthat the proba-
2 -1 bility of the power at one frequencyover the total power of the
¾L•= (Iv)max(2 Sy) ß l_<p_<N/2. (16b) setcanbe expressed(relativeto the arbitrarylevel z) by:
Theretorethe statisticaltest consistsof checkingwhether
or not the valueof ¾differs/¾omzero at somesignificance level
p[g>z]='•
i=•(-1)i-•n!(1-iz)"-•
i! (n-i)! ' (22)
fortheZ22
distribution,
i.e.,whether
ornotallIp = 0. Theproba-
bilitydistribution
of a Z22
is a simple
exponential [Hoel, wherea isthenextlargest
function integer
greater
thanz-• . Thefirstterm
1954]' of the expansion of Equation(22) can be recognized as the
expansionof a simpleexponential,suchthat for large valuesof n
f(z) = 2-l exp(-z/2) . (17) (n >> 10,000)'
Hence,
foranyvalue
ofz > 0,theprobability
thatI•,/o•.
2,does
not p[g* > z] = 1 - [1-exp(-z/2)]" , (23)
exceedz is givenby:
which is the sameresultobtainedin Equation(18b).
p[(I•,/o•)
5z]=i f(x)dx
=1-exp(-z/2)
. (18a) The critical value z is the highestvalue in the periodogramto
be tested. For the approximationof Equation(23), at somecriti-
cal value (P,,) of the confidencelevel probabilitythe asymptotic
Under the null hypothesisthat ¾ representsone of the N/2 valueof z to be exceededat thatlevel is (seeEquation(19)):
independently
distributed
exponentialvariables,thenfor anyz:
z•. = -2 ln[1-(Pc) l/" . (24)
p(¾> z) = 1-p[(I•/•)<z, forallp]
However,
Whittle[1952]hasproposed
thatthequantity
g• could
- 1-[ 1-exp(-z/2)] •v/2. (18b) be usedto continuethe significancetestto lesseramplitudepeaks
This makesit possibleto testwhetheror notthe largestvaluein a thanthe highest:
periodogramis statisticallydifferent from a zero mean distribu- , It,=2
2
tionwithvariance
oy. If sucha nonzero
peakexists,thedistribu- g2= i,=. , (25)
tion is unlikely to be random,and this is the end of the test. To N-•[ (• II,) - //,max]
be significant,¾mustexceedthe critical test value of: p=l
trates
theSchuster
g2•distribution
significance
testresults
andthe
conventional Lomb-Scargletest[Presset al., 1986], utilizingthe 5
dF = [O(2n)l/2]-lexp[-x2(202)-l]dx
-50 2 1/2 -
=(N-14noy)
- exp{-[a•+b•](4o•/N)l}
0 50 100 150 200 250
x dakdb•. (28a)
Time (seconds) From the definitionsgiven in Equations(14b) and (21) one can
2
Figure1. Noisyandunevenlyspaced datachosen recognize,usingthe differentestimatesof c•v, that the argument
experimental
lbr examination
of hiddenperiodicities. of the exponentialis'
10,360 HERNANDEZ: TIME SERIES,PERIODOGRAMS,AND SIGNIFICANCE
Schuster
Yl,. = - In( 1- P2m) ,
which is to be comparedwith y•. of Equation(t9) andz•. of Equa-
tion (24) and is the resultgivenby Presset al. [1986] for the con-
(29)
5
10
,/
ventional Lomb-Scargle statisticalsignificancetest. Since the
statisticin Equation(28c) and the test of Equation(29) are half
the value of the relevantstatisticand test in Equations(15b) and
(24), the resultantsignificancetestsare seento be the sameas the 0.95
other two tests,albeit with a scale change. The resultsof the sig- Lomb-Scargle 7.5
/
(corrected)
nificancetestjust describedare shownin Figure4, indicatingthe 5
corrected)Lomb-Scarglestatisticcalculatedusingthe expression
0
of Equation(28c). o 0.1 0.2 0.3 0.4
20
true variance. Thus, in particularfor small samples,it becomes
necessaryto obtain an estimateof the number of the sample's
degreesof freedom in order to have a proper estimateof the true 15
becomesspreadoverfrequency,andlessnoticeable, by thecon-
volutionoperation.Detailsontaperingarefoundin moststatisti- 0
cal references
[Blackman andTukey,1959;Priestley,1981;Per-
100
civalandWalden,1993],wheretheyarediscussed in termsof the
(originalfunction)Fejerkernel,its manipulation,
andthe resul- 50
tanteffectsin the analysis.Here we will examinethe effectsthat
thetaperingprocess
hasonourdatasample. Signal o
As expected,thereexista largenumberof windowfunctions
and figuresof merit associated
with them [seeBlackmanand -50
S -
Gauss
Fisher 0.95
100
Signal
() 0.1 0.2 0.3 0.4
Frequency (Hertz)
Figure 11. Periodogramusingthe numberof naturalfrequencies
½) 50 1(X) 150 200 250 allowed by the available degreesof freedomwith a hyper-Gauss
Time (seconds) window. As in Figure7, the fundamentalfrequencyhas been
redefinedby croppingthe datalength.
Figure 9. Hyper-Gausswindow and the original time seriesdata
of Figure 1 after multiplicationby this window.
5•
•xf/•I I I I I I I
0 0.1 0.2 0.3 0.4
ductof the original spectrumandthe filter transferfunctionor by
only calculatingthe spectrumfor the region of interest, which
Frequency(Hertz)
requirescalculatingthe full powerspectrumanyway,as given in
Equation(15b). The resultsobtainedby eitherof thesemethods
Figure10. Periodogram usingthenumber of natural
frequencies
allowedby the availabledegreesof freedomwith a Welch win- are indistinguishablefrom eachother.
dow. Note,in particular,
therelativeenhancement of thepower Filteringis a conceptwhichis more applicableto the scenario
at thelowerfrequency,whencompared to Figure7. Here,asin in which data are being obtainedcontinuouslyand the signal-to-
Figure7, thefundamental frequency hasbeenredefined by crop- noise ratio is continuouslyincreasing. After some time has
pingthe datalength. elapsed,an arbitrarycoherentfrequencypresentin the informa-
10,364 HERNANDEZ: TIME SERIES,PERIODOGRAMS,AND SIGNIFICANCE
0.95
15
f'(x) D i2nsF(s) , (35)
10
5
wheref(x) denotes
thefirstderivative
off(x), i = (- 1)•/2ands
is the Fourier plane abscissa.The latter is identified with fre-
Fisher 25
quencywhen x is in time units.
20
Equation(35) clearly shows that a differencing operation
15
0.95
depressesthe amplitude of the low-frequencycomponentsand
increasesthe amplitudeof the high-frequencycomponentsof the
5 original signal's spectrum. This operationconformsto the gen-
0
- I I
0.1
I k
I
0.2
I
0.3
I I
0.4
I eral definition of filtering as an enhancementof a region of a
spectrumrelative to the rest of the spectrum. The upper panel of
Frequency (Hertz)
Figure 13 shows the periodogramof a raw synthetic series.
Figure 12. Periodograms of the dataof Figure1, beforeandafter Although the raw synthetictime series consistsof six equal-
filtering with a squarefilter of 0.025 Hz bandwidthcenteredat
0.26 Hz. Note that the information available on the feature near
amplitudeoscillations(see FigureA2(a)), the periodogramshows
two of thesefrequenciesas missingbecausethey are outsidethe
0.26 Hz has not changedafter the arbitraryfiltering operation.
The 0.95 significancelevel for the upper periodogramfor the Nyquist limit determinedfrom the serial coherenceof the data
naturalfrequencies,presumingall datapointsare independent, is (seeFigureA2(b)), and the survivingfour frequenciesas having
also shown. unequal power. This latter effect is causedby the mismatch
betweenthe frequenciesof the oscillationsand the natural fre-
quenciesof the periodogram. The lower panel of Figure 13
tion streamreachesa desiredstatisticallysignificantlevel. This showsthe periodogramof the differencedsynthetictime series.
time is practicallyprovidedby the 'time constant'of the filter As expectedfrom the derivation of Equation(35) the low fre-
and/orthe coherentintegrationof the signal. However,whenone quenciesof the differenced periodogramhave been depressed
has a fixed total measurementinterval, the information is frozen while its high frequencieshave beenenhanced.At the sametime
withinthe dataandno furtherknowledgeis available(unlessnew the numberof degreesof freedomof the 'differenced'serieshas
information is provided,usuallyin theformof assumptions, etc.). increased,as can be seenby the increasedfrequencyrangeof the
Althoughde-trendingdataas part of statisticalanalysisis not lower panelperiodogramin Figure 13, and the appearanceof the
normallyconsidered a filteringoperation,its effectson theresults sixthfrequencywith exaggeratedpowerand enhancednoise.
canbe substantial andverysimilarin character to thoseoccurring The resultsshownin Figure 13 clearly showthe changesintro-
in filtering. A trendis normallydescribed asan amplitudevaria- duced by the difference procedureemployed in de-trending.
tion in the time serieswhichhasa periodicitymuchlongerthan Three of the known frequenciesknownto be statisticallysignifi-
the lengthof the availabledata,andde-trending consists of arbi- cant in the original data have disappeared,while two new fre-
trarilyremovingthissignal[Blackmanand Tukey,1959]. There- quenciesnot supportedby the serial coherenceof the original
fore de-trending becomes a specialized methodof filtering,with time serieshave appearedbecauseof the increaseddegreesof
all its associated pitfalls. freedomcreatedby the differencingoperation.Clearly, the lower
In principle,it is possibleto separatea 'trend'from the under- panelperiodogramin Figure 13 bearslittle similarityto the origi-
lying informationas long as the form of the trend is known. nal raw syntheticdata periodogramand cannot be describedas
There arisesthe immediatequestionas to how a statisticaltestof representingthe spectrumof the original time series. Thus, the
statistical results derived from the examination of a differenced
significance
performed
on de-trended
datarelatesto theoriginal
data. Since the power associatedwith the trend has been signalno longer refer to the original signaland thereforeare not
relevant to it.
removed,oneis no longermeasuring the significance
of a given
featurerelativeto the full spectrum.Thustheresultsno longer
referto theoriginaldataandarelikelyto beinvalidwhenapplied
to it. In addition,the de-trendingprocessmay substantially 35
changethe serial coherenceof the signal and the associated 30
41}
• a)
3O 80
1o 60
Ip
4{)
•- b) 40
30•
20
2O
•
O.95
10
Frequency (Hertz)
20•1
Figure A4. Periodogramof the time seriesof FigureA3. Note
the increasein the numberof degreesof freedombroughtby the
increasein amplitudeof the highestfrequencyoscillation.The
Fisher test is also shown.
40
•[
3½)
•
0.95
25
Signal 0
-25
-50
ß ß ß ß ß
-75
Signal 0 ß ß ß ß ß ß
ß ß ß ß ß
I
ß ß ß ß ß ß ß ß ß ß
0 50 100 150 200 250
ß ß ß ß ß ß ß ß ß ß
Time (seconds)
Figure A3. Six-frequencysyntheticseries,where the first five V V V V V
oscillations
haveequal(12 units)amplitudewhile the 2.2-s oscil-
0 25 5'0 75 100 125
lationnow hashigheramplitude(28 units),asdenoted.Gaussian
noise with a zero mean and a standard deviation of 15 units' has Time (seconds)
Frequency (Hertz)
Figure A6. Periodogramof the time seriesof Figure A5. Note
the modulation effect of the data gaps. The Fisher significance
test is also shown. References