You are on page 1of 3

Scientific Research in Geography: Project Alpine glaciers and their future. Jan Feb.

2007 WEEK 3; Circular statistics & graphs. Ian Evans 5 February 2007 Note that you can copy and paste from this file on DUO, rather than retyping the bold stata commands. [cf. last week, the option c means connect, ms means marker symbol. If there are several variables, the letters in brackets following c or ms can be multiple, in the same order as the variable list. i means invisible and l means connect with a line. titlegap and ,ang(h) are cosmetic improvements in graph appearance. lfit is a variant of regress. anova is, for qualitative controls, comparable to regress for quantitative controls (either can be multiple, and the two can be combined see below. In the significance tests, the higher t and F are, the more significant the relationship. The p columns give the bottom line, the chance that such a t or F could be obtained from samples of this size if the null hypothesis of no relationship were true. The smaller p, the more significant our result is.] copy data file to your space on j: open Stata (see NJC term 1 handouts) Further information on these methods can be found on DUO in the file; CircularstatStataNJC.doc Or type whelp circular The most useful routines for this project (you might want to try others too) are circsummarize (circsu) circlccorr circvplot circdplot and fourier. cd j: log using myalps2.log u alps05lean circsu aspect ablationAspect, det These variables are very similar; there is slightly more scatter (lower strength) for ablationAspect as it is influenced by pre-existing valleys. Note that if you expand the variables panel, you get a fuller description of the variable. This is the one used to label the x and y axes of graphs, unless you specify xti and/or yti. or for Southern & Maritime Alps: circsu aspect ablationAspect if district==1, det Note consistency (northward); but wider confidence interval for smaller data sets, i.e. the estimate of the mean is less precise. 95pc confidence intervals are useful as we are confident that, with repeated sampling, our estimate would fall within this range 95% of the time. The net northward tendency in glacier aspect is moderate for the Alps as a whole, but stronger for the southern district of small glaciers starting only a little above ELA. The distribution of glacier numbers over aspect is significantly non-uniform by both tests (Rayleigh and Kuiper: the latter considers any deviation from uniformity, while Rayleigh's test considers unimodal deviation).

circsu aspect , det by(cl1) Note that (ice)field and outlet (glacier) are not sufficiently numerous to provide significant results. circsu aspect ,det by(major) The differences between major divisions are subtle; the 95% confidence intervals on vector means overlap. Often we need to relate a linear variable to a circular one: DO NOT use corr, which is for linear variables only. Note that in the command circlccorr (circular linear correlation), the linear variable precedes the circular: circlccorr midalt aspect The relation is highly significant but not strong. Useful graphs (dot and vector plots) are provided by: circdplt aspect circdplt aspect, uti(Alps; all glaciers) circvplt aspect, uti(Alps; all glaciers) In case you mislaid the key to district numbers, tabulate them both ways: tab district tab district , nolabel circvplt aspect if district==18, uti(Adamello glaciers) circvplt aspect if district==14, uti(Glarus glaciers) You might check that the number of glaciers is correct circsu aspect if district==14 Or to remind yourself of type numbers: tab type tab type, nolabel circvplt aspect if type==0, uti(normal glaciers in the Alps) Regressions have to be on cosines and sines (because of the 0=360 situation): these are provided by fourier, e.g.: fourier ablationAspect the Alps file already contains cosine and sine for (accumulation area) aspect, so: reg length cosaspect sinaspect The regression is just significant at the 0.05 level, because of the relation to sine(aspect): t is higher and p lower for sine than for cosine. But the relation is very weak (R2 is negligible). reg midalt cosaspect sinaspect For mid-altitude, however, both terms are highly significant and account for 10% of variation. The northsouth contrast (expressed in the cosine coefficient) is considerably stronger than the east-west (sine). If you want to produce a graph, for a sub-areafirst regress, and save the prediction as a new variable (e.g. ABpred) so it can be plotted. Combining districts 18 and 19: reg midalt cosaspect sinaspect if district==18|district==19 predict ABpred

Then to produce a graph, repeating values for 0 at 360 (the edges of a rectangular graph, but imagine this being wrapped around a cylinder): circscatter midalt aspect if district==18|district==19, xsca(titlegap(2)) ysca(titlegap(2)) xla(0(90)360) yla(2000(400)3200 ,ang(h)) msy(+) sort legend(off) yti(Glacier Mid-altitude (m)) xti(Accumulation area aspect ({c 176})) ti(Adamello & Bernina districts) plot(mspline ABpred aspect if district==18| district==19, clpattern(solid)) xcirc pad(10) mspline fits a smooth curve(rather than a series of straight lines). Clpattern varies the pattern of the line drawn. Pad permits the duplication of 0 degree values at 360 deg. Note that further variables can be considered simultaneously, in a multiple regression equation predicting midalt: reg midalt cosaspect sinaspect logarea logheightdi Both of these variables improve the prediction significantly. If you are happy to go on increasing model complexity, you can combine qualitative variables such as type or district with these quantitative controls, in an analysis of variance, so long as you state which variables are quantitative (continuous): anova midalt type cosaspect sinaspect logarea logheightdi, cont(cosaspect sinaspect logarea logheightdi) As logarea is insignificant (p>.05) in that particular combination, we should drop it (from BOTH lists) and re-run: anova midalt type cosaspect sinaspect logheightdi, cont(cosaspect sinaspect logheightdi) Again, these commands are useful with the Alps data but these examples just illustrate some ways in which you can use them. You can learn more about unfamiliar commands, such as anova, by typing whelp In the final week, week 4, we will be in the GRC computer room to answer your queries and aid your completing the computation for whatever version of the project you have chosen, but no further new methods will be introduced. Go back to the main handout and see if you can think of an interesting variation. I hope you can go beyond just temperature change; remember that the several effects as listed by Oerlemans are treated as additive. Turn your .log file into your report, in WORD or an editor, or copy and paste from it. Report; text up to 5 pages, + 5 to 15 figures. -due to Dept. Office 4 pm Thursday 22 February

You might also like