You are on page 1of 37

Original paper

Data preparation
Data processing and visualization

Westward shift of western North Pacific


tropical cyclogenesis
Wu et al. (2015), Geophys. Res. Lett.
doi:10.1002/2015GL063450

Jonathon S. Wright

jswright@tsinghua.edu.cn

21 March, 2017
Original paper
Data preparation
Data processing and visualization

Original paper
Introduction
Results
Discussion

Data preparation
The netCDF4 module
The datetime module
Pre-processing

Data processing and visualization


Mapping: the cartopy module
Linear regression and correlations
More contour plots and masked arrays
Original paper
Introduction
Data preparation
Results
Data processing and visualization

Motivation
I Tropical cyclones in Western North Pacific account for about
one third of all TCs
I Changes in TC genesis location could affect billions of people

Previous studies indicate that TC genesis location may change


I Poleward movement of mean TC max intensity
I Large-scale changes in vertical wind shear and potential
intensity
I Changes in the distribution of mid-tropospheric RH
I More synoptic-scale disturbances in central Pacific
Original paper
Introduction
Data preparation
Results
Data processing and visualization

The tropical upper tropospheric trough


I Extends from 15 N in the WNP to 35 N in the ENP
I Apparent in 200 hPa wind field during boreal summer
I Strong vertical wind shear along the eastern flank limits
eastward extension of TC activity in WNP

Data and methodology


I Tropical cyclone best track data: JTWC and ADT-HURSAT
I Reanalysis data: 20CR, NCEP-NCAR, ERA-Interim, JRA-25,
MERRA, NCEP-DOE, CFSR
I Linear trends and correlations, with significance testing
Original paper
Introduction
Data preparation
Results
Data processing and visualization

Figure 1. Monthly mean 200hPa wind field (vectors, m s1 ) and vertical shear
of zonal wind between 850 hPa and 200 hPa (shaded, m s1 ) during
JuneNovember. Blue dots indicate TC genesis locations and thick green lines
show the TUTT trough line.

from Wu et al., 2015


Original paper
Introduction
Data preparation
Results
Data processing and visualization

Figure 2. Time series of (a) annual mean TC genesis longitude (blue) from the
JTWC dataset and annual mean TUTT longitude (red), and (b) annual mean
TC longitude from the ADT-HURSAT data set with (blue) and without (red)
the ENSO effect.

from Wu et al., 2015


Original paper
Introduction
Data preparation
Results
Data processing and visualization

Figure 3. Annual means of (a) TC formation frequency in the western and


eastern portions of the WNP basin, and (b) the difference of the tropospheric
temperature (red, K) between the tropics and the subtropics and the vertical
shear of the zonal wind (m s1 ) in the TUTT region.

from Wu et al., 2015


Original paper
Introduction
Data preparation
Results
Data processing and visualization

Figure 4. (a) JulyNovember mean zonal wind speed (contour, m s1 ) and the
associated trends (shaded, m s1 decade1 ) averaged over 5 N25 N during
19792012 and (b) JulyNovember mean temperature (contour, K) and the
associated trend (shaded, K decade1 ) averaged over 145 E170 W.

from Wu et al., 2015


Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

Much of the data we use is in NetCDF format, which we can read


(and write) using the netCDF4 module:
 
1 from netCDF4 i m p o r t Dataset
2
3 # read in IBTraCS tropical cyclone best track data
4 ncdf = Dataset ( ../ data / Basin . WP . ibtracs_all . v03r08 . nc )
 
The variables are a dictionary attribute of the file variable:
In[10]: ncdf.variables.keys()
Out[10]:
[ustorm_sn,
uname,
unumObs,
useason,
utrack_type,
ugenesis_basin,
unum_basins,
...]
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

We need to read in several variables from the IBTraCS dataset:


 
1 from netCDF4 i m p o r t Dataset
2
3 # read in IBTraCS tropical cyclone best track data
4 ncdf = Dataset ( ../ data / Basin . WP . ibtracs_all . v03r08 . nc )
5 year = ncdf . variables [ season ][:] # year
6 genb = ncdf . variables [ genesis_basin ][:] # 2 = Western North Pacific
7 tc_v = ncdf . variables [ source_wind ][: ,: ,10] # wind speed, in knots
8 tc_y = ncdf . variables [ source_lat ][: ,: ,10] # latitude of storm center
9 tc_x = ncdf . variables [ source_lon ][: ,: ,10] # longitude of storm center
10 tc_t = ncdf . variables [ source_time ][: ,:] # modified Julian day
11 ttyp = ncdf . variables [ track_type ][:] # 0,1 include cyclogenesis
12 ncdf . close ()
 
The index 10 refers to data from the Joint Typhoon Warning
Center (JTWC) in the Western Pacific, as we can see from
In[11]: .join(ncdf.variables[source][10])
Out[11]: jtwc_wp
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

IBTraCS stores dates as modified Julian days (number of days


since 17 November 1858). We want storms during specific months
(JuneNovember). We can check this using datetime:
 
1 i m p o r t datetime
2
3 # function to return month given modified Julian day
4 d e f mjd2month ( mjd , y0 =1858 , m0 =11 , d0 =17) :
5 """
6 A function for finding the month of a date expressed in
7 days relative to some baseline date
8
9 Variables:
10 mjd :: relative date (scalar)
11 (note: arrays will not work for mjd under this approach)
12
13 Parameters:
14 y0 :: year of baseline date
15 m0 :: month of baseline date
16 d0 :: day of baseline date
17
18 Note:
19 The default date used in this function corresponds to
20 the modified Julian date used by the IBTrACS data set.
21 """
22 date0 = datetime . datetime ( year = y0 , month = m0 , day = d0 )
23 date1 = date0 + datetime . timedelta ( days = mjd )
24 r e t u r n date1 . month
 
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

This example function is easy to use:


In[12]: print mjd2month(56628.75)
Out[12]: 12

but it is also somewhat limited. In particular, it cannot handle


arrays of dates (we will learn about methods for dealing with arrays
of dates later). However, the datetime module is often useful
when dealing with relative times like modified Julian days or
situations when we need to output dates in a particular format (see
the datetime.strftime() function).
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

We then need to find the genesis locations, which in this case are
defined as the first time that the maximum windspeed exceeded 25
knots (13 m s1 ):
 
1 i m p o r t numpy as np
2
3 # loop through storms and find genesis locations
4 cgx = []; cgy = []; cgm = []
5 # start with all tracks that include cyclogenesis in the WP
6 sdx = np . where ((( ttyp == 0) | ( ttyp == 1) ) & ( genb == 2) ) [0]
7 # note that loop is over indices, not over a range!
8 f o r ss i n sdx :
9 # check to see if the wind ever exceeds 25 knots
10 i f np . any ( tc_v [ ss ,:] >= 25) :
11 cg = np . where ( tc_v [ ss ,:] >= 25) [0][0]
12 # append lat/lon/month of cyclogenesis
13 cgy . append ( tc_y [ ss , cg ])
14 cgx . append ( tc_x [ ss , cg ])
15 cgm . append ( mjd2month ( tc_t [ ss , cg ]) )
16 # convert to arrays
17 cgy = np . array ( cgy )
18 cgx = np . array ( cgx )
19 cgm = np . array ( cgm )
20 # longitude in IBTraCS is (180,180); convert to (0,360)
21 cgx [ cgx < 0] += 360
 
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

We also need some reanalysis data for the winds and temperatures.
Here, we will use the JRA-55 reanalysis, which was not used in the
original paper. The data files are quite large, and it is therefore
convenient to preprocess the files to make them manageable. To
do this, I use the Climate Data Operators (CDO) utilities
developed at the Max-Planck Institut fur Meteorologie. First, I
select the temperature and wind data for June through November
using the selmon (select month) command:
cdo selmon,6,7,8,9,10,11 jra55nl_tmp_monthly_1958-2015.nc4 tmp_jjason.nc4
cdo selmon,6,7,8,9,10,11 jra55nl_ugrd_monthly_1958-2015.nc4 uwd_jjason.nc4
cdo selmon,6,7,8,9,10,11 jra55nl_vgrd_monthly_1958-2015.nc4 vwd_jjason.nc4

CDO can be installed by downloading and installing the source


from the project website, but on Mac OSX it is easier to use
MacPorts or Homebrew. CDO has excellent documentation.
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

To produce a JRA-55 version of Fig. 1 from the paper, we need to


do a bit more preprocessing using CDO. Specifically, we need to
calculate monthly climatologies of zonal and meridional winds over
19582015 using the ymonmean command, which calculates
monthly means spanning multiple years (i.e., an annual cycle):
cdo ymonmean uwd_jjason.nc4 uwd_jjason_mm.nc4
cdo ymonmean vwd_jjason.nc4 vwd_jjason_mm.nc4

and then select the 200 and 850 hPa levels, removing the rest of
the vertical profile (recall that we only need 850 hPa zonal wind to
calculate vertical shear, and do not need 850 hPa meridional wind):
cdo sellevel,85000,20000 uwd_jjason_mm.nc4 fig1_uwd.nc4
cdo sellevel,20000 vwd_jjason_mm.nc4 fig1_vwd.nc4
Original paper The netCDF4 module
Data preparation The datetime module
Data processing and visualization Pre-processing

We can then read in the data and calculate the zonal wind shear:
 
1 from netCDF4 i m p o r t Dataset
2
3 # read in JRA55 wind data
4 ncdf = Dataset ( ddir + fig1_uwd . nc4 )
5 jlon = ncdf . variables [ lon ][:]
6 jlat = ncdf . variables [ lat ][:]
7 u850 = ncdf . variables [ ugrd ][: ,0 ,: ,:]
8 u200 = ncdf . variables [ ugrd ][: ,1 ,: ,:]
9 ncdf . close ()
10 ncdf = Dataset ( ddir + fig1_vwd . nc4 )
11 v200 = ncdf . variables [ vgrd ][: ,0 ,: ,:]
12 ncdf . close ()
13 # calculate vertical wind shear
14 ushr = u850 - u200
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

We can plot a map using cartopy:


 
1 i m p o r t cartopy . crs as ccrs
2 i m p o r t cartopy . feature as cfeat
3 from cartopy . mpl . ticker i m p o r t LongitudeFormatter , L ati tude For mat ter
4
5 # formatting for tick marks
6 xfr = Lo n gi t ud eF o rm a tt er ( z e r o _ d i r e c t i o n _ l a b e l = True )
7 yfr = Lati tud eFo rma tte r ()
8
9 fig = plt . figure ( figsize =(13 , 4) )
10 axs = fig . add_subplot (111 , projection = ccrs . PlateCarree ( c ent ral_ lon git ude
=180) )
11 prj = ccrs . PlateCarree ( ce ntr al_ lon git ude =0)
12 # zonal wind shear
13 cs0 = axs . contourf ( jlon , jlat , ushr [0 ,: ,:] , clv , cmap = plt . cm . PuOr , extend =
both , transform = prj )
14 # streamplot to show winds
15 axs . streamplot ( jlon , jlat , u200 [0 ,: ,:] , v200 [0 ,: ,:] , color = k , density =
dns , transform = prj )
16 # scatter plot of TC genesis locations for June
17 idx = np . where ( cgm ==6)
18 axs . scatter ( cgx [ idx ] , cgy [ idx ] , marker = o , c = #377 eb8 , s =50 , zorder =10 ,
transform = prj )
19 axs . set_extent ([100 , 240 , 0 , 40] , prj )
20 axs . set_xticks ( r a n g e (100 , 241 , 20) , crs = prj )
21 axs . set_yticks ( r a n g e (0 , 41 , 10) , crs = prj )
22 axs . xaxis . s e t _ m a j o r _f o r m a t t e r ( xfr )
23 axs . yaxis . s e t _ m a j o r _f o r m a t t e r ( yfr )
24 axs . add_feature ( cfeat . COASTLINE , edgecolor = #333333 )
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

The result is effectively the same as what is shown in the paper,


although our version is missing the location of the TUTT. The
cartopy module has been developed by the UK Met Office.
40N
June

30N

20N

10N

0
100E 120E 140E 160E 180 160W 140W 120W

The cartopy interface can sometimes be confusing, but it has


some nice features that make it (in most cases) preferable to the
alternative basemap toolkit. We have also used the streamplot
function from matplotlib to represent the 200 hPa wind speeds
and directions, and the contourf function from matplotlib to
show the wind shear.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

Adding the location of the TUTT is more challenging, as the paper


does not describe this procedure clearly. The authors mention that
the TUTT is defined as the location where easterly winds
transition to westerly; here, I have extrapolated from the idea that
the transition from easterly to westerly indicates mass divergence
to define the TUTT location at each latitude as the longitude of
the maximum (positive) zonal gradient in zonal wind.
 
1 i m p o r t numpy as np
2 # for smoothing the visualization of the TUTT
3 from statsmodels . nonparametric . smoothers_lowess i m p o r t lowess
4
5 # TUTT
6 ttt = np . empty (20)
7 f o r yy i n r a n g e (45 ,65) :
8 dux = np . gradient ( np . squeeze ( u200 [5 , yy , xdx ]) )
9 ttt [ yy -45] = jlon [ xdx [ dux . argmax () ]]
10 tut = lowess ( ttt , jlat [45:65] , frac =0.75 , return_sorted = False )
11 axs . plot ( tut , jlat [45:65] , color = #999999 , linewidth =3 , transform = prj )
 
I have also used a local regression (lowess) filter from the
statsmodels module to smooth the curve.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

40N
June 40N
July
30N 30N
20N 20N
10N 10N
0 0
100E 120E 140E 160E 180 160W 140W 120W 100E 120E 140E 160E 180 160W 140W 120W

40N
August 40N
September
30N 30N
20N 20N
10N 10N
0 0
100E 120E 140E 160E 180 160W 140W 120W 100E 120E 140E 160E 180 160W 140W 120W

40N
October 40N
November
30N 30N
20N 20N
10N 10N
0 0
100E 120E 140E 160E 180 160W 140W 120W 100E 120E 140E 160E 180 160W 140W 120W
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

We need to do a bit more data processing to get the mean


longitude of cyclogenesis and the location of the TUTT between
5 N and 25 N for each year:
 
1 # get mean longitude of cyclogenesis for each year
2 yrs = np . arange (1958 ,2015)
3 mln = np . empty ( yrs . shape )
4 f o r ii i n r a n g e ( l e n ( yrs ) ) :
5 idx = np . where (( cgt == yrs [ ii ]) & ( cgm >= 6) & ( cgm <= 11) ) [0]
6 i f idx . any () :
7 mln [ ii ] = cgx [ idx ]. mean ()
8
9 # get location of TUTT (transition from easterlies to westerlies)
10 xdx = np . where (( jlon >= 120) & ( jlon <= 240) ) [0]
11 tutt = np . empty ( yrs . shape )
12 f o r ii i n r a n g e ( l e n ( yrs ) ) :
13 u = u200 [ ii , xdx ]
14 x = jlon [ xdx ]
15 tutt [ ii ] = x [ np . where ( u > 0) [0][0]]
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

Once we have these variables, we have multiple options for trend


analysis. For example, we could use the scipy.stats module to
work directly with the numpy arrays:
 
1 # linear regressions
2 from scipy i m p o r t stats
3
4 # linear regression over full time series
5 a0 , b0 , r0 , p0 , s0 = stats . linregress ( yrs , mln )
6 #linear regression from 19792013
7 a1 , b1 , r1 , p1 , s1 = stats . linregress ( yrs [21:] , mln [21:])
8
9 # linear regression over full time series
10 a2 , b2 , r2 , p2 , s2 = stats . linregress ( yrs , tutt )
11 #linear regression from 19792013
12 a3 , b3 , r3 , p3 , s3 = stats . linregress ( yrs [21:] , tutt [21:])
13
14 # linear correlation
15 r , p = stats . pearsonr ( mln , tutt )
 
stats.linregress returns the slope, intercept, correlation
coefficient R, 2-tailed p-value (that the slope is non-zero), and the
standard error of the slope estimate; stats.pearsonr returns the
Pearson correlation coefficient and the associated 2-tailed p-value.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

A convenient alternative in many cases is to construct a pandas


DataFrame and use seaborn.regplot(), especially if we only
need to see the trend:
 
1 # coding: utf8
2 i m p o r t pandas as pd
3 i m p o r t matplotlib . pyplot as plt
4 i m p o r t seaborn as sns
5
6 # plot parameters
7 sns . set_style ( darkgrid )
8
9 # make a pandas dataframe
10 df = pd . DataFrame ({ jtwc : mln , tutt : tutt } , index = yrs )
11
12 # plot time series of mean longitude of cyclogenesis, with trends
13 fig = plt . figure ( figsize =(12 ,6) )
14 axa = fig . add_subplot (211)
15 axa . plot ( df . index , df [ jtwc ] , - , color = #377 eb8 )
16 sns . regplot ( df . index , df [ jtwc ] , color = #377 eb8 , ax = axa , truncate = True )
17 sns . regplot ( df . ix [1979:]. index , df . ix [1979:][ jtwc ] , color = #377 eb8 ,
18 line_kws ={ linestyle : -- } , truncate = True , ci = None , ax = axa )
19 axa . set_xlabel ( )
20 axa . set_xlim (1955 ,2015)
21 axa . set_xticks ( r a n g e (1960 , 2011 , 10) )
22 axa . set_ylabel ( u TC mean genesis longitude [\ u00b0E ] , color = #377 eb8 )
23 axa . set_ylim (125 ,150)
24 axa . set_yticks ( r a n g e (125 ,151 ,5) )
25 axa . set_yticklabels ( r a n g e (125 ,151 ,5) , color = #377 eb8 )
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

 
1 sns . regplot ( df . index , df [ jtwc ] , color = #377 eb8 , ax = axa , truncate = True )
2 sns . regplot ( df . ix [1979:]. index , df . ix [1979:][ jtwc ] , color = #377 eb8 ,
3 line_kws ={ linestyle : -- } , truncate = True , ci = None , ax = axa )
 
150 190
TC mean genesis longitude [E]

145 180

TUTT longitude [E]


140 170

135 160

130 150
R = 0.64
125 140
1960 1970 1980 1990 2000 2010
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

A more sophisticated alternative to scipy.stats.linregress()


is offered by statsmodels.OLS():
 
1 i m p o r t statsmodels . api as sm
2 # add_constant prepares the x values
3 xvl = sm . add_constant ( df [ year ])
4 # use standardtype OLS to generate slope and intercept
5 trd = sm . OLS ( df [ tutt ] , xvl ) . fit ()
6
7 i m p o r t statsmodels . formula . api as smf
8 # use formulatype ols to generate slope and intercept
9 trd = smf . ols ( tutt ~ year , data = df ) . fit ()
10 # for 1979 2014
11 trd = smf . ols ( tutt ~ year , data = df . ix [21:]) . fit ()
 
Here we focus only on the 19792014 trend in TUTT longitude for
comparison with the results in the paper. The use of the function
sm.add constant() adds a column of ones to the independent
(x) variable array. This ensures that the regression model will
contain an intercept. We also subtract the initial year from the
independent variable, ensuring that that intercept is appropriate for
the year 1979 rather than the year 0.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

We can print a detailed summary of the regression model:


In[13]: print trd.summary()
Out[13]:
OLS Regression Results
==============================================================================
Dep. Variable: tutt R-squared: 0.187
Model: OLS Adj. R-squared: 0.163
Method: Least Squares F-statistic: 7.809
Date: Tue, 22 Mar 2016 Prob (F-statistic): 0.00848
Time: 08:50:25 Log-Likelihood: -114.38
No. Observations: 36 AIC: 232.8
Df Residuals: 34 BIC: 235.9
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 163.4347 1.950 83.830 0.000 159.473 167.397
x1 -0.2677 0.096 -2.794 0.008 -0.462 -0.073
==============================================================================
Omnibus: 3.578 Durbin-Watson: 1.963
Prob(Omnibus): 0.167 Jarque-Bera (JB): 1.571
Skew: 0.004 Prob(JB): 0.456
Kurtosis: 1.977 Cond. No. 39.9
==============================================================================

which tells us that the trend in TUTT is 0.27 longitude yr1 ,


and that this trend is significant at the 99% level.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

ENSO can also cause interannual variability in the locations of


cyclogenesis and the TUTT, so we should remove it from the time
series. Here we use statsmodels.OLS() to calculate the ordinary
least squares linear regression of mean cyclogenesis longitude
against the Nino3.4 index and then subtract it from the original
time series:
 
1 i m p o r t pandas as pd
2 i m p o r t statsmodels . formula . api as smf
3
4 # make a pandas dataframe
5 df = pd . DataFrame ({ jtwc : mln , tutt : tutt , nino : nino } , index = yrs )
6 # regress JTWC mean longitude of cyclogenesis against Nino 3.4
7 rgr = smf . ols ( jtwc ~ nino , data = df ) . fit ()
8 # remove ENSO component of variability
9 df [ estm ] = df [ jtwc ]. mean () + ( df [ jtwc ] - rgr . predict ( df [ nino ]) )
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

160
JTWC (original)
JTWC (ENSO influence removed)
TC mean genesis longitude [E]

150

140

130

120
1960 1970 1980 1990 2000 2010
Year

Although ENSO affects the year-to-year variability, the trend with


ENSO removed (0.11 longitude yr1 ) is within the 95%
confidence interval around the original trend (0.14 0.14
longitude yr1 ), so we can neglect ENSO effects.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

The next step is to count the annual number of TCs generated in


western ( < 145 E) and eastern ( > 145 E) parts of WNP:
 
1 # get number of TCs forming in the eastern/western WNP for each year
2 yrs = np . arange (1958 ,2015)
3 w_n = np . empty ( yrs . shape )
4 e_n = np . empty ( yrs . shape )
5 f o r ii i n r a n g e ( l e n ( yrs ) ) :
6 idx = np . where (( cgt == yrs [ ii ]) & ( cgm >= 6) & ( cgm <= 11) ) [0]
7 i f idx . any () :
8 w_n [ ii ] = ( cgx [ idx ] < 145) .sum()
9 e_n [ ii ] = ( cgx [ idx ] >= 145) .sum()
 
30
Western WNP
Eastern WNP
25

20
TC count

15

10

5
R = 0.27
0
1960 1970 1980 1990 2000 2010
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

To get the mean tropicalsubtropical temperature gradient we


return to cdo, first selecting levels between 850 and 200 hPa and
then calculating the vertical and annual averages:
cdo vertmean tmp_jjason_ts.nc4 tmp_jjason_tsmean.nc4
cdo yearmean tmp_jjason_tsmean.nc4 tmp_yr_tsmean.nc4
cdo sellonlatbox,145,180,-10,5 tmp_yr_tsmean.nc4 tmp_yr_tsmean_tropics.nc4
cdo sellonlatbox,145,180,15,30 tmp_yr_tsmean.nc4 tmp_yr_tsmean_subtrop.nc4
cdo fldmean tmp_yr_tsmean_tropics.nc4 fig3_tmp_tropics.nc4
cdo fldmean tmp_yr_tsmean_subtrop.nc4 fig3_tmp_subtrop.nc4

where sellonlatbox selects a region and fldmean takes the


area-weighted spatial average. The processing for calculating the
mean zonal wind shear in the TUTT region is similar:
cdo sellevel,85000,20000 uwd_jjason.nc4 uwd_jjason_vs.nc4
cdo sellonlatbox,145,180,5,25 uwd_jjason_vs.nc4 uwd_jjason_subtrop.nc4
cdo fldmean uwd_jjason_subtrop.nc4 uwd_jjason_subtropmean.nc4
cdo yearmean uwd_jjason_subtropmean.nc4 fig3_uwd.nc4

The actual subtraction of 850 hPa u from 200 hPa u is left for the
python code.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

2.1 14

1.8 12
Temperature gradient [K]

Zonal wind shear [m s 1 ]


1.5 10

1.2 8

0.9 6

0.6 4
R = 0.97
0.3 2

0.0 0
1960 1970 1980 1990 2000 2010
Year

The strong relationship between these two time series is consistent


with the hypothesis that interannual variability and trends in wind
shear (and hence the preferred location of cyclogenesis) in this
region are driven by the thermal wind relationship.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

To calculate vertical cross-sections of zonal wind and temperature


trends, we first calculate the appropriate vertical cross-sections of
zonal wind and temperature using cdo. For temperature:
cdo sellonlatbox,145,190,-50,50 tmp_jason.nc4 tmp_jason_pac.nc4
cdo yearmean tmp_jason_pac.nc4 tmp_annual_pac.nc4
cdo zonmean tmp_annual_pac.nc4 fig4_tmp_paczm.nc4

and for zonal wind:


cdo sellonlatbox,90,210,5,25 uwd_jason.nc4 uwd_jason_pac.nc4
cdo yearmean uwd_jason_pac.nc4 uwd_annual_pac.nc4
cdo mermean uwd_annual_pac.nc4 fig4_uwd_pacmm.nc4

Note that we calculate the zonal mean (zonmean) of temperature


across the selected region, but the meridional mean (mermean) of
zonal wind.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

To calculate trends in a time series effectively, we can define a


function that uses statsmodels.OLS():
 
1 i m p o r t statsmodels . api as sm
2
3 # function to calculate trend and significance from a pandas time series
4 d e f pdtrend (x , ci =0.95) :
5 """
6 A basic function for calculating trends given a pandas time
7 series.
8
9 Variables:
10 x :: the series
11
12 Parameters:
13 ci :: confidence interval for significance testing
14 """
15 # statsmodels regression require us to add a constant
16 xvl = sm . add_constant ( np . array ( x . index ) )
17 # ordinary least squares linear regression
18 rgr = sm . OLS ( x . values , xvl ) . fit ()
19 # trend slope (per year in this case)
20 trnd = rgr . params [1]
21 # simple representation of statistical significance (True/False)
22 tsig = ( rgr . pvalues [1] < (1 - ci ) )
23 r e t u r n ( trnd , tsig )
 
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

An important note at this point: the example paper we are working


from uses a non-parametric Mann-Kendall test, and accounts for
auto-correlation in the time series in significance testing. The
function on the previous page uses Students t test, which is (a)
parametric and (b) assumes that the underlying data are normally
distributed. We have also not adjusted the effective sample size to
account for auto-correlation. Our approach is acceptable for
exploratory analysis, but would be unsuitable for publication-quality
work. The reason that we have used a function in this case is that
this makes it easier to later add or modify the criteria for statistical
significance to make them more sophisticated or data-aware. For
more details, see Chapter 17 of Statistical Analysis in Climate
Research by Hans von Storch and Francis W. Zwiers.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

Our function returns both the slope of the trend and a boolean
(True or False) value indicating whether it is significant or not,
which we can use in tandem with numpy masked arrays:
 
1 i m p o r t pandas as pd
2
3 tru = np . ma . masked_all ( mm_u . shape [1:])
4 trt = np . ma . masked_all ( zm_t . shape [1:])
5 f o r zz i n r a n g e ( tru . shape [0]) :
6 f o r xx i n r a n g e ( tru . shape [1]) :
7 ser = pd . Series ( mm_u [: , zz , xx ])
8 m , s = pdtrend ( ser )
9 tru . data [ zz , xx ] = m *10 # convert trend to per decade
10 tru . mask [ zz , xx ] = ~ s
11 f o r yy i n r a n g e ( trt . shape [1]) :
12 ser = pd . Series ( zm_t [: , zz , yy ])
13 m , s = pdtrend ( ser )
14 trt . data [ zz , yy ] = m *10 # convert trend to per decade
15 trt . mask [ zz , yy ] = ~ s
 
Masked arrays make it convenient to emphasize significant trends
or hide insignificant trends in the contour plots.
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

 
1 cs1 = ax . contour ( jlon , jlev , mm_u . mean ( axis =0) , np . arange ( -50 ,50 ,2) ,
2 colors = k )
3 cs2 = ax . contourf ( jlon , jlev , tru . data , np . linspace ( -1 ,1 ,11) ,
4 cmap = plt . cm . RdBu_r , extend = both )
5 ax . contourf ( jlon , jlev , tru . mask . astype ( int ) , [ -0.5 ,0.5] ,
6 hatches =[ xx , none ] , colors = none , edgecolor = #666666 ,
7 zorder =10)
8 cb = plt . colorbar ( cs2 , orientation = vertical , extend = both , aspect =50)
9 cb . set_ticks ([ -1 , -0.5 ,0 ,0.5 ,1])
10 cb . set_label ( Trend in zonal wind [ m s$ ^{ -1} $ dec$ ^{ -1} $ ] )
 
100 100
1.0 0.5

0.4
150 150
0.3
0.5
200 200

Trend in zonal wind [m s 1 dec 1 ]


0.2

Trend in temperature [K dec 1 ]


250 250 0.1
Pressure [hPa]

300
0.0 Pressure [hPa] 300
0.0

400 400 0.1

500 500 0.2


0.5
0.3

700 700
0.4
850 850
1.0 0.5
1000 1000
100E 120E 140E 160 180 45S 30S 15S 0 15N 30N 45N
Original paper Mapping: the cartopy module
Data preparation Linear regression and correlations
Data processing and visualization More contour plots and masked arrays

We can also limit our trend analysis to only the period after 1979,
in which case our results are more similar to those in the paper.
 
1 ncdf = Dataset ( ddir + fig4_uwd_pacmm . nc4 )
2 jlev = ncdf . variables [ lev ][:27]*0.01
3 jlon = ncdf . variables [ lon ][:]
4 mm_u = np . squeeze ( ncdf . variables [ ugrd ][21: ,:27 ,: ,:])
5 ncdf . close ()
 

100 100
1.0 0.5

0.4
150 150
0.3
0.5
200 200

Trend in zonal wind [m s 1 dec 1 ]


0.2

Trend in temperature [K dec 1 ]


250 250 0.1
Pressure [hPa]

300
0.0 Pressure [hPa] 300
0.0

400 400 0.1

500 500 0.2


0.5
0.3

700 700
0.4
850 850
1.0 0.5
1000 1000
100E 120E 140E 160 180 45S 30S 15S 0 15N 30N 45N

You might also like