36 views

Uploaded by Sam Caunca Ortiz

R Programming for numerical measures

- 3-statistical-description-of-data.ppt
- MMEF Portfolio Theory
- 2014b December Paper 1
- S&P500 Garch Output Final
- Lampiran 3
- Phenotypic and Genotypic Correlation Co-efficient of Quantitative Characters and Character Association of Aromatic Rice-libre
- RueyTsay
- ARTICOL PIETE Capital Market Europa Pieteas
- Chap03CT and Variation
- MBA-KERALA-FIRST
- Statistics Manual
- Standard Deviation and Variance
- Tutorial Kf
- 07112542
- A Profitable Trading Strategy
- 3. 2009 Central Limit Theorem
- Lecture Slides Stats1.13.L04.AIR
- Chapter_4
- dexin-zhou.pdf
- paramest.pdf

You are on page 1of 8

CSSSPEC4

(Modeling and Simulation Using MATLAB)

EXERCISE

5

ELEMENTARY STATISTICS WITH R/MATLAB:

NUMERICAL MEASURES

<Name of Student>

<Data Performed>

<Name of Professor>

<Date Submitted>

I.

Objectives:

Cognitive

a) Review on the basic statistical concepts on numerical measures.

Psychomotor:

a Create fast solutions for elementary statistics on numerical measures.

Affective

a Appreciate how R programming can simplify and hasten the solution of numerical

measures problems.

II.

BACKGROUND INFORMATION

Mean

The mean of an observation variable is a numerical measure of the central location of the

data values. It is the sum of its data values divided by data count.

Hence, for a data sample of size n, its sample mean is defined as follows:

To find the mean eruption duration in the data set faithful, we apply the mean function to

compute the mean value of eruptions.

Median

The median of an observation variable is the value at the middle when the data is sorted

in ascending order. It is an ordinal measure of the central location of the data values.

To find the median of the eruption duration in the data set faithful, we apply the median

function to compute the median value of eruptions.

Quartile

There are several quartiles of an observation variable. The first quartile, or lower quartile,

is the value that cuts off the first 25% of the data when it is sorted in ascending order. The

second quartile, or median, is the value that cuts off the first 50%. The third quartile, or

upper quartile, is the value that cuts off the first 75%.

To find the quartiles of the eruption durations in the data set faithful, we apply the

quantile function to compute the quartiles of eruptions.

The first, second and third quartiles of the eruption duration are 2.1627, 4.0000 and

4.4543 minutes respectively.

Percentile

The nth percentile of an observation variable is the value that cuts off the first n percent of

the data values when it is sorted in ascending order.

To find the 32nd, 57th and 98th percentiles of the eruption durations in the data set

faithful, we apply the quantile function to compute the percentiles of eruptions with the

desired percentage ratios.

The 32nd, 57th and 98th percentiles of the eruption duration are 2.3952, 4.1330 and 4.9330

minutes respectively.

Range

The range of an observation variable is the difference of its largest and smallest data

values. It is a measure of how far apart the entire data spreads in value.

To find the range of the eruption duration in the data set faithful, we apply the max and

min function to compute the largest and smallest values of eruptions, then take the

difference.

Interquartile Range

The interquartile range of an observation variable is the difference of its upper and lower

quartiles. It is a measure of how far apart the middle portion of data spreads in value.

To find the interquartile range of eruption duration in the data set faithful, we apply the

IQR function to compute the interquartile range of eruptions.

Box Plot

The box plot of an observation variable is a graphical representation based on its

quartiles, as well as its smallest and largest values. It attempts to provide a visual shape of

the data distribution.

To find the box plot of the eruption duration in the data set faithful, we apply the boxplot

function to produce the box plot of eruptions.

Variance

The variance is a numerical measure of how the data values is dispersed around the mean.

In particular, the sample variance is defined as:

Similarly, the population variance is defined in terms of the population mean and

population size N:

To find the variance of the eruption duration in the data set faithful, we apply the var

function to compute the variance of eruptions.

Standard Deviation

The standard deviation of an observation variable is the square root of its variance.

To find the standard deviation of the eruption duration in the data set faithful, we apply

the sd function to compute the standard deviation of eruptions.

Covariance

The covariance of two variables x and y in a data sample measures how the two are

linearly related. A positive covariance would indicates a positive linear relationship

between the variables, and a negative covariance would indicate the opposite.

The sample covariance is defined in terms of the sample means as:

as:

To find the covariance of the eruption duration and waiting time in the data set faithful,

we observe if there is any linear relationship between the two variables and we apply the

cov function to compute the covariance of eruptions and waiting.

The covariance of the eruption duration and waiting time is 13.978. It indicates a positive

linear relationship between the two variables.

Central Moment

The kth central moment (or moment about the mean) of a data population is:

Similarly, the kth central moment of a data sample is:

To find the third central moment of eruption duration in the data set faithful, we apply the

function moment from the e1071 package. As it is not in the core R library, the package

has to be installed and loaded into the R workspace.

Skewness

The skewness of a data population is defined by the following formula, where 2 and 3

are the second and third central moments.

Intuitively, the skewness is a measure of symmetry. As a rule, negative skewness

indicates that the mean of the data values is less than the median, and the data distribution

is left-skewed. Positive skewness would indicates that the mean of the data values is

larger than the median, and the data distribution is right-skewed.

To find the skewness of eruption duration in the data set faithful, we apply the function

skewness from the e1071 package to compute the skewness coefficient of eruptions. As

the package is not in the core R library, it has to be installed and loaded into the R

workspace.

The skewness of eruption duration is -0.41355. It indicates that the eruption duration

distribution is skewed towards the left.

Kurtosis

The kurtosis of a univariate population is defined by the following formula, where 2 and

4 are the second and fourth central moments.

Intuitively, the kurtosis is a measure of the peakedness of the data distribution. Negative

kurtosis would indicates a flat data distribution, which is said to be platykurtic. Positive

kurtosis would indicates a peaked distribution, which is said to be leptokurtic.

Incidentally, the normal distribution has zero kurtosis, and is said to be mesokurtic.

To find the kurtosis of eruption duration in the data set faithful, we apply the function

kurtosis from the e1071 package to compute the kurtosis of eruptions. As the package is

not in the core R library, it has to be installed and loaded into the R workspace.

The kurtosis of eruption duration is -1.5116, which indicates that eruption duration

distribution is platykurtic. This is consistent with the fact that its histogram is not bellshaped.

III.

EXPERIMENTAL PROCEDURE:

Use the faithful dataset to answer for the following:

1.

Find the mean eruption waiting periods in faithful.

2.

Findthemedianoftheeruptionwaitingperiodsinfaithful.

3.

Find the quartiles of the eruption waiting periods in faithful.

4.

Find the 17th, 43rd, 67th and 85th percentiles of the eruption waiting periods in faithful.

5.

Find the range of the eruption waiting periods in faithful.

6.

Find the interquartile range of eruption waiting periods in faithful.

7.

Find the box plot of the eruption waiting periods in faithful.

8.

Findthevarianceoftheeruptionwaitingperiodsinfaithful.

9.

Find the standard deviation of the eruption waiting periods in faithful.

10.

Find the third central moment of eruption waiting period in faithful.

11.

Find the skewness of eruption waiting period in faithful.

12.

Find the kurtosis of eruption waiting period in faithful.

1. Cite samples of situations when a left skew is preferred than a right skew. Cite samples of

situations when a right skew is preferred than a left skew.

2. Why is kurtosis a measured value in statistics? What does it imply and how does its value

contribute to the analysis of a given data set?

- 3-statistical-description-of-data.pptUploaded byAnonymous rkGhCUh
- MMEF Portfolio TheoryUploaded byRajkumar35
- 2014b December Paper 1Uploaded byecobalas7
- S&P500 Garch Output FinalUploaded byanandparanilesh
- Lampiran 3Uploaded bykaderman harita
- Phenotypic and Genotypic Correlation Co-efficient of Quantitative Characters and Character Association of Aromatic Rice-libreUploaded byMnaSiddique
- RueyTsayUploaded bymirusiv
- ARTICOL PIETE Capital Market Europa PieteasUploaded bySergiu Pupazan
- Chap03CT and VariationUploaded byAbhey Bansal
- MBA-KERALA-FIRSTUploaded byunni
- Statistics ManualUploaded byJeypi Ceron
- Standard Deviation and VarianceUploaded byAgha Shafi Jawaid Khan
- Tutorial KfUploaded byBedadipta Bain
- 07112542Uploaded byVikas Goyal
- A Profitable Trading StrategyUploaded byfredtag4393
- 3. 2009 Central Limit TheoremUploaded byfabremil7472
- Lecture Slides Stats1.13.L04.AIRUploaded byPepe Mejia
- Chapter_4Uploaded byNJeric Ligeralde Ares
- dexin-zhou.pdfUploaded by951m753m
- paramest.pdfUploaded byMouhamadou Diaby
- Chpt5Uploaded bykpchak
- Outline 2016Uploaded bysqu
- Tr the Proper Use of RiskUploaded byFaiaz Ahamed
- Math B22 Practice Exam 1Uploaded byblueberrymuffinguy
- ProbabilityUploaded bybenitogaldos19gmail.com
- Mathematical symbols list (+,-,x,º,=,(,),...)Uploaded bytgoip
- lec2Uploaded byAdeyemi Odeneye
- NWU-EECS-07-03Uploaded byeecs.northwestern.edu
- Output estadísticaUploaded bySandra Martinez
- Classification project- application using SAS base programingUploaded byAndra

- Chi SquareUploaded byJazzmin Angel Comaling
- super bowl projectUploaded byapi-282062604
- Very Basic StatisticsUploaded byElwin Michaelraj Victor
- anl05Uploaded byRoni Tua Yohanes
- Ejemplo Grubbs y CochranUploaded byCarlos Timana
- S1 Summary of DataUploaded byAdnan Muridi Baba
- Time Series Analysis in Python With StatsmodelsUploaded byChong Hoo Chuah
- Lag SelectionUploaded byYenny Khairina Nirmayantina
- PROBLEM SET-4 Continuous Probability - SolutionsUploaded bymaxentiuss
- Chapter 19Uploaded byPRIYANKA H MEHTA
- Mathematics Promotional Exam Cheat SheetUploaded byAlan Aw
- Manual for SOA Exam MLC.Uploaded bydyulises
- hypothesistestingz-testt-test-150831182208-lva1-app6892.pdfUploaded byG Nathan Jd
- chap13-8Uploaded byKarla Hoffman
- Effect Size 4-9-2012Uploaded byMustafa Ersoy
- Independent Sample T-TestUploaded byMichael John F. Natividad
- s2Uploaded byAnoop Chakkingal
- Walkshäusl - 2016 - Mispricing and the Five-factor ModelUploaded byeertsii
- CAPM vs. Fama-French3 ModelUploaded byKaloyan Manolov
- ch6Uploaded byGulleds Ibrahim
- Sta301lec1to45mcqsUploaded bySarfraz Ali
- ELEN4903_hw1_Spring2018Uploaded byAzraf Anwar
- Ch10 Slides 1Uploaded byJenningsJingjingXu
- Pelican SolutionUploaded byrkpatham1718
- Descriptive Statistics - Tasks (Example)-1Uploaded by0p00
- Quantitative Approach Disadvantages Quantitative Approach+DisadvantagesUploaded byrscordova
- Getting Started - Module 1Uploaded byhuynh dung
- Course OverviewUploaded byRocio Gill
- ch03Uploaded byAnkit
- Income inequality and macroeconomic activity in GreeceUploaded byKOSTAS CHRISSIS