You are on page 1of 20

A PROJECT REPORT

Title: Relation between Temperature and


Humidity
Submitted by
Vennela Choutpally (16MES0022)
Pragna Kaduru (16MES0038)
Vani Manasa Nyshadham (16MES0053)
Nisarg Shah (16MES0057)
Course Code: MAT5007
Course Title: Applied Statistical Methods

Under the guidance of

Prof. Rita.S
Assistant Professor (Sr), SAS,
VIT University, Vellore.
SCHOOL OF ELECTRONICS ENGINEERING

Abstract
In this project we are calculating the measures of central
tendency viz., mean, median & mode along with the Correlation
between Temperature and Humidity for every individual month
within a time period of 1901 to 2015.

Introduction
Statistics is the study of the collection, analysis, interpretation,
presentation, and organization of data. In applying statistics to, e.g., a
scientific, industrial, or social problem, it is conventional to begin with a
statistical population or a statistical model process to be studied.
Populations can be diverse topics such as "all people living in a country"
or "every atom composing a crystal. Statistics deals with all aspects of
data including the planning of data collection in terms of the design of
surveys and experiments.
These are two broad categories of statistics. They are descriptive and
inferential.
1. Descriptive statistics summarize population data numerically or
graphically by deriving
statistics pertaining to central tendency such as the mean,
median, or mode
statistics pertaining to dispersion around the central tendency
such as the range or standard deviation
statistics or graphs depicting the shape of a distribution
2. Inferential statistics allow one to infer population parameters based
upon sample statistics and to model relationships within the data.
The categories of inferential statistics are
Estimation is the group of statistics which allow for the
estimation about population values based upon sample data.
Modelling allows us to develop mathematical equations which
describe the interrelationships between two or more variables.

Methodology

In this project, we have collected data on temperature and


humidity in the area of Vellore for the years 1901-2015. Then
we have carried out central tendencies (Mean, Median, Mode)
and correlation on the data grouping them month wise, i.e.
calculations were done on the data for the month of January for
all 114 years, and so on.
In this project using the above collected secondary data we are
analysing the relationship between temperature and humidity
of Vellore city for the years 1901-2015
Statistics approach:
We are calculating the measures of central tendency, mean,
median & mode along with the Correlation between Temperature
and Humidity.
Testing the hypothesis:
Hypothesis testing allows us to test for whether a hypothesis weve
developed is supported by a systematic analysis of the data.

Null hypothesis:
H0: There is no correlation between temperature and humidity.
Alternate hypothesis:
H1: There is a correlation between temperature and humidity.

Calculations
Month - January

Line Graph: Temperature vs Years

Bar Graph: Temperature vs Humidity

The correlation between the temperatures and the humidity

Scatter Graph: Temperature vs Humidity

Line Graph: Humidity vs Years

Mean Maximum Temperature in January in the time period:


> Mean=mean(jan_max$TEMP)
> Mean
[1] 23.63513
Mean Humidity Temperature in January in the time period:
> Mean_humidity=mean(jan_max$HUMIDITY)
> Mean_humidity
[1] 65.66303

Calculation of Mode:
> table_jan=table(jan_max$TEMP)
> mode=which(table_jan==max(table_jan));
> mode
23.57 23.61 23.91
43 46 59
Calculation of Median:
> Median=median(jan_max$TEMP)
> Median
[1] 23.61
Standard Deviation of Temperatures:
> SD=sd(jan_max$TEMP)
> SD
[1] 0.7390883

Variance in Temperatures:
> Variance=SD*SD
> Variance
[1] 0.5462515
Summary Statistics :
> summary(jan_max$TEMP)
Min. 1st Qu. Median Mean 3rd Qu. Max.
22.00 23.08 23.61 23.64 24.16 25.66
> summary(jan_max$HUMIDITY)
Min. 1st Qu. Median Mean 3rd Qu. Max.
60.25 63.50 66.21 65.66 67.41 70.32

Correlation between Temperature and Humidity


> cor.test(jan_max$TEMP,jan_max$HUMIDITY,method="pearson");
Pearson's product-moment correlation
data: jan_max$TEMP and jan_max$HUMIDITY
t = -3.0507, df = 113, p-value = 0.002845
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.43689369 -0.09767445

sample estimates:
cor
-0.2758513

Month February

Line Graph: Temperature vs Years

Bar Graph: Temperature vs Humidity

The correlation between the Temperatures and Humidity

Scatter Graph: Temperature vs Humidity

Line Graph: Humidity vs Years

Mean Maximum Temperature in February in the period:


> Mean=mean(feb_max$TEMP)
> Mean
[1] 25.52843
Mean Maximum Humidity in February during the period:
> Mean=mean(feb_max$HUMIDITY)
> Mean
[1] 65.66303

Calculation of Mode:
> table_feb=table(feb_max$TEMP)
> Mode=which(table_feb==max(table_feb))
> Mode
25.12 25.35 26.07
40 47 66
Calculation of Median:
> Median=median(feb_max$TEMP)
> Median
[1] 25.39
Standard Deviation of the Temperatures:
> SD=sd(feb_max$TEMP)
> SD
[1] 1.030881
Variance:
> Variance=var(feb_max$TEMP)
> Variance
[1] 1.062715
Summary Statistics:
> summary(feb_max$TEMP)
Min. 1st Qu. Median Mean 3rd Qu. Max.
22.83 24.78 25.39 25.53 26.28 29.33
> summary(feb_max$HUMIDITY)
Min. 1st Qu. Median Mean 3rd Qu. Max.
60.25 63.50 66.21 65.66 67.41 70.32

Correlation between the Temperature and the Humidity


> cor.test(feb_max$TEMP,feb_max$HUMIDITY,method="pearson")
Pearson's product-moment correlation
data: feb_max$TEMP and feb_max$HUMIDITY
t = -2.5786, df = 113, p-value = 0.0112
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.40152011 -0.05500594

sample estimates:
cor
-0.235742

Month March

Line Graph: Temperature vs Years

Bar Graph: Temperature vs Humidity

Correlation between the Temperatures and the Humidity

Scatter Graph: Temperature vs Humidity

Line Graph: Humidity vs Years

Mean of the Maximum Temperatures in March in the period


> Mean=mean(mar_max$TEMP)
> Mean
[1] 29.03339

Mean of the Humidity in the period

> Mean=mean(mar_max$HUMIDITY)
> Mean
[1] 65.66303

Calculation of Mode:
> table_mar=table(mar_max$TEMP)
> View(table_mar)
> Mode=which(table_mar==max(table_mar))
> Mode
27.04 27.31 27.62 27.78
28.9
2

13

28 28.11 28.21 28.61 28.7 28.74 28.75 28.89


19

20

27

28.95 29.04 29.07 30.32 30.79


42

45

47

87

94

Calculation of Median:
> Median=median(mar_max$TEMP)
> Median
[1] 29.02

Sandard Deviation:
> SD=sd(mar_max$TEMP)
> SD
[1] 0.99691

Variance:
> Variance=var(mar_max$TEMP)
> Variance
[1] 0.9938296

31

33

34

40

41

Summary Statistics:
> summary(mar_max$TEMP)
Min. 1st Qu. Median

Mean 3rd Qu.

Max.

26.68 28.31 29.02 29.03 29.58 32.33

> summary(mar_max$HUMIDITY)
Min. 1st Qu. Median

Mean 3rd Qu.

Max.

60.25 63.50 66.21 65.66 67.41 70.32

Correlation:
> cor.test(mar_max$TEMP,mar_max$HUMIDITY,method="pearson");

Pearson's product-moment correlation

data: mar_max$TEMP and mar_max$HUMIDITY


t = -1.0955, df = 113, p-value = 0.2756
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.28035776 0.08214257
sample estimates:
cor
-0.1025098

Month April

Line Graph: Temperature vs Years

Bar Graph: Temperature vs Humidity

Correlation:

Scatter Graph: Temperature vs Humidity

Line Graph: Humidity vs Years

Mean of the Temperatures in April during the period:


> Mean=mean(april_max$TEMP)
> Mean
[1] 31.92035

Mean of the Humidity:


> Mean=mean(april_max$HUMIDITY)

> Mean
[1] 76.01823

Calculation of Mode:
> table_apr=table(april_max$TEMP)
> Mode=which(table_apr==max(table_apr))
> Mode
31.7
36

Calculation of Median:
> Median=median(april_max$TEMP)
> Median
[1] 31.95

Standard Deviation:
> SD=sd(april_max$TEMP)
> SD
[1] 0.7891133

Variance:
> Variance=var(april_max$TEMP)
> Variance
[1] 0.6226999

Summary Statistics:

> summary(april_max$TEMP)
Min. 1st Qu. Median

Mean 3rd Qu.

Max.

30.01 31.44 31.95 31.92 32.40 34.07

> summary(april_max$HUMIDITY)
Min. 1st Qu. Median

Mean 3rd Qu.

Max.

70.00 74.23 76.32 76.02 78.36 80.32

Correlation between the Temperature and the Humidity:


> cor.test(april_max$TEMP,april_max$HUMIDITY,method="pearson")

Pearson's product-moment correlation

data: april_max$TEMP and april_max$HUMIDITY


t = 0.75939, df = 113, p-value = 0.4492
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.1133332 0.2510904
sample estimates:
cor
0.07125604

Inference
We can see that the P value we have obtained for the above months is
less than 0.5 hence the Null hypothesis is true i.e. there is almost no
correlation between temperature and humidity.

References
[1] Summaries of Statistical Analyses of Differences in Relative
Humidity, Temperature. Elizabeth Weatherhead, Gregory
Noonan, Tressa Fowler, Ligia Bernardet, Louisa Nance and Steve
Koch
[2] Climate change and Biodiversity, pp. 63, 2002.

[3] M. Li, Y. Dirong, "Some results of applications of statistical


method to climate changes and short term prediction in China", Adva.
Atmosphere. Sci., vol. 2, pp. 271-281, 1985.
[4] https://en.wikipedia.org/wiki/Statistic
[5] http://www.indiawaterportal.org/articles/meteorological-datasetsdownload-entire-datasets-various-meteorological-indicators-1901

You might also like