You are on page 1of 18

Collating Datasets

Cross Sectional, Panel and Time Series Data


Group 2 – Abijeet Singh, Rohan Mahajan, Ayushi Khurana, Rithvik Kumar

INTRODUCTION
Over the years, India has come a long way in improving its economic and social indices. Despite vast
improvements, the country continues to be plagued by deterrents to development like high rate of
population, poverty and inequalities in income distribution. In addition to this, India faces problems
of caste barriers, patriarchal traditions combined with low literacy which adds additional barrier to
development in the country.

The Government takes the responsibility to collect and organise the data for further analysis.
However due to the diversity of population and limited survey methods, the process is not very easy.
Due to the lack of awareness about the survey methodology, understanding of the survey
questionnaire and unreliable data collectors, a lot of respondents are unable to participate in the
process. These are few reasons why collection of data becomes very difficult on the ground level.

In the following sections, 3 datasets have been considered mainly to assess the factors that affect
key economic and social indicators in the country. Considering that developmental factors are to be
analysed, the data collected broadly pertains to aspects of agriculture, health and crime. Further,
methods of collection by the primary sources are assessed to determine the authenticity, accuracy
and quality of the data. In addition to analysing the macroeconomic indicators, we wish to
understand the challenges in collecting and collating such data.

THEME: Prevalence of Crime Across States (Cross


Sectional Data)
Prevalence of crime in a state is a major issue for both the people living in the area and the
government ruling it. It causes unnecessary problems and disturbances to the well-being of the
people. We see how various social and structural factors affect the levels of crime in different states.

A cross sectional dataset comprises of a set of observational units say, households, states or
countries which are measured at a point of time. It must be noted that a significant drawback to this
form of data collection is that it is highly unlikely that all units correspond to exactly the same time
period. In the dataset that has been selected, 28 States in India are chosen to be the observational
units for the year 2015-16.

Variables and Sources


Population per Policeperson (PoP): The variable was sourced from documents by the Nation Crime
Records Bureau (NCRB), Ministry of Home Affairs. The data is submitted by each state police
department following parameters set by the NCRB.
Population projections are based off the 2001 Census for 2015. The census projected the population
as on 1st March from 2001-2026. Population enumeration in the 2001 census was carried out from
9-28th February of the same year. This was preceded by a household estimation to make it easier to
enumerate and distinguish the household and houseless population.

Violent Crime Rate (VC rate): The variable was obtained from documents by the Nation Crime
Records Bureau (NCRB), Ministry of Home Affairs. Violent crime cases are reported at various police
stations in districts of a state. This data is then consolidated by state level police agencies which
include the State Crime Records Bureau (SCRB) and the criminal investigation department. Number
of violent crimes are then reported and submitted by the respective state governments to the NCRB.
These are submitted keeping in mind predetermined format set by the NCRB.

The Bureau adopts a norm of ‘Principle Offence Rule’ which implies that if more than one crime is
registered in a single First Investigation Report (FIR), it will only be considered as a single crime. Also,
the most heinous of the crimes (one which offers maximum punishment) will be considered.

Unemployment Rate (UE Rate): Unemployment data has been obtained for 28 states in India for the
year 2013-14, from the documents by Ministry of Lobour and Employment (MoLE). Directorates of
Economics and Statistics of the selected states were requested to collect data through fieldwork. For
the states that didn’t comply, the MoLE assigned contractors to carry out the surveys.

The states had used stratified samples to carry out the survey. This implies that every district in each
state had been divided broadly into two strata; (i) rural stratum and (ii) urban stratum. Sampling
happens in two stages; first, initial set of units are drawn from urban and rural strata using circular
sampling technique after which there is second stage strata formed from the first stage units based
on the number of people in each household who are above 15 years of age.

Per Capita Income: The data was retrieved from the Central Statistics Office (CSO) documents and
the National Employment Survey (NES). CSO annually collects data on Compensation of Employees
(COE) for activity branches or parts of activity branches not already covered by existing surveys, for
use in the annual National Accounts compilation. The variable was taken at 2015-16 prices.

Methodology
Population per Policeperson (PoP): It is the proportion of policeperson to population in each state.
In other words, it measures the comparative population under the responsibility of one
policeperson. Each state has a sanctioned strength which is determined by the central government.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑙𝑖𝑐𝑒𝑚𝑒𝑛 𝑖𝑛 𝑔𝑖𝑣𝑒𝑛 𝑠𝑡𝑎𝑡𝑒


𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑡𝑎𝑡𝑒

Violent Crime Rate (VC rate): It is the number of violent crimes that were reported divided by the
population of the given state in the 28 states that have been chosen.
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑖𝑜𝑙𝑒𝑛𝑡 𝑐𝑟𝑖𝑚𝑒𝑠 𝑡ℎ𝑎𝑡 𝑎𝑟𝑒 𝑟𝑒𝑝𝑜𝑟𝑡𝑒𝑑 𝑖𝑛 𝑎 𝑠𝑡𝑎𝑡𝑒
𝑀𝑖𝑑 − 𝑌𝑒𝑎𝑟 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑃𝑟𝑜𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑡𝑎𝑡𝑒(𝐼𝑛 𝐿𝑎𝑘ℎ𝑠)

Unemployment Rate (UE Rate): It is the number of people (of 15 years and above) who are
unemployed per 1000 people in the labour force. The labour force and the unemployed are
calculated using the Usual Status and Principal Subsidiary Status approach. This means that if a
person has engaged in any economic activity for a period of 30 days or more during the preceding
365 days, a person is considered employed under this approach. The labour force includes both
people who are either ‘working’ (employed) or ‘seeking and available for work’(unemployed).

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑑


× 100
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑓𝑜𝑟𝑐𝑒

Per Capita Income: It is the mean income of the people in an economic unit such as a country or city.
It is calculated by taking a measure of all sources of income in the aggregate (such as GDP or Gross
national income) and dividing it by the total population.
𝑇𝑜𝑡𝑎𝑙 𝑃𝑒𝑟𝑠𝑜𝑛𝑎𝑙 𝐼𝑛𝑐𝑜𝑚𝑒
𝑇𝑜𝑡𝑎𝑙 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

Summary Statistics
Variable Count Mean Sd Var Min Max

PoP 28 344.2679 287.6145 82722.12 74.8 965.8

VC rate 28 35.31429 15.73989 247.7442 12.6 77.5

UE Rate 28 46.35714 28.74693 826.3862 6 106

PCI 28 116928.1 70699.21 5.00E+09 30213 334576

Variable PoP VC Rate UE Rate PCI


PoP 1
VC Rate -0.0434 1
UE Rate 0.2086 -0.0438 1
PCI 0.0851 0.0253 0.2801 1
Limitations
• There is a difference between actual and sanctioned population per policeperson. The
government sanctions a larger amount of budget and police force compared to which
actually shows up. Also, a lot of policemen who are on duty are unable to provide services
due to personal or professional reasons. This reduces the overall policeman available in any
state.
• Increase in police force doesn’t match increase in population as the population of India is
increasing is a higher rate. Also, the number of people taking up police services as their job
as fallen down adding on to the problem.
• In many cases the crimes are not reported or the FIR’s are not launched which leads to a
misleading count of the reported crimes.
• Corruption and crime go hand in hand. A lot of crimes committed by the higher class or by
people in power are hidden hence are again not part of the total numbers.

THEME: Socio-Economic Impact of Female Literacy


in India over Three Decades (Panel Data)
Educated women are known to take informed reproductive and healthcare decisions. Social
determinants of health such as education and gender equality are strongly related to health seeking
behaviour and overall health outcomes. This results in population stabilization and better infant care
reflected by lower birth rates and infant mortality rates (IMR), respectively. Education of families,
especially of women has a ‘multiplier effect’ on development (United Nations, 2012). Reduction of
fertility has been observed in relation to better educational attainment in women (Graff M, et.al.
2010). Education of women reflected as higher literacy has also been seen to reduce IMRs
independent of socioeconomic status or residence in rural or urban area (Peña, et.al., 1999).

Panel data or longitudinal data typically refer to data containing time series observations of a
number of individuals. Therefore, observations in panel data involve at least two dimensions; a
cross-sectional dimension and a time series dimension. Panel data, by blending the inter-individual
differences and intra-individual dynamics have several advantages over cross-sectional or time-
series data. It offers greater capacity for capturing the complexity of human behaviour than a single
cross-section or time series data. A panel data model is also of practical matter since it allows for
cross-sectional heterogeneity in the data.

The dataset has been collated for 21 states over three time periods subjected to available data.

Variables and Sources


Literacy Rate (Females) (LR): Literacy levels are available for each state from the Census of India,
1991, 2001 and 2011. These figures were used for the index on literacy.

Infant Mortality Rates (IMR): Infant Mortality Rates are sourced from National Rural Health Mission
Publications, Ministry of Health and Family Welfare and Sample Registration System, Office of
Registrar General, India.
Total Fertility Rates (TFR): Total Fertility Rates are collated from Ministry of Home Affairs and
Sample Registration System, Office of Registrar General, India. The cumulative value of the age
specific fertility rates at the end of the child bearing age gives a measure of fertility known as Total
Fertility Rate.

Poverty Rates (PR): Poverty Rates are sourced from Niti Aayog and Planning Commission
documents. The data refers to number of persons (in lakhs) in Rural and Urban India separately and
Combined (Rural + Urban) for the year of 1993-94, 2004-05 and 2011-12.

Methodology
Literacy Rate (Females) (LR): Literacy rate for population was calculated as percentage share of all
literates in a state over the total population of people above 7 years of age in the state.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑖𝑡𝑒𝑟𝑎𝑡𝑒𝑠 𝑎𝑏𝑜𝑣𝑒 7 𝑦𝑒𝑎𝑟𝑠 𝑜𝑓 𝑎𝑔𝑒 𝑖𝑛 𝑎 𝑔𝑖𝑣𝑒𝑛 𝑔𝑒𝑟𝑜𝑔𝑟𝑎𝑝ℎ𝑖𝑐𝑎𝑙 𝑎𝑟𝑒𝑎


× 100
𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑏𝑜𝑣𝑒 7 𝑦𝑒𝑎𝑟𝑠 𝑜𝑓 𝑎𝑔𝑒

Infant Mortality Rates (IMR): The main objective of Sample Registration System (SRS) is to provide
reliable estimates of birth rate, death rate and infant mortality rate at the natural division level for
rural areas and at the State level for urban areas.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑒𝑎𝑡ℎ𝑠 𝑡𝑜 𝑙𝑖𝑣𝑒 𝑏𝑜𝑟𝑛 𝑖𝑛𝑓𝑎𝑛𝑡𝑠 𝑢𝑛𝑑𝑒𝑟 1 𝑦𝑒𝑎𝑟 𝑜𝑓 𝑎𝑔𝑒


× 1000
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑏𝑖𝑟𝑡ℎ𝑠

Total Fertility Rates (TFR): SRS also provides data for other measures of fertility and mortality
including total fertility, maternal and child mortality rate at higher geographical levels.

∑(𝐹𝑖𝑣𝑒 − 𝑦𝑒𝑎𝑟 𝑎𝑔𝑒 − 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑏𝑖𝑟𝑡ℎ 𝑟𝑎𝑡𝑒𝑠 𝑓𝑜𝑟 𝑓𝑒𝑚𝑎𝑙𝑒𝑠 𝑏𝑒𝑡𝑤𝑒𝑛 10 𝑎𝑛𝑑 49 𝑦𝑒𝑎𝑟𝑠 𝑜𝑓 𝑎𝑔𝑒) × 5
The Office of Registrar General, India, initiated the scheme of sample registration of births and
deaths in India popularly known as Sample Registration System (SRS) in 1964-65 on a pilot basis and
on full scale from 1969-70. The SRS since then has been providing data on a regular basis.

The SRS in India is based on a dual record system. The field investigation under Sample Registration
System consists of continuous enumeration of births and deaths in a sample of villages/urban blocks
by a resident part-time enumerator, and an independent six-monthly retrospective survey by a full-
time supervisor. The data obtained through these two sources are matched. The unmatched and
partially matched events are re-verified in the field to get an unduplicated count of correct events.

The advantage of this procedure, in addition to elimination of errors of duplication, is that it leads to
a quantitative assessment of the sources of distortion in the two sets of records making it a self-
evaluating technique.

Poverty Rates (PR): It is based on Tendulkar Methodology for Rural, Urban and Total showing
percentage of persons and No. of Persons in lakh population. It recommended to shift away from
the calorie-based model and made the poverty line somewhat broad based by considering monthly
spending on education, health, electricity and transport also.
𝑃𝑒𝑜𝑝𝑙𝑒 𝑏𝑒𝑙𝑜𝑤 𝑝𝑜𝑣𝑒𝑟𝑡𝑦 𝑙𝑖𝑛𝑒
× 100
𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑜𝑢𝑛𝑡𝑟𝑦
Summary Statistics
Variable Count Mean Var Sd Min Max
LR91 21 44.51429 223.8243 14.96076 20.4 86.1
LR01 21 57.9381 145.3545 12.0563 33.1 87.9
LR11 21 69.51905 106.0266 10.29692 51.5 92.1
IMR91 21 66.30225 739.9353 27.20175 16 124
IMR01 21 53.20988 565.8543 23.78769 9.85706 91.21837
IMR11 21 36.85178 240.3127 15.50202 10.70049 59.33175
TFR91 21 3.291409 0.837449 0.915123 1.741969 5.1
TFR01 21 2.747813 0.799678 0.894247 1.393705 4.5
TFR11 21 2.155552 0.443002 0.665584 1.359394 3.6
PR93-94 21 41.73333 136.2593 11.67302 20.8 65.1
PR04-05 21 33.55714 117.1496 10.82357 16.1 57.2
PR11-12 21 18.47143 117.3501 10.83283 5.1 37

As can be seen from the table above, over the three decades the literacy rates of females have
increased. As for the other variables, there has been a steady decline in IMR, TFR and Poverty Rates.
We observe that standard deviation falls for all variables through the decades. Hence there is a
converging factor between more developed states and less developed states.

LR91 LR01 LR11 IMR91 IMR01 IMR11 TFR91 TFR01 TFR11 PR93-94 PR04-05 PR11-12
LR91 1
LR01 0.9594 1
LR11 0.9524 0.948 1
IMR91 -0.7224 -0.6581 -0.6916 1
IMR01 -0.7882 -0.7609 -0.8143 0.9291 1
IMR11 -0.7681 -0.7429 -0.7541 0.8729 0.9239 1
TFR91 -0.8347 -0.8214 -0.8096 0.6844 0.773 0.8601 1
TFR01 -0.7771 -0.7968 -0.8049 0.5735 0.7039 0.7873 0.9497 1
TFR11 -0.6963 -0.7352 -0.7559 0.5553 0.688 0.7606 0.8829 0.9702 1
PR93-94 -0.4987 -0.5283 -0.5177 0.3852 0.3442 0.2895 0.2875 0.3345 0.337 1
PR04-05 -0.5883 -0.5722 -0.5291 0.5943 0.4758 0.3801 0.3596 0.3598 0.396 0.7071 1
PR11-12 0.0337 -0.0463 -0.01 -0.0809 -0.0078 0.0621 0.1231 0.1641 0.1513 0.2127 -0.0548 1

As can been seen from the correlation matrix, Literacy Rates have high positive correlation with past
literacy rates suggesting that more educated females will foster education for future females. It is
also observed that Literacy Rates have high negative correlation with the other variables. This is to
be expected since more the number of literate females less will be the child mortality, fertility and
poverty in the society.

In case for IMR and TFR, both have strong positive correlation with their past levels implying the
effects of social programs in spread of awareness through the decades and long-term impact of
capital building in the health sector. Furthermore, both are highly correlated with each other
showing incidence of high TFR results in high IMR.
Poverty Rates have little to no correlation with both IMR and TFR which is unusual since it is
anecdotally expected that the poor with little to no knowledge of family planning and poor access to
medical facilities should lead to high IMR and TFR rates.

LR IMR TFR PR
LR 1
IMR -0.7754 1
TFR -0.8151 0.7338 1
PR -0.5088 0.3929 0.3905 1

LR

100

50 IMR

0
4

2 TFR

0
50

40
PR
30

20
40 60 80 0 50 1000 2 4

Limitations
• With the change in the poverty rate estimation after establishment of Niti Aayog, the
Tendulkar estimates are no longer significant.
• Poverty Rates are available for different years compared to the other variables which could
be the reason for low correlation.
• Collection of SRS data could be inaccurate due to ill-informed respondents and unreliable
data collectors.
• Crude interpretation of literacy of citizens (read, write and understand in any language) may
not suggest the appropriate level of literacy.
• Ecological association should not be generalized to a family unit. However the data considers
it as such.
THEME: Agricultural Yield Over Time (Time Series
Data)
India is an agro-based country. The main occupation of the Indians has been agriculture and its allied
activities like farming, poultry, cattle rearing, fishing, animal husbandry etc. However, due to
defective planning and improper implementation the productivity of Indian agriculture is very poor.
Improper land tenure system, wrong landholding inadequate credit system, primitive technology
and old system of ploughing and irrigation etc. are the main reasons behind low productivity of
Indian agriculture. To overcome all these difficulties, government adopts several measures, including
land reforms, new tenancy system, economic subsidy etc. for the growth of per hectare agricultural
production.

The fertilizer industry in India is one of the core sectors and only second to steel in terms of
investment. Fertilizers are simply plant nutrients applied to agricultural fields to supplement
required elements found naturally in the soil. Fertilizers have been used since the start of
agriculture. Farmers turn to fertilizers because these substances contain plant nutrients such as
nitrogen, phosphorus, and potassium.

A time series dataset consists of observational units on a variable or multiple variable taken over
time. What makes time series data unique is that their observational units are very rarely assumed
to be independent across time. In the data set that has been selected, agricultural data for India has
been recorded over a 28-year period from 1977 to 2005.

Variables and Sources


Yield Per Hectare: The data was sourced from the Directorate of Economics and Statistics,
Department of Agriculture and Cooperation. Estimates for yield or output of major crops are
obtained through crop cutting experiments (CCE) conducted under crop estimation surveys. 95% of
yield estimates are based on yield rates from CCEs. In the dataset that had been chosen we can
understand how productivity of land has changed in 28 years.

Area under Irrigation: The data is collected every year by respective state governments and then
compiled by the Directorate of Economics and Statistics, Department of Agriculture and
Cooperation.

Fertiliser Consumption (N + P2O5 + K2O): These fertilisers include Nitrogen (N), Phosphorous
pentoxide (P2O5) and Potassium Oxide(K2O) which are used across the country. Data collection
usually is done through a questionnaire which is sent out by the Fertilizer Association of India to all
areas under cultivation. If questionnaires are missing information, variables like change in stock are
used to estimate the amount consumed.

Fertiliser Subsidies: The fertilizer subsidies are borne by the Central Government. The government
had come up with the policy of Retention Price Scheme (RPS) in the year 1977. Under RPS, the
government fixed a fair ex-factory retention price for various fertilizers of different manufacturers.
The Government pays the manufacturers their cost of production along with a profit margin of 12
percent (post tax) if the factory utilises the 90 percent of the installed capacity.
Methodology
Yield Per Hectare: Yield is calculated in Kilograms per Hectare (Kg/hectare).

𝐾𝑖𝑙𝑜𝑔𝑟𝑎𝑚𝑠 𝑜𝑓 𝑚𝑎𝑗𝑜𝑟 𝑐𝑟𝑜𝑝 ℎ𝑎𝑟𝑣𝑒𝑠𝑡𝑒𝑑 𝑖𝑛 𝑎 𝑔𝑖𝑣𝑒𝑛 𝑦𝑒𝑎𝑟


𝑇𝑜𝑡𝑎𝑙 𝑎𝑟𝑒𝑎 𝑢𝑛𝑑𝑒𝑟 𝑐𝑢𝑙𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛

Area under Irrigation: The variable was calculated as a percentage of crop area under irrigation over
total crop area.
Crop Area Under Irrigation
× 100
𝑇𝑜𝑡𝑎𝑙 𝑎𝑟𝑒𝑎 𝑢𝑛𝑑𝑒𝑟 𝑐𝑢𝑙𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛

Fertiliser Consumption (N + P2O5 + K2O): The variable was calculated as a summation of total
fertiliser consumed over crop area in a year (‘000 tonnes).

Fertiliser Subsidies: The variable was obtained by summing all the subsidies to fertiliser producers
(organised and unorganised) in financial year.

Summary Statistics
Variable Count Mean Sd Var Min Max
YIELD 28 1356.214 261.9647 68625.51 876 1734
AUI 28 36.34286 5.384644 28.99439 27.7 44.2
FC 28 11608.04 4496.096 2.02E+07 4286 18398
TS 28 5688.786 4908.698 2.41E+07 266 15879

YIELD AUI FC TS
YIELD 1
AUI 0.966 1
FC 0.9764 0.9855 1
TS 0.9156 0.9567 0.9558 1

• YEILD AND AUI: Correlation coefficient is 0.9660 which shows that yield and area are highly
correlated and have a positive relationship. increase in area under irrigation results in .9660
units increase in the yield.

• YEILD AND FERTILISER CONSUMPTION: Correlation coefficient is 0.9764 which shows that
yield and fertiliser consumption are highly correlated and have a positive relationship. This
means that more use of fertilisers results in a better yield.

• YEILD AND TOTAL SUBSIDY: Correlation coefficient is 0.9156 which means that yield and
subsidy on fertilisers are strongly and positively correlated.
• SUBSIDY AND FERTILISERS CONSUMPTION: Correlation coefficient is 0.9558 which again
shows a strong positive relationship. This shows that more the subsidies provided more is
the fertilisers consumption.

• SUBSIDY AND AUI: Correlation coefficient is 0.9567 which shows that subsidy and area under
irrigation are highly correlated and have a positive relationship. This means that more
subsidies on fertilisers encourage the farmers to increase the area under irrigation.

• AUI AND FERTILISER CONSUMPTION: Correlation coefficient is 0.9855 which means that area
under irrigation and fertiliser consumption are strongly and positively correlated. This means
that more the area under irrigation more is the use of fertilisers.

Yield

45

40

35 AUI

30
20000

15000
FC
10000

5000
15000

10000
TS
5000

0
1000 1500 2000 30 35 40 45
5000 10000 15000 20000

Limitations
• India doesn’t collect agricultural data at regular intervals. Also, sometimes the methods
of data collection require the respondents to either fill up a form or reply in person. In
both cases the farmer/cultivator should be aware about measuring methods and other
questions. The lack of awareness and literacy is a problem in addition.
• Generalisation of yield independent of land type. In order to understand the yield per
hectare the data collected should be on the basis of type of land and the soil on it. A
fertile land will have more yield compared to an arid soil land. Accounting this difference
in the type of land and soil will cause a major change in the data.
• Fertiliser data is collected through questionnaires which have low response rate. Also,
most of the farmers are not very open to revealing the amount of fertiliser they use as
excessive use of fertilisers may increase the yield but have an environmental impact to
which the government may respond negatively.
CHALLENGES FACED
During the course of this assignment, many challenges were faced in collecting and collating the
data.

• Compromising on variables: In order to make our exercise more comprehensive, total


subsidies were considered as the variable for time series dataset. However due to
unavailability of data for the period 1977-1995, subsidies to fertiliser producers was taken.
• Time period of the data: For the panel data, poverty rates were available only for years
1993-94, 2004-05 and 2011-12, whereas the time period of study was 1991, 2001 and 2011.
Furthermore, some variables obtained where collected for financial years while the rest over
calendar years. This is a huge shortcoming of our dataset as one of our variables varies at a
different frequency than the rest, making our analysis inconsistent. However, it is still
reflective of the general trend.
• Data retrieval: Certain variables were difficult to collect since the Government websites no
longer upload data before 2001. Statistical Handbooks of individual years had to sourced to
validate the data collated. For instance, the data on subsidies to fertiliser producers was
collected from three statistical handbooks from online libraries since Ministry websites have
not uploaded them.
• Methodology used for variables: Poverty levels differ based on the methodology used to
calculate them. Similarly, Unemployment Rates differ based on the approach to collecting
data. This posed the problem of choosing variable calculated by which methodology. We
selected the methodology that considered the broadest viewpoint for the variable. For
example, Unemployment Rate according to Usual Principal & Subsidiary Status Approach
was chosen instead of individual approach.

REFERENCES
• All-India Consumption of Fertiliser Nutrients - 1950-51 to 2017-18, Fertilizer Association of
India, Government of India, Available at: https://www.faidelhi.org/statistics/statistical-
database, Accessed on 20th Jan 2019
• Census (1991), Primary Census Abstracts, Registrar General of India, Ministry of Home
Affairs, Government of India, Available at:
http://censusindia.gov.in/DigitalLibrary/data/Census_1991/Publication/India/45969_1991_
CHN.pdf, Accessed on 20th Jan 2019
• Census (2001), Primary Census Abstracts, Registrar General of India, Ministry of Home
Affairs, Government of India, Available at: http://www.censusindia.gov.in/2011-
common/census_data_2001.html, Accessed on 20th Jan 2019
• Census (2011), Primary Census Abstracts, Registrar General of India, Ministry of Home
Affairs, Government of India, Available at: http://censusindia.gov.in/2011-
Common/CensusData2011.html, Accessed on 20th Jan 2019
• Crime in India 2016 Statistics, National Crime Records Bureau, Ministry of Home Affairs,
Government of India, Available at:
http://ncrb.gov.in/StatPublications/CII/CII2016/pdfs/NEWPDFs/Crime%20in%20India%20-
%202016%20Complete%20PDF%20291117.pdf, Accessed on 20th Jan 2019
• Economic Survey 1991-92, Ministry of Finance, Government of India, Available at:
https://www.indiabudget.gov.in/previouses.asp, Accessed on 20th Jan 2019
• Economic Survey 2001-02, Ministry of Finance, Government of India, Available at:
https://www.indiabudget.gov.in/previouses.asp, Accessed on 20th Jan 2019
• Economic Survey 2011-12, Ministry of Finance, Government of India, Available at:
https://www.indiabudget.gov.in/previouses.asp, Accessed on 20th Jan 2019
• Handbook of State Statistics, Niti Aayog, Government of India, Available at:
https://niti.gov.in/state-statistics, Accessed on 20th Jan 2019
• Handbook of Statistics on Indian Economy (2017-18), Reserve Bank of India, Available at:
https://www.rbi.org.in/SCRIPTs/AnnualPublications.aspx?head=Handbook%20of%20Statistic
s%20on%20Indian%20Economy, Accessed on 20th Jan 2019
• National Accounts Statistics (1996), Central Statistical Organisation, Government of India,
various issues
• Percent of Population below Poverty Line – 1993-94 & 2004-05 and Poverty by Castes &
Other Sub-Groups – 1983, 1993-94 & 2004-05, Planning Commission, Government of India,
Available at: http://planningcommission.nic.in/data/datatable/, Accessed on 20th Jan 2019
• Pocket Book of Agricultural Statistics – 2017, Ministry of Agriculture & Farmers Welfare,
Department of Agriculture, Cooperation & Farmers Welfare, Directorate of Economics &
Statistics, Government of India, Available at: https://eands.dacnet.nic.in/, Accessed on 20th
Jan 2019
• Statistical Year Book India 2017, Ministry of Statistics and Programme Implementation,
Government of India, Available at: http://www.mospi.gov.in/statistical-year-book-
india/2017/176, Accessed on 20th Jan 2019
• Wages and Statistics, Ministry of Labour & Employment, Government of India, Available at:
https://labour.gov.in/wages-and-statistics, Accessed on 20th Jan 2019
APPENDIX

Tables: Cross-Sectional Data

States PoP VC Rate UE Rate PCI


Arunachal Pradesh 878.4 50.4 39 112312
Assam 169.6 74.2 40 60817
Bihar 74.8 44.7 44 30213
Chhattisgarh 228.6 29.7 12 76025
Delhi 383.3 77.5 31 271305
Goa 352.4 23.6 90 334576
Gujarat 120.2 18.8 6 139254
Haryana 164.8 52.1 33 162034
Himachal Pradesh 225.4 22.4 102 135512
Jammu & Kashmir 627 46.2 66 73229
Jharkhand 175 27.5 22 52754
Karnataka 145.1 31.4 14 142267
Kerala 174.5 37.9 106 148011
Madhya Pradesh 125.4 34.1 30 62817
Maharashtra 186.5 35.2 15 147610
Manipur 962.7 35 34 55447
Meghalaya 442.7 29.5 40 68836
Mizoram 702.1 22.5 15 114055
Nagaland 965.8 12.6 56 82466
Odisha 132.9 44.8 38 65650
Punjab 275 23.7 58 118858
Rajasthan 121.7 22.1 25 83977
Sikkim 822.6 33.1 89 245987
Tamil Nadu 184.2 15.6 38 140441
Tripura 619.7 42.9 100 80027
Uttar Pradesh 90.4 29.7 61 47062
Uttarakhand 186.3 21.8 58 146454
West Bengal 102.4 49.8 36 75992

PoP Population Per Policeperson


VC Rate Rate of Violent Crimes (Crime Per One Lakh of Population)
Unemployment Rate (per 1000) for persons aged 15 years & above according to Usual
UE Rate Principal & Subsidiary Status Approach
PBP Per Capita Income at 2015-16 Prices
Tables: Panel Data

States Literacy Rate (%) - Infant Mortality Rate Total Fertility Rates Poverty Rate (%)
Females (%)
LR91 LR01 LR11 IMR91 IMR01 IMR11 TFR91 TFR01 TFR11 PR93- PR04- PR11-
94 05 12
Andhra Pradesh 32.70 50.40 59.10 73.00 66.44 42.86 3.00 2.34 1.80 44.60 29.90 9.20
Assam 43.00 54.60 66.30 81.00 74.05 55.42 3.50 3.00 2.40 51.80 34.40 32.00
Bihar 22.00 33.10 51.50 69.00 61.82 44.07 4.40 4.40 3.60 60.50 54.40 33.70
Goa 67.10 75.40 84.70 20.84 14.00 10.70 1.74 1.39 1.43 20.80 25.00 9.90
Gujarat 48.60 58.60 69.70 69.00 60.35 40.54 3.10 2.93 2.40 37.80 31.80 5.10
Haryana 40.50 45.70 65.90 68.00 65.78 44.40 4.00 3.10 2.30 35.90 24.10 16.60
Himachal Pradesh 52.10 67.40 75.90 74.58 42.78 37.51 3.12 2.19 1.76 34.60 22.90 11.20
Karnataka 44.30 56.90 68.10 77.00 58.01 34.52 3.10 2.42 1.90 49.50 33.40 37.00
Kerala 86.10 87.90 92.10 16.00 11.34 12.09 1.80 1.80 1.80 31.30 19.70 20.90
Madhya Pradesh 29.40 50.30 59.20 117.00 86.00 59.33 4.60 3.93 3.10 44.60 48.60 7.10
Maharashtra 52.30 67.00 75.90 60.00 45.06 24.57 3.00 2.40 1.80 47.80 38.10 31.60
Manipur 47.60 60.50 72.40 21.68 9.86 11.29 2.47 2.04 1.46 65.10 38.00 17.40
Meghalaya 44.90 59.60 72.90 57.00 51.22 51.97 4.25 3.75 2.77 35.20 16.10 36.90
Odisha 34.70 50.50 64.00 124.00 91.22 56.55 3.30 2.60 2.20 59.10 57.20 18.90
Punjab 50.40 63.40 70.70 53.00 51.51 30.35 3.10 2.44 1.80 22.40 20.90 32.60
Rajasthan 20.40 43.90 52.10 79.00 80.17 51.61 4.60 4.00 3.00 38.30 34.40 8.30
Sikkim 46.70 60.40 75.60 51.04 29.21 26.46 2.77 2.39 1.59 31.80 31.10 14.70
Tamil Nadu 51.30 64.40 73.40 57.00 49.32 21.82 2.20 2.00 1.70 44.60 28.90 8.20
Tripura 49.70 64.90 82.70 56.19 35.07 29.13 2.76 1.68 1.36 32.90 40.60 11.30
Uttar Pradesh 24.40 42.20 57.20 97.00 82.85 56.75 5.10 4.50 3.40 48.40 40.90 14.00
West Bengal 46.60 59.60 70.50 71.00 51.35 31.94 3.20 2.40 1.70 39.40 34.30 11.30
Tables: Time Series Data

Year Yield AUI FC TS


1977-78 991 27.7 4286 266
1978-79 1022 28.8 5117 343
1979-80 876 30.3 5255 604
1980-81 1023 29.7 5516 505
1981-82 1032 29.6 6067 375
1982-83 1035 30.8 6401 605
1983-84 1162 30.9 7710 1042
1984-85 1149 31.9 8211 1927
1985-86 1175 31.4 8474 1923
1986-87 1128 32.6 8645 1897
1987-88 1173 33.5 8784 2164
1988-89 1331 34.4 11040 3201
1989-90 1349 35.0 11568 4542
1990-91 1380 35.1 12546 4389
1991-92 1382 37.4 12728 5185
1992-93 1457 37.4 12155 5769
1993-94 1501 38.7 12366 4562
1994-95 1546 39.6 13564 5769
1995-96 1491 40.1 13876 6735
1996-97 1614 40.0 14308 7578
1997-98 1552 40.8 16188 9918
1998-99 1627 42.4 16798 11596
1999-00 1704 43.9 18069 13244
2000-01 1626 43.4 16702 13811
2001-02 1734 43.0 17360 12595
2002-03 1535 42.8 16094 11015
2003-04 1727 42.2 16799 11847
2004-05 1652 44.2 18398 15879

Yield Kg. /Hectare


AUI Area Under Irrigation (%)
FC Fertilizer Consumed (N+P2O5+K2O) '000 Tonnes
TS Total Subsidy (Cr)
0
20
40
60
80
100
120
0
20
40
60
80
100
1000
1200

0
200
400
600
800
100000
200000
300000
400000

0
Arunachal…
Arunachal…
Assam
Assam
Bihar
Bihar
Chhattisgarh
Chhattisgarh
Delhi
Delhi
Goa
Goa
Gujarat
Gujarat
Haryana
Haryana
Himachal…
Himachal…
Jammu &…
Jammu &…
Jharkhand
Jharkhand
Karnataka
Karnataka
Charts: Cross-Sectional Data

Kerala
Kerala
Madhya…
Madhya…
Maharashtra
Maharashtra
Manipur Manipur
Meghalaya
PER CAPITA INCOME

Unemployment Rate
Meghalaya
Mizoram

RATE OF VIOLENT CRIMES


Mizoram
Nagaland
POLICE PER LAKH POPULATION

Nagaland
Odisha Odisha
Punjab Punjab
Rajasthan Rajasthan
Sikkim Sikkim
Tamil Nadu Tamil Nadu
Tripura Tripura
Uttar Pradesh Uttar…
Uttarakhand Uttarakhand
West Bengal West Bengal
Charts: Panel Data
1991
140 6
120 5
100
LR/IMR/PR

4
80

TFR
3
60
40 2
20 1
0 0

LR91 IMR91 PR93-94 TFR91

2001
100 5
90 4.5
80 4
LR/IMR/PR

70 3.5
60 3

TFR
50 2.5
40 2
30 1.5
20 1
10 0.5
0 0

LR01 IMR01 PR04-05 TFR01

2011
100 4
90 3.5
80
3
70
LR/IMR/PR

60 2.5
TFR

50 2
40 1.5
30
1
20
10 0.5
0 0

LR11 IMR11 PR11-12 TFR11


YEILD FERTILISER CONSUMPTION YEILD

0
200
400
600
800
1000
1200
1400
1600
1800
2000
0
200
400
600
800
1000
1200
1400
1600
1800
2000

0.0
10,000.0
12,000.0
14,000.0
16,000.0
18,000.0
20,000.0

2,000.0
4,000.0
6,000.0
8,000.0
1977-78 1977-78
1978-79 1977-78 1978-79
1979-80 1978-79 1979-80
1980-81 1979-80 1980-81
1981-82 1980-81 1981-82
1982-83 1981-82 1982-83
1983-84 1982-83 1983-84
1984-85 1983-84 1984-85
1985-86 1984-85 1985-86
1986-87 1985-86
1986-87
1987-88 1986-87
1987-88
1988-89 1987-88
Charts: Time Series Data

1988-89
1989-90 1988-89
1989-90 1989-90
1990-91

FC
Yield 1990-91

Yield
1991-92 1990-91
1992-93 1991-92 1991-92
1993-94 1992-93 1992-93

TS
1993-94 1993-94

FC
1994-95
AUI
AREA AND YEILD

1995-96 1994-95 1994-95


1996-97 1995-96 1995-96
1997-98 1996-97 1996-97
1998-99 1997-98 1997-98
1999-00 1998-99 1998-99

YEILD AND FERTILISER CONSUMPTION


2000-01 1999-00 1999-00
SUBSIDY AND FERTILISER CONSUMPTION

2001-02 2000-01 2000-01


2002-03 2001-02 2001-02
2003-04 2002-03
2002-03
2004-05 2003-04
2003-04
2004-05
2004-05

0.0
0
2000
4000
6000
8000
0.0
5.0

2,000.0
4,000.0
6,000.0
8,000.0
10000
12000
14000
16000
18000
10.0
15.0
20.0
25.0
30.0
35.0
40.0
45.0
50.0

10,000.0
12,000.0
14,000.0
16,000.0
18,000.0
20,000.0
FERTILISER CONSUMPTION SUBSIDY AREA

You might also like