You are on page 1of 12

Module 14

Introduction to Statistics

Component 1A

Role Name Affiliation

Principal Investigator Dr. Geeta Balakrishnan College of Social Work,


Nirmala Niketan, Mumbai
Paper Coordinator Dr. Yamini Suvarna College of Social Work,
Nirmala Niketan, Mumbai
Content Writer Dr. Graciella Tavares St. Andrew’s College, Mumbai

Content Reviewer Dr. Melita Vaz Tata Institute of Social


Sciences, Mumbai
Language Editor Dr. Melita Vaz Tata Institute of Social
Sciences, Mumbai

Component 1B

Description of Module
Subject Name Social Work Education

Paper Name Research Methods and Statistics

Module Name Introduction to Statistics

Module ID SWE/RMS/IS/M14

Pre Requisites None

Objectives To understand the concept of statistics and how to use it in research

Key words Statistics, Variables, Hypothesis, Independent and Dependent


Variables, Predictor and Criterion Variables, Population and
Parameters, Sample and Sample Statistics

SWE/RMS/IS/M14 by Dr. Graciella Tavares


Quadrant 1

1. Introduction

The word statistics has been derived from the Latin word ‘status’ or the Italian word ‘stastista’ or the
German word ‘statistics,’ each of which means a political state. Initially statistics was concerned with
economic, political and demographic particulars of a country. Today, statistics has broadened to include
many things. It is so important to our way of living that many of us often use statistical analysis in making
decisions without even realizing it. In recent years the growth of statistics has made itself felt in almost
every phase of human activity. Statistics no longer consists in collection of data and their presentation in
charts and tables; it is now considered to encompass the science of basing inferences on observed data
and the entire problem of reaching decisions in the face of uncertainty.

Statistics often causes anxiety and tension in the minds of students undertaking this study of research
methods. But an easy way to reduce this stress is to recognize that we are already exposed to statistics in
our everyday life. For instance, as students we are familiar with the notion of Class X percentages. While
watching cricket matches, we are used to comparing players and teams on the basis of various aspects
such as run rate and wickets. We are used to deciding whether the rainfall our cities and villages receive is
above or below average based on the figures of the previous years. During elections we see much
discussion on the vote share of different party candidates.

Statistics is simply the science of the organization and the conceptual understanding of groups of
numbers. This group of numbers is called data. It is the purpose of statistics to take all these numbers or
data and present them in a more efficient way, actually in a more comprehensible way. A statistician
needs to be a good story teller. A statistician must be able to articulate what was found in an experiment
or observational study, why this finding is important and to whom, and what the data from these may
mean for us in the future. A statistical background is indispensible in order to understand research reports.

1.1 Reasons to Study Statistics

 Being an informed “Information Consumer”


 Extract information from charts and graphs
 Follow numerical arguments
 Know the basics of how data should be gathered, summarized and analyzed to

SWE/RMS/IS/M14 by Dr. Graciella Tavares


draw statistical conclusions
 Understanding and Making Decisions
 Decide if existing information is adequate
 Collect more information in an appropriate way
 Summarize the available data effectively
 Analyze the available data
 Draw conclusions, make decisions, and assess the risks of an incorrect decision
 Evaluate Decisions That Affect Your Life
 Help understand the validity and appropriateness of processes and decisions that
affect your life

1.2 What is statistics?


The most popular, exhaustive and comprehensive definition of statistics in the pluralistic sense is given
by Prof Horace Secrist. His definition states that “Statistics may be defined as the aggregate of facts
affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated
according to a reasonable standard of accuracy, collected in a systematic manner, for a predetermined
purpose and placed in relations to each other.”

The most crucial aspect of applying statistics consists of analyzing the data in such a way to obtain a more
efficient and comprehensive summary of the overall results. To achieve these goals, statistics is divided
into two areas, descriptive and inferential statistics.

1.2.1. Important Concepts in Statistics

 Population: This is the entire set of events, people or phenomena about whom the researcher is
interested. For instance, the researcher may be interested in all tribal societies in India. Another
researcher may be interested in all students in International Baccalaureate schools in India. So we
see a population may range from a relatively small set of events, people or phenomena that can be
enumerated easily or a very large set that cannot be traced without a significant amount of time,
money and personnel. Think back to all that you may have read in the newspapers or heard in the
mass media about the Census of India that takes places. This is a once-in-10-years occurrence
which involves counting all the people living in India.
 Parameters: When we are able to describe key aspects of the population, we are generating
population parameters. Some of the population parameters on the Census of India website include
literacy, male-to-female ratio and number of persons who report that they are employed.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


 Sample: But most research cannot attempt to undertake census-like operations. So most research
is undertaken on smaller sets that are drawn systematically from the population. We call this
smaller set of events, people or phenomena a sample. For instance, if you want to check whether
there is sugar already in a cup of tea that is served to you in a restaurant, you would usually pick a
spoon and sip a little to make sure before adding any more. The little that you taste off that spoon
is essentially a sample representing the larger tea-cup. In this case, the tea-cup contains the
“population” in which you are momentarily interested.
 Statistics and Parameters: Here you will learn another meaning of the term statistics: In this
context, it refers to the set of numerical values that the researcher will compute to describe the
sample (such as the average value or the range of values). For every population parameter, there
is a corresponding sample statistic. The sample mean which is written as X with a bar on top (and
read aloud as X-bar) corresponds to the population parameter mu (µ). Sample statistics are always
written in roman letter-script while population parameters are written in Greek letters.
Statisticians use carefully selected techniques to draw inferences about the larger population from
what they know about the sample. How we select our sample will give us confidence to draw
conclusions about the larger population of interest. We call this the sampling procedure. For
instance, think back to the tasting of that spoon of tea to check for the sugar content in the cup:
What happens if you taste a spoonful of tea from the surface but there is actually sugar at the
bottom of the cup which has not been mixed in by stirring the cup?

1.2.2 Descriptive statistics:

Historically, this is the older area, and supplies several tools, such as tables, graphs, and basic description
of number such as averages or means. These tools help in collecting, classifying and summarizing
information about a collection of actual observations. Tables listing the different types of crimes against
women reported in your state in the past one year, a graph showing the daily movement of the rupee in
the last two weeks, and the grade point average in your mark sheet are all examples of statistical tools that
organize and summarize information about a collection of actual observations. The most well-known
statistics are the mean (that is the statistical average) and the mode (the most frequently occurring value).
Knowledge of this aspect of Statistics enables us to evaluate critically the information presented in
reports, articles, etc.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


1.2.3 Inferential statistics:
Inferential statistics is primarily a product of the twentieth century and supplies several tools for
generalizing beyond actual observations. It involves informed and calculated making guesses (inferences)
about a large group of data (called the population) from a smaller group of data (called the sample).
Typically, sample data is randomly drawn from the population or larger group of data. The concept of
random sampling means that every person in the population has an equal chance of being chosen for the
limited size sample.

1.2.4 Perspective:
Often the areas of descriptive and inferential statistics overlap. Depending on one’s perspective, a given
set of observations can exemplify either descriptive or inferential statistics. For instance, noting the
weight of a hundred of trucks passing through a city and using bridges can be viewed from the
perspective of descriptive statistics, because one is concerned with the load bearing capacity of the
bridges, or from the perspective of inferential statistics, because one wishes to draw some conclusions
about the weight of all trucks.

2. Types of data:

A statistical analysis is performed on data, that is, on a collection of actual observations from a survey or
experiment. The precise form of a statistical analysis often depends on whether the data are numbers or
words.

2.1 Quantitative data:


When, among a set of observations, any single observation is a number that represents an amount or a
count, the data are quantitative. The weights of the hundred trucks recorded as per our earlier example are
quantitative data, because any single observation represents an amount of weight.

2.2 Qualitative data:


When, among a set of observations, any single observation is a word or code that represents a class or
category, the data are qualitative. For instance, if a survey records the response of students with the code
Y for Yes and N for No, to the question "Do you drink?" the resulting data is qualitative since each
individual observation is a code that represents a particular class of replies. Often numbers are assigned to
the Yes and No replies to permit computer processing. But we cannot derive a statistical average of these
numbers. We can only report the most frequent value, or the relative percentage of values.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


3. Types of variables:

Another distinction from the realm of research methods is based on two types of variables. A variable is a
characteristic or property that can take on different values. Weights of trucks or the replies (Y or N) are
examples of variables but any single observation can be described as a constant, because it takes on only
one value.
3.1. Independent variable:
When a variable is manipulated by an investigator in an experiment, it is an independent variable
3.2. Dependent variable:
When a variable is measured, counted or recorded by the investigator, it is a dependent variable. It is not
manipulated. Instead it represents an outcome: the data produced by an experiment in relation to changes
in the independent variable. Thus the values that appear for the dependent variable cannot be specified in
advance.
3.3. Predictor and Criterion variables:
Let us look at another example: empowerment of women. Social work researchers may be interested to
see which women are able to achieve economic stability and/ or independence: Are younger women more
likely to achieve economic stability? Are more educated women likely to achieve economic stability? The
answer is that younger women are more likely to be educated. So is it age or is it education that is the
special ingredient that leads to independence? Here researchers are not able to manipulate these variables.
They can only draw inferences from observational or archival data sources. So we speak instead of
predictor and criterion variables.

3.3.1. Predictor variables


A predictor variable is one “from which a prediction is made” (Howell, 1999, p.144). In our example, age
and education are predictor variables.
3.3.2. Criterion variables
A criterion variable is one which is “to be predicted” (Howell, 1999, p.144). In our example,
empowerment is the criterion variable. While depicting these variables graphically, we will traditionally
designate the predictor variables as X (or X1, X2 and so on) and we will plot it on the X axis (the
horizontal axis). Criterion variables as designated as Y (or Y1, Y2 and so on) and we will plot it on the Y
axis (the vertical axi

SWE/RMS/IS/M14 by Dr. Graciella Tavares


4. Characteristics of statistics

The major characteristics of statistics can be stated as:

1. Statistics are aggregates of facts: These facts can be compared or related to other figures within
the same framework. For example a single figure about a person’s income cannot be considered
statistics because it is not informative enough. If you get to know one social worker’s salary, you
are not informed well enough about the entire field. But if you were to learn through a survey
about the salary of 100 social workers, depending on how and where you sampled the group in
your study, you may be able to start drawing conclusions about the average pay in the field, the
average pay for someone who just entered employment versus someone who has been working as
a social worker for 10 years, and finally the average pay for a social worker involved in service
delivery versus a social worker involved in academics and research. Such data relating to the
income of a group of persons or perhaps all the employees of a social work organisation would be
considered statistics.

2. Statistics are affected to a marked extent by multiplicity of causes: It means that statistics are
facts and figures which are the consequence of multiple causation. For example, the change in the
travel habits of an individual may be due to the effect of many factors such as changes in prices,
health, incomes and time. While these factors can be isolated by themselves, the effect of these
factors cannot be isolated and measured independently.

3. Statistics are expressed numerically: Qualitative statements are not statistics unless they are
supported by numbers. For example, the statement like “India has reduced the number of people
who are considered poor” does not constitute statistics. On the other hand, comparing the per
capita income of India with that of Bangladesh would be called statistics. Similarly, the collection
of data consisting of the number of poor of India for the last ten years would also be considered
as statistics.

4. Statistics are enumerated or estimated according to reasonable standards of accuracy:


Enumeration simply means counting the actual number in the data, such as the number of male
teachers in a school. However, when it is impossible or impractical to observe an entire set of
observations, then data are estimated by using powerful techniques of sampling and estimation.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


But the estimated values will not be as precise and accurate as actual values. The degree and
accuracy of the estimated values will depend on the nature and purpose of the enquiry.

5. Statistics are collected in a systematic manner: The data must be collected in a systematic
manner because gathering data in a haphazard manner leads to erroneous conclusions.

6. Statistics are collected for a pre-determined purpose: The purpose and objective of collecting
the data must be clearly defined, decided and determined prior to data collection. This would
facilitate the collection of proper and relevant data. For, example data on agriculture production
in itself, does not serve any purpose unless we know the period, commodities and regions for
which they are required.

7. Statistics must be placed in relation to each other: The statistical data that are collected should
be comparable with each other. For example, the data collected on the price of different
commodities in a store will not be considered statistics. However, the price of one commodity in
different departmental stores constitutes statistical data since theses prices are comparable.

5. Functions of Statistics

5.1. Preciseness and definiteness: Statistical techniques enable us to present facts in a clear, precise and
definite form.

5.2. Condensation: With the help of statistical tools a mass of data can be condensed into a few
presentable, understandable and significant figures. Data that are complex can be presented in terms of
percentages, averages etc.

5.3. Comparison: This is also one of the important functions of statistics; the absolute figures themselves
do not convey any significant meaning. It is their comparison that helps us draw conclusions. Statistical
devices like averages, ratios, percentages and graphs are the tools that can be employed for the purposes
of comparison.

5.4. Formulation and testing of hypothesis: Statistical methods are extremely useful in formulating and
testing hypothesis for the purpose of establishing a relationship between two or more variables. For
example, the degree of association between demand and price or between amount of rainfall and yield of

SWE/RMS/IS/M14 by Dr. Graciella Tavares


crop or between advertising expenditure and sales can be obtained using statistical tools. Similarly the
hypothesis like whether or not advertising campaign is effective in sales increase can be tested by using
appropriate statistical tools.

5.5. Prediction: Statistical techniques can be used to analyse the past data and predict the future trends.
For example, the demand for a particular product for the next year can be predicted by knowing the
demand for that product in the past and the current market trends and possible changes in the factors that
affect the demand.

5.6. Formulation of suitable policies: Statistics helps in the formulations of various economic, business
and other policies at state, national or global level. For example, the government policies on education,
taxation, pollution, law and order, social welfare etc., are formulated on the basis of statistical data and
inferences drawn from their analysis. Similarly business organizations also use of statistics to design their
policies in areas of finance, marketing and personnel.

6. Uses of Statistics

Historically, the use of statistics can be traced back to the ancient Egyptians and Chinese who used
statistics for keeping state records. The Chinese under the Chou dynasty in 2000 BEC maintained
extensive lists of revenue collection and Government Expenditures. They also maintained records on the
availability of warriors. Today, the subject of statistics is one of the most powerful tools in any scientific
enquiry which has found applications in a very large number of disciplines such agriculture, industry,
psychology, sociology, economics, insurance, business, biology, planning, management, etc. The field of
study to which statistics is being applied is constantly increasing. In business, commerce and industry,
statistics is considered an indispensable tool in analysis of activities relating to business commence and
industry. Statisticians are employed by every progressive industry to direct their quality control process
and to assist in the establishment of good advertising and sales programmes for their products. The main
objective of any production process is to control the quality of the manufactured product so that it
conforms to certain specifications. In business the statistician is responsible for decision making, for the
analysis of time series, and for the formulation of index numbers. Statistics can be used to study trends, to
obtain the estimates of probable demand for the good. Time series analysis can also be use to study
‘Business cycles’. The study of index numbers helps the business man to have an idea about the
purchasing power of the currency.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


Statistics in Economics: To study various economic problems such as production, consumption,
distribution, wealth, savings, investment, unemployment, poverty, etc. Data on national income and
wealth are useful in formulating policies for reducing disparities of income.

Statistics in physical and natural sciences: In studies of astronomy, biology, agriculture, zoology, etc,
statistical techniques are extremely useful. In medicine the effects of drugs on individuals is one area
where the uses of statistical methods is well established. It has even been applied to determine the total
population of various species.

Statistics in research: Most of the advancement in research has taken place because of experiments
conducted with the help of statistical methods. These are now widely applied in market production
research. This includes advertising and promotional strategies, forecasting, new product decisions etc.

7. Limitations of Statistics

The field of statistics though widely used in almost every sphere of life, does have some limitations.
7.1. Statistics does not deal with individual figures: it only deals with aggregate of facts or figures. Single
or isolated figures cannot be called statistics. For example a single figure relating to the height of a
student in a class is not statistical in nature unless the heights of other students in the same class are also
mentioned.
7.2. Statistics does not study qualitative data: Statistics is the study of only those facts that are capable of
being stated in numbers or quantity. Qualitative phenomenon such as honesty, intelligence, beauty and
poverty cannot be studied directly in statistics unless these attributes are expressed in terms of numbers.
7.3. Statistics laws are not exact: Unlike the laws of natural science, the laws of statistics are not exact,
thus the conclusions based on them are true only on an average.
7.4. Statistics can be misused: A famous statement says that “the figures do not lie, but the liars can
figure”, which is testimony to the misuse of statistics. The results can be manipulated or interpreted
according to one’s own interest
7.5. Statistics’ interpretation requires a great degree of skill and understanding: In order to get
meaningful results, it is necessary that the data be collected and critically interpreted. Only people having
adequate knowledge of the statistical methods can properly handle statistical data.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


8. Using Computer Programmes in Statistics

When statistics was developing as a branch of science, people did computations by hand. Statisticians
needed to memorise formulae which became more and more complex with the complexity of the data.
Computing statistics often took time and as all elements of manual labour may have involved human error
during the calculation. Today computers remove much of the pain suffered by earlier statisticians.
Computers are becoming more sophisticated and capable of handling large databases. Today the field is
slowly moving towards the analysis of big data because there are computers and programmes to handle
such huge data-sets. Data mining is becoming a specialty subject in research.

Using statistical programmes students can derive statistical output in a minute (depending on the speed of
the computer’s processer unit). Using a computer package like SPSS you may be able to order the
computer to give you multiple statistics using just one command. But this is not a good use of statistics.
Good researchers use the appropriate method for their studies. So as you learn other modules, you would
be wise to understand where to apply a particular technique, and to avoid the common mistake of
beginners of showing a report with all the types of statistics that the computer programme allows you
irrespective of whether that is the right place to apply it.

9. Summary

 It is an acknowledged fact that learning statistics, one can interpret the statistical messages of
everyday life, understand references in research reports and plan a simple statistical analysis of
one’s own research. Statistics consists of two main subdivisions: descriptive and inferential
statistics.
 It is important to distinguish between quantitative and qualitative data. When among a set of
observations, any single observation is a number that represents an amount or a count, the data
are quantitative. If any single observation is a word or code that represents a class or category, the
data are qualitative
 It is also helpful to distinguish between independent and dependent variables. The former are
manipulated by the investigator, while the latter are outcomes measured, counted or recorded by
the investigator.

SWE/RMS/IS/M14 by Dr. Graciella Tavares


Statistics are affected by a multiplicity of causes and must be must be placed in relation to each other
Statistical devices like averages, ratios, percentages, graphs, etc. are the tools that can be employed for the
purposes of comparison.

Today, the subject of statistics is one of the most powerful tools in any scientific enquiry which has found
applications in a very large number of disciplines Most of the advancement in research has taken place
because of experiments conducted with the help of statistical methods The field of statistics though
widely used in almost every sphere of life, does have some limitations. Only people having adequate
knowledge of the statistical devices can properly handle statistical data.

References
1) Howell, D.C. (1999). Fundamental Statistics for the Behavioral Sciences (4th ed.). Pacific Grove,
CA, USA: Duxberry Press.

SWE/RMS/IS/M14 by Dr. Graciella Tavares

You might also like