You are on page 1of 61

ABF 102

BUSINESS STATISTICS Vol-2

ACeL
Amity University

In the modern world of computers and information technology, the importance of statistics is very well recogonised by all the disciplines. Statistics has originated as a science of statehood and found applications slowly and steadily in Agriculture, Economics, Commerce, Biology, Medicine, Industry, planning, education and so on. As on date there is no other human walk of life, where statistics cannot be applied.

Table of Contents
CHAPTER SIX: REGRESSION ANALYSIS ................................................................................................. 4 6.1 Meaning ................................................................................................................................................ 4 6.2 Definitions :.......................................................................................................................................... 5 6.3 Regression Line ................................................................................................................................... 5 6.4 Regression Equations and Regression Coefficient ....................................................................... 6 6.5 Difference between Correlation and Regression Analysis ....................................................... 12 Chapter Six: End Chapter Quizzes ........................................................................................................ 14 CHAPTER SEVEN: TIME SERIES ANALYSIS ........................................................................................ 17 7.1 Meaning .............................................................................................................................................. 17 7.2 Definitions.......................................................................................................................................... 18 7.4 Uses or importance of Time-series ............................................................................................... 19 7.6 Components of time series ............................................................................................................. 21 7.6.1 Trend Component ..................................................................................................................... 21 7.6.2 Cyclical Component................................................................................................................... 22 7.6.3 Seasonal Component ................................................................................................................ 22 7.6.4 Irregular Component ................................................................................................................ 22 7.7 Methods of measuring secular trend or trend ............................................................................ 23 7.8 Measurement of seasonal variations ............................................................................................ 26 7.8.2 Ratio to trend method: ............................................................................................................. 27 7.9 Practical Problems: .......................................................................................................................... 28 Chapter Seven: End Chapter Quizzes .................................................................................................. 34 CHAPTER EIGHT: PROBABILITY .......................................................................................................... 37 8.1 Introduction....................................................................................................................................... 37 8.2.1 Trial.............................................................................................................................................. 37 8.2.2 Random Trial or Random Experiment .................................................................................. 37 8.2.3 Sample space .............................................................................................................................. 38 8.2.4 Event ............................................................................................................................................ 38 8.2.5 Equally Likely Events................................................................................................................ 38

8.2.7 Exhaustive Events ..................................................................................................................... 39 8.2.8 Independent Events .................................................................................................................. 39 8.2.9 Dependent Events ..................................................................................................................... 39 8.2.10 Complementary Events .......................................................................................................... 40 8.3 Definitions.......................................................................................................................................... 40 8.3.1 Mathematical (or A Priori or Classic) Definition ................................................................ 40 8.3.2 Van Mises Statistical (or Empirical) Definition .................................................................. 42 8.4 The Law of Probability .................................................................................................................... 46 8.5 Importance of Probability .............................................................................................................. 54 8.6 Practical Problems: .......................................................................................................................... 55 Chapter Eight: End Chapter Quizzes ................................................................................................... 57 BIBLIOGRAPHY ........................................................................................................................................ 61

CHAPTER SIX: REGRESSION ANALYSIS


6.1 Meaning
In statistics, regression analysis is a collective name for techniques for the modeling and analysis of numerical data consisting of values of a dependent variable (also called response variable or measurement) and of one or more independent variables (also known as explanatory variables or predictors). The dependent variable in the regression equation is modeled as a function of the independent variables, corresponding parameters

("constants"), and an error term. So Regression analysis is any statistical method where the mean of one or more random variables is predicted based on other measured random variables. There are two types of regression analysis, chosen according to whether the data approximate a straight line, when linear regression is used, or not, when non-linear regression is used. Regression can be used for prediction (including forecasting of timeseries data), inference, hypothesis testing, and modeling of causal relationships. These uses of regression rely heavily on the underlying assumptions being satisfied. Regression analysis has been criticized as being misused for these purposes in many cases where the appropriate assumptions cannot be verified to hold one factor contributing to the misuse of regression is that it can take considerably more skill to critique a model than to fit a model.

6.2 Definitions :
Regression is the measure of the average relationship between two or more variables and terms of the original units of the data. Morris M. Blair One of the most frequently used techniques in economics and business research, to find a relation between two or more variables that are related casually, is regression analysis. Taro Yamane It is often more important to find out what the relation actually is, in order to estimate or predict one variable and the statistical technique appropriate to such a case is called regression analysis. Wallis and Roberts

6.3 Regression Line


A regression line is a line drawn through a scatterplot of two variables. The line is chosen so that it comes as close to the points as possible. Regression analysis, on the other hand, is more than curve fitting. It involves fitting a model with both deterministic and stochastic components. The deterministic component is called the predictor and the stochastic component is called the error term. The simplest form of a regression model contains a dependent variable, also called the "Y-variable" and a single independent variable, also called the "X-variable".

6.4 Regression Equations and Regression Coefficient


Regression equations or estimating equations are algebraic expression of regression lines. As there are two regression lines, so there are two regression equation, i.e. regression equation of X on Y and regression equation of Y on X. The regression equation of X on Y is : X = a + bY Here X is a dependent variable and Y is independent variable. a is X intercept and b is the slope of line and it represents change in variable X when there is a unit change in variable Y. X = aN + bY XY = aY + bY2 (i) (ii)

If we solve these two equations, we can compute the values of a and b constants. Similarly, regression equation of Y on X is : Y = a + bX And if we solve the following two equations, we can find the values of constants a and b. Y = aN + bX XY = aX + bX2

(i) (ii)

Illustration : Students of a class have obtained marks as given below in paper I and paper II of statistics: Paper I 45 55 56 58 60 65 68 70 75 80 85 PaperII 56 50 48 60 62 64 65 70 74 82 90 Find the mean, coefficient of correlation, regression coefficient.

6.5 Difference between Correlation and Regression Analysis


Both Correlation and Regression Analysis are two important statistical tools to study the relationship between variables. The difference between the two can be analysed as under : Correlation 1. Correlation measures the relationship between the two variables which vary in the same or opposite direction. Regression Analysis 1. Regression means going back or act of return. It is a mathematical measure which shows the average relationship between the two variables. 2. Here both X and Y variables are 2. Here X is a random variable and random variables. Y is a fixed variable. However, both variables may be random variables.

3. There can be non sense or spurious correlation between two variables. 4. The coefficient of correlation is a relative measure and it ranges in 1.

3. There is no such non sense regression equation. 4. Regression coefficient is an absolute measure. If we know the value of independent variable, we can estimate the value of dependent variable.

Chapter Six: End Chapter Quizzes


1. The term regression was introduced by

a- R. A. Fisher b- Sir Francis Galton c- Karl Pearson d- none of the above

1. If X and Y are two variates, there can be most a. one regression line b. two regression lines c. three regression lines d. an infinite number of regression lines

2. In regression line of Y on X, the variable X is known as a- independent variable b- regressor c- explanatory variable d- all the above

3. Regression equation is also named as abprediction equation estimating equation c. line of average relationship d. all the above

5.

Scatter diagram of the variate values (X, Y) gives the idea about abcdfunctional relationship regression model distribution of errors none of the above

6.

If p=0, the lines of regression are abcdcoincident parallel perpendicular to each other none of the above

7.

Regression coefficient is independent of abcdorigin scale both origin and scale neither origin nor scale

8.

Regression analysis can be used for abcdreducing the length of confidence interval for prediction of dependent variate value to know the true effect of certain treatments all the above

9.

Probable error is used for ameasuring the error in r

bcd-

testing the significance of r both (a) and (b) neither (a) nor (b)

10.

If p = 0, the angle between the two lines of regression is abcd0 degree 90 degree 60 degree 30 degree

CHAPTER SEVEN: TIME SERIES ANALYSIS


7.1 Meaning
In statistics, signal processing, and many other fields, a time series is a sequence of data points, measured typically at successive times, spaced at (often uniform) time intervals. Time series analysis comprises methods that attempt to understand such time series, often either to understand the underlying context of the data points (where did they come from? what generated them?), or to make forecasts (predictions). Time series forecasting is the use of a model to forecast future events based on known past events: to forecast future data points before they are measured. A standard example in econometrics is the opening price of a share of stock based on its past performance. The term time series analysis is used to distinguish a problem, firstly from more ordinary data analysis problems (where there is no natural ordering of the context of individual observations), and secondly from spatial data analysis where there is a context that observations (often) relate to geographical locations. There are additional possibilities in the form of spacetime models (often called spatial-temporal analysis). A time series model will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values in a series for a given time will be expressed as deriving in some way from past values, rather than from future values (see time reversibility.)

So a time series is a sequence of observations which are ordered in time (or space). If observations are made on some phenomenon throughout time, it is most sensible to display the data in the order in which they arose, particularly since successive observations will probably be dependent. Time series are best displayed in a scatter plot. The series value X is plotted on the vertical axis and time t on the horizontal axis. Time is called the independent variable (in this case however, something over which you have little control). There are two kinds of time series data: 1. Continuous, where we have an observation at every instant

of time, e.g. lie detectors, electrocardiograms. We denote this using observation X at time t, X(t). 2. Discrete, where we have an observation at (usually

regularly) spaced intervals. We denote this as Xt.

7.2 Definitions
A set of data depending on the time is called a time series. ------- Kenny and Keeping A time series consists of data arranged chronologically. ------- Croxton and Cowden variable made periodically through time. ------- C.H.Mayers A

time series may be defined as a sequence or repeated measurements of a

7.3 Applications of time series: The application of time series


models is two fold :

Obtain an understanding of the underlying forces and structure that produced the observed data Fit a model and proceed to forecasting, monitoring or even feedback and feed forward control. Time Series Analysis is used for many applications. Few of them are as follows: Economic Forecasting Sales Forecasting Budgetary Analysis Stock Market Analysis Yield Projections Process and Quality Control Inventory Studies Workload Projections Utility Studies Census Analysis

7.4 Uses or importance of Time-series


Analysis of time series is useful in every walk of life like business, economics, science, state, sociology, research work etc. However, following are its main objectives :

7.4.1 Study of past behaviour: Analysis of time series studies the


past behaviour of data and indicates the changes that have taken place in the past.

7.4.2 Prediction for future: On the basis of analysis of time series,


future predictions can be made easily. For instance, we can predict future sales and necessary alterations can be done in the production policy.

7.4.3 Facilitate comparisions : We can make comparison of


various time series to know the death rate, birth rate, yield per acre etc.

7.4.4 Evaluation of actual data: On the basis of deviation analysis of


actual data and estimated data obtained from analysis of time series, we can come to know about the causes of this change.

7.4.5 Prediction of trade cycle: We can know about the factors of


cyclical variations like boom, depression, recession and recovery which are very important to business community.

7.4.6 Universal utility: The analysis of time series is not only useful
to business community and economists but it is equally to agriculturist, government, researchers, political and social institutions, scientists etc.

7.5 Difference between seasonal and cyclical variations


Following are the main differences between the two:

7.5.1 Time period: The duration of seasonal variations is always one


year while year while duration of cyclical variation is more than one year and it varies from three to eight years.

7.5.2 Regularity: We find regularity in the components of seasonal


variation while there is no regularity in the components of cyclical variations

and even the length of components of cyclical variations, viz., boom, disinflation, depression and recovery is not equal.

7.5.3 Causes of variations: Seasonal variation takes place due to


change in seasons, customs, habits, fashion etc. While cyclical variation takes place due to change in teconomic activity.

7.5.4 Measurement: Both the variations can be measured, however,


their technique differ. The seasonal variation can be measured more precisely as its variation is of regular in nature.

7.5.5 Effect of variation: Seasonal variation affect different people


in a different manner while the effect of cyclical variation is the same on the whole economy.

7.6 Components of time series


Following are the components of time series :

7.6.1 Trend Component


We want to increase our understanding of a time series by picking out its main features. One of these main features is the trend component. Descriptive techniques may be extended to forecast (predict) future values. Trend is a long term movement in a time series. It is the underlying direction (an upward or downward tendency) and rate of change in a time series, when allowance has been made for the other components. A simple way of detecting trend in seasonal data is to take averages over a certain period. If these averages change with time we can say that there is evidence of a trend in the series. There are also more formal tests to enable detection of trend in time series.

It can be helpful to model trend using straight lines, polynomials etc.

7.6.2 Cyclical Component


We want to increase our understanding of a time series by picking out its main features. One of these main features is the cyclical component. Descriptive techniques may be extended to forecast (predict) future values. In weekly or monthly data, the cyclical component describes any regular fluctuations. It is a non-seasonal component which varies in a recognisable cycle.

7.6.3 Seasonal Component


We want to increase our understanding of a time series by picking out its main features. One of these main features is the seasonal component. Descriptive techniques may be extended to forecast (predict) future values. In weekly or monthly data, the seasonal component, often referred to as seasonality, is the component of variation in a time series which is dependent on the time of year. It describes any regular fluctuations with a period of less than one year. For example, the costs of various types of fruits and vegetables, unemployment figures and average daily rainfall, all show marked seasonal variation. We are interested in comparing the seasonal effects within the years, from year to year; removing seasonal effects so that the time series is easier to cope with; and, also interested in adjusting a series for seasonal effects using various models.

7.6.4 Irregular Component


We want to increase our understanding of a time series by picking out its main features. One of these main features is the irregular component (or 'noise'). Descriptive techniques may be extended to forecast (predict) future values.

The irregular component is that left over when the other components of the series (trend, seasonal and cyclical) have been accounted for.

7.7 Methods of measuring secular trend or trend


Broadly speaking there are four methods of measuring trend, they are as follows :

7.7.1 Free hand curve method: This is the easiest and simplest
method of computing secular trend. In this method, time is plotted on X- axis and the other variable is plotted on Y- axis. A free hand curve is then drawn so as to pass from the center of original fluctuations. Merits: 1. values. 2. The trend line is drawn without using scale, so it may be a It is the easiest and simplest method of knowing to trend

straight line or a smooth curve line. 3. Demerits: 1. The straight line trends (Yt) drawn on graph will differ The method is free from any mathematical formulas.

from person to person in the absence of any mathematical formula. 2. biased. If the statistician is biased, the free hand curve will also be

7.7.2 Semi average method: It is a better technique to comparison to


free hand curve method. Under this method variable (Y) is divided into two equal parts and average of each part is computed separately. Merits: 1. This method is simple and easy to understand in relation to

moving average and least square method. 2. The trend line (Yt) in this method is a fixed straight line

unlike the free hand curve method where trend line depend upon the personal judgement of the statistician. Demerits: 1. The method is based on the assumption of linear trend

whether it exists or not. 2. means. 3. This method is not suitable for removing trend from the The method is affected by the limitation of the arithmetic

original data.

7.7.3 Moving Average method: This method is a better technique of


knowing trend in relation to semi average method. The trend values are obtained with a fair degree of accuracy by eliminating cyclical fluctuations. In this method we calculate average on the basis of moving technique. This period of moving average is determined on the basis of length of cyclical fluctuations which varies from 3 to 11 years. Merits: 1. square. This technique is easier in relation to method of least

2. Demerits: 1.

This technique is effective if the trend of series is irregular.

In this method we can not obtain the trend values for all the

years as we leave the first and last year value of data while computing three years moving average and so on. 2. The basic purpose of trend value is to predict the trend of

future. In this method we can not extend the trend line on both direction, so this method cannot be used for prediction purposes. 7.7.4 Method of least square: This is the best method of measuring secular trend. It is the mathematical as well as analytical tool. This method can be fitted to economic and business time series to make future predictions. The trend line may be linear or non linear. Merits : 1. The method of least square does not suffer from subjectivity

or personal judgement as it is a mathematical method. 2. We can compute the trend value of all the given years by

this method. Demerits: 1. The method is based on mathematical technique, so it is not

easily understandable to a non mathematical person. 2. If we add or delete some observations in the data, the value of constants a and b will change and new trend line will follow.

7.8 Measurement of seasonal variations


The short term variations with in a year in a time series are referred to as seasonal variations. These variations are periodic in nature, viz., weekly, monthly or quarterly changes. These variations may take place due to change in seasons like summer, winter, rainy, autumn etc. Thus, seasonal variations refer to annual repetitive pattern in economic and business activity. Following measures are used to measure the seasonal variations: 7.8.1 Method of simple averages: This method involves the following steps : 1. quarters. 2. 3. Totals of each month for the given years are obtained. The average of each month is then obtained by The given time series is arranged by years, months or

dividing the totals of months by no. of years. 4. Total of average month is obtained and divided by the

no. of months in a year. 5. Considering the average of monthly average as base,

seasonal index is computed for each month by applying the following formula: Seasonal index = monthly average for the month/ Average of monthly average*100

7.8.2 Ratio to trend method:


This method is based on multiplicative model of time series. It assumes that seasonal variation for a given period is a constant fraction of the trend value. The steps for computation of this method are: 1. First of all trend values are calculated by applying the

method of least square on the yearly average. 2. Trend values for each quarter is obtained based on

trend values so obtained. 3. Now divide the original quarterly data by the trend

value of corresponding quarter and multiply the quotient by hundred. These values are free from trend. 4. To free the data from cyclical and irregular

variations, quarterly data are averaged. 7.8.3 Link relative method: This is one of the most difficult method of obtaining seasonal variations. Steps involved in this method are: 1. Link relatives are calculated from the given quarterly

data by applying formula: Current Quarter/ Previous quarter*100 2. Average of link relatives are obtained for each quarter. 3. Chain relatives are then calculated by using the formula:

Chain index = (Current quarterL.R.*Previous quarter chain index)/100 4. 5. I quarter chain index is calculated bases on IV quarter. Chain relatives are adjusted for each quarters by

subtracting (Quarterly effect * 1, quarterly effect * 2, quarterly effect * 3). quarterly effect from II, III, IV quarter.

6.

Seasonal index is finally computed . since the total of

quarterly index should be 400, while the real total will be much more, so seasonal index is computed as Seasonal index = (Chain index of quarter * 400) / Actual total of chain index of four quarters.

7.9 Practical Problems:


Illustration: Find 3- years moving average from the following data :
Year Sales(in Rs.) 1990 1991 1992 1993 1994 3 8 10 9 12 1995 1996 1997 1998 1999 lakh Year Sale Rs.) 15 13 18 17 20 (in lakh

Link relative method :

This is one of the most difficult method of obtaining seasonal variations. Steps involved in this method are : Link relatives are calculated from the given quarterly data by applying formula: Current Quarter/ Previous quarter*100 Average of link relatives are obtained for each quarter. Seasonal index is finally computed . since the total of quarterly index should be 400, while the real total will be much more, so seasonal index is computed as Seasonal index = (Chain index of quarter * 400) / Actual total of chain index of four quarters.

Illustration : Compute seasonal variations by using Link Relative Method from the following data: Year I Quarter II Quarter III Quarter IV Quarter I 45 54 72 60 II 48 56 63 56 III 49 63 70 65 IV 52 65 75 72

(iv) Total of correct chain relatives = 100+ 120.08+140.86+124.74 = 485. 68 (v) Seasonal Index

Chapter Seven: End Chapter Quizzes

1. abcd-

A time series is a set of data recorded periodically at time or space intervals at successive points of time all the above

2. abcd-

The time series analysis helps to compare the two or more series to know the behaviour of business to make predictions all the above

3. abcd-

A time series is unable to adjust the influences like customs and policy changes seasonal changes long-term influences none of the above

4. abcd-

A time series consists of two components three components four components five components

5.

The forecasts on the basis of a time series are abcdcent per cent true true to a great extent never true none of the above

6.

The components of the time series attached to long-term

variations is terms as abcdcyclic variation secular trend irregular variation all the above

7.

Secular trend is indicative of long-term variation towards abcdincrease only decrease only either increase or decrease none of the above

8.

Linear trend of a time series indicates towards abcdconstant rate of change constant rate of growth change is geometric progression all the above

9.

Seasonal variation means the variations occurring with in aba number if years parts of year

cd-

parts of month none of the above

10. abcd-

Cyclic variations in a time series are caused by lockouts in a factory war in a country floods in the states none of the above

CHAPTER EIGHT: PROBABILITY


8.1 Introduction The theory of probability was developed towards the end of the 18th century and its history suggests that it developed with the study of games and chance, such as rolling a dice, drawing a card, flipping a coin etc. Apart from these, uncertainty prevailed in every sphere of life. For instance, one often predicts: "It will probably rain tonight." "It is quite likely that there will be a good yield of cereals this year" and so on. This indicates that, in laymans terminology the word probability thus connotes that there is an uncertainty about the happening of events. To put probability on a better footing we define it. But before doing so, we have to explain a few terms."

8.2 Concepts of probability calculation


Following are the fundamental concepts of probability calculation:
8.2.1 Trial

A procedure or an experiment to collect any statistical data such as rolling a dice or flipping a coin is called a trial.
8.2.2 Random Trial or Random Experiment

When the outcome of any experiment can not be predicted precisely then the experiment is called a random trial or random experiment. In other words, if a random experiment is repeated under identical conditions, the outcome will vary at random as it is impossible to predict about the performance of the experiment. For example, if we toss a honest coin or roll an unbiased dice, we may not get the same results as our expectations.

8.2.3 Sample space

The totality of all the outcomes or results of a random experiment is denoted by Greek alphabet or English alphabets and is called the sample

space. Each outcome or element of this sample space is known as a sample print.
8.2.4 Event

Any subset of a sample space is called an event. A sample space S serves as the universal set for all questions related to an experiment 'S' and an event A w.r.t it is a set of all possible outcomes favorable to the even t A For example, A random experiment :- flipping a coin twice Sample space :or S = {(HH), (HT), (TH), (TT)}

The question : "both the flipps show same face" Therefore, the event A : { (HH), (TT) }
8.2.5 Equally Likely Events

All possible results of a random experiment are called equally likely outcomes and we have no reason to expect any one rather than the other. For example, as the result of drawing a card from a well shuffled pack, any card may appear in draw, so that the 52 cards become 52 different events which are equally likely.
8.2.6 Mutually Exclusive Events

Events are called mutually exclusive or disjoint or incompatible if the occurrence of one of them precludes the occurrence of all the others. For example in tossing a coin, there are two mutually exclusive events viz turning up a head and turning up of a tail. Since both these events cannot happen simultaneously. But note that events are compatible if it is possible for them to

happen simultaneously. For instance in rolling of two dice, the cases of the face marked 5 appearing on one dice and face 5 appearing on the other, are compatible.
8.2.7 Exhaustive Events

Events are exhaustive when they include all the possibilities associated with the same trial. In throwing a coin, the turning up of head and of a tail are exhaustive events assuming of course that the coin cannot rest on its edge.
8.2.8 Independent Events

Two events are said to be independent if the occurrence of any event does not affect the occurrence of the other event. For example in tossing of a coin, the events corresponding to the two successive tosses of it are independent. The flip of one penny does not affect in any way the flip of a nickel.
8.2.9 Dependent Events

If the occurrence or non-occurrence of any event affects the happening of the other, then the events are said to be dependent events. For example, in drawing a card from a pack of cards, let the event A be the occurrence of a king in the 1st draw and B be the occurrence of a king in the 1st draw and B be the occurrence of a king in the second draw. If the card drawn at the first trial is not replaced then events A and B are independent events. Note (1) If an event contains a single simple point i.e. it is a singleton set, then this event is called an elementary or a simple event. (2) An event corresponding to the empty set is an "impossible event." (3) An event corresponding to the entire sample space is called a certain event.

8.2.10 Complementary Events

Let S be the sample space for an experiment and A be an event in S. Then A is a subset of S. Hence , the complement of A in S is also an event in

S which contains the outcomes which are not favorable to the occurrence of A i.e. if A occurs, then the outcome of the experiment belongs to A, but if A does not occur, then the outcomes of the experiment belongs to It is obvious that A and = S. If S contains n equally likely, mutually exclusive and exhaustive points and A contains m out of these n points then contains (n - m) sample points. are mutually exclusive. A = and A

8.3 Definitions We shall now consider two definitions of probability :

8.3.1 Mathematical or a priori or classical. 8.3.2 Statistical or empirical.


8.3.1 Mathematical (or A Priori or Classic) Definition

If there are n exhaustive, mutually exclusive and equally likely cases and m of them are favorable to an event A, the probability of A happening is defined as the ratio m/n Expressed as a formula :-

This definition is due to Laplace. Thus probability is a concept which measures numerically the degree of certainty or uncertainty of the occurrence of an event. For example, the probability of randomly drawing taking from a wellshuffled deck of cards is 4/52. Since 4 is the number of favorable outcomes (i.e. 4 kings of diamond, spade, club and heart) and 52 is the number of total outcomes (the number of cards in a deck). If A is any event of sample space having probability P, then clearly, P is a positive number (expressed as a fraction or usually as a decimal) not greater than unity. 0 P 1 i.e. 0 (no chance or for impossible event) to a high of 1

(certainty). Since the number of cases not favorable to A are (n - m), the probability q that event A will not happen is, p. Now note that the probability q is nothing but the probability of the complementary event A i.e. Thus p ( ) = 1 - p or p ( ) = 1 - p ( ) so that p (A) + p ( ) = 1 i.e. p + q = 1 q = or q = 1 - m/n or q = 1 -

Relative Frequency Definition


The classical definition of probability has a disadvantage i.e. the words equally likely are vague. In fact, since these words seem to be synonymous with "equally probable". This definition is circular as it is defining (in terms) of itself. Therefore, the estimated or empirical probability of an event is taken as the relative frequency of the occurrence of the event when the number of observations is very large.

8.3.2 Van Mises Statistical (or Empirical) Definition

If trials are to be repeated a great number of times under essentially the same condition then the limit of the ratio of the number of times that an event happens to the total number of trials, as the number of trials increases indefinitely is called the probability of the happening of the event. It is assumed that the limit exists and finite uniquely. Symbolically p (A) =p= provided it is finite and unique. The two definitions are apparently different but both of them can be reconciled the same sense. Example Find the probability of getting heads in tossing a coin. Solution : Experiment : Tossing a coin Sample space : S = { H, T} Event A : getting heads A = { H} n (A) = 1 n (S) = 2

Therefore, p (A) =

or 0.5

Example Find the probability of getting 3 or 5 in throwing a die. Solution : Experiment : Throwing a dice Sample space : S = {1, 2, 3, 4, 5, 6 } Event A : getting 3 or 6 A = {3, 6} n (A) = 2 n (S) = 2

Therefore, p (A) = Example Two dice are rolled. Find the probability that the score on the second die is greater than the score on the first die. Solution : Experiment : Two dice are rolled

Sample space : S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 1), (2, 2), (2, 3), (2, 4), (2, 6)}...

(6, 1), (6, 2) (, 3), (6, 4), (6, 5), (6, 6) } n (S) = 6 the 1st die. i.e. A = { (1, 2), (1, 3), (1, 4), (1, 5), (1, 6) (2, 3), (2, 4), (2, 5), (2, 6) (3, 4), (3, 5), (3, 6) (4, 5), (4, 6) (5, 6)} n (A) = 15 Therefore, p (A) = Example A coin is tossed three times. Find the probability of getting at least one head. Solution : Experiment : A coin is tossed three times. Sample space : S = {(H H H), (H H T), (HTH), (HTT), (THT), (TTH), (THH), (TTT) } n (S) = 8 Event A : getting at least one head so that A : getting no head at all = { (TTT) P( )= Therefore, P (A) = 1 - P ( A ) = Example A ball is drawn at random from a box containing 6 red balls, 4 white balls and 5 blue balls. Determine the probability that the ball drawn is (i) red (ii) white (iii) blue (iv) not red (v) red or white. n( )=1 6 = 36 Event A : The score on the second die > the score on

Solution : Let R, W and B denote the events of drawing a red ball, a white ball and a blue ball respectively.

(i)

Note : The two events R and W are disjoint events. Example What is the chance that a leap year selected at random will contain 53 Sundays ? Solution : A leap year has 52 weeks and 2 more days. The two days can be : Monday - Tuesday Tuesday - Wednesday Wednesday - Thursday Thursday - Friday Friday - Saturday Saturday - Sunday and Sunday - Monday. There are 7 outcomes and 2 are favorable to the 53rd Sunday.

Now for 53 Sundays in a leap year, P(A) 2 / 7 = 0.29 (Approximately) Example If four ladies and six gentlemen sit for a photograph in a row at random, what is the probability that no two ladies will sit together ?

Solution :

Now if no two ladies are

to be together, the ladies have 7 positions, 2 at ends and 5 between the gentlemen Arrangement L, G1, L, G2, L, G3, L, G4, L, G5, L, G6, L

Example In a class there are 13 students. 5 of them are boys and the rest are girls. Find the probability that two students selected at random wil be both girls. Solution : Two students out of 13 can be selected in girls of 8 can be selected in ways. ways and two out

Therefore, required probability =

Example A box contains 5 white balls, 4 black balls and 3 red balls. Three balls are drawn randomly. What is the probability that they will be (i) white (ii) black (iii) red ? Solution : Let W, B and R denote the events of drawing three white, three three red balls respectively. black and

8.4 The Law of Probability


So far we have discussed probabilities of single events. In many situations we come across two or more events occurring together. If event A and event B are two events and either A or B or both occurs, is denoted by A B or (A + B) and the event that both A and B occurs is denoted by A B or AB. We term these situations as compound event or the joint occurrence of events. We may need probability that A or B will happen. It is denoted by P (A B) or P (A + B). Also we may need the

probability that A and B (both) will happen simultaneously. It is denoted by P (A B) or P (AB).

Consider a situation, you are asked to choose any 3 or any diamond or both from a well shuffled pack of 52 cards. Now you are interested in the probability of this situation. Now see the following diagram. It is denoted by P (A B) or P (A + B). Also we may need the

probability that A and B (both) will happen simultaneously. It is denoted by P (A B) or P (AB). Consider a situation, you are asked to choose any 3 or any diamond or both from a well shuffled pack of 52 cards. Now you are interested in the probability of this situation. Now see the following diagram.

Now count the dots in the area any 3 or any diamond or both. They are 16.

which fulfills the condition

Thus the required probability

In the language of set theory, the set any 3 or any diamond or both is the union of the sets any 3 which contains 4 cards and any diamond which contains 15 cards. The number of cards in their union is equal to the sum of these numbers minus the number of cards in the space where they overlap. Any points in this space, called the intersection of the two sets, is counted here twice (double counting), once in each set. Dividing by 52 we get the required probability. Thus P (any 3 or any diamond or

both) In general, if the letters A and B stands for any two events, then

Clearly, the outcomes of both A and B are non-mutually exclusive. Example Two dice are rolled. Find the probability that the score is an even number or multiple of 3. Solution : Two dice are rolled. Sample space = {(1, 1), (1, 2), ............, (6, 6)} n(S) = 6 6 = 36

Event E : The score is an even number or multiple of 3. Note here score means the sum of the numbers on both the dice when they land. For example (1, 1) has score 1 + 1 = 2. It is clear that the least score is 2 and the highest score (6, 6) 12 6+6=

i.e. score 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 Let Event A : Score is an even numbers A = {(1, 1), (1, 3), (1, 5), (2,2), (2, 4), (2, 6), (3, 1), (3, 3) (3, 5), (4, 2), (4, 4), (4, 6), (5, 1), (5, 3), (5, 5), (6, 2), (6, 4), (6, 6) } Therefore n (A) = 18 Let Event B: The score is the multiple of 3 i.e. 3, 6, 9, 12 B = {(1, 2), (1, 5), (2, 4), (2, 1) (3, 6) (3, 3) (4,2), (4, 5), (5, 1), (5,4), (6, 3), (6, 6) } n (B) = 12 Let Event A B:The score is an even number and multiple of 3 AB or

(i.e. common to both A and B) AB = {(2, 4), (4, 2), (33,3), (4,2), (5, 1), (6,6)}

n (AB) = 6

Multiplication Law of Probability

If there are two independent events; the respective probability of which are known, then the probability that both will happen is the product of the probabilities of their happening respectively P (AB) = P (A) P (B)

To compute the probability of two or even more independent event all occurring (joint occurrence) extent the above law to required number. For example, first flip a penny, then the nickle and finally flip the dime. On landing, probability of heads is probability of heads is probability of heads is for a nickle for a dime or 0.125. for a penny

Thus the probability of landing three heads will be (Note that all three events are independent)

Example Three machines I, II and III manufacture respectively 0.4, 0.5 and 0.1 of the total production. The percentage of defective items produced by I, II and III is 2, 4 and 1 percent respectively for an item randomly chosen, what is the probability it is defective? Solution:

Example In shuffling a pack of cards, 4 are accidentally dropped one after another. Find the chance that the missing cards should be one from each suit. Solution: Probability of 4 missing cards from different suits are as follows: Let H, D, C and S denote heart, diamond, club and spade cards respectively

Example A problem in statistics is given to three students A, B and C whose chances in solving it are probability that the problem will be solved ? respectively. What is the

Solution : The probability that A can solve the problem = 1/2 The probability that B cannot solve the problem = 1 - 1/2 = 1/2 Similarly the probabilities that B and C cannot solve problem are respectively.

Conditional Probability
In many situations you get more information than simply the total outcomes and favorable outcomes you already have and, hence you are in position to make yourself more informed to make judgements regarding the probabilities of such situations. For example, suppose a card is drawn at random from a deck of 52 cards. Let B denotes the event the card is a

diamond and A denotes the event the card is red. We may then consider the following probabilities.

Since there are 26 red cards of which 13 are diamonds, the probability that the card is diamond is knowing that A has occurred is . In other words the probability of event B .

The probability of B under the condition that A has occurred is known as condition probability and it is denoted by P (B/A) . Thus P (B/A) = . It

should be observed that the probability of the event B is increased due to the additional information that the event A has occurred. Conditional probability found using the formula P (B/A) =

Justification Similarly

:-

(A/B) P(A/B)

= =

In both the cases if A and B are independent events then P (A/B) = P (A) and

P(B/A) Therefore or P(B) = P(A) = P

= (AB) = P (A) . P

P(B) (B)

P (AB) = P(A) . P (B)

8.5 Importance of Probability


The theory of probability has its origin in the seventeenth century to develop the quantitative measure of probability concerning problems related to the theory of die in gambling. Later, the theory was used on problems pertaining to chance by mathematicians. The problems are related to tossing of a coin, possibility of getting a card of specific suit, possibility of getting balls of specific colour from a bag of balls. Now a days the law of probability, is used to solve the economic and business problems. It is also used to solve the problems of our day to day life even. The utility of probability can be known by its various uses. Following are the areas where probability theory has been used : 1. The fundamental laws of statistics like Law of Statistical Regularity and Law of Inertia of large numbers are based on the theory of probability. 2. The various test of significance like Z test, F test, Chi suare test, are derived from the theory of probability. 3. This theory gives solution to the problems relating to the game of chance. 4. The decision theories are based on the fundamental laws of probability.

5. The theory is generally used in economic and business decision making. The theory is very useful in the situations where risk and uncertainty prevails. 6. The subjective probability is widely used in those situations where actual measurement of probability is not feasible. It has, thus, added new dimension to the theory of probability. These probability can be revised at a later stage on the basis of experience.

8.6 Practical Problems: Illustration: A single letter is selected at random from the word PROBABILITY. What is the probability that it is a vowel?

Sollution : Total number of letters in the word, PROBABILIT5Y = n = 11 Number of favourable cases = m = 4 ( vowels are o, a, i, i ) We know that,

P(A)= Illustration:

= Find the probability of having at least one son in a family if

there are two children in a family on an average. Solution: Two children in a family may be either : (1) Both sons

or (2) Son and daughter

or (3) Daughter and son or (4) Both daughters Thus, total number of equally likely cases = n = 4 At least one son implies that a family may have one son or two sons. Thus, favourable number of cases = m = 3 (i.e., option, nos 1,2,3,) P(A) =

Illustration: Find the chance of getting an ace in a draw from a pack of 52 cards. Solution: Total number of cards = n = 52 Number of favourable cases = m = 4 (number of aces) P(A) Illustration: Suppose an ideal die is tossed twice. What is the probability of getting a sum of 10 in the two tosses? Solution: A die can be tossed first time in = 6 ways Adie can be tossed second time in = 6 ways A die can be tossed twice in = 6 6 = 36 ways (as per rule of counting) Number of ways in which we can through two die to get a sum of 10 are = m = 3 ways (i.e., dot number 4+6+5and 6+4) P(A)

Chapter Eight: End Chapter Quizzes


1. abcdThe outcome of tossing a coin is a simple event mutually exclusive event complementary event compound event

2. abcd-

Classical probability is measured in terms of an absolute value a ratio absolute value and ratio both none of the above

3. abcd-

Probability is expressed as ratio proportion percentage all the above

4. abcd-

Classical probability is also known as Laplaces probability mathematical probability a priori probability all the above

5.

Each outcome of a random experiment is called

abcd-

Primary event Compound event Derived event All the above

6. given by abcd-

The definition of statistical probability was originally

De Moivre Laplace Von-Mises Pascal

7. given by abcd-

The definition of priori probability was originally

De Moivre Laplace Von-Mises Feller

8. abcd-

Probability by classical approach has no lecuna only one lecuna only two lecunae many lecunae

9. A is called

An event consisting of those elements which are not in

abcd-

primary event derived event simple event complementary event

10.

The probability of the intersection of two mutually

exclusive events is always abcdinfinity zero one none of the above

Answer key to End Chapter Quizzes:


Chapter One (1) b , (2) c , (3) c , (4) a , (5) b , (6) c , (7) b , (8) c , (9) c , (10) d Chapter Two (1) b , (2) b , (3) c , (4) b , (5) d , (6) c , (7) c , (8) c , (9) c , (10) c Chapter Three (1) d , (2) c , (3) a , (4) d , (5) b , (6) b , (7) a , (8) d , (9) a , (10) c Chapter four (1) d , (2) c , (3) b , (4) a , (5) a , (6) b , (7) c , (8) b , (9) c , (10) d Chapter Five (1) c , (2) b , (3) c , (4) c , (5) a , (6) b , (7) b , (8) b , (9) c , (10) b Chapter Six (1) b , (2) b , (3) d , (4) d , (5) a , (6) c , (7) a , (8) d , (9) b , (10) b Chapter Seven (1) d , (2) d , (3) a , (4) c , (5) b , (6) b , (7) c , (8) a , (9) b , (10) d Chapter Eight (1) a , (2) b , (3) d , (4) d , (5) a , (6) c , (7) b , (8) d , (9) d , (10) b

BIBLIOGRAPHY
(I) Books : 1. 2. 3. 4. 5. 6. 7. 8. 9. Gupta, S. P. Sharma, N. L. Gupta, K. L. Gupta, S. P. Kapoor & Sancheti Kothari, C. R. Agarwal, B. M. Hooda, R. P. Sharma, J. K. : : : : : : : Business Statistics Statistics : Business Statistics Statistical Methods Business Statistics Quantitative Techniques Business Statistics : Introduction to Statistics Business Statistics

II) Journals, Periodicals, Newspapers and Other useful Publications

1. Economic and Political Weekly. 2. India Today. 3. Business India. 4. Journal of Development Economics.

III) Reports and Other Materials

1. Journals in Statistics

2. Various reports on Economic and Statistical investigations.

You might also like