You are on page 1of 9

TUTOR MARKED ASSIGNMENT

Course Code : ECO-07 Course Title : Elements of Statistics Assignment Code : ECO-07/TMA/2011-12 Assignment Coverage : All Blocks Maximum Marks : 100

1. What is statistical survey? Explain various sampling methods. (20) Statistics is said to be any systematic set of data which can be analysis for productive results and is expressed in numeric terms. It is a subject that reads about huge data acquired for a particular reason of analysis. This data is acquired through systematic and purposeful surveys done to determine the results. This is known as statistical survey.

A primary characteristic of the statistical data is that it may not be exact, but it is required to be accurate and to the point of the topic of analysis. Hence, it is important that the statistical survey is done properly to derive desired results out of the data. There are few steps involved in statistical surveys, following which one can achieve the target accuracy of the data. The steps are as follows: 1. Defining the problem 2. Determining the objectives and scope 3. Preliminaries to the collection of data: a. Source of data b. Type of enquiry c. Statistical Unit d. Degree of accuracy 4. Collection of data 5. Editing of data 6. Coding of data 7. Classification and tabulation of data 8. Analysis of the data 9. Interpretation of the data 10. Writing the report Once the above mentioned steps are followed, one can be rest assured that the data collected for from the survey are appropriate enough for the reason and interpretation. The surveys conducted are usually of two types: (1) census surveys : wherein the group is to be surveyed; (2) sample survey : wherein a particular item of the group is surveyed. In the second type of survey method, an item is picked up from the group for survey. This is called a sample and the method of selecting the item for survey is known as sampling method. Sampling methods can be categorized broadly as Probability Sampling Methods & Nonprobability Sampling Methods. Probability Sampling Method

In this type of sampling method, each and every item of the population has a chance or probability of inclusion in the survey. They can be categorized as following: 1. Random Sampling: This is a simple selection method wherein samples are randomly selected from the population. 2. Systematic Sampling: In this method, the population is systematically arranged in a list such as alphabetical order, and then selected systematically for eg. Every 10th item is selected. 3. Stratified Sampling: This happens when the population is divided into homogenous groups and then randomly samples are picked up from each group or strata. 4. Cluster Sampling: In this method, the population is divided into heterogeneous groups and samples are picked up from each cluster. 5. Area Sampling: Groups are created on the basis of area and samples are picked up. However this happens only when a small geographical area is taken into consideration. 6. Multi-stage Sampling: This is considered when a large geographical area is taken into consideration. They are divided into groups such as state. Then few states are picked up and then divided into districts. Then few towns are selected from few districts and so on. Non-probability Sampling Method Under this, the samples are selected purposefully and not all the items have probable chance of selection. The more meaningful and close to the subject are selected. They are further categorized as follows: 1. Convenience Sampling: In this method, the sampling is done on the basis of convenience of the interviewer and not all the available resources are taken into consideration. 2. Judgmental Sampling: In this method, the samples are selected purely on the discretion of the interviewer and depend on his judgments. 3. Quota Sampling: In this method the population is divided into homogeneous groups. The interviewers are allotted quota of sampling which they randomly select out of the groups as per their quota.

2. (a) What precautions should be taken while using secondary data. (10) Statistics is said to be any systematic set of data which can be analysis for productive results and is expressed in numeric terms. It is a subject that reads about huge data acquired for a particular reason of analysis. This statistical data may be of two types on basis of its sources. They are primary and secondary data. Primary data is that which the person gathers on first hand. However, secondary data is that which taken from other sources other than gathering our own self. For eg. a data published by the government is a primary data for the Government but a secondary data for s person or organization that takes the published data for their reference and analysis. Since the secondary data is gathered from another source, it is important to understand the objective and scope of enquiry before gathering the data. We must ensure that the data taken is relevant and helpful to the analysis. There are also many precautions that we must ensure while considering the secondary data. Reliability It is important to carefully note the source of the secondary data. The reliability is the most important factor while considering secondary data. The data taken should be authentic otherwise the results derived will not be of any importance. Adequacy We must properly understand the need and scope of the enquiry. It helps us to determine the secondary data that we consider. It is important to check if the data we are consider fully satisfying the need of the query and is adequate enough for the survey. Suitability We must also ensure that the secondary data we are considering suits the need of the query. It might be a case that the data was prepared for a different query and has a presentation which is different from the present query.

2. (b) Why is tabulation important? Explain the requisites of a good statistical table. (10) Statistical data are accumulated by means of surveys. They are a combination of primary and secondary data. After the data is accumulated, it is then edited and combined. It is also analyzed and interpreted as per requirement. Once the data is ready, it is important to present the data. The data can be presented in a tabular, diagrammatic or graphical form. The process of presenting the data in a tabular form is called tabulation. Tabulation is a very important step of a statistical survey. Even for a diagrammatic or graphical representation of the report, the primary step of tabulation needs to be done. Tabulation involves the process of putting the data in a systematic manner in rows and columns. This helps simplify the whole presentation as it involves a lot of complex and vast data. This data is divided and subdivided into columns and rows and also presented with the total and sub-totals of each group which easily interprets the relationship between the related data. Since the table makes the

complex data simple and easy to understand, it also helps one to identify errors, if any. These tables are well titled and presented. Tabulation is an easy and cheap form of data presentation that also requires less time. However, we must look into the basic requisites of a good statistical table: The primary and most important feature of a good statistical table is that it should be simple and clear. Therefore it is important to name the table with sub-heads. The various numeric columns must also be presented with totals and sub-totals. The name and various titles must be short and clear. This facilitates easy understanding of the data. The table can also be presented with brief captions and remarks and is also advisable to be numbered for easy reference. The tables accumulate data from various references. These references need to be clearly stated in the tabular presentation. The data accumulated are analyzed and various results are derived for reference like ratios, percentages & averages. These also must be presented in the table. The ditto mark must be avoided. Abbreviations are better avoided in a table. However, if needed, proper footnote should be provided for the abbreviations. The columns that are related or are comparative must be presented side by side. The columns must be clearly separated by lines for easy understanding. All the data must be properly aligned.

3. Differentiate between (a) Classification and Tabulation (5) In spite of the fact that they are closely related, the differences are as follows: Classification 1. It is the basis for Tabulation 2. It is the basis for Simplification 3. Data is divided into groups and sub groups on the basis of similarities and dissimilarities. Tabulation It is the basis for further analysis It is the basis for Presentation Data is listed according to a logical sequence of related characteristics

3. Differentiate between (b) Multiple Bar Diagram and Sub Divided Bar Diagram. (4) Sometimes it is desired to represent more than one interrelated series data on a bar diagram. In such cases also simple bar diagram is not suitable. We have to use what is known as multiple-bar diagram. Here the number of each year or region or zone is equal to the number of variables (data) to be represented. For example, imports and exports will be represented by two bars;

selling price, cost price and profits by three bars and so on. Normally we do not take more than three bars because it becomes complicated.

A simple bar diagram explained above is used to present only one variable. But when a breakdown of total or a series of totals is to be represented, we have to use what is called subdivided or component bar diagrams. For example, we may like to represent trend in sales of Super Bazaar in a big city like Delhi for (say) three years divided into four zones-North, East, West and South.

3. Differentiate between (c) Mixed Graph and Range Graph (5) Mixed Graph is a type of historigram prepared for two dependant variables where the units of measurement in respect of these two variables are not the same. The values of these two dependent variables are represented by two different scales one on the usual Y axis and the other Y axis taken on the right of the horizontal axis. Sometimes, for the dependent variable, two extreme values (i.e. the maximum and the minimum values) are given. Maximum and minimum temperature of a particular day is an example. The graph showing these two extreme values is called a range graph. This graph is called range graph because it shows the range in the values of the given data.

3. Differentiate between (d) Histogram and Historigram (5) Histogram is a bar diagram which represents a, frequency distribution with continuous classes. The width of all bars is equal to class interval and heights of the bars are proportional to the frequency of the respective classes. Each rectangle is joined with the other so as to give a continuous picture. In case of unequal class intervals there will be rectangles of unequal width. If it is desired to keep width of all the rectangles equal then, height will be adjusted proportionally so that area of the rectangle remains the same. A Few Points to Remember 1. To draw histogram from discrete frequency distribution consider the given values as the midpoint of the continuous classes assuming that the frequencies are uniformly distributed throughout each class. 2. If the class intervals are of inclusive type then change the class limits into class boundaries. 3. Histogram and Historigram: The word-histogram should not be confused with the term historigram which stands for time series graphs. Graphs of continuous time series are known as historigrams. If the absolute value of the variable is taken into consideration the graphs obtained are known as absolute historigrams. Historigrams may be constructed on the natural scale or on the ratio scale.

4. Find mean from the following distribution by step deviation method.

Class interval Frequency

15-25

25-35

35-45

45-55

55-65

65-75

11

19

14

Solution:

C.I. 15-25 25-35 35-45 45-55 55-65 65-75

x 20 30 40 50 60 70 TOTAL

f 4 11 19 14 8 2 58

d=xa -25 -15 -5 5 15 25 0

u=d/h - 2.5 - 1.5 - 0.5 0.5 1.5 2.5 0.0

fd -10 -16.5 -9.5 7 12 5 -12

Here, a=40 ; h=10

= 40 + (-12/58)x10

= 40+ 2

= 42

5. Write short notes on (a) Dispersion (5) The data values in a sample are not all the same. This variation between values is called dispersion. When the dispersion is large, the values are widely scattered; when it is small they are tightly clustered. The width of diagrams such as dot plots, box plots, stem and leaf plots is greater for samples with more dispersion and vice versa. There are several measures of dispersion, the most common being the standard deviation. These measures indicate to what degree the individual observations of a data set are dispersed or 'spread out' around their mean. In manufacturing or measurement, high precision is associated with low dispersion.

5. Write short notes on (b Mean Deviation (5) The mean deviation (also called the mean absolute deviation) is the mean of the absolute deviations of a set of data about the data's mean. For a sample size , the mean deviation is defined by

where is the mean of the distribution. The mean deviation of a list of numbers is implemented in Mathematica as Mean Deviation. The mean deviation for a discrete distribution defined for , 2, ..., is given by

Mean deviation is an important descriptive statistic that is not frequently encountered in mathematical statistics. This is essentially because while mean deviation has a natural intuitive definition as the "mean deviation from the mean," the introduction of the absolute value makes analytical calculations using this statistic much more complicated than the standard deviation

As a result, least squares fitting and other standard statistical techniques rely on minimizing the sum of square residuals instead of the sum of absolute residuals.

5. Write short notes on (c) Standard Deviation (5) Standard deviation is a measure of the spread or dispersion of a set of data. It is calculated by taking the square root of the variance and is symbolized by s.d, or s. In other words

The more widely the values are spread out, the larger the standard deviation. For example, say we have two separate lists of exam results from a class of 30 students; one ranges from 31% to 98%, the other from 82% to 93%, then the standard deviation would be larger for the results of the first exam.

5. Write short notes on (d) Lorenz Curve (5) It is the form of a curve which is derived from the cumulative percentage of the given variables. This curve was given by Dr. Max O. Lorenz a popular Economic- Statistician. He studied distribution of Wealth and Income with its help. It is graphic method to study dispersion. It helps in studying the variability in different components of distribution especially economic. The base of Lorenz Curve is that we take cumulative percentages along X and Y axis. Joining these points we get the Lorenz Curve. Lorenz Curve is of much importance in the comparison of two series graphically. It gives us a clear cut visual view of the series to be compared.

You might also like