You are on page 1of 20

RESEARCH METHODOLOGY : MFM SEM II GROUP 10

Group Roll No. 8 17 24 41 55 RM Assignment: RM 5 Q1 Differentiate between following, 1. Parameter and statistic. 2. Level of significance and level of confidence. 3. Null and Alternate hypothesis. 4. Type-I and type-II error. 5. One-tailed and two-tailed test of hypothesis 6. Testing of hypothesis and estimation. 7. Point estimate and interval estimate. 8. Parametric and non-parametric test of hypothesis 9. Z-test and t-test of hypothesis. 10. Test of goodness of fit and test of independence, under chi-square test 11. 1-way ANOVA and 2-way ANOVA. 12. Test of confirmation and test of comparison. 10 Name Sarvesh Desai Pooja Gupta Nilesh Jadhav Rupesh Phalke Venugopalan Swaminathan

Solution: Q 1.1 Parameter 1 A parameter describes a full population 2 A parameter is a property of the underlying population distribution 3 as the sample becomes large, approaches the population mean, which is a parameter Q 1.2 1 Level of Significance It indicates the likelihood that the answer will fall outside that range 1% significance level means 99% confidance level It indicates the likelihood that the answer will fall outside that range

Statitics a statistic describes a sample "statistic" is "a function of a sample/observation." the sample mean is a statistic

Level of confidance Is the expected % of times that actual value will fall with the stated precision limits 95% confidance level means 95 chances in 100 that sample represents true condition Is the expected % of times that actual value will fall with the stated precision limits Alternate hypothesis Page 1

Q 1.3

Null Hypothesis

[Type text]

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


1 2 Ho: The finding occurred by chance The null hypothesis is then assumed to be true unless we find evidence to the contrary H1: The finding did not occur by chance If we find that the evidence is just too unlikely given the null hypothesis, we assume the alternative hypothesis is more likely to be correct. Type II error Means accepting the hypothesis which should have been rejected Denoted by Beta It depends on the type I error two tailed hyopthesis Rejection/Acceptance area only on two side

Q 1.4 1 2 3 Q 1.5 1 2

Type I error Means rejection of hypothesis which should have been accepted Denoted by alapha Can be controlled by fixing it lower One tailed Hypothesis Rejection/Acceptance area only on one side

Q 1.6

Testing of Hypothesis Hypothesis testing is carried out for testing of the assumed criteria Point Estimate The esitmate of a population parameter may be one single value or it could be a range

Estimation of Hypothesis Population parameters are unknown so has to be estimated from sample Interval Estimate Estimation of the parameter is not sufficient. It is necessary to analyse and see how confident we can be about this particular estimation. One way of doing it is defining confidence intervals. If we have estimated q we want to know if the true parameter is close to our estimate. In other words we want to find an interval that satisfies following relation:

Q 1.7

as the name suggests is the estimation of the population parameter with one number

[Type text]

Page 2

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q 1.8 1 2 3 4 Parameric test of hypotesis The observations must be independent The observations must be drawn from normally distributed populations These populations must have the same variances The means of these normal and homoscedastic populations must be linear combinations of effects due to columns and/or rows* Z test Z-test is a statistical hypothesis test that follows a normal distribution Z-test is appropriate when you are handling moderate to large samples (n > 30). Z-test will often require certain conditions to be reliable. Z-tests are not commonly used than Ttests Test of goodness of fit under chi sqaure A goodness-of-fit test is a one variable Chi-square test. the goal of a Chi-square goodness-of-fit test is to determine whether a set of frequencies or proportions is similar to and therefore fits with a hypothesized set of frequencies or proportions A Chi-square goodness-of-fit test is like to a one-sample t-test It determines if a sample is similar to, and representative of, a population. 1 way ANOVA The purpose of one way Anova is to verify whether the data collected from different sources converge on a common mean Non parameteric test of hypotesis Observations are independent Variable under study has underlying continuity

Q 1.9 1 2

T test T-test follows a Student s T-distribution A T-test is appropriate when you are handling small samples (n < 30) T-test is more adaptable than Z-test T-tests are more commonly used than Ztests Test of independence under chi sqaure A test of independence is a two variable Chi-square test the goal of a two-variable Chi-square is to determine whether or not the first variable is related to or independent of the second variable A two variable Chi-square test or test of independence is similar to the test for an interaction effect in ANOVA Is the outcome in one variable related to the outcome in some other variable 2 Way ANOVA purpose of the two way Anova is to verify whether the data collected from different sources coverage on a common mean based on two categories of defining characteristics

3 4

Q 1.10 1 2

Q 1.11 1

[Type text]

Page 3

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


2 one way Anova is find out whether the groups carried out the same procedures in conducting research Anova is used in the comparison of treatment means. This involves the introduction of randomized block design. The experiment conducted in the case of two way Anova gets split normally into many mini experiments. In short it can be said that the two way Anova is employed for a design with two or more treatment means that can be called factorial designs. Test of comparision

Q 1.12 1 2

Test of confirmation

[Type text]

Page 4

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q2 State whether following statements are true or false, giving reasons, 1) Level of significance is type-I error. 2) In 1-way ANOVA, we need all samples to be of equal size. 3) Point estimate is often insufficient because it is either right or wrong. 4) In Z distribution , area contained between + / - 3* standard deviation is equal to 100%. 5) In fixing critical value of t , we need to specify level of significance or degrees of freedom or one/two tailed. 6) All tests of hypothesis are repetitive and hence universal. 7) If the test fails to support null hypothesis, it also, indicates why test fails. 8) ( 1 beta error ) is called power of test. 9) 1% level of significance gives greater confidence to decision maker than 5% level of significance. 10) In 1-way ANOVA, if F calculated is lesser than 1, it means the factor which differentiates columns is the strong reason explaining variation in data. 11) If all data values are increased by 5, ANOVA inference drawn earlier will change. 12) Client is supposed to give beta error to researcher in advance. 13) In chi-square test, we want to confirm whether chi-square value is zero or not. 14) Level of significance is rejection area under the sampling distribution beyond critical value of test statistic 15) Good hypothesis can result into type-II error only. 16) Alternate hypothesis can decide whether test is one tailed or two tailed in case of large sample Z test. 17) Randomised block experimental design results into one-way ANOVA. 18) Difference between sample statistic and population parameter is always significant. 19) We use chi-square test of goodness of fit on nominal data 2-way classified. 20) Latin square experimental design will lead to 3-way ANOVA Solution: Q2 State whether following Answer Reason statements are true or false, giving reasons Q 2 .1 Level of significance is type-I TRUE Level of significance indicates most error. likelihood to reject the hypothesis though its true which is Type-I error Q 2 .2 In 1-way ANOVA, we need all FALSE Not necessary. 1-way ANOVA can samples to be of equal size. result for unequal sample size also Q 2 .3 Point estimate is often TRUE Point estimate gives one value insufficient because it is either which can be right or wrong where right or wrong. interval gives range to check answer Q 2 .4 In Z distribution , area contained FALSE In Z distribution, area contained between + / - 3* standard between +/-3* SD is 99.87% deviation is equal to 100%. Q 2 .5 In fixing critical value of t , we TRUE To fix critical value of 't', we need to need to specify level of specify LOS, DOF, one/tqo tailed. significance or degrees of freedom or one/two tailed.

[Type text]

Page 5

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q 2 .6 Q 2 .7 All tests of hypothesis are repetitive and hence universal. If the test fails to support null hypothesis, it also, indicates why test fails. ( 1 beta error ) is called power of test. 1% level of significance gives greater confidence to decision maker than 5% level of significance. In 1-way ANOVA, if F calculated is lesser than 1, it means the factor which differentiates columns is the strong reason explaining variation in data. If all data values are increased by 5, ANOVA inference drawn earlier will change. Client is supposed to give beta error to researcher in advance. In chi-square test, we want to confirm whether chi-square value is zero or not. Level of significance is rejection area under the sampling distribution beyond critical value of test statistic Good hypothesis can result into type-II error only. Alternate hypothesis can decide whether test is one tailed or two tailed in case of large sample Z test. Randomised block experimental design results into one-way ANOVA. Difference between sample statistic and population parameter is always significant. TRUE FALSE When sample changes, we need to repeat thst of hypothesis No. It does no tell why test fails

Q 2 .8 Q 2 .9

TRUE TRUE

1-beta error is type-II error in which False H0 is accepted. 1% LOS is 99% confidence level which means 99% confidence level is > 95% confidence level Yes. 'F' calculated is lesser than 1 explains variation in data with strong reason

Q 2 .10

TRUE

Q 2 .11

FALSE

Q 2 .12 Q 2 .13

TRUE TRUE

Researcher should know the client expected success rate

Q 2 .14

TRUE

LOS indicates the % failure in test statistic

Q 2 .15

TRUE

Q 2 .16

TRUE

Here False H0 is accepted, indicating failures are accepted hence good hypothesis Alternate hypothesis tells the

Q 2 .17

FALSE

CR results into one way ANOVA

Q 2 .18

FALSE

Lets say population has seasonality factor and while if the sampling is not done proper way, your sample statistic and population parameter can be different.

Q 2 .19

We use chi-square test of goodness of fit on nominal data 2-way classified.

TRUE

[Type text]

Page 6

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q 2 .20 Latin square experimental design will lead to 3-way ANOVA TRUE

[Type text]

Page 7

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q3 State whether following statements are true or false, giving reasons Partial correlation analysis is same as multiple correlation analysis. If byx = 0.8, bxy = - 0.2, hence r = - 0.4. If byx = 0.8,bxy = 1.6, hence r = 1.13. byx and bxy must be less than 1, always. y = a + bx this equation can be used to estimate value of x for a given value of y always. 6. If two regression lines are perpendicular to each other., correlation coefficient is 1 7. If r =0.7, amount of variation in y because of x is 70 %. 8. Coefficient of determination can be negative sometimes. 9. If one variable is constant, correlation between x and y is positive perfect. 10. If coefficient of determination is less, stronger will be relationship between x and y. 11. Coefficient of indetermination and standard error of estimate are same in concepts. 12. Variance and co-variance mean the same thing. 13. If correlation coefficient between x and y is 0.90, this definitely proves that relationship is always causal. 14. If two regression lines coincide, coefficient of correlation is always +1. 15. Intersection of two regression lines is the mean of each variable. Solution: Q3 State whether following statements are true or false, giving reasons Partial correlation analysis is same as multiple correlation analysis. TRUE Reason / FALSE FALSE Partial correlation measures the effect of its independent variable on the dependent variable whereas multiple correlation takes into account two independent and one dependent variable. TRUE r= (0.8*0.2) = hence r 0.16= - 0.4 TRUE TRUE TRUE (.0.8*1.6) r = 1.28 r= 1.13 1. 2. 3. 4. 5.

Q 3.1

Q 3.2 Q 3.3 Q 3.4 Q 3.5

Q 3.6

Q 3.7

If byx = 0.8, bxy = - 0.2, hence r = 0.4. If byx = 0.8,bxy = 1.6, hence r = 1.13. byx and bxy must be less than 1, always. y = a + bx this equation can be used to estimate value of x for a given value of y always. If two regression lines are perpendicular to each other., correlation coefficient is 1 If r =0.7, amount of variation in y because of x is 70 %.

TRUE

TRUE

[Type text]

Page 8

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q 3.8 Coefficient of determination can be negative sometimes. If one variable is constant, correlation between x and y is positive perfect. If coefficient of determination is less, stronger will be relationship between x and y. COefficient of indetermination and standard error of estimate are same in concepts. Variance and co-variance mean the same thing. If correlation coefficient between x and y is 0.90, this definitely proves that relationship is always causal. If two regression lines coincide, coefficient of correlation is always +1. TRUE negative values of R2 may occur when fitting non-linear trends to data.

Q 3.9

FALSE

Q 3.10

FALSE

Q 3.11

FALSE

Q 3.12 Q 3.13

FALSE FALSE

Q 3.14

Q 3.15 Intersection of two regression lines is the mean of each variable.

FALSE When r +/- 1, there is exact linear relationship between X & Y and two regression lines coincides with each other. TRUE Two regression lines always intersect each other at point mean of X and mean of Y

[Type text]

Page 9

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q4 Explain importance of following in statistical analysis (Under what circumstances will you recommend following in analyzing data collected? 1. Mode as measure of central tendency. 2. Coefficient of variation 3. Interquartile range. 4. Measures of skewness and kurtosis 5. Syx : standard error of estimate of y because of x. 6. Coefficient of determination ( r2) 7. Co-variance in bivariate analysis 8. Interval estimate. 9. Classification, tabulation, presentation of data 10. Frequency curve and histogram 11. Correlation and regression analysis 12. Yule s coefficient of association Solution: 1) Mode as measure of central tendency. The mode is the most frequently occurring value in the data set. The mode in a distribution is that item around which there is maximum concentration. In general mode is the size of the item which has the maximum frequency. For example, in the data set {1,2,3,4,4}, the mode is equal to 4. A data set can have more than a single mode, in which case it is multimodal. In the data set {1,1,2,3,3} there are two modes: 1 and 3. The mode can be very useful for dealing with categorical data. For example, if a sandwich shop sells 10 different types of sandwiches, the mode would represent the most popular sandwich. The mode also can be used with ordinal, interval, and ratio data. However, in interval and ratio scales, the data may be spread thinly with no data points having the same value. In such cases, the mode may not exist or may not be very meaningful. 2) Coefficient of variation The coefficient of variation measures variability in relation to the mean (or average) and is used to compare the relative dispersion in one type of data with the relative dispersion in another type of data. The data to be compared may be in the same units, in different units, with the same mean, or with different means. Suppose you want to evaluate the relative dispersion of grades for two classes of students: Class A and Class B. The coefficient of variation can be used to compare these two groups and determine how the grade dispersion in Class A compares to the grade dispersion in Class B. This is one example of how the coefficient of variation can be applied. The coefficient of variation is a calculation built on other calculations -- the standard deviation and the mean -- as follows:

This reads as 'the coefficient of variation is equal to the standard deviation divided by the mean, multiplied by 100 (to produce a percentage). The steps required for calculating the coefficient of variation are: [Type text] Page 10

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Calculate the mean for the data set. Calculate the standard deviation. Divide the standard deviation by the mean. Multiply the result of step 3 by 100. 3) Interquartile range. The interquartile range (IQR) is the distance between the 75th percentile and the 25th percentile. The IQR is essentially the range of the middle 50% of the data. Because it uses the middle 50%, the IQR is not affected by outliers or extreme values. The IQR is also equal to the length of the box in a box plot. 4) Measures of skewness and kurtosis Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. For univariate data Y1, Y2, ..., YN, the formula for skewness is:

where is the mean, is the standard deviation, and N is the number of data points. The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. Some measurements have a lower bound and are skewed right. For example, in reliability studies, failure times cannot be negative. Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:

where

is the mean, is the standard deviation, and N is the number of data points.

5) Syx : standard error of estimate of y because of x. Let us consider yest as the estimated value of y for a given value of x. This estimated value can be obtained from the regression curve of y on x From this, the measure of the scatter about the regression curve is supplied by the quantity:

[Type text]

Page 11

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


The above equation is called the Standard Error of Estimate of y on x. It is important to note that this Standard Error of Estimate has properties analogous to those of standard deviation. 6) Coefficient of determination ( r2) The coefficient of determination, r 2, is useful because it gives the proportion of the variance (fluctuation) of one variable that is predictable from the other variable. It is a measure that allows us to determine how certain one can be in making predictions from a certain model/graph. The coefficient of determination is the ratio of the explained variation to the total variation. The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of the linear association between x and y. The coefficient of determination represents the percent of the data that is the closest to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained. The coefficient of determination is a measure of how well the regression line represents the data. If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation. The further the line is away from the points, the less it is able to explain. 7) Co-variance in bivariate analysis

8) Interval estimate. An interval estimate is defined by two numbers, between which a population parameter is said to lie. For example, a < x < b is an interval estimate of the population mean . It indicates that the population mean is greater than a but less than b. 9) Classification, tabulation, presentation of data Tabulation refers to the systematic arrangement of the information in rows and columns. Rows are the horizontal arrangement. In simple words, tabulation is a layout of figures in rectangular form with appropriate headings to explain different rows and columns. The main purpose of the table is to simplify the presentation and to facilitate comparisons "A statistical table is a systematic organisation of data in columns and rows." "Tabulation involves the orderly and systematic presentation of numerical data in a form designed to elucidate the problem under consideration." 10) Frequency curve and histogram Frequency curve is obtained by joining the points of frequency polygon by a freehand smoothed curve. Unlike frequency polygon, where the points we joined by straight lines, we make use of free hand joining of those points in order to get a smoothed frequency curve. It is used to remove the ruggedness of polygon and to present it in a good form or shape. We [Type text] Page 12

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


smoothen the angularities of the polygon only without making any basic change in the shape of the curve. In this case also the curve begins and ends at base line, as is in case of polygon. Area under the curve must remain almost the same as in the case of polygon. A histogram is a way of summarising data that are measured on an interval scale (either discrete or continuous). It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient form. It divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group, and an area proportional to the number of observations falling into that group. This means that the rectangles might be drawn of non-uniform height. The histogram is only appropriate for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets (>100 observations), when stem and leaf plots become tedious to construct. A histogram can also help detect any unusual observations or any gaps in the data set. 11) Correlation and regression analysis Regression analysis is the mathematical process of using observations to find the line of best fit through the data in order to make estimates and predictions about the behaviour of the variables. This line of best fit may be linear (straight) or curvilinear to some mathematical formula. Correlation analysis is the process of finding how well (or badly) the line fits the observations, such that if all the observations lie exactly on the line of best fit, the correlation is considered to be 1 or unity.

12) Yule s coefficient of association In order to find the degree of intensity of association between two or more sets of attributes, we should work out the coefficient of association , Professor Yule s coefficient of association QAB = {(AB)(ab)-(Ab)(aB)}/{(AB)(ab)+(Ab)(aB)} QAB = Yule s coefficient of association between attributes A & B (AB)=Frequency of class AB in which A & B are present (Ab) = Frequency of class Ab in which A is present & B is absent (aB) = Frequency of class aB in which A is absent & B is present (ab)= Frequency of class ab in which both A & B are absent

[Type text]

Page 13

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


RM Assignment: RM 6 Q1 Differentiate between following 1. Completely randomized ( CR ) and randomized block ( RB ) experimental design 2. Stratified sampling and cluster sampling 3. Sampling and non-sampling errors. 4. Probability and non-probability sampling. 5. Survey and experiment. 6. Simple random sampling and systematic sampling. 7. Nominal data and ratio data. 8. Exploratory and diagnostic research. 9. Validity and reliability in attitude measurement. 10. Bias and error in research 11. Structured and un-structured interview. 12. Latin square and factorial experimental design. 13. Principle of randomizing and principle of replication. 14. Multi-stage sampling and multi-phase sampling. 15. Informal experimental and formal experimental design Solution: Q 1.1 Completely randomized ( CR ) 1 It is simple design than RB 2 Invovles 2 principles Viz the principle of replication and the principle of randmozation 3 Subjects are randomly assigned to experiment treatments Randomized block ( RB ) experimental design It is an improvement over CR Principle of Local control can be applied along with the other two principles of experimental design Subjects are divided into groups-Blocks , such that within each group the subjectss are relatively homogenous in respect to some other variable' Is Analsed by 2 way ANOVA Cluster sampling for bigger samples divide the area into a number of smaller non overlapping areas and then randomly select a number of these smaller areas(Clusters)

Is Analsed by 1 way ANOVA Q 1.2 Stratified sampling 1 If a population from which a sample is to be drawn does not constitue a homogenous group , stratified sampling technique is used 2 Generally used to obtain representative sample 3 Sampling population is divided into several sub -population(Strata) that are individually more homogenous than the total population then from Stratum items are selected for sampling 4 Sample size ni = { n x N1 x si} /{N1 x s1 +N2 x s2+ ..Ni x si} [Type text]

Sample is divided in clusters which are themselves clusters in themselves

Page 14

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


5 High cost required 6 More precise Q 1.3 Probability Sampling 1 Also known as Random sampling or chance sampling 2 Every item of universe has eqal chance of inclsion in sample Low cost involved Less Precise Non-probability Sampling Also known as deliberate sampling Organisers of inquiry purposively choosw the particular units of the universe for constituing a sample on the bais that the sma;; ass that they so select out of a hufe one will be typical or represntative whle Just quota sampling no basis Experiment The process of examing the truth of statitical hypothesis relating to some research problem is known as an experiment. Two types absolute & comparitive are part of experimental research studies Small samples used for measure of the effects of an experiment which he conducts intentionally Example Laboratory research Systematic sampling Various systemeatic approaches logic is defined in order to have better control on sample high cost is involved ratio data has absolute or zero of measurement actual amounts of variables Geometric or harmonic means are used as easure of central tendency Used for physical measurement Diagnostic research This is carried out for digonising certain problem Page 15

3 Probability is 1 /NCn Q 1.4 1 Survey

2 3 are conducted in case of descriptive reaserch studies 4 Larger samples 5 Normally used for social & behavioural sciences 6 Example firld research Q 1.5 Simple random sampling 1 Just a random sample 2 every entity from universe may become a sample 3 low cost Nominal data 1 Simply a system of assigning nmber symbols to events in order to lable hem. 2 conveienet for keeping taracks 3 only mode is measure of central tendancy 4 Widely used in surveys Exploratory research 1 This is carried out for exploring new ideasm with support

Q 1.6

Q 1.7

[Type text]

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


2 This is general research leading to surveys 3 Low to moderate cost compared to Diagonostic research Q 1.8 Validity in attitude measurement This is extensive research involves depth study and stattical tools High cost compared to exploratory research Reliability in attitude measurement

Q 1.9

Bias in Research 1 This may impacts the results of the research 2 This is the attitude

Error in research This impacts a lot the results of the reasearch This is system related Un-structured interview Questions are not fixed Normal standards for recording freedom to condct interview Question sequence may be chaged Factorial experimental design are used in experiments where the effects of varying more than one factor are to be determined There is interractio between row & column entity more complex problem are been looked with multiple rows and columns Provide equivalent accuracy with lesss labour and as such are a source of economy Principle of replication

Q 1.10 Structured interview 1 Invovles a set of predetermined questions 2 Highly standardised techniques of recording 3 Rigid procedure to intervirew 4 Question order is fixed sometimes Q 1 .11 Latin square 1 Very frequenctly used in agricultural reasearch 2 Asumption that there is no interaction between row factor & coum factors 3 No of row & columns are required to be equal 4 Acuuracy us low compared to factorial deisgn

Q 1 .12 1 2

Principle of randomizing

Q 1. 13 Multi-stage sampling 1 It is further dvelopment of cluster sampling 2 Easier to administer 3 Large no of units can be sampledfor given cost under mutlistsge [Type text]

Multi-phase sampling

Page 16

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Q1. 14 Informal experimental 1 of 3 types before & after without control design After only cotrol design Before & after with cotrol design 2 Less sophisticated 3 based on differences of magnitude Formal experimental design of 4 types Completely randomized design (CR) Rnadomized block design (RB) Latin sqauare design (LS) Factorial design offer more control Use precise sratitical procedure for analysis

Q2 Justify following statements 1. Quota sampling is a non-probability sampling. 2. We don t need hypothesis firmed up in diagnostic research. 3. Wording of questionnaire can cause ineffective instrument. 4. In Latin square experimental design it is assumed that factors are independent of each other. 5. Stratified sampling method assumes strata to be homogeneous within and heterogeneous between. 6. Convenience sampling is a method of probability sampling. 7. Semantic differential scale requires identifying bi-polar adjectives describing the object. 8. Likert scale is a summative model for attitude measurement. 9. Principle of replication in experimental design is aimed at increasing statistical accuracy 10. Principle of local control in experimental design is identifying effect of known source of variation in data. 11. Non-sampling errors cannot be totally avoided in research. 12. Word association test is a projective method of data collection. 13. Defining the problem involves in identifying unit of analysis and characteristic of interest, time and space references and environmental conditions. 14. Projective methods of data collection are used for inferred characteristics 15. On ordinal data, we can do all mathematical operations. 16. Optimal sample size is based on degree of accuracy and level of confidence expected. 17. Cluster sampling needs each cluster to be homogeneous between and heterogeneous within. 18. Systematic sampling is not truly probability sampling. 19. Parameters of quality data are same whether it is primary data or secondary data. 20. We firm up hypothesis based on exploratory, descriptive and diagnostic research. Solution: 1) Quota sampling is a non-probability sampling. The first step in non-probability quota sampling is to divide the population into exclusive subgroups. Then, the researcher must identify the proportions of these subgroups in the population; this same proportion will be applied in the sampling process. Finally, the [Type text] Page 17

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


researcher selects subjects from the various subgroups while taking into consideration the proportions noted in the previous step. The final step ensures that the sample is representative of the entire population. It also allows the researcher to study traits and characteristics that are noted for each subgroup. So in quota sampling the probability is not considered hence it is called non probability sampling. 2) We don t need hypothesis firmed up in diagnostic research. Since DR aims to identify causes of a problem and its possible solutions. 3) Wording of questionnaire can cause ineffective instrument. Wording and order of questions, ensures that each respondent receives the same stimuli, else the purpose of the survey will not get serve 4) In Latin square experimental design it is assumed that factors are independent of each other. A Latin square is used in experimental designs in which one wishes to compare treatments and to control for two other known sources of variation. It was recognized that within a eld there would be fertility trends running both across the eld and up and down the eld. So in an experiment to test, say, four different fertilizers, A, B, C and D, the eld would divided into four horizontal strips and four vertical strips, thus producing 16 smaller plots. A Latin square design will give a random allocation of fertilizer type to a plot in such a way that each fertilizer type is used once in each horizontal strip (row) and once in each vertical strip (column). 5) Stratified sampling method assumes strata to be homogeneous within and heterogeneous between. 6) Convenience sampling is a method of probability sampling. Convenience sampling is a non-probability sampling technique where subjects are selected because of their convenient accessibility and proximity to the researcher. 7) Semantic differential scale requires identifying bi-polar adjectives describing the object. Yes, Semantic differential is a type of a rating scale designed to measure the connotative meaning of objects, events, and concepts. 8) Likert scale is a summative model for attitude measurement. Likert (1932) developed the principle of measuring attitudes by asking people to respond to a series of statements about a topic, in terms of the extent to which they agree with them, and so tapping into the cognitive and affective components of attitudes. 9) Principle of replication in experimental design is aimed at increasing statistical accuracy Measurements are usually subject to variation and uncertainty. Measurements are repeated and full experiments are replicated to help identify the sources of variation, to better estimate the true effects of treatments, to further strengthen the experiment's reliability and validity, and to add to the existing knowledge of about the topic.[13] [Type text] Page 18

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


However, certain conditions must be met before the replication of the experiment is commenced: the original research question has been published in a peer-reviewed journal or widely cited, the researcher is independent of the original experiment, the researcher must first try to replicate the original findings using the original data, and the write-up should state that the study conducted is a replication study that tried to follow the original study as strictly as possible. 10) Principle of local control in experimental design is identifying effect of known source of variation in data. Local control refers to grouping of the experimental units in such a way that the units within a group (i.e., block) are more homogeneous than are units in different groups. The experimental materials or conditions are more alike within a group. Thus, the variation among experimental units within a group is less than the variation would have been without grouping 11) Non-sampling errors cannot be totally avoided in research. Non-sampling errors are part of the total error that can arise from doing a statistical analysis. The remainder of the total error arises from sampling error. Unlike sampling error, increasing the sample size will not have any effect on reducing non-sampling error. Unfortunately, it is virtually impossible to eliminate non-sampling errors entirely. 12) Word association test is a projective method of data collection. Word Association Test: An individual is given a clue or hint and asked to respond to the first thing that comes to mind. The association can take the shape of a picture or a word. There can be many interpretations of the same thing. A list of words is given and you don t know in which word they are most interested 13) Defining the problem involves in identifying unit of analysis and characteristic of interest, time and space references and environmental conditions. 14) Projective methods of data collection are used for inferred characteristics This holds that an individual puts structure on an ambiguous situation in a way that is consistent with their own conscious & unconscious needs 15) On ordinal data, we can do all mathematical operations. Ordinal data is second level of measurement therefore The experimental (scientific) method depends on physically measuring things. The concept of measurement has been developed in conjunction with the concepts of numbers and units of measurement. Statisticians categorize measurements according to levels. Each level corresponds to how this measurement can be treated mathematically 16) Optimal sample size is based on degree of accuracy and level of confidence expected. 17) Cluster sampling needs each cluster to be homogeneous between and heterogeneous within.

[Type text]

Page 19

RESEARCH METHODOLOGY : MFM SEM II GROUP 10


Common motivation for cluster sampling is to reduce the average cost per interview. Given a fixed budget, this can allow an increased sample size. 18) Systematic sampling is not truly probability sampling. Systematic sampling is still thought of as being random, as long as the periodic interval is determined beforehand and the starting point is random, For example, if you wanted to select a random group of 1,000 people from a population of 50,000 using systematic sampling, you would simply select every 50th person, since 50,000/1,000 = 50. 19) Parameters of quality data are same whether it is primary data or secondary data. Data that has been collected from first-hand-experience is known as primary data. Primary data has not been published yet and is more reliable, authentic and objective. Primary data has not been changed or altered by human beings, therefore its validity is greater than secondary data. The review of literature in nay research is based on secondary data. Nostly from books, journals and periodicals. 20) We firm up hypothesis based on exploratory, descriptive and diagnostic research

[Type text]

Page 20

You might also like