Professional Documents
Culture Documents
: Prof. DR. SAMSUBAR SALEH. M.Soc.Sc : DOSEN FEB-UGM : GURU BESAR ILMU EKONOMI IV/c : JL WELING II / 39 A. DEPOK-SLEMAN : EKONOMI-PUBLIK STATISTIKA TERAPAN TEORI EKONOMI MAKRO/MIKRO PENGALAMAN KERJA : DOSEN FEB-UGM : ASSISTEN WADEK III & SEK-JURUSAN IE. : PENGELOLA S2/S3 FEB-UGM : SUPERVISOR BSNP DIKNAS : PENGELOLA PENELITIAN EKONOMI : KONSULTAN PERENC/PEMB DIY : KONSULTAN & FASILITATOR KEU-DA HP: 0811253224. Email :ssamsubar@yahoo.com
Text-Books
1. Lind & Marchal; Statistics for Business 2. Wonnacott : Statistics 3. Samsubar Saleh: Statistik Induktif
PROBABILITY
PROBABILITY IS A NUMERICAL MEASURE OF THE LIKELIHOOD OF AN EVENT OR MORE OCCURRING or CHANCE AN EVENT OR MORE WILL OCCUR AS A RESULT OF AN EXPERIMENT. THE CONCEPTS OF PROBABILITY ARE SPECIALLY RELEVANT IN BUSINESS. BUSINESSMEN MAKE DECISIONS IN THE FACE OF UNCERTAINTY DAY BY DAY. FOR EXAMPLE: 1. WHEN COMPANY LAUNCHES A NEW PRODUCT. 2.THE LIKELIHOOD OF A NEW INVESTMENT PROJECT BEING PROFITABLE MAY BE UNKNOWN. THEREFORE, DECISION MAKING HAS TO BE BASED ON SOME ASSESSMENT OF THE PROBABILITY OF POSSIBLE OUTCOMES OCCURRING. 3.ENGINEERING INSTALATION DESIGN.
EACH EXPERIMENT WILL HAVE TWO OR MORE POSSIBLE RESULTS. OUTCOME; A PARTICULAR RESULT OF AN EXPERIMENT. EVENT ; A COLLECTION OF ONE OR MORE OUTCOMES OF AN EXPERIMENT. TOSS 2 COINS HH HT TH TT GET THE NUMBER OF TAIL AT LEAST 1 ROLL 2 DICE THERE ARE 36 POSSIBLE OUTCOMES
2. DEPENDENT EVENT: TWO EVENTS ARE DEPENDENT WHEN THE OCCURRENCE OF ONE EVENT AFFECT THE PROBABILITY OF OCCURRENCE OTHER EVENT. P ( A and B ) = P ( A ) . P ( B/A )
Example
A BOX CONTAINS OF 10 RED BALLS, 6 WHITE BALLS AND 4 BLUE BALLS. ONE BALL IS SELECTED AT RANDOM 3 TIMES. WHAT IS THE PROBABILITY of OBTAINING:
All balls are white ( with replacement ). All balls are white ( without replacement ). The 1st selection is red, the 2nd selection is red and the 3rd selection is blue ( without replacement ). There are 2 red balls and 1 blue ball ( without replacement ) There are 1 red, 1 white and 1 blue ball ( Without R).
a.
b.
c.
d. e.
EXAMPLE
SCORES : A CLASSES
X Y Z
1. P ( A or B or F ) 4. P ( Z or A )
10 13 8
15 16 11
2. P ( X or C ) 5. P ( C/X )
8 7 6
2 4 5
CONDITIONAL PROBABILITY
THE PROBABILITY OF A PARTICULAR EVENT OCCURING, GIVEN THAT ANOTHER EVENT HAS ALREADY OCCURRED. IF, P ( A and B ) = P ( A ) . P ( B/A ), THUS; P ( A and B ) P ( B/A ) = ------------------------P(A)
Example
In Lab: A, there are 10 units Acer computer, 5 units Compact and 15 units Toshiba. In lab: B, there are 15 units Acer computers, 10 unit Compact and 10 units Toshiba. In lab: C, there are 10 units Acer, 10 units Compact and 10 units Toshiba. A. If each lab is selected one computer randomly, what is the probability that all of the computers are Toshiba. B. If one computer is selected at random from one lab, what is the probability that the computer is Acer or Compact.
Example:2
The number of students from class A = 35, class B = 30, class C = 40 and class D = 25. Based on the examination record, it was found that 10% of students from class A was fail, class B = 5%, class C = 15 % and class D = 20 %. If one student was selected at random was indicated that he did well in the exam. What is the probability that he is from class B or D.
Mathematical Expectation
Finding the expected value of the Mean, Variance and Standard Deviation of a probability distribution:
Mean = Xi . P ( Xi ), where Xi is the value of variable X, and P (Xi) is the probability of X. Variance = ( Xi - )2 P ( Xi) or Xi2 P ( Xi ) - 2
Example
A family has five children. Please calculate the expected value and standard deviation of having male children. Step1 : find out the value of Xi ( male children ). Step 2: find out the probability of having Xi. Xi : ? P( Xi ): ?
Example
The quantity of X demanded probability 200 units 0.15 225 0.20 250 0.30 275 0.25 300 0.10 a. What is the mean of quantity of X demanded. b. What is the variance of the distribution? c. When the price of X = $ 10 and AC = $ 8 What is the expected value of the profit?
Probability distributions
1. Discrete probability distribution: Binomial distribution: a. In each trials only two outcomes are possible: success or failure, good or defect, pass or fail, girl or boy, odd number or even number. b. Each trial is independent to each other. c. Probability of an event is assumed to remain constant over all trials ( n trials ).
Binomial Distribution
The probability of obtaining the outcome success denoted by and the outcome failure denoted q or (1 ). If we are interested in the number of success or failure ( x ) occuring in ( n ) trials, then, the probability binomial distribution is given by the formula:
n x nx
C
x
Example: The four engines of a commercial aircraft are design so that they each operate independently. Test, carried out over a long period of time, show that there is a one-in-a hundred chance of in-flight failure of a single engine. What is the probability that on a given flight: a. No failures occur? b. No more than two failures occur? c. At least two failures occurs.
BINOMIAL DISTRIBUTION
PROBABILITY OF A STUDENT WILL PASS IN THE FINAL STATISTICS EXAM = 0.80. IF 6 STUDENTS ARE SELECTED AT RANDOM, WHAT IS THE PROBABILITY OF OBTAINING:
a. At most 2 of them are fail in the exam. b. At least 4 of them will pass in the exam.
Example
If 7 students were selected at random, what is the probability that: A. More than 4 students will get score A or C. B. at least 5 students will get F. C. less than 3 students will get B or C. D. all of them will get A or B or F.
Example : 2
a.
b.
c. d.
Suppose that a family has 5 children. What is the probability of having: At least 3 of them are male. All of them are female. At most there is one male. the number of female is more than 4.
Example : 3
a.
b.
c.
Two dice is thrown 5 times, what is the probability of obtaining: Total number of both dice is 10 appears at most twice. Total number of both dice is 5 appears at least 4 times. Total number of both dice is 8 appears exactly 3 times.
The normal probability distribution has the following major characteristics : 1. It is bell-shaped and has a single peak. 2. The mean, median and mode are equal. 3. The location of normal distribution is determined by the and . 4. It falls off smoothly in either direction from the and the curve gets closer and closer to X axis but never actually touches it.
NORMAL DISTRIBUTION
1. Xi ~ N ( , )
2. Zi ~ N ( 0, 1 )
What is the difference in meaning between the first and the second terms.
Xi - Zi = ---------------
Zi MEASURES THE DISTANCE OF ANY PARTICULAR VALUE OF X FROM THE , MEASURED IN UNITS OF THE STANDARD DEVIATION. THE SIGN OF Z MIGHT BE POSITIVE, NEGATIVE OR ZERO. IT DEPENDS ON THE DIFFERENCE BETWEEN Xi VALUE AND ITS MEAN ( ).
Starting point of z value is always from where Zi = 0 up to the point of another zi value ( the difference between mean and a selected value of Xi ).
a. b. c. d.
e.
f.
The scores of 250 students in mathematics class follow the normal distribution, with a mean of 69 and standard deviation of 10. What is the probability or areas under this normal curve for the scores : Between 60 and 65. 80 Between 64 and 80. If the minimum passing grade is 55. how many students fail in this exam. What is the minimum scores for 5 % the best. What is the maximum scores for 37.5% above the average.
Normal Distribution
An electrical firm manufactures light bulbs that have a length of life that is normally distributed with mean 800 hours and variance of 1600 hours. Find the probability: a. That a bulb burns between 778 and 834 hours. b. That a bulb burns 850 hours or 775 hours. c. That a bulb burns between 720 and 780 hours. d. The minimum length of life for 12.5% the longest. e. The minimum length of life for 22% below the average.
MID-EXAM QUESTIONS
1. MEASURES OF CENTRAL TENDENCY: MEAN, MEDIAN AND MODE AND COEFFICIENT OF VARIATION. 2. MEASURES OF LOCALITY: QUARTILE, DECILE AND PERCENTILES. 3. DISPERSION: VARIANCE, STANDARD DEVIATION AND COEFFICIENT OF VARIATION FOR UNGROUPED DATA.
The Normal approximation to the Binomial Binomial distribution is effective if the number of trials is relatively small. But, generating a binomial distribution for a large number of trials would be very time consuming. A more efficient approach is to apply the normal approximation to the Binomial: The approach is determined as follows: =n Xi n = n q then, Z =------------ nq
Example
a.
b.
c.
If 10 % of all business executives fill out a given marketing survey questionnaire, what is the probability of getting: At least 15 questionnaires will be back out of 200 distributed to executives? What is the probability that between 23 and 30 questionnaires will be filled out? What is the probability that 17 questionnaires or less will be filled out.
Extra-Bonus
A manufacture produces 10.000 units of product daily. On the basis of frequent inspections it is dicovered, 500 unit of product is classified as defective. What is the probability that:
a. b. c. d.
More than 525 units of product is defective. At least 9450 units of product is good. Less than 475 units of product is defective. The number of good product is between 9450 units and 9550 units.
1. Binomial distribution 2. Normal distribution 3. From Binomial to Normal. The quiz will be given on Tuesday next week. ( dont forget to bring your calculator and statistics tables.
Sampling Methods
REASONS TO SAMPLE 1. TO OBSERVE THE WHOLE POPULATION WOULD BE TIME CONSUMING AND VERY EXPENSIVE. 2. THE PHYSICAL IMPOSSIBILITY OF CHECKING ALL ITEMS IN THE POPULATION. 3. THE DESTRUCTIVE NATURE OF SOME TESTS. 4. THE SAMPLE RESULTS ARE ADEQUATE TO INFER THE CHARACTERISTICS OF POPULATION.
SAMPLING METHODS
1. SIMPLE RANDOM SAMPLE: A SAMPLE SELECTED SO THAT EACH ITEM IN THE POPULATION HAS THE SAME PROBABILITY OF BEING INCLUDED. 2. SYSTEMATIC RANDOM SAMPLE: A RANDOM STARTING POINT k IS SELECTED, AND THEN EVERY kth MEMBER OF THE POPULATION IS SELECTED. 3. STRATIFIED RANDOM SAMPLE : A POPULATION IS DIVIDED INTO SUB-GROUPS, CALLED STRATA, AND A SAMPLE IS RANDOMLY SELECTED FROM EACH STRATUM. 4.CLUSTER SAMPLING: A POPULATION IS DIVIDED INTO CLUSTERS USING NATURALLY OCCURING GEOGRAPHIC OR OTHER BOUNDARIES. THEN, CLUSTER ARE RANDOMLY SELECTED AND A SAMPLE IS COLLECTED BY RANDOMLY SELECTING FROM EACH CLUSTER.
Sampling Methods
A SAMPLE IS A TOOL TO INFER SOMETHING ABOUT POPULATION OR STATISTICS CAN BE USED TO FIND SOMETHING ABOUT A CHARACTERISTIC OF POPULATION OR A PARAMETER.
SAMPLING DISTRIBUTION OF THE SAMPLE MEAN IS A PROBABILITY DISTRIBUTION OF ALL POSSIBLE SAMPLE MEANS OF A GIVEN SAMPLE SIZE ( n ). THE FOLLOWING EXAMPLE ILLUSTRATES THE CONTRUCTION OF A SAMPLING DISTRIBUTION OF SAMPLE MEAN. LET US ASSUME THAT A POPULATION CONSISTS OF FOUR ELEMENTS X1 =1, X2 = 2, X3 = 3 AND X4 = 4. CONSIDER ALL THE SAMPLES OF A GIVEN SIZE n = 2 THAT COULD BE DRAWN FROM THIS POPULATION. (CONSIDER CAREFULLY WHETHER ONE IS SAMPLING WITH OR WITHOUT REPLACEMENT).
Without Replacement
All possible samples Sample means 1.2 1.5 1.3 2.0 1.4 2.5 2.3 2.5 2.4 3.0 3.4 3.5 x = ( 1.5 + 2.0 + ------- + 3.5 ) : 6 = 2.5 x2 = ( 1.5 2.5 )2 + ( 2.0 2.5 )2 +----- + ( 3.5 2.5)2 : 6 = (1 + 0.25 + 0 + 0 + 0.25 + 1) = 2.5 : 6 = 0.417 x = 0.417 = 0.645 or x = --- ( ( N n ) : ( N 1 ) n
A population consists of three elements: X1 = 25, X2= 30 and X3 = 35. Calculate: a. Sampling distribution of the mean if the sample size = 2 ( with replacement and without replacement ). b. Variance and std deviation of the sample means.
Confidence interval for is estimated by: X a sampling error or X z value SE Z value depends on confidence level. SE = standard error of the mean SE = /n, if std deviation of population is known. But, if the std deviation of population is unknown, SE = s/n , s = std deviation of sample.
Sampling Error : The diffrence between a sample statistic and its corresponding population parameter. This error will decrease as the sample size increase. Confidence interval estimate: A range of values constructed from sample data so that the population parameter is likely to within that range occur at specified probability. The specified probability is called the level of confidence.
The weights of 7 similar boxes of cereal are 9.8, 10.2, 10.4, 9.8, 10.0, 10.2 and 9.6 ounces.
Find a 90% confidence interval for he mean of all such boxes of cereal. ( use t table ). Step1: calculate the mean and std deviation of sample. 2: use t table to find critical value for 90% C.L ( tv). 3: substitute of all information from step 1 and 2 into the formula: X - tv ( sd/n) u X + tv ( sd/n )
Level of Confidence
Required level value of value of Z table of confidence 90 % 10 % 1.65 95 % 5% 1.96 99 % 1% 2.58 Example : The prices at which certain type of instant coffee was being sold on a given day were collected from a random sample of 45 shops around the country. The mean price was $ 1.95 with a standard deviation of $ 0.27. Compute a 80 % confidence interval for the population mean.
P Zv
P( 1- P ) / n
page:300 no:18
Application
The distribution of household that favor a certain bath soap in West Java Province (use a 90% confidence level). Brand Number of households Palmolive 1500 Lux 1800 Zest 1500 Beauty 1000 Minty 1200 Other brands 2000 Calculate confidence interval for the proportion of households that favor LUX bath soap for their families.
4. Confidence interval for the population mean ( ) when the sample size ( n < 30 ) ttable ( p:722 )
Confidence interval for the is estimated by: X a sampling error or X t table.SE SE = S/n ( std deviation of population is unknown).
Example : The operating life of rechargeable cordless screwdrivers produced by a firm is assumed to be normally distributed. A sample of 15 screwdrivers is tested and the mean life is found to be 8900 hours, with a sample std deviation of 500 hours. Construct a 90% confidence interval estimate for the population mean.
Finding t - value Example : n = 10, confidence level = 90%. n = 24, confidence level = 95% Degree of freedom of t- value = df (n-1 ) df Two-tailed test, t-value 9 10% 1.833 23 5% 2.069
( X1 X2 ) Zv x
x = pooled standard deviation. X1 = the sample mean of X1 X2 = the sample mean of X2 n1 = the sample size of X1 n2 = the sample size of x2
Example
A study was made to estimate the difference in salaries of college professors in the private and state colleges of Virginia. A random sample of 100 professor in the private colleges showed an average of $ 15.000 per month with a standard deviation of $ 1200. A random sample of 200 professors in state colleges showed an average salary of $ 16.000 with a standard deviation of $ 1400. Find a 90% confidence interval for the difference between the average salaries of professors teaching in state and private colleges in Virginia.
A new program for a youth club is planned in a small city. To determine whether or not the program will get the city governments support. It is necessary to estimate the proportion of young people who plan to use the clubs facilities. A survey of 100 randomly selected young people has shown that 22 will use the facilities if they become available. Construct a 90% confidence interval for the true proportion. n = 100, X = 22, P = 22/100 = 0.22 and ( 1 P ) = 0.78 Z value for CL = 90 % = 1.65.
Example:
The performance of stocks between Cement Industries and Oil industries in 2007 ( in % ). Cement Industries Oil Industries Average yields 22 18 Variance 3 2 Sample size 35 40 Calculate interval estimate the difference in average yields between cement industries and oil industries ( CL = 85 % ).
a. Construct the regression line by using OLS. b. Calculate R and R square. c. When you got score 75 in mid-exam, what is your prediction score in final-exam?
TESTING HYPOTHESES
( CH: 10 P 116-521)
HYPOTHESIS TESTING: A PROCEDURE BASED ON SAMPLE EVIDENCE TO DETERMINE WHETHER THE HYPOTHESIS IS A REASONABLE STATEMENT. HYPOTHESIS: A STATEMENT ABOUT A POPULATION DEVELOPED FOR THE PURPOSE OF TESTING. NULL HYPOTHESIS ( Ho ): SPECIFIES THE VALUE OF THE PARAMETER TO BE TESTED. e.g: , , ( 1- 2) OR HYPOTHESES THAT WE FORMULATE WITH THE HOPE OF REJECTING. ALTERNATIVE HYPOTHESIS ( Ha ): HYPOTHESES THAT WE FORMULATE WITH THE HOPE OF ACCEPTING.
TWO-TAILED TEST: BASICALLY BE USED TO TEST IF THE QUESTION ASK YOU ABOUT: THE DIFFERENCE, CHANGE OR EFFECT. ( ALPHA IS DIVIDED BY 2 OR 1/2 ) ONE TAILED TEST : BASICALLY BE USED TO TEST IF THE QUESTION ASK YOU ABOUT: GREATER THAN, LESS THAN, INCREASE, DECREASE, OR MORE THAN. THE CRITICAL STATISTIC TABLE MAY BE IN THE LEFT OR IN THE RIGHT TAIL OF THE CURVE.
START
RESEARCH AT THE UGM INDICATES THAT 50% OF THE STUDENTS CHANGE THEIR MAJOR OF STUDY AFTER THEIR FIRST YEAR IN A PROGRAM. A RANDOM SAMPLE OF 100 STUDENTS IN THE BUSINESS PROGRAM REVEALED THAT 48% HAD CHANGED THEIR MAJOR AREA OF STUDY AFTER THEIR FIRST YEAR OF THE PROGRAM. Has there been a significant decrease in the proportion of students who change their major after the first year in this program? Test at the 0.05 level of significance.
Example:2
Brand of bath soap Number of Household Palmolive 1000 Lux 1500 Zest 1000 Beauty 750 Minty 1250 Maya 1000 Other Brands 3500 a. Can you conclude that the proportion of HH that like Lux is different from 14%. b. Can you conclude that the proportion of HH that like Lux is higher than 14 %. ( Alpha = 15 % ).
Example
The distribution of gasoline consumption of 6 sample Honda Astrea can be reported as follow:
Sample Range
1 54 km/l 2 53 3 56 4 52 5 50 6 55 The manufacture claimed that the average of gasoline Consumption was 55 km/l. Can you conclude that this claim is overestimate? ( use alpha = 5 % ).
PARAMETRIC STATISTICS: 1. SPSS: PAIRED SAMPLES 2. EVIEWS: UNPAIRED SAMPLES: EQUALITY OF VARIANCE TEST 3. EXCEL: DATA ANALYSIS NON PARAMETRIC STATISTICS: SPSS : RELATED SAMPLES, WILCOXON SIGN RANK TEST.
X1 X2 Zh = --------------------------var1/n1 + var2/n2
EXAMPLE
THE PRODUCTION OF ELECTRONIC ITEMS DAILY IN COMPANY ABC: MALE WOKERS FEMALE WORKERS
450 UNITS 90 UNITS 40 WORKERS 460 UNITS 110 UNITS 50 WORKERS
CAN YOU CONCLUDE THAT FEMALE WORKERS PRODUCTIVITY IS HIGHER THAN MALE WORKERS? ( USE ALPHA = 10 % )
Testing (1 - 2 ), n < 30
Dr. Dony, a psychologist, administered IQ tests to determine if female FEB students were as smart as male students. The random sample of 15 females had a mean score of 131 with std deviation of 17. The random sample of 13 male students had a mean of 126 and a std deviation of 14. At 0.01 level of significance: a. is there a significant difference in their IQ? b. Can you conclude that females IQ is greater than males IQ?
3
4 5 6 7 8 9 10
Productivity of Workers After Training Before Training d ( d-d ) ( d d )2 235 units 228 units 7 2.4 5.76 210 205 5 0.4 0.16 231 219 12 7.4 54.76 242 240 2 -2.6 6.76 205 198 7 2.4 5.76 230 223 7 2.4 5.76 231 227 4 -0.6 0.36 210 215 -5 -9.6 92.16 225 222 3 -1.6 2.56 249 245 4 -0.6 0.36 Can you conclude that the training program increase the productivity of the employees? ( alpha = 5% ).
d = 46 SD =
d = 46/10 = 4.6
d SD / n
Example:2
Sample of customer Score of: New Menu Old menu 1 36 35 2 48 46 3 50 51 4 76 74 5 55 56 6 60 59 7 71 72 8 66 64 Can you conclude that the new menu is more delicious than the old one? ( alpha = 1% ).
Chy- square tests are used in a procedure that involves the comparison of the differences between the sample frequencies of the occurrence ( Oij ) and the hypothetical or theoretical population frequencies ( Eij ) ( Expected value ). ( Goodness of Fit ). It can also be used to test relationship between variables ( independency test ). Critical value of x2 depends on the number of rows and columns
Chy- square table is always one tailed in the right side of the curve. Critical table is X2 df ( r -1 )( c - 1 ). Ho is accepted if test statistics calculated is less than or equal to its critical table. Ho is rejected if test statistics calculated is greater than its critical table.
Chy-Square test
Test statistic: ( Oij Eij )2 Xh2 = -------------Eij Oij = observed frequencies in ith row & jth column Eij = expected frequencies in ith row & jth column
EXAMPLE
A GARMENT COMPANY IN CAKUNG RECORDS THE PERFORMANCE OF ITS LABOUR PRODUCTIVITY RANDOMLY. THE LEVEL OF PRODUCTIVITY LOW MODERATE HIGH WORK-SHIFT MORNING (I) 40 45 50 AFTERNOON(II) 60 55 60 NIGHT(III) 40 30 25 FROM THE DATA ABOVE, WHAT IS YOUR CONCLUSION? ( ALPHA = 5 % ).
Color of bath soap: Pink White Yellow Gender: Man 10 10 20 Woman 25 20 5 What is your conclusion? ( UseAlpha = 5 % )
ANALYSIS of VARIANCE
This method can be used to test the difference among population means ( the number of specific sample is more than two or k > 2. The steps of testing the hypothesis. 1. Calculate the variance within samples ( 2w). 2. Calculate variance between or among the samples (2b). 3. Calculate F table. 4. Calculate F statistic.
ANALYSIS OF VARIANCE
FERTILIZER: SAMPLE
1 2 3 4 5
A
10 kgs 12 13 11 14
B
12 kgs 11 10 10 12
C
11 kgs 10 9 10 10
D
9 kgs 10 8 8 10
Can you conclude that the average productivity of those fertilizers are significantly different? ( alpha = 5% ).