You are on page 1of 3

******EIDC Stata Lab Session: Introduction******

*Getting your DATA into Stata


//This command help stata to locatate the STATA_session folder from where
//the data files will be used all through the do files.

cd "C:\Users\Vivek\Desktop\STATA_session"
use school_a

*DESCRIBE command: it lists each variable name along with variable label and
storage type
describe

*Each element of data is said to be either type numeric or type string. The word
real is sometimes
*used in place of numeric. Associated with each data type is a storage type
*Numbers are stored as byte, int, long, float, or double, with the default being
float. byte,
*int, and long are said to be of integer type in that they can hold only integers
*Strings are stored as str#, for instance, str1, str2, str3, . . . , str2045, or as
strL.
*The number after the str indicates the maximum length of the string.
*A str5 could hold the word male, but not the word female because female has
six characters
*Storage requirements for byte=1 byte, int=2 bytes, long and float=4 bytes,
double=8 bytes
*Digits of accuracy- float=7 digits, long=9 digits, and double=16 digits of
accuracy
*Few people have data that is accurate to 1 part in 10 to the 7th.
*Among the exceptions are banks, who keep records accurate to the penny on amounts
of billions of INR
*If you are dealing with such financial data, store your dollar amounts as doubles.

//describing certain variables only


describe totalteacher- classroom_num

//Compress command: compress attempts to reduce the amount of memory used by your
data by demoting and coalescing
compress

//nocoalesce option-specifies that compress not try to find duplicate values


compress, nocoalesce

*SUMMARIZE command: a basic summarize command gives you no. of observations,


minimum and maximum value
summarize school_type

sum school_type school_id

sum totalteacher, detail //this command tells you about variance,


//skweness and quantiles distribution of Var

sum totalteacher- classroom_num //summarizes all the vars in between the two vars

*TABULATION command: It gives frequency and count of the obseravtion


tab school_type

tab school_type electricity //Cross tabulation giving count between type of school
//and facility of electricity
tab school_type electricity, row //this gives you proportions of schools with
electricity
//for all the types of school
tab school_type electricity, col //this gives the proportion of school types with
in all the
//schools with electricity connection
tab electricity, missing

tab roof_type if school_type==1 // Types of Roof existing in schools of type 1

*TABLE command- it combines the features of tab and summarize command*

table school_type, c(mean totalteacher mean total_student)


table school_type electricity, c(mean totalteacher mean total_student)

*Count Command gives the count for observation satisfying the condition

count if total_student<totalteacher

count if school_type==1 & electricity==1

*RECODE command: it assists in recoding the already existing variable

recode electricity 1=1 2=0 //1:Electricity connection 0: No electricty connection

*RENAME command: gives a new name to a previously existing variable


rename s14_a_p58_q07 grades_taught

*LABEL command: gives label to variable which help explain the var in more clear
form
//Provide a label to already existing variable type*

label variable electricity "Does school has electricity connection 1-Yes 0-No"
label variable roof_type "1-Thatched 2-Wood 3-Cement 4-tin/Asbestos sheet"

//Labeling values of variables


//We can also label the values of a variable. In variable electricity,it is
reported as 1/0 (1=Y, 0=N)
//This help us remember what they are
//To attach labels to the values of a variable, there are two steps

*Step 1: We need to first define value label*


label define electlabel 1"Electricity" 0 "No Electricity"

//Step 2: Assign this label to the variable*


label values electricity electlabel
tab electricity

//Later, if we dont require label values, we can drop the value label for variable
electricity
label drop electlabel

You might also like