You are on page 1of 2

charlemagne.nikiema@sciencespo.

fr , il nero

Introduction to Econometrics using Stata


1st lecture
Approf su mutue
*Econometrics: mathematical statistics field (sub-field of economics) used: to test
economic theories (Keynes one e.g.), set of tools for forecasting future values of
economic variables, to fit mathematical economic models to real data, used as an
instrument to make recomendations about quantitative policies in gov and business.
The most common application is FORECASTING important macroeconmic variables (int
rates, inflation rates, gdp, ecc). Also used In finance, labor economics,
microeconomics, marketing and econ policy. Now also in political economics.
4 definitions -> defs on the ppt
Datasets: raw material of econometrics
*Where data come from?
-experimental data: on individiuals, firms or segments of economy in order to evaluate
a causal effect of a treatment or policy
-nonexperimental data: in turn they can be observational or retrospective, collected by
observinf behaviors outside and experimental setting with surveys or administrative
records.
*Data in general come in three main types:
-cross-sectional data: sample of units taken at a given point in time (single time
period), for example height or weight of a given number of individuals
-time series data: observations on a variable or several ones over time.
-panel data: combination of the first two, its all about looking the time series for each
cross-sectional member in the data set (tabella a doppia entrata), we take care of the
time.
-pooled cross sections: data collected by combining/pooling multiple entities from
different time periods. Difference with panel data: for pooled cross we just take every
cross sectional unit and just put them together, regardless of the time, same
individuals for two time series (e.g.).
Depending of the data we deal with we have different results, depending of what we
want to study.
*Nature of variables at hand can be:
-continuous: can take any value over a particular range of real numbers (height,
weight)
-discrete: variable having only integer values (number of trials of an experiment)
-limited: theres a limit or boundary on the variable (labor supply, tickets sold for a
game)

-count: non-negative integer and concentrated on few small discrete values (n of


children of a couple)
-categorical: contains values indicating membership in one of several possible
categories (gender, marital status). Often theyve assigned numerical values used as
labels (0=male, 1=female).
*Econometrics Models:
There are several ones. E.g.:
-linear regression
-probit and logit (binary outcome) models (well deal with these).
Well use software packages, STATA in particular.
*random variables and probability distribution.
Y is a random variable if it represents a random draw from some population (students
age, height).
A discrete random variable can take on only selected values (integer numbers)
A continuous random variable can take on any value in a real interval.
Associated with each random variable is a probability distribution (that is the function
that links each individual of a population to its value).
For continuous random variables the probability distribution is called probability
density function or p.d.f.
A cumulative density function or c.d.f. is the probability that a random variable is less
than or equal to a particular value (e.g. we want the prob that people aged under or
equal to 21).
*copiare slide 16 e 17, 18, 19 fino alla fine.

You might also like