You are on page 1of 13

DECSION TREE

STATISTICAL DECISION
THEORY
 Statistical Experiment
Statistical Model: A collection of data-generating
distributions P ≜ {Pθ | θ ∈ Θ}, where
▶ Θ is called the parameter space, could be finite,
infinitely countable, or uncountable.
▶ Pθ (·) is a probability distribution which accounts
for the implicit randomness in experiments,
sampling, or making observations
 Data (Sample/Outcome/Observation): X is generated
by a random draw from Pθ, that is, X ∼ Pθ.
▶ X could be random variables, vectors, matrices,
processes, etc
 Inference Task
Objective: T ( θ ), a function of the parameter θ.
From the data X ∼ Pθ, one would like to infer T
(θ ) from X.
 Decision Rule
Decision rule (deterministic): τ (· ) is a function of
X. Tˆ = τ (X) is the inferred result.
Decision rule (randomized): τ (·, ·) is a function of
(X, U), where U is external randomness.
ˆT = τ (X, U) is the inferred result.
 Performance Evaluation: how good is a
decision rule τ?
Loss function: l (T (θ ), τ (X )) measures how bad
the decision rule τ is (with a specific data point
X).
Note: since X is random, l (T (θ ) , τ (X )) is also
random.
Risk: Lθ (τ ) ≜ EX∼Pθ [l (T (θ ), τ (X ))] measures
on average how bad the decision rule τ is
when the true parameter is θ.
 Performance Evaluation: what if the decision rule
τ is randomized?
Loss function becomes l (T (θ ), τ (X, U )).
Risk becomes Lθ (τ ) ≜ EU,X∼Pθ [l (T (θ ), τ (X, U ))].

You might also like