MATH 5763 - Stochastic Processes, Section 990 - Fall 2015
Tue 1:30–4:20 p.m., Classroom Building 3104
Instructor:
Nikola Petrov, 802 PHSC, (405)325-4316, npetrov AT math.ou.edu
First day handout
Prerequisite:
Basic calculus-based probability theory (including axioms of probability, random variables,
expectation, probability distributions, independence, conditional probability).
The class will also require knowledge of elementary analysis (including sequences, series, continuity),
linear algebra (including linear spaces, eigenvalues, eigenvectors), and ordinary differential equations.
Course description:
The theory of stochastic processes studies systems that evolve
randomly in time;
it can be regarded as the "dynamical" part of probability theory.
It has many important practical applications,
as well as in other branches
in mathematics such as partial differential equations.
This course is a graduate-level
introduction to stochastic processes,
and should be of interest to students of mathematics,
statistics, physics, engineering, and economics.
The emphasis will be on the fundamental concepts,
but we will avoid using the theory of Lebesgue measure
and integration in any essential way.
Many examples of stochastic phenomena in applications
and some modeling issues will also be discussed in class
and given as homework problems.
Texts (all available free for OU students from the
OU Library):
-
[L]
M. Lefebvre, Applied Stochastic Processes,
1st edition, Springer, 2007, through the
-
[BZ]
Z. Brzezniak, T. Zastawniak. Basic Stochastic Processes. Springer, 1999
-
[P]
E. Parzen. Stochastic Proceses. SIAM, 1999
-
[D]
R. Durrett. Essentials of Stochastic Processes. 2nd ed., Springer, 2012
-
[R]
S. Ross. Introduction to Probability Models.
8th ed., Elsevier, 2003
Main topics (a tentative list):
-
a brief review of probability theory;
-
discrete Markov chains: Chapman-Kolmogorov equations,
persistence and transience, generating functions,
stationary distributions, reducibility, limit theorems, ergodicity;
-
continuous Markov processes:
Poisson process, birth-death and branching processes,
embedding of a discrete-time Markov chain
in a continuous-time Markov processes;
-
conditional expectation, martingales;
-
stationary processes (autocorrelation function,
spectral representation);
-
renewal processes, queues;
-
diffusion processes, Wiener processes (Brownian motion);
-
introduction to stochastic differential equations, Itô calculus;
-
Fokker-Planck equation, Ornstein-Uhlenbeck process.
Homework:
-
Homework 1,
due by 4:30 p.m. on Wednesday, September 2, in Mrs. Wagenblatt's office.
-
Homework 2,
due by 4:30 p.m. on Wednesday, September 9, in Mrs. Wagenblatt's office.
-
Homework 3,
due by 1:30 p.m. on Tuesday, September 15, in class.
-
Homework 4,
due by 4:30 p.m. on Wednesday, September 23, in Ms. Arnett's office.
-
Homework 5,
due on Tuesday, October 6, in class.
-
Homework 6,
due by 4:30 p.m. on Wednesday, October 21, in Mrs. Wagenblatt's office.
-
Homework 7,
due by 12:00 p.m. on Friday, October 30, in Mrs. Wagenblatt's office.
-
Homework 8,
due by 12:00 p.m. on Thursday, November 12, in Mrs. Wagenblatt's office.
-
Homework 9,
due by 12:00 p.m. on Tuesday, November 24, in Mrs. Wagenblatt's office.
Content of the lectures:
-
Lecture 1 (Tue, Aug 25):
Review of probability:
sample space Ω, events as subsets of the sample space,
elementary events as elements of the sample space,
operations with events (complement, union,
intersection, difference, symmetric difference,
subset, impossible event);
σ-algebra (σ-field), examples;
De Morgan's laws, disjoint events,
distributivity properties of intersection and union;
probability (probability measure) P,
probability space,
elementary properties of probability measures
(including the inclusion-exclusion formula);
conditional probability P(A|B),
properties of conditional probability,
partitions of the sample space,
law of total probability, Bayes' formula,
independent events, independent family of events,
pairwise independent family of events;
an example of using conditioning
[pages 1-3, 5-8 of Sec. 1.1 of [L]]
Random variables:
random variables (RVs),
(cumulative) distribution function (c.d.f.) FX(x)
of a RV X,
properties of c.d.f.s;
discrete RVs, probability mass function (p.m.f.)
pX(x) of a discrete RV,
important discrete RVs (Bernoulli, binomial, Poisson, geometric)
[pages 8, 9, 11, 12 of Sec. 1.2 of [L]]
-
Lecture 2 (Tue, Sep 1):
Random variables (cont.):
continuous RVs; p.d.f. ƒX(x) of a continuous RV;
important continuous RVs (uniform, exponential, normal/Gaussian, standard normal);
p.d.f. ƒY(y) of a function Y=g(X) of the RV X;
conditional c.d.f. FX(x|A)
of a RV X conditioned on an event A;
conditional p.m.f. pX(x|A)
or p.d.f. ƒX(x|A) of a RV X conditioned on an event A;
expectation E[X] of a RV X;
expectation E[g(X)] of a function of a RV X;
rth moment E[Xr] of a RV X
for r=0,1,2,...;
variance Var X and standard deviation
σX=(Var X)1/2
of a RV X; properties of E[X] and Var X;
conditional expectation E[X|A],
conditional moments E[Xr|A]
and conditional variance and Var(X|A)
of a RV X given an event A; example;
characteristic function CX(ω)
and moment-generating function MX(t)
of a RV X, expressing the E[Xr]
as a derivative of MX(t)
[pages 10, 11, 13-20 of Sec. 1.2 of [L]]
Random vectors:
definition; (joint) c.d.f. FX,Y(x,y)
of a random vector X=(X,Y); properties of FX,Y(x,y);
marginal c.d.f.s FX(x)=FX,Y(x,∞)
and FY(y)=FX,Y(∞,y);
marginal p.m.f.s pX(x) and pY(y),
respectively p.d.f.s ƒX(x) and ƒY(y),
of a random vector (X,Y);
independence of the components of a random vector;
proof that if the RVs X and Y are independent, then
MX+Y(t)=MX(t)MY(t);
conditional c.d.f. FX|Y(x|Y=ym)
and conditional p.m.f.
pX|Y(xk|Y=ym)
of the discrete RV X conditioned on the discrete RV Y;
conditional c.d.f. FX|Y(x|y)
and conditional p.m.f.
ƒX|Y(x|y)
of the continuous RV X conditioned on the continuous RV Y;
conditional expectation E[X|Y=y] of the RV X
conditioned on the RV Y;
the conditional expectation E[X|Y=y]
depends only on y (but not on X), so it can be considered as a function of Y,
therefore we can think of the conditional expectation E[X|Y]
as a new random variable which is a function of the RV Y: namely,
E[X|Y]:Ω→R is defined as
the value of E[X|Y](ω) is defined as E[X|Y=y],
where y=Y(ω);
tower rule E[E[X|Y]]=E[X]
[pages 21-27 of Sec. 1.3 of [L]]
-
Lecture 3 (Tue, Sep 8):
Stochastic processes:
definition of a stochastic process (random process);
classification of random processes:
discrete-time or continuous-time,
discrete-state space or continous-state space
[pages 47-48 of Sec 2.1 of [L]]
Markov chains - introduction:
Markov property; Markov chain (MC);
example: simple 1-dimensional random walk (RW), symmetric simple 1-dim RW;
the future of a MC depends only on the most recent available information (Prop. 3.1.1);
more examples: 2-dimensional, and d-dimensional RWs, Ehrenfests' urn model,
birth-death processes
[pages 73-75 of Sec. 3.1 of [L]]
Discrete-time Markov chains - definitions and notations:
time-homogeneous discrete-time discrete-state space MCs;
stationary (time-homogeneous) MCs;
one-step and n-step transition probabilities;
one-step transition probability matrix P of a MC;
stochastic and doubly-stochastic matrices;
n-step transition probability matrices P(n);
Chapman-Kolmogorov in matrix form
(P(m+n)=P(m)P(n))
and in components;
corollary: P(n)=Pn;
an example of a MC with 2 states;
probability ρij(n)
of visiting state j for the first time
in n steps starting from state i;
probability ρii(n)
of first return to state i in n steps;
representation of pij(n)
as a sum over k from 1 to n of
ρij(k)pjj(n−k);
examples of direct computation ρij(n) for a simple MC;
initial distribution (p.m.f.) a=(a0,a1,a2,...),
ai=P(X0=i), of a MC;
distribution (p.m.f.)
a(n)=(a0(n),a1(n),a2(n),...),
ai(n)=P(Xn=i), of a MC at time n;
formula for evolution of the probability distribution:
a(n)=aPn;
examples: simple 1-dim random walk on Z,
simple 1-dim random walk on Z+ with reflecting
and absorbing boundary condition at 0,
a Markov chain coming from sums of i.i.d. random variables (read Example 2 on page 85)
[Sec. 3.2.1 of [L]]
Properties of Markov chains:
accessibility of state j from state i, i→j;
communicating states i↔j;
properties of the relation ↔
(reflexivity, symmetry, transitivity),
↔ as an equivalence relation,
equivalence classes with respect of ↔,
closed sets of the state space (Def. 3.2.7)
[pages 85-86 of Sec. 3.2.2 of [L]]
-
Lecture 4 (Tue, Sep 15):
Properties of Markov chains (cont.):
irreducible MCs; irreducibility criteria; examples;
absorbing states;
probability ƒij
of eventual visit of state j starting from state i;
probability ƒii
of eventual return to state i;
expressing ƒij
as a sum of the first visit probabilities
ρij(n);
recurrent (persistent) and transient states;
Decomposition Theorem;
an example of identifying closed irreducible sets of recurrent states
and sets of transient states, and structure of the stochastic
matrix; a necessary and sufficient criterion of recurrence
of state i in terms of the expected value E[Ni]
of the number Ni of returns to this state (Prop. 3.2.3);
a necessary and sufficient criterion of recurrence of state i
in terms of a sum of the (ii)th matrix element of
P(n) over n (Prop. 3.2.4);
recurrence is a class property (Prop 3.2.5);
average number μi of transitions for first
return to state i;
positive recurrent and null-recurrent states;
criterion for null-recurrence;
type of recurrence (positive or null) is a class property;
recurrent states of a finite MC are positive recurrent;
periodic and aperiodic states; remarks about periodicity; examples;
simple random walk on Z: computing the number of itineraries
by using combinatorial arguments, Stirling's formula,
recurrence in the symmetric case (p=1/2)
and transience otherwise
[pages 86-93 of Sec. 3.2.2 of [L]]
Limiting probabilities:
limiting probabilities πi,
limiting probability distribution
π=(π0,π1,π1,...),
ergodic states,
Ergodic Theorem (giving conditions for existence and uniqueness
of a limiting probability distribution,
relation between πi and
μi,
and an algorithm for computing π)
[pages 94-95 of Sec. 3.2.3 of [L]]
Mathematical digression:
computing high powers of a matrix by diagonalizing it first;
general methods for solving linear recurrence relations.
-
Lecture 5 (Tue, Sep 22):
a recap of the Ergodic Theorem, examples showing
the importance of the aperiodicity (otherwise
the limit of pij(n)
as n→∞ may not exist),
the irreducibility (otherwise the stationary distribution
may not be unique), and the recurrence
(otherwise the average number μj
of transitions for first return to state i will be infinite,
so that the relation πj=1/μj
will not make sense); an example of identifying the closed irreducible sets
of recurrent states and of the sets of transient states of a MC,
and structure of the stochastic matrix of a MC;
stationary distribution of a doubly stochastic irreducible aperiodic MC
with a finite state space (Prop. 3.2.6);
simple RW on {0,1,2,...} with a partially reflecting boundary
- obtaining the stationary distribution π
when the probability of moving to the right is smaller 1/2,
and showing that a stationary distribution does not exist
when the probability of moving to the right is greater than 1/2
(exercise: study this MC when the probability of moving to the right is equal to 1/2)
[pages 95-100 of Sec. 3.2.3 of [L]]
Absorption problems:
definition of the probability
ri(n)(C)
of absorption by the closed subset C of the state space S
after exactly n steps (starting from state i);
definition of the probability ri(C)
of eventual absorption by the closed subset C of the state space S
(starting from state i);
a theorem giving ri(C)
in terms of the (pij) (Theorem 3.2.2.);
an example: the gambler's ruin problem;
martingales
[Sec. 3.2.4 of [L]]
Continuous-time discrete-state space MCs:
definition of a continuous-time discrete-state state MCs;
Markov property;
transition functions
pij(s,t)=P(Xt=j|Xs=i) for t>s;
stationary (time-homogeneous) MCs - for which
pij(s,t)=pij(0,t−s),
notation:
pij(t)=pij(0,t)=P(Xt=j|X0=i) for t>0;
a discrete-time MC {Yn}n∈{0,1,2,...}
embedded in the continuous-time MC {Xt}t≥0;
irreducibility; analogue of the condition of being
a stochastic matrix for pij(t)
(the sum over j is 1);
evolution of the occupation probabilities
pj(t)=P(Xt=j)
expressed in terms of the initial occupation probabilities
pi(0) and the transition probabilities pij(t);
Chapman-Kolmogorov equations;
discussion of the meaning of the memorylessness properties of the geometric
and the exponential random variables (Prop. 3.3.1)
[pages 121-123 of Sec. 3.3.2 and pages 109-110 of Sec. 3.3.1 of [L]]
-
Lecture 6 (Tue, Sep 29):
Continuous-time discrete-state space MCs (cont.):
exponential random variable: definition,
proof that it is memoryless, moment-generating function,
other properties (Prop. 3.3.4 and the remarks after it, Prop. 3.3.5)
[pages 109-111, 112-114 of Sec. 3.3.1 of [L]]
Poisson process:
counting process;
"little o(h)" notation, examples;
definition of a Poisson process N as a nondecreasing process with N(0)=0,
certain short-time transition probabilities
pij(h) (for small h),
and independence of events occurring at a later time interval
from the events occurring at a non-overlapping earlier time interval;
derivation of the distribution of N(t)
for a Poisson process N by deriving an initial-value problem
for an infinite system of ODEs for pij(t)
and solving the system by induction and by the method of generating functions;
an "elementary" way of deriving that N(t)∼Poisson(λt)
by dividing the interval [0,t] into a large number n of short intervals
of length t/n and applying the binomial distribution
to the distribution of the events k events in the n short intervals
[loosely following pages 231, 232, 236 of Sec. 5.1 of [L]]
Poisson process and distribution of interarrival times:
independence and exponential distribution of the interarrival times τj
of a Poisson process (Prop. 5.1.3);
arrival times Tj as sums of interarrival times;
basic properties of the Γ(α,λ) random variables;
the sum of n i.i.d. Exp(λ) random variables is a Γ(n,λ)
random variable (Prop. 3.3.6);
reconstructing a Poisson process from the interarrival times τj
[pages 237-240 of Sec. 5.1, pages 115-119 of Sec. 3.3.1 of [L]]
Miscellaneous facts about Poisson processes (optional):
a sum of independent Poisson processes is a Poisson process (Prop. 5.5.1);
decomposition of Poisson processes (Prop. 5.5.2);
distribution of T1 given that N(t)=1 (Prop. 5.1.5);
generating a Poisson process by using a generator of uniform random variables (Example 5.1.2);
first occurrence of two or more independent Poisson processes (∼Prop. 3.3.4, Prop. 3.3.5)
[pages 234-236, 239, 240 of Sec. 5.1, pages 113, 114 of Sec. 3.3.1 of [L]]
-
Lecture 7 (Tue, Oct 6):
Continuous-time discrete-state space MCs (cont.):
stochastic semigroup {Pt}t≥0;
standard semigroups;
generator
G=(νij):=(dPt/dt)|t=0
of a stochastic semigroup;
properties of G
(the sum of the elements νij in each row of G is zero);
holding time Ui of the ith state;
proof that Ui is an Exp(−νii) random variable;
discussion of the meaning of νij
expressed in terms of the rates νi
(such that the probability of leaving state i in a time interval of length h
is νih+o(h))
and the matrix elements γij of the 1-step transition
probability matrix of the jump chain {Yn}n≥0;
obtaining the 1-step transition probability matrix (γij)
of the jump chain from the generator G;
obtaining Pt from G:
Kolmogorov forward and backward equations
Pt'=PtG,
resp.
Pt'=GPt,
initial condition
Pt|t=0=I;
definition of exponential of a matrix eA;
main properties of eA:
eAeA=eAB,
detA/dt=AetA;
computing eA by simplifying A by a similarity
transformation, e.g., A=C−1DC
for a diagonal matrix D,
and using that An=C−1DnC
to show that eA=C−1eDC;
expressing the solution of the initial value problem
x'(t)=Ax(t), x(0)=x(0)
(for a constant matrix A) as
x(t)=etAx(0);
expressing the stochastic semigroup Pt
through the generator G: Pt=etG;
computing Pt for a continuous-time, two-state MC;
remarks on the Laplace transform and its usage to solve initial-value problems for ODEs;
birth process; computing the expectation of the time Tn
for a birth process starting at X0=1 to reach
Xt=n for the first time;
solving the Kolmogorov forward equations for the birth process by using generating functions
Gi(ξ,t);
using the generating function Gi(ξ,t)
to prove that the birth process is honest (Gi(1,t)=1),
to compute the conditional average of the birth process given that X(0)=k
as E[X(t)|X(0)=k]=(∂Gk(ξ,t)/∂ξ)|ξ=1
and the conditional variance Var(X(t)|X(0)=k);
another way of computing the conditional expectation E[X(t)|X(0)=k]
[roughly following Sec. 3.3.3, and pages 129-133 of Sec. 3.3.4 of [L]]
-
Lecture 8 (Tue, Oct 13):
Limiting probabilities and balance equations:
stationary distribution π
of a stochastic semigroup Pt;
reason for the term "stationary distribution":
if P(X(0)=i)=πi
where π is a stationary distribution,
then
P(X(t)=j)=πj
for all j∈S and all t≥0;
recurrence time Tii,
mean recurrence time μii=E[Tii];
recurrent and transient states,
positive recurrent and null recurrent states of a continuous-time Markov chain;
irreducible Markov chains;
Ergodic Theorem for continuous-time Markov process, remarks;
relation between the stationary distribution πj,
the rate νj of leaving state j
(where the holding time for state j is Uj∼Exp(νj)),
and the mean recurrence time μii=E[Tii];
finding stationary distributions from the generator:
πG=0;
balance equations and their interpretation
[pages 138-140 of Sec. 3.3.5 of [L]]
Birth and death processes:
birth-death-immigration-disaster process;
detailed derivation of the short-time transition probabilities pij(h)
(hence, the generatorνij) of a death-immigration process;
proving that the stationary distribution of a death-immigration process
is Poisson(ρ/μ), i.e.,
πj=e−ρ/μ(ρ/μ)j/j!
[roughly following pages 135, 136 of Sec. 3.3.4 of [L]]
Nonhomogeneous Poisson processes:
nonhomogeneous Poisson process with intensity function λ(t);
the number of arrivals N(s+t)−N(s)
of a nonhomogeneous Poisson process with intensity function λ(t)
is Poisson(m(s+t)−m(s)),
where m(t) is the mean value function of the process
(m(0)=0, m'(t)=λ(t))
[pages 250, 252 of Sec. 5.2 of [L]]
Reading assignment (mandatory):
distribution of the p.d.f. of the first arrival time T1
of a nonhomogeneous Poisson process {N(t)}t≥0
given that N(t1)=1 (for some fixed t1>0)
(Prop. 5.5.2) [pages 21, 22 of the lecture notes from Lecture 8, page 253 of [L]]
-
Lecture 9 (Tue, Oct 20):
Nonhomogeneous Poisson processes (cont.):
"homogenizing" a nonhomogeneous Poisson process N(t)
(with a strictly positive rate function λt>0)
by rescaling the time:
M(t)=N(m−1(t))
is a Poisson process with rate 1
[pages 253, 254 of Sec. 5.2 of [L]]
Compound Poisson processes:
compound random variable;
derivation of the mean, variance, and moment generating function
of a compound random variable (Prop. 5.3.1);
definition of a compound Poisson process;
mean, variance, and moment generating function
of a compound Poisson process;
approximating the distribution of a compound Poisson process
for large times by using the Central Limit Theorem (Prop. 5.3.2);
the sum of two independent compound Poisson processes
Y1(t) and Y2(t)
corresponding to Poisson processes
N1(t) and N2(t)
with rates λ1 and λ2
is a compound Poisson process
corresponding to a Poisson process with rate
λ1+λ2
[Sec. 5.3]
Doubly stochastic Poisson processes:
definition of a conditional (or "mixed") Poisson process
(whose rate is a random variable, independent of time);
proof that a conditional Poisson process
has stationary, but not independent increments (Prop. 5.4.1);
best estimator of the rate of a Poisson process;
definition of a doubly stochastic Poisson process ("Cox process")
[pages 258-260, 262 of Sec. 5.4 of [L]]
Renewal processes:
definition of a renewal process;
modified ("delayed") renewal process
(when the distribution of τ0
differs from the distributions of
τ1, τ2,...);
relations between the process N(t),
the times of the events Tn,
and the interevent times τn;
expression for the p.m.f. of N(t)
in terms of the c.d.f. of Tn (Prop. 5.6.1);
renewal function m(t)=E[N(t)];
expression for the renewal function m(t)
in terms of the c.d.f.'s of Tn (Prop. 5.6.2)
[pages 267-269 of Sec. 5.6 of [L]]
Mathematical digression:
definition of Riemann integral;
definition of the Riemann-Stieltjes integral;
particular cases of the Riemann-Stieltjes integral
when g(t) is differentiable and non-decreasing,
and when g(t) is piecewise constant;
applications to computing expected values of discrete
and continuous random variables.
-
Lecture 10 (Tue, Oct 27):
Mathematical digression (cont.):
expected value of an N-valued random variable X
as a sum (over n from 1 to infinity)
of probabilites of X to be greater or equal to n;
expected value of a non-negative continuous random variable
X as an integral of
[1−FX(x)],
geometric meaning.
Renewal processes (cont.):
recursive formula for the c.d.f.'s of the arrival times Tn
in terms of the p.d.f. Fτ of the inter-arrival
times τj through Riemann-Stieltjes integrals;
derivation of an integral equation for the renewal function
m(t)=E[Nt];
solving renewal-type equations by using Laplace transform;
another derivation of the formula for the renewal function
m(t) by performing Laplace transformation on the formula
representing m(t)
as a sum of the c.d.f.s of all the Tn's;
computing the renewal function m(t)
for a Poisson process in three ways:
(1) by using the fact that N(t) is a Poisson(λt) random variable,
(2) by expressing it as a sum of the c.d.f.'s of Tn (Prop. 5.6.2),
(3) by solving the integral equation for m(t) using Laplace transform;
the moment-generating function MX
of a (0,∞)-valued random variable X:Ω→(0,∞)
is equal to the Laplace-Stieltjes transform of the c.d.f. FX
of X and, if the random variable X is continuous,
equal to the Laplace transform of the p.d.f. ƒX of X.
Queues:
set-up of the problem, examples of queues
(queues with baulking, multiple servers, airline check-in, FIFO, LIFO,
group servise, "student discipline", "continental queueing");
A/S/s/c/p/D classification of the queues,
where A and S are deterministic (D),
Markovian
(M - with exponentially distributed interrarival/service times),
Γ (or Erlang), or general (G) distributions,
s is the number of servers, c is the capacity of the system,
p is the size of the population, D is the discipline (i.e., service policy);
stability of a queue;
a detailed solution of the M(λ)/M(μ)/1 queue:
obtaining the stationary distribution π
(and the condition for existence of a stationary distribution),
computing the probability that the waiting time W will be zero,
the conditional average of W given that there are j customers
in the queue, and the average E[W]
(both averages for large t so that the stationary distribution
has been reached);
M(λ)/G/1 queue - constructing of a discrete-time Markov
chain embedded in the queueing process
and derivation of the transition probability matrix of this Markov chain.
-
Lecture 11 (Tue, Nov 3):
General properties of stochastic processes:
cumulative distribution function
F(x1,...,xk;t1,...,tk)=FX(t1),...,X(tk)(x1,...,xk),
probability mass function
p(x1,...,xk;t1,...,tk)=pX(t1),...,X(tk)(x1,...,xk),
and probability density function
ƒ(x1,...,xk;t1,...,tk)=ƒX(t1),...,X(tk)(x1,...,xk)
of order k of a stochastic process
X={Xt:t∈[0,∞)};
mean
mX(t)=E[Xt],
autocorrelation function
RX(t1,t2)=E[Xt1Xt2],
autocovariance function
CX(t1,t2)=RX(t1,t2)−mX(t1)mX(t2),
variance
var X(t)=CX(t,t),
and autocorrelation coefficient
ρX(t1,t2)
of a stochastic process X;
processes with indepent increments;
processes with stationary increments;
strict-sense stationary (SSS, strongly stationary) processes;
wide-sense stationary (WSS, weakly stationary) processes;
an example of a WSS stochastic process that is not SSS;
average power E[Xt2] of a stochastic process;
E[Xt2] of a WSS stochastic process
does not depend on t;
spectral density SX(ω) of a WSS process;
properties of SX(ω)
[Sec. 2.1 and 2.2 of [L]]
Gaussian and Markov processes:
multinormal distribution of a random vector
X=(X1,...,Xn)∼N(m,K),
vector of the means m, covariance matrix
K=(cov(Xi,Xj));
characteristic function φX(ω)=E[exp(iωX)]
of a random variable X,
(joint) characteristic function
φX(ω)=E[exp(iω⋅X)]
of a multinormal random variable X
(Prop. 2.4.1);
if two components of
X=(X1,...,Xn)∼N(m,K)
are uncorrelated, then they are independent;
Gaussian process {Xt}
- a continuous-time stochastic process with
(Xt1,...,Xtn)
being multinormal for any n and times
t1,...tn;
if {Xt} is a Gaussian process
such that its mean mX(t)
does not depend on t
and its autocovariance function
CX(t1,t2)
depends only on t2−t1,
then the process is SSS (Prop. 2.4.2);
definition of a Markov (or Markovian) processes, examples
(random walk, Poisson process);
(first-order) density function
ƒ(x;t)=ƒX(t)(x);
conditional transition density function
p(x,x0;t,t0)=ƒX(t)|X(t0)(x|x0);
integrals of ƒ(x;t)
and
p(x,x0;t,t0)
over x are equal to 1;
expressing ƒ(x;t)
as in integral of
ƒ(x0;t0)p(x,x0;t,t0)
over x0;
more on the meaning of the p.d.f. of a continuous RV:
P(X∈(x,x+Δx])≈ƒX(x)Δx,
generalization for jointly continuous random vectors
P(X∈A)≈ƒX(x)vol(A)
where A is a small domain in Rk
containing x;
application to kth order p.d.f.'s of a random process:
P(Xt1∈(x1,x1+Δx1],...,Xtk∈(xk,xk+Δxk])≈ƒ(Xt1,...,Xtk)(x1,...,xk)Δx1...Δxk;
Chapman-Kolmogorov equations for the
conditional transition density function
p(x,x0;t,t0)=ƒXt|Xt0(x|x0)
[pages 58-63 of Sec. 2.4 of [L]]
A digression on generalized functions (distributions):
test functions (infinitely smooth compactly supported functions);
Dirac δ-function δa
defined by δa(ƒ):=ƒ(a);
derivatives of generalized functions - defined by applying integration by parts,
treating the generalized function as a regular function
and using that for a test function ƒ
limx→∞ƒ(x)=0
and
limx→−∞ƒ(x)=0
(because ƒ has compact support):
this gives us
that integral of ƒ times the kth derivative
ξ(k) of a generalized function ξ
is equal to (−1)k times
integral of ξ times ƒ(k),
which symbolically can be written as
ξ(k)(ƒ):=(−1)kξ(ƒ(k));
following this recipe, the derivatives of δa
defined by δa'(ƒ):=−ƒ'(a),
δa''(ƒ):=(−1)2ƒ''(a),
and in general
δa(k)(ƒ):=(−1)kƒ(k)(a);
example: generalized derivative
of the Heaviside (unit step) function:
Ha'=δa.
-
Lecture 12 (Tue, Nov 10):
A digression on generalized functions (distributions) (cont.):
interpretation of generalized functions as a "rough" signal,
and of the test function as a "smoothing" function
corresponding to the "smearing" due to the experimental device.
The Wiener process (Brownian motion):
normal (Gaussian) random variables N(μ,σ2)
- p.d.f., mean, variance, characteristic function,
standard normal random variable Z∼N(0,1)
which can be obtained from Z∼N(μ,σ2)
as Z=(X−EX)/σX;
computing the characteristic function of a symmetric simple random walk
on the state space ηZ={...,−2η,−η,0,η,2η,...},
with jumps (of size η) occurring as a Poisson process with rate λ/2,
proof that in the limit η→0, λ→∞, λη2=1
Xt∼N(0,t);
definition of Brownian motion/Wiener process
Wt∼N(0,σ2t)
and a standard Wiener process
Bt∼N(0,t);
the Wiener process as a limit of simple random walk;
historical remarks (Robert Brown, Albert Einstein,
Marian Smoluchowski, Norbert Wiener, Andrey Kolmogorov);
p.d.f. of order k of a Wiener process;
moments of Wt:
mean E[Wt]=0,
autocovariance function
CW(t,s)=E[WtWs]=σ2min(t,s),
autocorrelation function
RW(t,s)=E[WtWs]=σ2min(t,s);
proof that tW1/t
is a Brownian motion (Example 4.1.2)
[pages 173-179 of Sec. 4.1]
-
Lecture 13 (Tue, Nov 17):
σ-algebras and probability measures:
sample space Ω; outcome (elementary event) ω - an element of Ω;
σ-algebra F of subsets of Ω;
event - an element of F; examples;
Borel σ-algebra B(R) of subsets of R;
sub-σ-algebra G⊆F
of F;
σ-algebra σ(A1,A2,...)
generated by a collection A1, A2,... of subsets of Ω;
examples;
probabiliy measure P:F→[0,1] on (Ω,F);
Lebesgue measure L:B([0,1])→[0,1] on [0,1] defined by
L((a,b))=b−a;
probability space (Ω,F,P);
F-measurable functions
X:Ω→R - for which
{X∈B}∈F for all
B∈B(R);
random variable on (Ω,F)
- an F-measurable function X:Ω→R;
an example: a σ(∅)-measurable function is a constant function;
more examples;
σ-algebra σ(X) generated by a random variable X;
σ-algebra σ(F1,...,Fn)
generated by a by a collection of σ-algebras;
σ-algebra σ(X1,...,Xn)
generated by a family of random variables
X1,...,Xn;
filtration
F1⊆F2⊆F3⊆... of σ-algebras generated by a sequence
X1,X2,X3,...
of functions Xk:Ω→R,
where Fk=σ(X1,...,Xk);
example: filtration of σ-algebras generated by a sequence of coin tosses;
distribution (cumulative distribution function, c.d.f.)
FX(x)=P(X≤x)=P({ω∈Ω:X(ω)≤x}) of a random variable X;
expectation E[X] of a random variable X as an integral over R
of x dFX(x)
or, equivalently, as an integral over Ω of X(ω) P(dω);
integrable (L1-) random variables (for which E[|X|]<∞).
Conditional expectation and martingales:
conditional expectation E[X|A]
of a random variable X conditioned on an event A;
conditional expectation E[X|F]
of a random variable X conditioned on a σ-algebra F;
conditional expectation E[X|Y]
of a random variable X conditioned on another random variable Y;
discussion of the meaning of the filtration
F1⊆F2⊆F3⊆...
with
Fn=σ(X1,...,Xn)
in the context of "coin tossing" (where Xn is the result of the nth toss)
- Fn represents our knowledge at time n;
a sequence Y1,Y2,... of random variables
adapted to the filtration
F1⊆F2⊆F3⊆...
- each Yn is
Fn-measurable
(i.e., can be determined from the values of the random variables
X1,...,Xn
generating the σ-algebra
Fn=σ(X1,...,Xn));
an example - the running averages Sn;
martingale M1,M2,...
with respect to a filtration F1⊆F2⊆F3⊆...
- a sequence of L1-random variables adapted to the filtration
and such that E[Mn+1|Fn]=Mn;
an example of a martingale - the positions
Y1,Y2,... of a particle in a simple symmetric random walk;
a continuous-time example of a martingale - for a Poisson process {Nt}t≥0
with intensity λ,
Mt=Nt−λt is a martingale.
-
Lecture 14 (Tue, Nov 24):
Conditional expectation and martingales (cont.):
example - exponential martingale
exp(αBt−α2t/2),
obtaining a family of polynomial martingales from the Taylor
expansion of the exponential martingale with respect to α around α=0:
exp(αBt−α2t/2)=1+Btα+(1/2)(Bt2−t)α2+(1/6)(Bt3−3tBt)α3+(1/24)(Bt4−6tBt2+3t2)α4+(1/120)(Bt5−10tBt3+15t2Bt)α5+...
The Wiener process (Brownian motion) (cont.):
a brief review - definition of Wt and Bt,
increments, moments
(E[Wtodd power]=0,
Var Wt=E[Wt2]=σ2t,
E[Wt4]=3σ4t2,
E[Wt6]=15σ6t3,...),
autocorrelation and autocovariance functions,
probability density function
ƒ(x1,...,xk;t1,...,tk)=ƒW(t1),...,W(tk)(x1,...,xk)
of order k expressed as product of the p.d.f.
ƒ(x1;t1)=ƒW(t1)(x1)
and the conditional p.d.f.'s
ƒW(tj)|W(tj−1)(xj|xj−1) for j=2,...,k,
explicit expressions for all p.d.f.'s;
short-time behavior: for Δt>0 and
ΔBt:=Bt+Δt−Bt,
computing
E[ΔBtodd power]=0 and
E[(ΔBt)2]=1/Δt→∞
as Δt→0+,
nondifferentiability of the Brownian motion;
Gaussian white noise ξt:=dBt/dt;
making sense of the derivative dBt/dt
as a (random) generalized function acting on a test function φ,
definition of a functional Ξ(φ) as an integral of ξtφ(t)
over t from 0 to ∞
(meaning; a measurement "smeared by φ"),
and a proof that E[Ξ(φ)]=0 and that E[Ξ(φ)2]
equals integral of φ(t)2 over t from 0 to ∞,
interpreting these facts as
E[ξt]=0 and that E[ξtξs]=δ(t−s).
Stochastic differential equations (SDEs):
the standard Brownian motion can be considered as the solution
of the initial value problem
dBt/dt=ξt, B0=0
for the unknown function Bt whose evolution
is driven by Gaussian white noise ξt;
on meaning of an SDE - computing the transition probability density
ƒBt|Bs(x|y)=ƒ(x,y|t,s)
for 0≤s<t
as a solution of an initial-value problem for a partial differential equation,
e.g., ∂tƒ(x,y|t,s)=(1/2)∂xxƒ(x,y|t,s),
limit of ƒ(x,y|t,s) as t→s+ equals δ(x−y);
a generalization:
dXt/dt=ƒ(t,Xt)+g(t,Xt)ξt;
discretization by using the values at the left end:
ΔXt≈ƒ(t,Xt)Δt+g(t,Xt)ΔBt,
Xt+Δt=Xt+ΔXt
(similar to the Euler method for integration of ODEs),
main reason for using this discretization - the increment ΔBt
is independent of the value of Xt and Bt;
Itô integrals as a limit (in some sense) of left Riemann sums.
-
Lecture 15 (Tue, Dec 1):
Stochastic differerential equations and Itô integrals:
using left Riemann sums to approximate the solution of the SDE
dXt=ƒ(t,Xt)dt+g(t,Xt)dBt;
definition and examples of of L1-limit
and m.s.-limit (mean-square limit, L2-limit) of series of functions;
definition of Itô integral as a m.s.-limit of the left Riemann sums
∑i g(ti,Xi) ΔBi;
useful facts for calculations:
E[(ΔBiodd power)]=0,
E[(ΔBi)2]=Δti,
E[(ΔBi)4]=3(Δti)2,
E[g(ti,Bi)ΔBi]=0,
E[g(ti,Bi)(ΔBi)2]=E[g(ti,Bi)]Δti,
E[(ΔBi)k(ΔBj)m]=E[(ΔBi)k]E[(ΔBj)m] for i≠j;
computing the Itô integral from t0 to t of BsdBs;
writing the result about the integral in the form
d(Bt2)=2BtdBt+dt; similar result for
d(Btk) for k=3,4,5,...;
Itô formula for dΨ(t,Xt)
where Ψ(t,x) is a function of two variables
and Xt satisfies the SDE
dXt=ƒ(t,Xt)dt+g(t,Xt)dBt, mnemonic rules for deriving the formula;
remarks on the meaning of the solution Xt of a SDE;
non-anticipating functions and expectation of Itô integrals;
properties of Itô integrals; correlation formula; Itô isometry;
example - simple population growth at a noisy rate:
dXt/dt=(r+αξt)Xt
or, equivalently,
dXt=rXtdt+αXtdBt, obtaining the solution Xt=X0e(r−α2/2)t+αBt by using Itô formula, computing the average
E[Xt]=E[X0]ert,
discussion of the behavior of the solutions for
r>α2/2 and for r<α2/2,
remarks about the interpretation of the numerical simulations of the SDE.
-
Lecture 16 (Tue, Dec 8):
Stochastic differerential equations and Itô integrals (cont.):
using the exponential martingale to analyze the average of the population
in the problem of simple population growth at a noisy rate
("geometric Brownian motion"), computing the variance of the population at time t;
example: linear growth with noise that is proportional to the population:
dXt=dt+XtdBt,
solving the problem by "integration by parts" in stochastic calculus,
d(XtYt)=XtdYt+YtdXt+(dXt)(dyt),
computing the mean and the variance of the solution;
meaning and derivation of the Fokker-Planck equation for the conditional transition
density function p(x,x0;t,t0);
solution of the Fokker-Planck equation for the standard Brownian motion,
physical interpretation of the solution;
solution of the Fokker-Planck equation for the geometric Brownian motion
(lognormal distribution);
idea of Stratonovich integral.
Grading:
Your grade will be determined by your performance
on the following coursework:
Homework (lowest grade dropped) |
50% |
Take-home midterm exam |
20% |
Take-home final exam |
30% |
Homework:
Homework assignments will be given
regularly throughout the semester and will be posted on this web-site.
The homework will be due at the start of class on the due date.
Each homework will consist of several problems,
of which some pseudo-randomly chosen problems will be graded.
Your lowest homework grade will be dropped.
All homework should be written on a 8.5"×11" paper
with your name clearly written, and should be stapled.
No late homework will be accepted!
You are encouraged to discuss the homework problems
with other students.
However, you have to write your solutions clearly
and in your own words - this is the only way to
achieve real understanding!
It is advisable that you first write a draft
of the solutions and then copy them neatly.
Please write the problems in the same order
in which they are given in the assignment.
There is no need to type the homework, but please use your best handwriting!
Exams:
There will be one take-home midterm and a comprehensive take-home final.
All tests must be taken at the scheduled times, except in extraordinary circumstances.
Please do not arrange travel plans that will prevent you
from taking any of the exams at the scheduled time.
Good to know: