MATH 5763 - Stochastic Processes, Section 001 - Spring 2019
TR 1:30-2:45 p.m., 809 PHSC
Instructor:
Nikola Petrov, 1101 PHSC, npetrov AT ou.edu
Office Hours:
Mon 2:30-3:30, Wed 10:30-11:30 (subject to change), or by appointment, in 1101 PHSC
First day handout
Prerequisite:
Basic calculus-based probability theory at the level of MATH 4733
(including axioms of probability, random variables, expectation,
probability distributions, independence, conditional probability).
The class will also require knowledge of elementary analysis
(including sequences, series, continuity),
linear algebra (including linear spaces, eigenvalues,
eigenvectors),
and ordinary differential equations (at the level
of MATH 3113 or MATH 3413).
Course description:
The theory of stochastic processes studies systems that evolve
randomly in time;
it can be regarded as the "dynamical" part of probability theory.
It has many important practical applications,
as well as in other branches
in mathematics such as partial differential equations.
This course is a graduate-level
introduction to stochastic processes,
and should be of interest to students of mathematics,
statistics, physics, engineering, and economics.
The emphasis will be on the fundamental concepts,
but we will avoid using the theory of Lebesgue measure
and integration in any essential way.
Many examples of stochastic phenomena in applications
and some modeling issues will also be discussed in class
and given as homework problems.
Texts:
We may use parts of the following books, freely available from the
OU Library for OU students:
-
[L] Mario Lefebvre, Applied Stochastic Processes, Springer, 2006
-
[BZ] Zdisław Brzeźniak, Tomasz Zastawniak, Basic Stochastic Processes, Springer, 1999
-
[P] Emanuel Parzen, Stochastic Proceses, SIAM, 1999
-
[D] Richard Durrett, Essentials of Stochastic Processes, Second ed., Springer, 2012
-
[R] Sheldon Ross, Introduction to Probability Models, Eighth ed., Elsevier, 2003
-
[K] Hui-Hsiung Kuo, Introduction to Stochastic Integration, Springer, 2007
-
[MO] Kosto V. Mitov, Edward Omey, Renewal Processes, Springer, 2014
-
[H] Moshe Haviv, Queues: A course in Queueing Theory, Springer, 2013
-
[BGMT] Gunter Bolch, Stefan Greiner, Hermann de Meer, Kishor S. Trivedi,
Queueing Networks and Markov Chains, Second ed., Wiley Interscience, 2006
-
[MS] Yuliya Mishura, Georgiy Shevchenko, Theory and Statistical Applications of Stochastic Processes
, Wiley, 2017
-
[Ø] Bernt Øksendal, Stochastic Differential Equations:
An Introduction with Applications, Sixth ed., Springer, 2013
Main topics (a tentative list):
-
a brief review of probability theory;
-
discrete Markov chains: Chapman-Kolmogorov equations,
persistence and transience, generating functions,
stationary distributions, reducibility, limit theorems, ergodicity;
-
continuous Markov processes:
Poisson process, birth-death and branching processes,
embedding of a discrete-time Markov chain
in a continuous-time Markov processes;
-
conditional expectation, martingales;
-
stationary processes (autocorrelation function,
spectral representation);
-
renewal processes, queues;
-
diffusion processes, Wiener processes (Brownian motion);
-
introduction to stochastic differential equations, Itô calculus;
-
Fokker-Planck equation, Ornstein-Uhlenbeck process.
Homework:
-
Homework 1, due January 24 (Thursday).
-
Homework 2, due January 31 (Thursday).
-
Homework 3, due February 7 (Thursday).
-
Homework 4, due February 14 (Thursday).
-
Homework 5, due February 21 (Thursday).
-
Homework 6, due March 7 (Thursday).
-
Homework 7, due March 26 (Tuesday).
-
Homework 8, due April 11 (Thursday).
-
Homework 9, due April 25 (Thursday).
-
Homework 10, solved in class.
Content of the lectures:
-
Lecture 1 (Tue, Jan 15)
Review of probability:
-
sample space Ω, events as subsets of the sample space,
elementary events as elements of the sample space,
operations with events (complement, union,
intersection, difference, symmetric difference,
subset, impossible event);
σ-algebra (σ-field), examples;
De Morgan's laws, disjoint events,
distributivity properties of intersection and union;
-
probability (probability measure) P,
probability space,
elementary properties of probability measures
(including the inclusion-exclusion formula);
-
conditional probability P(A|B),
properties of conditional probability,
partitions of the sample space,
law of total probability
[pages 1-3, 5-6 of Sec. 1.1 of [L]]
-
Lecture 2 (Thu, Jan 17)
Review of probability (cont.):
-
independent events, independent family of events,
pairwise independent family of events;
conditional independence given an event;
the conditional independence of the events A and B
given an event C neither implies nor is implied
by the independence of A and B;
an example of using conditioning;
-
random variables (RVs),
(cumulative) distribution function (c.d.f.) FX(x)
of a RV X, properties of c.d.f.s;
discrete RVs, probability mass function (p.m.f.)
pX(x) of a discrete RV;
continuous RVs, p.d.f. ƒX(x) of a continuous RV;
p.d.f. ƒY(y)
of a function Y=g(X) of the RV X;
-
conditional c.d.f. FX(x|A)
of a RV X conditioned on an event A;
conditional p.m.f. pX(x|A)
or p.d.f. ƒX(x|A) of a RV X
conditioned on an event A;
-
expectation E[X] of a RV X;
expectation E[g(X)] of a function of a RV X;
rth moment E[Xr] of a RV X
for r=0,1,2,...;
variance Var X and standard deviation
σX=(Var X)1/2
of a RV X
[pages 7, 8 of Sec. 1.1 of [L];
pages 8, 9, 13, 12, 16, 17 of Sec. 1.2 of [L]]
-
Lecture 3 (Thu, Jan 22)
Review of probability (cont.):
-
representing the expectation of a random variable as an integral over Ω
with respect to the probability measure P;
-
conditional expectation E[X|A],
conditional moments E[Xr|A]
and conditional variance and Var(X|A)
of a RV X given an event A;
-
random vectors: definition; (joint) c.d.f. FX,Y(x,y)
of a random vector X=(X,Y); properties of FX,Y(x,y);
marginal c.d.f.s FX(x)=FX,Y(x,∞)
and FY(y)=FX,Y(∞,y);
marginal p.m.f.s pX(x) and pY(y),
respectively p.d.f.s ƒX(x) and ƒY(y),
of a random vector (X,Y);
-
conditional c.d.f. FX|Y(x|Y=ym)
and conditional p.m.f.
pX|Y(xk|Y=ym)
of the discrete RV X conditioned on the discrete RV Y;
conditional c.d.f. FX|Y(x|y)
and conditional p.m.f.
ƒX|Y(x|y)
of the continuous RV X conditioned on the continuous RV Y;
-
conditional expectation E[X|Y=y] of the RV X
conditioned on the RV Y;
the conditional expectation E[X|Y=y]
depends only on y (but not on X), so it can be considered as a function of Y,
therefore we can think of the conditional expectation E[X|Y]
as a new random variable which is a function of the RV Y: namely,
E[X|Y]:Ω→R is defined as
the value of E[X|Y](ω) is defined as E[X|Y=y],
where y=Y(ω);
tower rule E[E[X|Y]]=E[X];
an example of using conditional expectation in computing the average grade
of all students in two classes of different size
[pages 14, 17 of Sec. 1.2 of [L];
pages 21-27 of Sec. 1.3 of [L]]
-
Lecture 4 (Thu, Jan 24)
Stochastic processes:
-
definition of a stochastic process (random process);
-
classification of random processes:
discrete-time or continuous-time,
discrete-state space or continuous-state space
[pages 47-48 of Sec 2.1 of [L]]
Markov chains - introduction:
-
Markov property; Markov chain (MC);
-
example: simple 1-dimensional random walk (RW), symmetric simple 1-dim RW;
-
the future of a MC depends only on the most recent available information (Prop. 3.1.1);
-
more examples: 2-dimensional, and d-dimensional RWs, Ehrenfests' urn model,
birth-death processes
[pages 73-75 of Sec. 3.1 of [L]]
Discrete-time Markov chains - definitions and notations:
-
time-homogeneous discrete-time discrete-state space MCs;
stationary (time-homogeneous) MCs;
-
one-step and n-step transition probabilities;
one-step transition probability matrix P of a MC;
stochastic and doubly-stochastic matrices;
-
n-step transition probability matrices P(n);
-
Chapman-Kolmogorov in matrix form
(P(m+n)=P(m)P(n))
and in components;
corollary: P(n)=Pn;
-
an example of a MC with 2 states;
-
probability ρij(n) of visiting state j for the first time
in n steps starting from state i;
probability ρii(n) of first return to state i in n steps;
representation of pij(n)
as a sum over k from 1 to n of
ρij(k)pjj(n−k);
-
examples of direct computation ρij(n)
in the example of a MC with 2 states
[pages 73-82 of Sec. 3.2.1 of [L]]
-
Lecture 5 (Tue, Jan 29)
Discrete-time Markov chains - definitions and notations (cont.):
-
initial distribution (p.m.f.) a=(a0,a1,a2,...),
ai=P(X0=i), of a MC;
distribution (p.m.f.)
a(n)=(a0(n),a1(n),a2(n),...),
ai(n)=P(Xn=i), of a MC at time n;
-
formula for evolution of the probability distribution:
a(n)=aPn;
-
examples: simple 1-dim random walk on Z,
simple 1-dim random walk on Z+ with reflecting
and absorbing boundary condition at 0
[Sec. 3.2.1 of [L]]
Properties of Markov chains:
-
accessibility of state j from state i, i→j;
communicating states i↔j;
-
properties of the relation ↔
(reflexivity, symmetry, transitivity),
↔ is an equivalence relation;
-
equivalence classes with respect of ↔;
-
closed sets of the state space (Def. 3.2.7);
-
irreducible MCs;
irreducibility criteria; examples;
-
absorbing states
[pages 85-86 of Sec. 3.2.2 of [L]]
-
Lecture 6 (Thu, Jan 31)
Discrete-time Markov chains - definitions and notations (cont.):
-
probability ƒij
of eventual visit of state j starting from state i;
probability ƒii
of eventual return to state i;
-
expressing ƒij as a sum of the first visit probabilities
ρij(n);
-
recurrent (persistent) and transient states;
-
Decomposition Theorem;
-
an example of identifying closed irreducible sets of recurrent states
and sets of transient states, and structure of the stochastic matrix;
-
a necessary and sufficient criterion of recurrence
of state i in terms of the expected value E[Ni]
of the number Ni of returns to this state (Prop. 3.2.3);
-
a necessary and sufficient criterion of recurrence of state i
in terms of a sum of the (ii)th matrix element of
P(n) over n (Prop. 3.2.4);
-
recurrence is a class property (Prop. 3.2.5);
-
average number μi of transitions for first return to state i;
-
positive recurrent and null-recurrent states;
-
criterion for null-recurrence;
-
type of recurrence (positive or null) is a class property;
-
recurrent states of a finite MC are positive recurrent;
-
examples of identifying the transient and recurrent states
and splitting an MC into classes according to the Decomposition Theorem
[pages 87-90 of Sec. 3.2.2 of [L]]
Reading assignment (mandatory):
Read the proof of Proposition 3.2.5 (page 88, 89 of [L])
-
Lecture 7 (Tue, Feb 5)
Discrete-time Markov chains - definitions and notations (cont.):
-
periodic and aperiodic states; remarks about periodicity; examples;
-
simple random walk on Z:
computing the number of itineraries by using combinatorial arguments, Stirling's formula,
recurrence in the symmetric case (p=1/2)
and transience otherwise
-
simple symmetric random walk on Zd:
it is recurrent for d=2, and transient for d=3,4,5,...
[pages 91-93 of Sec. 3.2.2 of [L]]
Limiting probabilities:
-
limiting probabilities πi,
limiting probability distribution
π=(π0,π1,π1,...);
-
ergodic states;
-
Ergodic Theorem (giving conditions for existence and uniqueness
of a limiting probability distribution,
relation between πi and
the average value μi of the first return time to state i;
and an algorithm for computing π
as a normalized left eigenvector of the one-step transition probability matrix P
-
origin of the word "stationary":
if pXn(i)=πi,
then
pXn+1(i)=πi
(proof: exercise!);
-
computing high powers of a matrix by diagonalizing it first;
-
example of application of the Ergodic Theorem;
-
example: simple random walk on {0,1,2,3,...} with a partially reflecting boundary:
setting up the problem, writing down the one-step transition probability matrix P
and the system of equations for the stationary distribution π
[pages 94-97, 99 of Sec. 3.2.3 of [L]]
-
Lecture 8 (Thu, Feb 7)
Limiting probabilities (cont.):
-
example (cont.):
simple random walk on {0,1,2,3,...} with a partially reflecting boundary:
solving linear recurrence relations (characteristic equation,
general form of the solution, finding the arbitrary constants
from the boundary conditions);
-
solving the system of equations for the stationary distribution π:
computing stationary distribution when the probability p
of moving to the right is smaller than 1/2,
showing that a stationary distribution does not exist when p>1/2
[pages 98-100 of Sec. 3.2.3 of [L]]
Absorption problems:
-
definition of the probability
ri(n)(C)
of absorption by the closed subset C of the state space S
after exactly n steps (starting from state i);
-
definition of the probability ri(C)
of eventual absorption by the closed subset C of the state space S
(starting from state i);
-
a theorem giving ri(C)
in terms of the (pij) (Theorem 3.2.2.);
-
an example: the gambler's ruin problem;
-
martingales;
-
solving the gambler's ruin problem using martingales
[Sec. 3.2.4 of [L]]
Exercise: Study the simple random walk on {0,1,2,3,...}
with a partially reflecting boundary if the walk is symmetric
(i.e., p=1/2).
Exercise/reading assignment:
Solve the gambler's ruin problem in the general case p∈(0,1).
-
Lecture 9 (Tue, Feb 12)
Continuous-time discrete-state space MCs:
-
definition of a continuous-time discrete-state state MCs;
-
Markov property;
-
transition functions
pij(s,t)=P(Xt=j|Xs=i) for t>s;
-
stationary (time-homogeneous) MCs - for which
pij(s,t)=pij(0,t−s),
notation:
pij(t)=pij(0,t)=P(Xt=j|X0=i) for t>0;
-
a discrete-time MC {Yn}n∈{0,1,2,...}
embedded in the continuous-time MC {Xt}t≥0;
-
irreducibility;
-
analogue of the condition of being
a stochastic matrix for pij(t)
(the sum over j is 1);
-
evolution of the occupation probabilities
pj(t)=P(Xt=j)
expressed in terms of the initial occupation probabilities
pi(0) and the transition probabilities pij(t);
-
Chapman-Kolmogorov equations;
-
discussion of the meaning of the memorylessness properties of the geometric
and the exponential random variables (Prop. 3.3.1)
-
exponential random variable: definition,
proof that it is memoryless, moment-generating function,
other properties (Prop. 3.3.4 and the remarks after it, Prop. 3.3.5)
[pages 121-123 of Sec. 3.3.2 and pages 109-110 of Sec. 3.3.1 of [L]]
Poisson process:
-
counting process;
-
"little o(h)" notation, examples;
-
definition of a Poisson process N as a nondecreasing process with N(0)=0,
certain short-time transition probabilities
pij(h) (for small h),
and independence of events occurring at a later time interval
from the events occurring at a non-overlapping earlier time interval
[loosely following pages 231, 232, 236 of Sec. 5.1 of [L]]
-
Lecture 10 (Thu, Feb 14)
Poisson process (cont.):
-
derivation of the distribution of N(t)
for a Poisson process N by deriving an initial-value problem
for an infinite system of ODEs for pij(t)
and solving the system:
-
by mathematical induction,
-
by the method of generating functions, and
-
an "elementary" way of deriving that N(t)∼Poisson(λt)
by dividing the interval [0,t] into a large number n of short intervals
of length t/n and applying the binomial distribution
to the distribution of the events k events in the n short intervals.
Poisson process and distribution of interarrival times:
-
arrival times Tj as sums of interarrival times;
-
reconstructing the Poisson process from the interarrival times
[pages 237-238 of Sec. 5.1 of [L]]
-
Lecture 11 (Tue, Feb 19)
Lecture cancelled due to weather.
-
Lecture 12 (Thu, Feb 21)
Poisson process and distribution of interarrival times (cont.):
-
independence and exponential distribution of the interarrival times τj
of a Poisson process;
-
basic properties of the Γ(α,λ) random variables;
-
the sum of n i.i.d. Exp(λ) random variables is a Γ(n,λ)
random variable (Prop. 3.3.6);
-
reconstructing a Poisson process from the interarrival times τj;
-
several facts about the Poisson process:
-
if Mt and Wt
are Poisson processes with rates μ and ν, respectively,
then
Nt=Mt+Wt
is a Poisson process with rate μ+ν (Prop. 5.1.1),
-
if Mt, Wt, and Nt
are defined as above, then Mt=k|{Nt=n}
is a Binomial RV with parameters n and μ/(μ+ν),
-
if X∼Exp(μ) and Y∼Exp(ν), then
Z=min(X,Y)∼Exp(μ+ν)
(think about the first arrival time of a Poisson process
that is a "combination" of two Poisson processes of rates μ and ν
running simultaneously (if we do not distinguish between the events
of the two Poisson processes) (~Prop. 3.3.4, 3.3.5).
[pages 234, 235, 237-239 of Sec. 5.1, pages 113, 115-119 of Sec. 3.3.1 of [L]]
Continuous-time discrete-state space MCs (cont. from Lecture 9):
-
stochastic semigroup {Pt}t≥0;
-
generator
G=(νij):=(dPt/dt)|t=0
of a stochastic semigroup;
-
properties of G
(the sum of the elements νij in each row of G is zero);
-
obtaining Pt from G:
Kolmogorov forward and backward equations
Pt'=PtG,
resp.
Pt'=GPt,
-
initial condition
Pt|t=0=I
[roughly following Sec. 3.3.3 of [L]]
-
Lecture 13 (Tue, Feb 26)
Continuous-time discrete-state space MCs (cont.):
-
definition of exponential of a matrix eA;
-
computing eA by simplifying A by a similarity
transformation, e.g., A=C−1DC
for a diagonal matrix D,
and using that An=C−1DnC
to show that eA=C−1eDC;
-
expressing the solution of the initial value problem
x'(t)=Ax(t), x(0)=x(0)
(where x:R→Rd is the unknown function
and A is a constant d×d matrix)
as x(t)=etAx(0);
-
expressing the stochastic semigroup Pt
through the generator G: Pt=etG;
-
computing Pt for a continuous-time, two-state MC;
-
remarks on the Laplace transform and its usage to solve initial-value problems for ODEs;
-
definition of a birth process;
-
solving the Kolmogorov forward equations for the birth process by using generating functions
Gi(ξ,t);
-
using the generating function Gi(ξ,t)
to prove that the birth process is honest (Gi(1,t)=1)
[roughly following pages 129-133 of Sec. 3.3.4 of [L]]
-
Lecture 14 (Thu, Feb 28)
Continuous-time discrete-state space MCs (cont.):
-
using the generating function Gi(ξ,t)
to compute the conditional average
E[X(t)|X(0)=k]=(∂Gk(ξ,t)/∂ξ)|ξ=1
and the conditional variance Var(X(t)|X(0)=k);
of the birth process given that X(0)=k;
-
another way of computing the conditional expectation E[X(t)|X(0)=k]
by writing an ODE for it.
Limiting probabilities and balance equations:
-
stationary distribution π of a stochastic semigroup Pt;
-
reason for the term "stationary distribution":
if P(X(0)=i)=πi
where π is a stationary distribution, then
P(X(t)=j)=πj
for all j∈S and all t≥0;
-
recurrence time Tii,
mean recurrence time μii=E[Tii];
-
recurrent and transient states,
positive recurrent and null recurrent states of a continuous-time Markov chain;
-
irreducible Markov chains;
-
Ergodic Theorem for continuous-time Markov process, remarks;
-
relation between the stationary distribution πj,
the rate νj of leaving state j
(where the holding time for state j is Uj∼Exp(νj)),
and the mean recurrence time μii=E[Tii];
-
finding stationary distributions from the generator:
πG=0;
balance equations and their interpretation
[pages 138-140 of Sec. 3.3.5 of [L]]
Birth and death processes:
-
birth-death-immigration-disaster process - general set-up.
-
Lecture 15 (Tue, Mar 5)
Birth and death processes (cont.):
-
detailed derivation of the short-time transition probabilities pij(h)
(hence, the generator νij) of a death-immigration process;
-
proving that the stationary distribution π of a death-immigration process
is Poisson(ρ/μ), i.e.,
πj=e−ρ/μ(ρ/μ)j/j!
[roughly following pages 135, 136 of Sec. 3.3.4 of [L]]
Nonhomogeneous Poisson processes:
-
nonhomogeneous Poisson process with intensity function λ(t);
-
the number of arrivals Ns+t−Ns
of a nonhomogeneous Poisson process with intensity function λ(t)
is Poisson(m(s+t)−m(s)),
where m(t) is the mean value function of the process
(defined by m(0)=0, m'(t)=λ(t)),
-
"homogenizing" a nonhomogeneous Poisson process N(t)
(with a strictly positive rate function λ(t)>0)
by rescaling the time:
Mt:=Nm−1(t)
is a Poisson process with rate 1
[pages 250, 252-254 of Sec. 5.2 of [L]]
Reading assignment (optional):
distribution of the p.d.f. of the first arrival time T1
of a nonhomogeneous Poisson process {Nt}t≥0
given that Nt1=1 (for some fixed t1>0)
(Prop. 5.5.2) [Prop. 5.2.2 page 253 of [L]]
-
Lecture 16 (Thu, Mar 7)
Compound Poisson processes:
-
compound random variable;
-
derivation of the mean, variance, and moment generating function
of a compound random variable (Prop. 5.3.1);
-
definition of a compound Poisson process;
-
mean, variance, and moment generating function of a compound Poisson process;
-
approximating the distribution of a compound Poisson process
for large times by using the Central Limit Theorem (Prop. 5.3.2);
-
the sum of two independent compound Poisson processes
Y1(t) and Y2(t)
corresponding to Poisson processes
N1(t) and N2(t)
with rates λ1 and λ2
is a compound Poisson process
corresponding to a Poisson process with rate
λ1+λ2
[Sec. 5.3 of [L]]
Doubly stochastic Poisson processes:
definition of a conditional (or "mixed") Poisson process
(whose rate is a random variable, independent of time)
[page 258 of Sec. 5.4 of [L]]
-
Lecture 17 (Thu, Mar 12)
Compound Poisson processes:
-
proof that a conditional Poisson process
has stationary, but not independent increments (Prop. 5.4.1)
[pages 258, 259 of Sec. 5.4 of [L]]
Renewal processes:
-
definition of a renewal process;
-
modified ("delayed") renewal process (when the distribution of τ0
differs from the distributions of τ1, τ2,...);
-
relations between the process N(t),
the times of the events Tn,
and the interevent times τn;
-
expression for the p.m.f. of N(t)
in terms of the c.d.f. of Tn (Prop. 5.6.1);
-
renewal function m(t)=E[N(t)];
-
expression for the renewal function m(t)
in terms of the c.d.f.'s of Tn (Prop. 5.6.2)
[pages 267-269 of Sec. 5.6 of [L]]
Mathematical digression:
-
definition of Riemann integral;
-
definition of the Riemann-Stieltjes integral;
-
a particular case of the Riemann-Stieltjes integral
when g(t) is differentiable and non-decreasing.
-
Lecture 18 (Thu, Mar 14)
Mathematical digression (cont.):
-
a particular case of the Riemann-Stieltjes integral
when g(t) is non-decreasing and piecewise constant;
-
applications of the Riemann-Stieltjes integral
to computing expected values of discrete and continuous random variables;
-
expected value of an N-valued random variable X
as a sum (over n from 1 to infinity)
of probabilites of X to be greater or equal to n;
-
expected value of a non-negative continuous random variable
X as an integral of
[1−FX(x)],
geometric meaning.
Renewal processes (cont.):
-
recursive formula for the c.d.f.'s of the arrival times Tn
in terms of the p.d.f. Fτ of the inter-arrival
times τj through Riemann-Stieltjes integrals;
-
derivation of an integral equation for the renewal function
m(t)=E[Nt];
-
solving renewal-type equations by using Laplace transform.
-
Lecture 19 (Tue, Mar 26)
Renewal processes (cont.):
-
another derivation of the formula for the renewal function
m(t) by performing Laplace transformation on the formula
representing m(t)
as a sum of the c.d.f.s of all the Tn's;
-
computing the renewal function m(t) for a Poisson process in three ways:
-
by using the fact that N(t) is a Poisson(λt) random variable,
-
by expressing it as a sum of the c.d.f.'s of Tn (Prop. 5.6.2),
-
by solving the integral equation for m(t) using Laplace transform;
the moment-generating function MX
of a (0,∞)-valued random variable X:Ω→(0,∞)
is equal to the Laplace-Stieltjes transform of the c.d.f. FX
of X and, if the random variable X is continuous,
equal to the Laplace transform of the p.d.f. ƒX of X.
Queues:
-
set-up of the problem, examples of queues
(queues with baulking, multiple servers, airline check-in, FIFO, LIFO,
group servise, "student discipline", "continental queueing");
-
A/S/s/c/p/D classification of the queues,
where A and S are deterministic (D),
Markovian
(M - with exponentially distributed interrarival/service times),
Γ (or Erlang), or general (G) distributions,
s is the number of servers, c is the capacity of the system,
p is the size of the population, D is the discipline (i.e., service policy);
-
stability of a queue;
-
M(λ)/G/1 queue - constructing of a discrete-time Markov
chain embedded in the queueing process
and scanned notes of the derivation
of the transition probability matrix of this Markov chain.
General properties of stochastic processes:
-
cumulative distribution function
F(x1,...,xk;t1,...,tk)=FXt1,...,Xtk(x1,...,xk),
probability mass function
p(x1,...,xk;t1,...,tk)=pXt1,...,Xtk(x1,...,xk),
and probability density function
ƒ(x1,...,xk;t1,...,tk)=ƒXt1,...,Xtk(x1,...,xk)
of order k of a stochastic process
X={Xt}t∈[0,∞);
-
mean
mX(t)=E[Xt],
autocorrelation function
RX(t1,t2)=E[Xt1Xt2],
autocovariance function
CX(t1,t2)=RX(t1,t2)−mX(t1)mX(t2),
variance
var Xt=CX(t,t),
and autocorrelation coefficient
ρX(t1,t2)
of a stochastic process X.
[pages 48-50 of Sec. 2.1 of [L]
-
Lecture 20 (Thu, Mar 28)
General properties of stochastic processes (cont.):
-
a reminder about the meaning of the correlation between two random variables;
-
stochastic processes with indepent increments;
-
stochastic processes with stationary increments;
-
strict-sense stationary (SSS, strongly stationary) stochastic processes;
-
wide-sense stationary (WSS, weakly stationary) stochastic processes;
-
average power E[Xt2] of a stochastic process;
-
E[Xt2] of a WSS stochastic process
does not depend on t;
-
spectral density SX(ω) of a WSS process
[pages 50-55 of Sec. 2.1 and 2.2 of [L]]
Gaussian and Markov processes:
-
multinormal distribution of a random vector
X=(X1,...,Xn)∼N(m,K),
vector of the means m, covariance matrix
K=(cov(Xi,Xj));
-
characteristic function φX(ω)=E[exp(iωX)]
of a random variable X,
(joint) characteristic function
φX(ω)=E[exp(iω⋅X)]
of a multinormal random variable X
(Prop. 2.4.1);
-
if two components of
X=(X1,...,Xn)∼N(m,K)
are uncorrelated, then they are independent;
-
Gaussian process {Xt}
- a continuous-time stochastic process with
(Xt1,...,Xtn)
being multinormal for any n and any times
t1,...tn;
-
if {Xt} is a Gaussian process
such that its mean mX(t)
does not depend on t and its autocovariance function
CX(t1,t2)
depends only on t2−t1,
then the process is SSS (Prop. 2.4.2);
-
definition of a Markov (or Markovian) processes, examples
(random walk, Poisson process);
-
(first-order) density function
ƒ(x;t)=ƒXt(x);
-
conditional transition density function
p(x,x0;t,t0)=ƒXt|Xt0(x|x0);
-
a Markovian, continuous-time continuous-state space stochastic process {Xt}
is completely determined by ƒ(x;t)=ƒXt(x)
and p(x,x0;t,t0)=ƒXt|Xt0(x|x0);
-
integrals of ƒ(x;t)
and
p(x,x0;t,t0)
over x are equal to 1
[pages 58-62 of Sec. 2.4 of [L]]
-
Lecture 21 (Tue, Apr 2)
Gaussian and Markov processes (cont.):
-
expressing ƒ(x;t) as in integral of
ƒ(x0;t0)p(x,x0;t,t0)
over x0;
-
more on the meaning of the p.d.f. of a continuous RV:
P(X∈(x,x+Δx])≈ƒX(x)Δx,
generalization for jointly continuous random vectors
P(X∈A)≈ƒX(x)vol(A),
where A is a small domain in Rk containing x;
-
application to kth order p.d.f.'s of a random process:
P(Xt1∈(x1,x1+Δx1],...,Xtk∈(xk,xk+Δxk])≈ƒ(Xt1,...,Xtk)(x1,...,xk)Δx1...Δxk;
-
Chapman-Kolmogorov equations for the conditional transition density function
p(x,x0;t,t0)=ƒXt|Xt0(x|x0)
-
since in the limit t→t0+,
the process did not have time to evolve,
p(x,x0;t,t0)→δ(x−x0)
as
t→t0+
[pages 58-63 of Sec. 2.4 of [L]]
A digression on generalized functions (distributions):
-
test functions (infinitely smooth compactly supported functions);
-
Dirac δ-function δa
defined by δa(ƒ):=ƒ(a);
-
derivatives of generalized functions - defined by applying integration by parts,
treating the generalized function as a regular function
and using that a test function ƒ satisfies
limx→∞ƒ(x)=0
and
limx→−∞ƒ(x)=0
(because ƒ has compact support);
-
the above recipe gives us that integral of ƒ times the kth derivative
ξ(k) of a generalized function ξ
is equal to (−1)k times
integral of ξ times ƒ(k),
which symbolically can be written as
ξ(k)(ƒ):=(−1)kξ(ƒ(k));
-
following this recipe, the derivatives of δa
defined by δa'(ƒ):=−ƒ'(a),
δa''(ƒ):=(−1)2ƒ''(a),
and in general
δa(k)(ƒ):=(−1)kƒ(k)(a);
-
example: generalized derivative
of the Heaviside (unit step) function:
Ha'=δa.
-
interpretation of generalized functions as a "rough" signal, and of the test function as a "smoothing" function corresponding to the "smearing" due to the experimental device.
-
Lecture 22 (Thu, Apr 4)
The Wiener process (Brownian motion):
-
normal (Gaussian) random variables N(μ,σ2):
p.d.f., mean, variance, characteristic function,
standard normal random variable Z∼N(0,1)
which can be obtained from Z∼N(μ,σ2)
as Z=(X−EX)/σX;
-
definition of Brownian motion/Wiener process
Wt∼N(0,σ2t)
and a standard Wiener process Bt∼N(0,t);
-
the Wiener process as a limit of a simple random walk;
-
historical remarks (Robert Brown, Albert Einstein,
Norbert Wiener, Andrey Kolmogorov);
-
p.d.f. of order k of a Wiener process;
-
moments of Wt:
mean E[Wt]=0,
autocovariance function
CW(t,s)=E[WtWs]=σ2min(t,s),
autocorrelation function
RW(t,s)=E[WtWs]=σ2min(t,s);
-
for constants a, b∈R and Wt a Brownian motion,
the process a+bWt
is a Brownian motion (proof: exercise);
-
proof that tW1/t
is a Brownian motion (Example 4.1.2)
[pages 173-179 of Sec. 4.1]
-
Lecture 23 (Tue, Apr 9)
σ-algebras and probability measures:
-
sample space Ω - a set of outcomes (elementary events) ω;
-
σ-algebra F of subsets of Ω;
event - an element of F;
sub-σ-algebra G⊆F
of F;
examples;
-
Borel σ-algebra B(R) of subsets of R;
-
σ-algebra σ(A1,A2,...)
generated by a collection A1, A2,... of subsets of Ω;
examples;
-
Lebesgue measure L:B(R)→R on R defined by
L((a,b))=b−a;
-
probabiliy measure P:F→[0,1] on (Ω,F);
probability space (Ω,F,P);
-
F-measurable functions X:Ω→R - for which
{X∈B}∈F for all
B∈B(R);
-
random variable on (Ω,F)
- an F-measurable function X:Ω→R;
-
an example: a σ(∅)-measurable function is a constant function;
more examples;
-
σ-algebra σ(X) generated by a random variable X;
-
σ-algebra σ(F1,...,Fn)
generated by a by a collection of σ-algebras;
-
σ-algebra σ(X1,...,Xn)
generated by a family of random variables
X1,...,Xn;
-
filtration
F1⊆F2⊆F3⊆... of σ-algebras generated by a sequence
X1,X2,X3,...
of functions Xk:Ω→R,
where Fk=σ(X1,...,Xk);
-
example: filtration of σ-algebras generated by a sequence of coin tosses.
-
Lecture 24 (Thu, Apr 11)
σ-algebras and probability measures (cont.):
-
distribution (cumulative distribution function, c.d.f.)
FX(x)=P(X≤x)=P({ω∈Ω:X(ω)≤x}) of a random variable X;
-
expectation E[X] of a random variable X as an integral over R
of x dFX(x)
or, equivalently, as an integral over Ω of X(ω) P(dω).
Conditional expectation and martingales:
conditional expectation E[X|A]
of a random variable X conditioned on an event A∈ F;
-
conditional expectation E[X|F]
of a random variable X conditioned on a σ-algebra F;
-
conditional expectation E[X|Y]
of a random variable X conditioned on another random variable Y;
-
discussion of the meaning of the filtration
F1⊆F2⊆F3⊆...
with
Fn=σ(X1,...,Xn)
in the context of "coin tossing" (where Xn is the result of the nth toss)
- Fn represents our knowledge at time n;
-
a sequence Y1,Y2,... of random variables
adapted to the filtration
F1⊆F2⊆F3⊆...
- each Yn is
Fn-measurable
(i.e., can be determined from the values of the random variables
X1,...,Xn
generating the σ-algebra
Fn=σ(X1,...,Xn));
-
an example - the running averages Sn;
-
martingale Y1,Y2,...
with respect to a filtration F1⊆F2⊆F3⊆...
- a sequence of L1-random variables (i.e., such that E[|X|]<∞) adapted to the filtration
and such that E[Yn+1|Fn]=Yn;
-
an example of a martingale - the positions
Y1,Y2,... of a particle in a simple symmetric random walk;
-
an example of a continuous-time martingale - for a Poisson process {Nt}t≥0
with intensity λ,
Yt=Nt−λt is a martingale.
Lecture 25 (Tue, Apr 16)
Conditional expectation and martingales (cont.):
-
example - exponential martingale
exp(αBt−α2t/2),
obtaining a family of polynomial martingales from the Taylor
expansion of the exponential martingale with respect to α around α=0:
exp(αBt−α2t/2)=1+Btα+(1/2)(Bt2−t)α2+(1/6)(Bt3−3tBt)α3+(1/24)(Bt4−6tBt2+3t2)α4+(1/120)(Bt5−10tBt3+15t2Bt)α5+..., so that Bt, (Bt2−t), (Bt3−3tBt), (Bt4−6tBt2+3t2), (Bt5−10tBt3+15t2Bt), ..., are martingales.
The Wiener process (Brownian motion) (cont.):
-
a brief review - definition of Wt and Bt,
increments, moments
(E[Wtodd power]=0,
Var Wt=E[Wt2]=σ2t,
E[Wt4]=3σ4t2,
E[Wt6]=15σ6t3,...),
autocorrelation and autocovariance functions,
probability density function
ƒ(x1,...,xk;t1,...,tk)=ƒW(t1),...,W(tk)(x1,...,xk)
of order k expressed as product of the p.d.f.
ƒ(x1;t1)=ƒW(t1)(x1)
and the conditional p.d.f.'s
ƒW(tj)|W(tj−1)(xj|xj−1) for j=2,...,k,
explicit expressions for all p.d.f.'s;
-
short-time behavior: for Δt>0 and
ΔBt:=Bt+Δt−Bt,
computing
E[ΔBtodd power]=0 and
E[(ΔBt)2]=1/Δt→∞
as Δt→0+,
nondifferentiability of the Brownian motion;
-
Gaussian white noise ξt:=dBt/dt;
-
making sense of the derivative dBt/dt
as a (random) generalized function acting on a test function φ,
definition of a functional Ξ(φ) as an integral of ξtφ(t)
over t from 0 to ∞
(meaning; a measurement "smeared by φ");
-
a proof that E[Ξ(φ)]=0 and that E[Ξ(φ)2]
equals integral of φ(t)2 over t from 0 to ∞,
interpreting these facts as
E[ξt]=0 and E[ξtξs]=δ(t−s).
Reading assignment (mandatory):
expectation of a "smeared" derivative of a Wiener process.
Lecture 26 (Thu, Apr 18)
Stochastic differential equations (SDEs):
-
the standard Brownian motion can be considered as the solution
of the initial value problem
dBt/dt=ξt, B0=0
for the unknown function Bt whose evolution
is driven by Gaussian white noise ξt;
-
on meaning of an SDE - computing the transition probability density
ƒBt|Bs(x|y)=ƒ(x,y|t,s)
for 0≤s<t
as a solution of an initial-value problem for a partial differential equation
- in this case,
∂tƒ(x,y|t,s)=(1/2)∂xxƒ(x,y|t,s),
limit of ƒ(x,y|t,s) as t→s+ equals δ(x−y);
-
a generalization:
dXt/dt=ƒ(t,Xt)+g(t,Xt)ξt;
-
discretization by using the values at the left end:
ΔXt≈ƒ(t,Xt)Δt+g(t,Xt)ΔBt,
Xt+Δt=Xt+ΔXt
(similar to the Euler method for integration of ODEs),
main reason for using this discretization - the increment ΔBt
is independent of the value of Xt and Bt;
-
Itô integrals as a limit (in some sense) of left Riemann sums; in what sense?
Stochastic differerential equations and Itô integrals:
-
using left Riemann sums to approximate the solution of the SDE
dXt=ƒ(t,Xt)dt+g(t,Xt)dBt;
-
definition and examples of of L1-limit
and m.s.-limit (mean-square limit, L2-limit) of series of functions;
-
definition of Itô integral ∫0tg(t,Xt)dBt
as a m.s.-limit of the left Riemann sums
∑i g(ti,Xi) ΔBi;
-
useful facts for calculations:
E[(ΔBiodd power)]=0,
E[(ΔBi)2]=Δti,
E[(ΔBi)4]=3(Δti)2,
E[g(ti,Bi)ΔBi]=0,
E[g(ti,Bi)(ΔBi)2]=E[g(ti,Bi)]Δti,
E[(ΔBi)k(ΔBj)m]=E[(ΔBi)k]E[(ΔBj)m] for i≠j.
Reading assignment (mandatory):
computing the Itô integrals ∫t0tBsdBs and ∫t0tBs2dBs
(read only the computation of the integral of BsdBs).
Lecture 27 (Tue, Apr 23)
Stochastic differential equations (SDEs):
-
writing the result about the definite Itô integral
∫t0tBsdBs=(Bt2−Bt02)/2−(t−t0)/2
as indefinite integral
∫ BtdBt=Bt2/2−t/2
and in the form
d(Bt2)=2BtdBt+dt; similar result for
d(Btk) for k=3,4,5,...;
-
Itô formula for dΨ(t,Xt)
where Ψ(t,x) is a function of two variables
and Xt satisfies the SDE
dXt=ƒ(t,Xt)dt+g(t,Xt)dBt, mnemonic rules for deriving the formula;
-
remarks on the meaning of the solution Xt of a SDE;
-
non-anticipating functions and expectation of Itô integrals.
Lecture 28 (Thu, Apr 25)
Stochastic differential equations (SDEs) (cont.):
-
properties of Itô integrals:
-
additivity of domain,
-
linearity,
-
zero average of
∫t0tg(s,Xs)dBs for any non-anticipating function g(t,Xt),
-
∫t0tg(s,Xs)dBs for any non-anticipating function g(t,Xt)
is a martingale with the respect of filtration
{Ft}
of σ-algebras generated by the Wiener process {Bt},
-
correlation formula,
-
Itô isometry;
-
example - simple population growth at a noisy rate ("geometric Brownian motion"):
-
dXt/dt=(r+αξt)Xt
or, equivalently,
dXt=rXtdt+αXtdBt,
-
obtaining the solution Xt=X0e(r−α2/2)t+αBt by using Itô formula,
-
computing the average
E[Xt]=E[X0]ert,
-
discussion of the behavior of the solutions for
r>α2/2 and for r<α2/2,
-
remarks about the interpretation of the numerical simulations of the SDE.
Lecture 29 (Tue, Apr 30)
Stochastic differential equations (SDEs) (cont.):
-
using the exponential martingale to analyze the average of the population
in the problem of simple population growth at a noisy rate ("geometric Brownian motion");
-
computing the variance of the population at time t;
-
meaning and derivation of the Fokker-Planck equation for the conditional transition
density function
p(x,z;t,s)=ρ(x,t|z,s)
for s<t (see the Reading assignment);
-
solution of the Fokker-Planck equation for the standard Brownian motion Bt;
-
physical interpretation of the solution of the Fokker-Planck equation for Bt
(propagation of heat);
-
solution of the Fokker-Planck equation for the geometric Brownian motion
(lognormal distribution);
-
idea of Stratonovich integral:
defined as an m.s.-limit of midpoint Riemann sums,
the regular calculus rules are valid, it is not a martingale.
Reading assignment (optional):
derivation of the Fokker-Planck equation.
Good to know: