Math 5763 - Nikola Petrov

MATH 5763 - Stochastic Processes, Section 001 - Spring 2019
TR 1:30-2:45 p.m., 809 PHSC

Instructor: Nikola Petrov, 1101 PHSC, npetrov AT ou.edu

Office Hours: Mon 2:30-3:30, Wed 10:30-11:30 (subject to change), or by appointment, in 1101 PHSC

Prerequisite: Basic calculus-based probability theory at the level of MATH 4733 (including axioms of probability, random variables, expectation, probability distributions, independence, conditional probability). The class will also require knowledge of elementary analysis (including sequences, series, continuity), linear algebra (including linear spaces, eigenvalues, eigenvectors), and ordinary differential equations (at the level of MATH 3113 or MATH 3413).

Course description: The theory of stochastic processes studies systems that evolve randomly in time; it can be regarded as the "dynamical" part of probability theory. It has many important practical applications, as well as in other branches in mathematics such as partial differential equations. This course is a graduate-level introduction to stochastic processes, and should be of interest to students of mathematics, statistics, physics, engineering, and economics. The emphasis will be on the fundamental concepts, but we will avoid using the theory of Lebesgue measure and integration in any essential way. Many examples of stochastic phenomena in applications and some modeling issues will also be discussed in class and given as homework problems.

Texts: We may use parts of the following books, freely available from the OU Library for OU students:

[L] Mario Lefebvre, Applied Stochastic Processes, Springer, 2006
[BZ] Zdisław Brzeźniak, Tomasz Zastawniak, Basic Stochastic Processes, Springer, 1999
[P] Emanuel Parzen, Stochastic Proceses, SIAM, 1999
[D] Richard Durrett, Essentials of Stochastic Processes, Second ed., Springer, 2012
[R] Sheldon Ross, Introduction to Probability Models, Eighth ed., Elsevier, 2003
[K] Hui-Hsiung Kuo, Introduction to Stochastic Integration, Springer, 2007
[MO] Kosto V. Mitov, Edward Omey, Renewal Processes, Springer, 2014
[H] Moshe Haviv, Queues: A course in Queueing Theory, Springer, 2013
[BGMT] Gunter Bolch, Stefan Greiner, Hermann de Meer, Kishor S. Trivedi, Queueing Networks and Markov Chains, Second ed., Wiley Interscience, 2006
[MS] Yuliya Mishura, Georgiy Shevchenko, Theory and Statistical Applications of Stochastic Processes , Wiley, 2017
[Ø] Bernt Øksendal, Stochastic Differential Equations: An Introduction with Applications, Sixth ed., Springer, 2013

Main topics (a tentative list):

a brief review of probability theory;
discrete Markov chains: Chapman-Kolmogorov equations, persistence and transience, generating functions, stationary distributions, reducibility, limit theorems, ergodicity;
continuous Markov processes: Poisson process, birth-death and branching processes, embedding of a discrete-time Markov chain in a continuous-time Markov processes;
conditional expectation, martingales;
stationary processes (autocorrelation function, spectral representation);
renewal processes, queues;
diffusion processes, Wiener processes (Brownian motion);
introduction to stochastic differential equations, Itô calculus;
Fokker-Planck equation, Ornstein-Uhlenbeck process.

Homework:

Homework 1, due January 24 (Thursday).
Homework 2, due January 31 (Thursday).
Homework 3, due February 7 (Thursday).
Homework 4, due February 14 (Thursday).
Homework 5, due February 21 (Thursday).
Homework 6, due March 7 (Thursday).
Homework 7, due March 26 (Tuesday).
Homework 8, due April 11 (Thursday).
Homework 9, due April 25 (Thursday).
Homework 10, solved in class.

Content of the lectures:

Lecture 1 (Tue, Jan 15)
Review of probability:
- sample space Ω, events as subsets of the sample space, elementary events as elements of the sample space, operations with events (complement, union, intersection, difference, symmetric difference, subset, impossible event); σ-algebra (σ-field), examples; De Morgan's laws, disjoint events, distributivity properties of intersection and union;
- probability (probability measure) P, probability space, elementary properties of probability measures (including the inclusion-exclusion formula);
- conditional probability P(A|B), properties of conditional probability, partitions of the sample space, law of total probability
[pages 1-3, 5-6 of Sec. 1.1 of [L]]

Lecture 2 (Thu, Jan 17)
Review of probability (cont.):
- independent events, independent family of events, pairwise independent family of events; conditional independence given an event; the conditional independence of the events A and B given an event C neither implies nor is implied by the independence of A and B; an example of using conditioning;
- random variables (RVs), (cumulative) distribution function (c.d.f.) F_X(x) of a RV X, properties of c.d.f.s; discrete RVs, probability mass function (p.m.f.) p_X(x) of a discrete RV; continuous RVs, p.d.f. ƒ_X(x) of a continuous RV; p.d.f. ƒ_Y(y) of a function Y=g(X) of the RV X;
- conditional c.d.f. F_X(x|A) of a RV X conditioned on an event A; conditional p.m.f. p_X(x|A) or p.d.f. ƒ_X(x|A) of a RV X conditioned on an event A;
- expectation E[X] of a RV X; expectation E[g(X)] of a function of a RV X; rth moment E[X^r] of a RV X for r=0,1,2,...; variance Var X and standard deviation σ_X=(Var X)^1/2 of a RV X
[pages 7, 8 of Sec. 1.1 of [L]; pages 8, 9, 13, 12, 16, 17 of Sec. 1.2 of [L]]

Lecture 3 (Thu, Jan 22)
Review of probability (cont.):
- representing the expectation of a random variable as an integral over Ω with respect to the probability measure P;
- conditional expectation E[X|A], conditional moments E[X^r|A] and conditional variance and Var(X|A) of a RV X given an event A;
- random vectors: definition; (joint) c.d.f. F_X,Y(x,y) of a random vector X=(X,Y); properties of F_X,Y(x,y); marginal c.d.f.s F_X(x)=F_X,Y(x,∞) and F_Y(y)=F_X,Y(∞,y); marginal p.m.f.s p_X(x) and p_Y(y), respectively p.d.f.s ƒ_X(x) and ƒ_Y(y), of a random vector (X,Y);
- conditional c.d.f. F_X|Y(x|Y=y_m) and conditional p.m.f. p_X|Y(x_k|Y=y_m) of the discrete RV X conditioned on the discrete RV Y; conditional c.d.f. F_X|Y(x|y) and conditional p.m.f. ƒ_X|Y(x|y) of the continuous RV X conditioned on the continuous RV Y;
- conditional expectation E[X|Y=y] of the RV X conditioned on the RV Y; the conditional expectation E[X|Y=y] depends only on y (but not on X), so it can be considered as a function of Y, therefore we can think of the conditional expectation E[X|Y] as a new random variable which is a function of the RV Y: namely, E[X|Y]:Ω→R is defined as the value of E[X|Y](ω) is defined as E[X|Y=y], where y=Y(ω); tower rule E[E[X|Y]]=E[X]; an example of using conditional expectation in computing the average grade of all students in two classes of different size [pages 14, 17 of Sec. 1.2 of [L]; pages 21-27 of Sec. 1.3 of [L]]
Lecture 4 (Thu, Jan 24)
Stochastic processes:
- definition of a stochastic process (random process);
- classification of random processes: discrete-time or continuous-time, discrete-state space or continuous-state space
[pages 47-48 of Sec 2.1 of [L]]
Markov chains - introduction:
- Markov property; Markov chain (MC);
- example: simple 1-dimensional random walk (RW), symmetric simple 1-dim RW;
- the future of a MC depends only on the most recent available information (Prop. 3.1.1);
- more examples: 2-dimensional, and d-dimensional RWs, Ehrenfests' urn model, birth-death processes
[pages 73-75 of Sec. 3.1 of [L]]
Discrete-time Markov chains - definitions and notations:
- time-homogeneous discrete-time discrete-state space MCs; stationary (time-homogeneous) MCs;
- one-step and n-step transition probabilities; one-step transition probability matrix P of a MC; stochastic and doubly-stochastic matrices;
- n-step transition probability matrices P⁽ⁿ⁾;
- Chapman-Kolmogorov in matrix form (P^(m+n)=P^(m)P⁽ⁿ⁾) and in components; corollary: P⁽ⁿ⁾=Pⁿ;
- an example of a MC with 2 states;
- probability ρ_ij⁽ⁿ⁾ of visiting state j for the first time in n steps starting from state i; probability ρ_ii⁽ⁿ⁾ of first return to state i in n steps; representation of p_ij⁽ⁿ⁾ as a sum over k from 1 to n of ρ_ij^(k)p_jj^(n−k);
- examples of direct computation ρ_ij⁽ⁿ⁾ in the example of a MC with 2 states
[pages 73-82 of Sec. 3.2.1 of [L]]

Lecture 5 (Tue, Jan 29)
Discrete-time Markov chains - definitions and notations (cont.):
- initial distribution (p.m.f.) a=(a₀,a₁,a₂,...), a_i=P(X₀=i), of a MC; distribution (p.m.f.) a⁽ⁿ⁾=(a₀⁽ⁿ⁾,a₁⁽ⁿ⁾,a₂⁽ⁿ⁾,...), a_i⁽ⁿ⁾=P(X_n=i), of a MC at time n;
- formula for evolution of the probability distribution: a⁽ⁿ⁾=aPⁿ;
- examples: simple 1-dim random walk on Z, simple 1-dim random walk on Z₊ with reflecting and absorbing boundary condition at 0
[Sec. 3.2.1 of [L]]
Properties of Markov chains:
- accessibility of state j from state i, i→j; communicating states i↔j;
- properties of the relation ↔ (reflexivity, symmetry, transitivity), ↔ is an equivalence relation;
- equivalence classes with respect of ↔;
- closed sets of the state space (Def. 3.2.7);
- irreducible MCs; irreducibility criteria; examples;
- absorbing states
[pages 85-86 of Sec. 3.2.2 of [L]]

Lecture 6 (Thu, Jan 31)
Discrete-time Markov chains - definitions and notations (cont.):
- probability ƒ_ij of eventual visit of state j starting from state i; probability ƒ_ii of eventual return to state i;
- expressing ƒ_ij as a sum of the first visit probabilities ρ_ij⁽ⁿ⁾;
- recurrent (persistent) and transient states;
- Decomposition Theorem;
- an example of identifying closed irreducible sets of recurrent states and sets of transient states, and structure of the stochastic matrix;
- a necessary and sufficient criterion of recurrence of state i in terms of the expected value E[N_i] of the number N_i of returns to this state (Prop. 3.2.3);
- a necessary and sufficient criterion of recurrence of state i in terms of a sum of the (ii)th matrix element of P⁽ⁿ⁾ over n (Prop. 3.2.4);
- recurrence is a class property (Prop. 3.2.5);
- average number μ_i of transitions for first return to state i;
- positive recurrent and null-recurrent states;
- criterion for null-recurrence;
- type of recurrence (positive or null) is a class property;
- recurrent states of a finite MC are positive recurrent;
- examples of identifying the transient and recurrent states and splitting an MC into classes according to the Decomposition Theorem
[pages 87-90 of Sec. 3.2.2 of [L]]
Reading assignment (mandatory): Read the proof of Proposition 3.2.5 (page 88, 89 of [L])

Lecture 7 (Tue, Feb 5)
Discrete-time Markov chains - definitions and notations (cont.):
- periodic and aperiodic states; remarks about periodicity; examples;
- simple random walk on Z: computing the number of itineraries by using combinatorial arguments, Stirling's formula, recurrence in the symmetric case (p=1/2) and transience otherwise
- simple symmetric random walk on Z^d: it is recurrent for d=2, and transient for d=3,4,5,...
[pages 91-93 of Sec. 3.2.2 of [L]]
Limiting probabilities:
- limiting probabilities π_i, limiting probability distribution π=(π₀,π₁,π₁,...);
- ergodic states;
- Ergodic Theorem (giving conditions for existence and uniqueness of a limiting probability distribution, relation between π_i and the average value μ_i of the first return time to state i; and an algorithm for computing π as a normalized left eigenvector of the one-step transition probability matrix P
- origin of the word "stationary": if p_{X_n}(i)=π_i, then p_{X_n+1}(i)=π_i (proof: exercise!);
- computing high powers of a matrix by diagonalizing it first;
- example of application of the Ergodic Theorem;
- example: simple random walk on {0,1,2,3,...} with a partially reflecting boundary: setting up the problem, writing down the one-step transition probability matrix P and the system of equations for the stationary distribution π
[pages 94-97, 99 of Sec. 3.2.3 of [L]]

Lecture 8 (Thu, Feb 7)
Limiting probabilities (cont.):
- example (cont.): simple random walk on {0,1,2,3,...} with a partially reflecting boundary: solving linear recurrence relations (characteristic equation, general form of the solution, finding the arbitrary constants from the boundary conditions);
- solving the system of equations for the stationary distribution π: computing stationary distribution when the probability p of moving to the right is smaller than 1/2, showing that a stationary distribution does not exist when p>1/2
[pages 98-100 of Sec. 3.2.3 of [L]]
Absorption problems:
- definition of the probability r_i⁽ⁿ⁾(C) of absorption by the closed subset C of the state space S after exactly n steps (starting from state i);
- definition of the probability r_i(C) of eventual absorption by the closed subset C of the state space S (starting from state i);
- a theorem giving r_i(C) in terms of the (p_ij) (Theorem 3.2.2.);
- an example: the gambler's ruin problem;
- martingales;
- solving the gambler's ruin problem using martingales
[Sec. 3.2.4 of [L]]
Exercise: Study the simple random walk on {0,1,2,3,...} with a partially reflecting boundary if the walk is symmetric (i.e., p=1/2).
Exercise/reading assignment: Solve the gambler's ruin problem in the general case p∈(0,1).

Lecture 9 (Tue, Feb 12)
Continuous-time discrete-state space MCs:
- definition of a continuous-time discrete-state state MCs;
- Markov property;
- transition functions p_ij(s,t)=P(X_t=j|X_s=i) for t>s;
- stationary (time-homogeneous) MCs - for which p_ij(s,t)=p_ij(0,t−s), notation: p_ij(t)=p_ij(0,t)=P(X_t=j|X₀=i) for t>0;
- a discrete-time MC {Y_n}_{n∈{0,1,2,...}} embedded in the continuous-time MC {X_t}_t≥0;
- irreducibility;
- analogue of the condition of being a stochastic matrix for p_ij(t) (the sum over j is 1);
- evolution of the occupation probabilities p_j(t)=P(X_t=j) expressed in terms of the initial occupation probabilities p_i(0) and the transition probabilities p_ij(t);
- Chapman-Kolmogorov equations;
- discussion of the meaning of the memorylessness properties of the geometric and the exponential random variables (Prop. 3.3.1)
- exponential random variable: definition, proof that it is memoryless, moment-generating function, other properties (Prop. 3.3.4 and the remarks after it, Prop. 3.3.5)
[pages 121-123 of Sec. 3.3.2 and pages 109-110 of Sec. 3.3.1 of [L]]
Poisson process:
- counting process;
- "little o(h)" notation, examples;
- definition of a Poisson process N as a nondecreasing process with N(0)=0, certain short-time transition probabilities p_ij(h) (for small h), and independence of events occurring at a later time interval from the events occurring at a non-overlapping earlier time interval
[loosely following pages 231, 232, 236 of Sec. 5.1 of [L]]

Lecture 10 (Thu, Feb 14)
Poisson process (cont.):
- derivation of the distribution of N(t) for a Poisson process N by deriving an initial-value problem for an infinite system of ODEs for p_ij(t) and solving the system:
  - by mathematical induction,
  - by the method of generating functions, and
  - an "elementary" way of deriving that N(t)∼Poisson(λt) by dividing the interval [0,t] into a large number n of short intervals of length t/n and applying the binomial distribution to the distribution of the events k events in the n short intervals.
Poisson process and distribution of interarrival times:
- arrival times T_j as sums of interarrival times;
- reconstructing the Poisson process from the interarrival times
[pages 237-238 of Sec. 5.1 of [L]]
Lecture 11 (Tue, Feb 19)
Lecture cancelled due to weather.

Lecture 12 (Thu, Feb 21)
Poisson process and distribution of interarrival times (cont.):
- independence and exponential distribution of the interarrival times τ_j of a Poisson process;
- basic properties of the Γ(α,λ) random variables;
- the sum of n i.i.d. Exp(λ) random variables is a Γ(n,λ) random variable (Prop. 3.3.6);
- reconstructing a Poisson process from the interarrival times τ_j;
- several facts about the Poisson process:
  - if M_t and W_t are Poisson processes with rates μ and ν, respectively, then N_t=M_t+W_t is a Poisson process with rate μ+ν (Prop. 5.1.1),
  - if M_t, W_t, and N_t are defined as above, then M_t=k|{N_t=n} is a Binomial RV with parameters n and μ/(μ+ν),
  - if X∼Exp(μ) and Y∼Exp(ν), then Z=min(X,Y)∼Exp(μ+ν) (think about the first arrival time of a Poisson process that is a "combination" of two Poisson processes of rates μ and ν running simultaneously (if we do not distinguish between the events of the two Poisson processes) (~Prop. 3.3.4, 3.3.5).
[pages 234, 235, 237-239 of Sec. 5.1, pages 113, 115-119 of Sec. 3.3.1 of [L]]
Continuous-time discrete-state space MCs (cont. from Lecture 9):
- stochastic semigroup {P_t}_t≥0;
- generator G=(ν_ij):=(dP_t/dt)|_t=0 of a stochastic semigroup;
- properties of G (the sum of the elements ν_ij in each row of G is zero);
- obtaining P_t from G: Kolmogorov forward and backward equations P_t'=P_tG, resp. P_t'=GP_t,
- initial condition P_t|_t=0=I
[roughly following Sec. 3.3.3 of [L]]

Lecture 13 (Tue, Feb 26)
Continuous-time discrete-state space MCs (cont.):
- definition of exponential of a matrix e^A;
- computing e^A by simplifying A by a similarity transformation, e.g., A=C⁻¹DC for a diagonal matrix D, and using that Aⁿ=C⁻¹DⁿC to show that e^A=C⁻¹e^DC;
- expressing the solution of the initial value problem x'(t)=Ax(t), x(0)=x⁽⁰⁾ (where x:R→R^d is the unknown function and A is a constant d×d matrix) as x(t)=e^tAx⁽⁰⁾;
- expressing the stochastic semigroup P_t through the generator G: P_t=e^tG;
- computing P_t for a continuous-time, two-state MC;
- remarks on the Laplace transform and its usage to solve initial-value problems for ODEs;
- definition of a birth process;
- solving the Kolmogorov forward equations for the birth process by using generating functions G_i(ξ,t);
- using the generating function G_i(ξ,t) to prove that the birth process is honest (G_i(1,t)=1)
[roughly following pages 129-133 of Sec. 3.3.4 of [L]]

Lecture 14 (Thu, Feb 28)
Continuous-time discrete-state space MCs (cont.):
- using the generating function G_i(ξ,t) to compute the conditional average E[X(t)|X(0)=k]=(∂G_k(ξ,t)/∂ξ)|_ξ=1 and the conditional variance Var(X(t)|X(0)=k); of the birth process given that X(0)=k;
- another way of computing the conditional expectation E[X(t)|X(0)=k] by writing an ODE for it.
Limiting probabilities and balance equations:
- stationary distribution π of a stochastic semigroup P_t;
- reason for the term "stationary distribution": if P(X(0)=i)=π_i where π is a stationary distribution, then P(X(t)=j)=π_j for all j∈S and all t≥0;
- recurrence time T_ii, mean recurrence time μ_ii=E[T_ii];
- recurrent and transient states, positive recurrent and null recurrent states of a continuous-time Markov chain;
- irreducible Markov chains;
- Ergodic Theorem for continuous-time Markov process, remarks;
- relation between the stationary distribution π_j, the rate ν_j of leaving state j (where the holding time for state j is U_j∼Exp(ν_j)), and the mean recurrence time μ_ii=E[T_ii];
- finding stationary distributions from the generator: πG=0; balance equations and their interpretation
[pages 138-140 of Sec. 3.3.5 of [L]]
Birth and death processes:
- birth-death-immigration-disaster process - general set-up.

Lecture 15 (Tue, Mar 5)
Birth and death processes (cont.):
- detailed derivation of the short-time transition probabilities p_ij(h) (hence, the generator ν_ij) of a death-immigration process;
- proving that the stationary distribution π of a death-immigration process is Poisson(ρ/μ), i.e., π_j=e^−ρ/μ(ρ/μ)^j/j!
[roughly following pages 135, 136 of Sec. 3.3.4 of [L]]
Nonhomogeneous Poisson processes:
- nonhomogeneous Poisson process with intensity function λ(t);
- the number of arrivals N_s+t−N_s of a nonhomogeneous Poisson process with intensity function λ(t) is Poisson(m(s+t)−m(s)), where m(t) is the mean value function of the process (defined by m(0)=0, m'(t)=λ(t)),
- "homogenizing" a nonhomogeneous Poisson process N(t) (with a strictly positive rate function λ(t)>0) by rescaling the time: M_t:=N_m⁻¹(t) is a Poisson process with rate 1
[pages 250, 252-254 of Sec. 5.2 of [L]]
Reading assignment (optional): distribution of the p.d.f. of the first arrival time T₁ of a nonhomogeneous Poisson process {N_t}_t≥0 given that N_t₁=1 (for some fixed t₁>0) (Prop. 5.5.2) [Prop. 5.2.2 page 253 of [L]]

Lecture 16 (Thu, Mar 7)
Compound Poisson processes:
- compound random variable;
- derivation of the mean, variance, and moment generating function of a compound random variable (Prop. 5.3.1);
- definition of a compound Poisson process;
- mean, variance, and moment generating function of a compound Poisson process;
- approximating the distribution of a compound Poisson process for large times by using the Central Limit Theorem (Prop. 5.3.2);
- the sum of two independent compound Poisson processes Y₁(t) and Y₂(t) corresponding to Poisson processes N₁(t) and N₂(t) with rates λ₁ and λ₂ is a compound Poisson process corresponding to a Poisson process with rate λ₁+λ₂
[Sec. 5.3 of [L]]
Doubly stochastic Poisson processes: definition of a conditional (or "mixed") Poisson process (whose rate is a random variable, independent of time) [page 258 of Sec. 5.4 of [L]]

Lecture 17 (Thu, Mar 12)
Compound Poisson processes:
- proof that a conditional Poisson process has stationary, but not independent increments (Prop. 5.4.1)
[pages 258, 259 of Sec. 5.4 of [L]]
Renewal processes:
- definition of a renewal process;
- modified ("delayed") renewal process (when the distribution of τ₀ differs from the distributions of τ₁, τ₂,...);
- relations between the process N(t), the times of the events T_n, and the interevent times τ_n;
- expression for the p.m.f. of N(t) in terms of the c.d.f. of T_n (Prop. 5.6.1);
- renewal function m(t)=E[N(t)];
- expression for the renewal function m(t) in terms of the c.d.f.'s of T_n (Prop. 5.6.2)
[pages 267-269 of Sec. 5.6 of [L]]
Mathematical digression:
- definition of Riemann integral;
- definition of the Riemann-Stieltjes integral;
- a particular case of the Riemann-Stieltjes integral when g(t) is differentiable and non-decreasing.

Lecture 18 (Thu, Mar 14)
Mathematical digression (cont.):
- a particular case of the Riemann-Stieltjes integral when g(t) is non-decreasing and piecewise constant;
- applications of the Riemann-Stieltjes integral to computing expected values of discrete and continuous random variables;
- expected value of an N-valued random variable X as a sum (over n from 1 to infinity) of probabilites of X to be greater or equal to n;
- expected value of a non-negative continuous random variable X as an integral of [1−F_X(x)], geometric meaning.
Renewal processes (cont.):
- recursive formula for the c.d.f.'s of the arrival times T_n in terms of the p.d.f. F_τ of the inter-arrival times τ_j through Riemann-Stieltjes integrals;
- derivation of an integral equation for the renewal function m(t)=E[N_t];
- solving renewal-type equations by using Laplace transform.
Lecture 19 (Tue, Mar 26)
Renewal processes (cont.):
- another derivation of the formula for the renewal function m(t) by performing Laplace transformation on the formula representing m(t) as a sum of the c.d.f.s of all the T_n's;
- computing the renewal function m(t) for a Poisson process in three ways:
  - by using the fact that N(t) is a Poisson(λt) random variable,
  - by expressing it as a sum of the c.d.f.'s of T_n (Prop. 5.6.2),
  - by solving the integral equation for m(t) using Laplace transform; the moment-generating function M_X of a (0,∞)-valued random variable X:Ω→(0,∞) is equal to the Laplace-Stieltjes transform of the c.d.f. F_X of X and, if the random variable X is continuous, equal to the Laplace transform of the p.d.f. ƒ_X of X.
Queues:
- set-up of the problem, examples of queues (queues with baulking, multiple servers, airline check-in, FIFO, LIFO, group servise, "student discipline", "continental queueing");
- A/S/s/c/p/D classification of the queues, where A and S are deterministic (D), Markovian (M - with exponentially distributed interrarival/service times), Γ (or Erlang), or general (G) distributions, s is the number of servers, c is the capacity of the system, p is the size of the population, D is the discipline (i.e., service policy);
- stability of a queue;
- M(λ)/G/1 queue - constructing of a discrete-time Markov chain embedded in the queueing process and scanned notes of the derivation of the transition probability matrix of this Markov chain.
General properties of stochastic processes:
- cumulative distribution function F(x₁,...,x_k;t₁,...,t_k)=F_{X_t₁,...,X_{t_k}}(x₁,...,x_k), probability mass function p(x₁,...,x_k;t₁,...,t_k)=p_{X_t₁,...,X_{t_k}}(x₁,...,x_k), and probability density function ƒ(x₁,...,x_k;t₁,...,t_k)=ƒ_{X_t₁,...,X_{t_k}}(x₁,...,x_k) of order k of a stochastic process X={X_t}_t∈[0,∞);
- mean m_X(t)=E[X_t], autocorrelation function R_X(t₁,t₂)=E[X_t₁X_t₂], autocovariance function C_X(t₁,t₂)=R_X(t₁,t₂)−m_X(t₁)m_X(t₂), variance var X_t=C_X(t,t), and autocorrelation coefficient ρ_X(t₁,t₂) of a stochastic process X.
[pages 48-50 of Sec. 2.1 of [L]

Lecture 20 (Thu, Mar 28)
General properties of stochastic processes (cont.):
- a reminder about the meaning of the correlation between two random variables;
- stochastic processes with indepent increments;
- stochastic processes with stationary increments;
- strict-sense stationary (SSS, strongly stationary) stochastic processes;
- wide-sense stationary (WSS, weakly stationary) stochastic processes;
- average power E[X_t²] of a stochastic process;
- E[X_t²] of a WSS stochastic process does not depend on t;
- spectral density S_X(ω) of a WSS process
[pages 50-55 of Sec. 2.1 and 2.2 of [L]]
Gaussian and Markov processes:
- multinormal distribution of a random vector X=(X₁,...,X_n)∼N(m,K), vector of the means m, covariance matrix K=(cov(X_i,X_j));
- characteristic function φ_X(ω)=E[exp(iωX)] of a random variable X, (joint) characteristic function φ_X(ω)=E[exp(iω⋅X)] of a multinormal random variable X (Prop. 2.4.1);
- if two components of X=(X₁,...,X_n)∼N(m,K) are uncorrelated, then they are independent;
- Gaussian process {X_t} - a continuous-time stochastic process with (X_t₁,...,X_{t_n}) being multinormal for any n and any times t₁,...t_n;
- if {X_t} is a Gaussian process such that its mean m_X(t) does not depend on t and its autocovariance function C_X(t₁,t₂) depends only on t₂−t₁, then the process is SSS (Prop. 2.4.2);
- definition of a Markov (or Markovian) processes, examples (random walk, Poisson process);
- (first-order) density function ƒ(x;t)=ƒ_{X_t}(x);
- conditional transition density function p(x,x₀;t,t₀)=ƒ_{X_t|X_t₀}(x|x₀);
- a Markovian, continuous-time continuous-state space stochastic process {X_t} is completely determined by ƒ(x;t)=ƒ_{X_t}(x) and p(x,x₀;t,t₀)=ƒ_{X_t|X_t₀}(x|x₀);
- integrals of ƒ(x;t) and p(x,x₀;t,t₀) over x are equal to 1
[pages 58-62 of Sec. 2.4 of [L]]

Lecture 21 (Tue, Apr 2)
Gaussian and Markov processes (cont.):
- expressing ƒ(x;t) as in integral of ƒ(x₀;t₀)p(x,x₀;t,t₀) over x₀;
- more on the meaning of the p.d.f. of a continuous RV: P(X∈(x,x+Δx])≈ƒ_X(x)Δx, generalization for jointly continuous random vectors P(X∈A)≈ƒ_X(x)vol(A), where A is a small domain in R^k containing x;
- application to kth order p.d.f.'s of a random process: P(X_t₁∈(x₁,x₁+Δx₁],...,X_{t_k}∈(x_k,x_k+Δx_k])≈ƒ_{(X_t₁,...,X_{t_k})}(x₁,...,x_k)Δx₁...Δx_k;
- Chapman-Kolmogorov equations for the conditional transition density function p(x,x₀;t,t₀)=ƒ_{X_t|X_t₀}(x|x₀)
- since in the limit t→t₀⁺, the process did not have time to evolve, p(x,x₀;t,t₀)→δ(x−x₀) as t→t₀⁺
[pages 58-63 of Sec. 2.4 of [L]]
A digression on generalized functions (distributions):
- test functions (infinitely smooth compactly supported functions);
- Dirac δ-function δ_a defined by δ_a(ƒ):=ƒ(a);
- derivatives of generalized functions - defined by applying integration by parts, treating the generalized function as a regular function and using that a test function ƒ satisfies lim_x→∞ƒ(x)=0 and lim_x→−∞ƒ(x)=0 (because ƒ has compact support);
- the above recipe gives us that integral of ƒ times the kth derivative ξ^(k) of a generalized function ξ is equal to (−1)^k times integral of ξ times ƒ^(k), which symbolically can be written as ξ^(k)(ƒ):=(−1)^kξ(ƒ^(k));
- following this recipe, the derivatives of δ_a defined by δ_a'(ƒ):=−ƒ'(a), δ_a''(ƒ):=(−1)²ƒ''(a), and in general δ_a^(k)(ƒ):=(−1)^kƒ^(k)(a);
- example: generalized derivative of the Heaviside (unit step) function: H_a'=δ_a.
- interpretation of generalized functions as a "rough" signal, and of the test function as a "smoothing" function corresponding to the "smearing" due to the experimental device.
Lecture 22 (Thu, Apr 4)
The Wiener process (Brownian motion):
- normal (Gaussian) random variables N(μ,σ²): p.d.f., mean, variance, characteristic function, standard normal random variable Z∼N(0,1) which can be obtained from Z∼N(μ,σ²) as Z=(X−EX)/σ_X;
- definition of Brownian motion/Wiener process W_t∼N(0,σ²t) and a standard Wiener process B_t∼N(0,t);
- the Wiener process as a limit of a simple random walk;
- historical remarks (Robert Brown, Albert Einstein, Norbert Wiener, Andrey Kolmogorov);
- p.d.f. of order k of a Wiener process;
- moments of W_t: mean E[W_t]=0, autocovariance function C_W(t,s)=E[W_tW_s]=σ²min(t,s), autocorrelation function R_W(t,s)=E[W_tW_s]=σ²min(t,s);
- for constants a, b∈R and W_t a Brownian motion, the process a+bW_t is a Brownian motion (proof: exercise);
- proof that tW_1/t is a Brownian motion (Example 4.1.2)
[pages 173-179 of Sec. 4.1]
Lecture 23 (Tue, Apr 9)
σ-algebras and probability measures:
- sample space Ω - a set of outcomes (elementary events) ω;
- σ-algebra F of subsets of Ω; event - an element of F; sub-σ-algebra G⊆F of F; examples;
- Borel σ-algebra B(R) of subsets of R;
- σ-algebra σ(A₁,A₂,...) generated by a collection A₁, A₂,... of subsets of Ω; examples;
- Lebesgue measure L:B(R)→R on R defined by L((a,b))=b−a;
- probabiliy measure P:F→[0,1] on (Ω,F); probability space (Ω,F,P);
- F-measurable functions X:Ω→R - for which {X∈B}∈F for all B∈B(R);
- random variable on (Ω,F) - an F-measurable function X:Ω→R;
- an example: a σ(∅)-measurable function is a constant function; more examples;
- σ-algebra σ(X) generated by a random variable X;
- σ-algebra σ(F₁,...,F_n) generated by a by a collection of σ-algebras;
- σ-algebra σ(X₁,...,X_n) generated by a family of random variables X₁,...,X_n;
- filtration F₁⊆F₂⊆F₃⊆... of σ-algebras generated by a sequence X₁,X₂,X₃,... of functions X_k:Ω→R, where F_k=σ(X₁,...,X_k);
- example: filtration of σ-algebras generated by a sequence of coin tosses.

Lecture 24 (Thu, Apr 11)
σ-algebras and probability measures (cont.):
- distribution (cumulative distribution function, c.d.f.) F_X(x)=P(X≤x)=P({ω∈Ω:X(ω)≤x}) of a random variable X;
- expectation E[X] of a random variable X as an integral over R of x dF_X(x) or, equivalently, as an integral over Ω of X(ω) P(dω).
Conditional expectation and martingales:
- conditional expectation E[X|F] of a random variable X conditioned on a σ-algebra F;
- conditional expectation E[X|Y] of a random variable X conditioned on another random variable Y;
- discussion of the meaning of the filtration F₁⊆F₂⊆F₃⊆... with F_n=σ(X₁,...,X_n) in the context of "coin tossing" (where X_n is the result of the nth toss) - F_n represents our knowledge at time n;
- a sequence Y₁,Y₂,... of random variables adapted to the filtration F₁⊆F₂⊆F₃⊆... - each Y_n is F_n-measurable (i.e., can be determined from the values of the random variables X₁,...,X_n generating the σ-algebra F_n=σ(X₁,...,X_n));
- an example - the running averages S_n;
- martingale Y₁,Y₂,... with respect to a filtration F₁⊆F₂⊆F₃⊆... - a sequence of L¹-random variables (i.e., such that E[|X|]<∞) adapted to the filtration and such that E[Y_n+1|F_n]=Y_n;
- an example of a martingale - the positions Y₁,Y₂,... of a particle in a simple symmetric random walk;
- an example of a continuous-time martingale - for a Poisson process {N_t}_t≥0 with intensity λ, Y_t=N_t−λt is a martingale.
Lecture 25 (Tue, Apr 16)
Conditional expectation and martingales (cont.):
- example - exponential martingale exp(αB_t−α²t/2), obtaining a family of polynomial martingales from the Taylor expansion of the exponential martingale with respect to α around α=0: exp(αB_t−α²t/2)=1+B_tα+(1/2)(B_t²−t)α²+(1/6)(B_t³−3tB_t)α³+(1/24)(B_t⁴−6tB_t²+3t²)α⁴+(1/120)(B_t⁵−10tB_t³+15t²B_t)α⁵+..., so that B_t, (B_t²−t), (B_t³−3tB_t), (B_t⁴−6tB_t²+3t²), (B_t⁵−10tB_t³+15t²B_t), ..., are martingales.
The Wiener process (Brownian motion) (cont.):
- a brief review - definition of W_t and B_t, increments, moments (E[W_t^odd power]=0, Var W_t=E[W_t²]=σ²t, E[W_t⁴]=3σ⁴t², E[W_t⁶]=15σ⁶t³,...), autocorrelation and autocovariance functions, probability density function ƒ(x₁,...,x_k;t₁,...,t_k)=ƒ_{W(t₁),...,W(t_k)}(x₁,...,x_k) of order k expressed as product of the p.d.f. ƒ(x₁;t₁)=ƒ_W(t₁)(x₁) and the conditional p.d.f.'s ƒ_{W(t_j)|W(t_j−1)}(x_j|x_j−1) for j=2,...,k, explicit expressions for all p.d.f.'s;
- short-time behavior: for Δt>0 and ΔB_t:=B_t+Δt−B_t, computing E[ΔB_t^odd power]=0 and E[(ΔB_t)²]=1/Δt→∞ as Δt→0⁺, nondifferentiability of the Brownian motion;
- Gaussian white noise ξ_t:=dB_t/dt;
- making sense of the derivative dB_t/dt as a (random) generalized function acting on a test function φ, definition of a functional Ξ(φ) as an integral of ξ_tφ(t) over t from 0 to ∞ (meaning; a measurement "smeared by φ");
- a proof that E[Ξ(φ)]=0 and that E[Ξ(φ)²] equals integral of φ(t)² over t from 0 to ∞, interpreting these facts as E[ξ_t]=0 and E[ξ_tξ_s]=δ(t−s).
Reading assignment (mandatory): expectation of a "smeared" derivative of a Wiener process.

Lecture 26 (Thu, Apr 18)
Stochastic differential equations (SDEs):
- the standard Brownian motion can be considered as the solution of the initial value problem dB_t/dt=ξ_t, B₀=0 for the unknown function B_t whose evolution is driven by Gaussian white noise ξ_t;
- on meaning of an SDE - computing the transition probability density ƒ_{B_t|B_s}(x|y)=ƒ(x,y|t,s) for 0≤s<t as a solution of an initial-value problem for a partial differential equation - in this case, ∂_tƒ(x,y|t,s)=(1/2)∂_xxƒ(x,y|t,s), limit of ƒ(x,y|t,s) as t→s⁺ equals δ(x−y);
- a generalization: dX_t/dt=ƒ(t,X_t)+g(t,X_t)ξ_t;
- discretization by using the values at the left end: ΔX_t≈ƒ(t,X_t)Δt+g(t,X_t)ΔB_t, X_t+Δt=X_t+ΔX_t (similar to the Euler method for integration of ODEs), main reason for using this discretization - the increment ΔB_t is independent of the value of X_t and B_t;
- Itô integrals as a limit (in some sense) of left Riemann sums; in what sense?
Stochastic differerential equations and Itô integrals:
- using left Riemann sums to approximate the solution of the SDE dX_t=ƒ(t,X_t)dt+g(t,X_t)dB_t;
- definition and examples of of L¹-limit and m.s.-limit (mean-square limit, L²-limit) of series of functions;
- definition of Itô integral ∫₀^tg(t,X_t)dB_t as a m.s.-limit of the left Riemann sums ∑_i g(t_i,X_i) ΔB_i;
- useful facts for calculations: E[(ΔB_i^odd power)]=0, E[(ΔB_i)²]=Δt_i, E[(ΔB_i)⁴]=3(Δt_i)², E[g(t_i,B_i)ΔB_i]=0, E[g(t_i,B_i)(ΔB_i)²]=E[g(t_i,B_i)]Δt_i, E[(ΔB_i)^k(ΔB_j)^m]=E[(ΔB_i)^k]E[(ΔB_j)^m] for i≠j.
Reading assignment (mandatory): computing the Itô integrals ∫_t₀^tB_sdB_s and ∫_t₀^tB_s²dB_s (read only the computation of the integral of B_sdB_s).

Lecture 27 (Tue, Apr 23)
Stochastic differential equations (SDEs):
- writing the result about the definite Itô integral ∫_t₀^tB_sdB_s=(B_t²−B_t₀²)/2−(t−t₀)/2 as indefinite integral ∫ B_tdB_t=B_t²/2−t/2 and in the form d(B_t²)=2B_tdB_t+dt; similar result for d(B_t^k) for k=3,4,5,...;
- Itô formula for dΨ(t,X_t) where Ψ(t,x) is a function of two variables and X_t satisfies the SDE dX_t=ƒ(t,X_t)dt+g(t,X_t)dB_t, mnemonic rules for deriving the formula;
- remarks on the meaning of the solution X_t of a SDE;
- non-anticipating functions and expectation of Itô integrals.

Lecture 28 (Thu, Apr 25)
Stochastic differential equations (SDEs) (cont.):
- properties of Itô integrals:
  - additivity of domain,
  - linearity,
  - zero average of ∫_t₀^tg(s,X_s)dB_s for any non-anticipating function g(t,X_t),
  - ∫_t₀^tg(s,X_s)dB_s for any non-anticipating function g(t,X_t) is a martingale with the respect of filtration {F_t} of σ-algebras generated by the Wiener process {B_t},
  - correlation formula,
  - Itô isometry;
- example - simple population growth at a noisy rate ("geometric Brownian motion"):
  - dX_t/dt=(r+αξ_t)X_t or, equivalently, dX_t=rX_tdt+αX_tdB_t,
  - obtaining the solution X_t=X₀e^{(r−α²/2)t+αB_t} by using Itô formula,
  - computing the average E[X_t]=E[X₀]e^rt,
  - discussion of the behavior of the solutions for r>α²/2 and for r<α²/2,
  - remarks about the interpretation of the numerical simulations of the SDE.

Lecture 29 (Tue, Apr 30)
Stochastic differential equations (SDEs) (cont.):
- using the exponential martingale to analyze the average of the population in the problem of simple population growth at a noisy rate ("geometric Brownian motion");
- computing the variance of the population at time t;
- meaning and derivation of the Fokker-Planck equation for the conditional transition density function p(x,z;t,s)=ρ(x,t|z,s) for s<t (see the Reading assignment);
- solution of the Fokker-Planck equation for the standard Brownian motion B_t;
- physical interpretation of the solution of the Fokker-Planck equation for B_t (propagation of heat);
- solution of the Fokker-Planck equation for the geometric Brownian motion (lognormal distribution);
- idea of Stratonovich integral: defined as an m.s.-limit of midpoint Riemann sums, the regular calculus rules are valid, it is not a martingale.
Reading assignment (optional): derivation of the Fokker-Planck equation.

Good to know:

The Greek_alphabet.
Some useful notations.
Basic principles of counting.

MATH 5763 - Stochastic Processes, Section 001 - Spring 2019 TR 1:30-2:45 p.m., 809 PHSC

MATH 5763 - Stochastic Processes, Section 001 - Spring 2019
TR 1:30-2:45 p.m., 809 PHSC