STA261: Probability and Statistics II

Undergraduate course, University of Toronto, 2020

STA261 is intended to be a rigorous (but friendly) introduction to mathematical statistics for students in the Theory and Methods Statistics Specialist program at the University of Toronto (UofT). I taught this course four times, from 2020 to 2024. The content here is mainly from the Summer 2024 offering.

Syllabi

Summer 2021
Summer 2022
Summer 2024

Modules (Summer 2024)

Module 1: Statistics

Module 2: Point Estimation

Slides

Module 2 - Statistics
Module 2 - Point Estimation (Annotated)

Assignment

Assignment 2

Optional Readings and Resources

Psychology Wiki - Likelihood Principle
Cross Validated - When is a biased estimator preferable to an unbiased one?
John Aldrich - R. A. Fisher and the Making of Maximum Likelihood 1912 - 1922 (A wonderful minimal-math history/overview of the MLE, with further appearances by sufficiency and efficiency)
R. A. Fisher - Professor Karl Pearson and the Method of Moments and Karl Pearson - Method of Moments and Method of Maximum Likelihood (Pearson essentially invented the MOM, while Fisher did the same for the MLE. Fisher and Pearson openly despised each other, and neither of them had warm personalities to begin with. In these two remarkable papers, Fisher and Pearson use the academic publication as a medium to publicly attack one another. Both papers are dripping with sarcasm and scorn, which you can easily get a sense of even if you skip past all the math.)
L. Bondesson - On Uniformly Minimum Variance Unbiased Estimation when no Complete Sufficient Statistics Exist

Module 3: Hypothesis Testing

Slides

Module 3 - Hypothesis Testing
Module 3 - Hypothesis Testing (Annotated)

Assignment

Assignment 3

Optional Readings and Resources

Shane Pederson - What Does a Lady Tasting Tea Have to Do with Science?
Christie Aschwanden - Not Even Scientists Can Easily Explain P-Values
William Huber (whuber) - A Dialog [sic] Between a Teacher and a Thoughtful Student (whuber is a well-known Cross Validated contributor/moderator)
Itai Yanai and Martin Lercher - A Hypothesis is a Liability (This rather provocative article has generated much discussion among statisticians since it was published)
Kristoffer Magnusson - Understanding Statistical Power and Significance Testing: an interactive visualization (An amazing visualization of the Z-test that lets you play around with various parameters of the test)
Raymond S. Nickerson - Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy
John D. Cook - Are Coffee and Wine Good for You or Bad for You?
George B. Dantzig - On the Non-Existence of Tests of “Student’s” Hypothesis Having Power Functions Independent of σ (You ever hear the story of the university student who arrived to lecture late one day, saw two problems written on the blackboard that he assumed were homework problems and solved them before the next lecture, only to find out that they were actually unsolved problems? It’s not an urban legend; the guy was a real person named George Dantzig, an outstanding mathematician who later went on to invent the simplex algorithm (if you ever did/do take a course in optimization, you’ll learn about it), among many other things. The university class was a statistics course, and the professor teaching the lecture was Jerzy Neyman (he of the Neyman-Pearson lemma and much else). The problems were both conjectures related to hypothesis testing, and Neyman encouraged Dantzig to publish his proofs of the conjectures, which he did (and then stapled them together into what became his PhD thesis). The linked paper is the first of the two. After Module 3, you should have seen enough about hypothesis testing – and -tests in particular – to understand most everything up until the end of Section 2 of the paper, including the statement of the problem that Dantzig solved.)

Module 4: Intervals and Model Checking

Slides

Module 4 - Intervals and Model Checking
Module 4 - Intervals and Model Checking (Annotated)

Assignment

Assignment 4

Optional Readings and Resources

Hoekstra et al. - Robust Misinterpretation of Confidence Intervals
Kristoffer Magnusson - Interpreting Confidence Intervals (Another excellent visualization from the same author as the Module 3 visualization for hypothesis testing)
Wikipedia - All Models Are Wrong (George Box’s quote is probably the most famous aphorism in Statistics)
Vincent Scheurer - Convicted on Statistics? (A tragic case study on the dangers of poor model assumptions. This is a difficult read.)
StackExchange - How to Understand Degrees of Freedom?
Wikipedia - Chernoff Face (In lecture, I mentioned that visual assessments are useful because the human eye is much better at noticing deviations from expected visuals than computers are. Chernoff faces are a very cool way of exploiting this to visualize different sets of multivariate observations)

When Uniform Numbers Aren’t That Uniform…

library(rgl) # for plotting

####

# A linear congruential generator: starting with an initial X_{0}, we recursively update X_{i+1} = c + a*X{i} (mod m) 
# for some fixed c, prime number a, m and then return (X_{0}, ..., X_{n})/m

####

a <- 65539
m <- 2^31
seed <- 1234

lcg <- function(n, seed, a, m) {
  X <- numeric(n)
  X[1] <- seed
  for (i in 2:n) {
    X[i] <- a*X[i-1] %% m
  }
  return(X / m)
}

n <- 10000
X <- lcg(n, seed, a, m)

rgl::plot3d(X[1:(n-2)], X[2:(n-1)], X[3:n], xlab = 'X[i]', ylab = 'X[i+1]', zlab = 'X[i+2]', col = 'blue')

Module 5: Asymptotic Extensions

Slides

Module 5 - Asymptotic Extensions
Module 5 - Asymptotic Extensions (Annotated)

Assignment

Assignment 5

Optional Readings and Resources

StackExchange - Large sample asymptotic/theory - Why to care about?
A. Buse - The Likelihood Ratio, Wald, and Lagrange Multiplier Tests: An Expository Note
Alan Agresti and Brent A. Coull - Approximate Is Better than “Exact” for Interval Estimation of Binomial Proportions
Brad Efron and David E. Hinkley - Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information (The math gets pretty heavy after Section 1, but you can still skim through the empirical results)
Aad van der Vaart - Superefficiency (Just Section 27.1, for the story of Hodges’ estimator that appears in the assignment.)

Asymptotics in Action

If you know some basic R and you’ve worked through the assignment, try to figure out what this does and then run it. If you aren’t amazed, you should be!

set.seed(123456)
wald <- 0
score <- 0
J <- 1000
N <- 10000

for (j in 1:J) {
    dat <- rnorm(n=N, mean=10, sd=sqrt(4))
    vardat <- (1/N)*sum((dat-10)^2)

    L_w <- (1-sqrt(2/N)*1.96)*vardat
    U_w <- (1+sqrt(2/N)*1.96)*vardat
    if (L_w < 4 && 4 < U_w) {
    wald <- wald + 1
    }

    L_s <- vardat/(1+sqrt(2/N)*1.96)
    U_s <- vardat/(1-sqrt(2/N)*1.96)
    if (L_s < 4 && 4 < U_s) {
    score <- score + 1
    }
}

score_conf <- score/J
wald_conf <- wald/J

Module 6: Bayesian Statistics

Slides

Module 6 - Bayesian Statistics
Module 6 - Bayesian Statistics (Annotated)

Assignment

Assignment 6

Optional Readings and Resources

Assessments

Summer 2021

Summer 2021 - Quiz 1
Summer 2021 - Quiz 2
Summer 2021 - Quiz 3
Summer 2021 - Quiz 4
Summer 2021 - Quiz 5
Summer 2021 - Final Assessment

Summer 2022

Summer 2022 - Midterm 1
Summer 2022 - Midterm 2
Summer 2022 - Final Exam

Summer 2024

Summer 2024 - Midterm 1
Summer 2024 - Midterm 2
Summer 2024 - Final Exam

Share on

Twitter Facebook LinkedIn

Robert Zimmerman

Syllabi

Modules (Summer 2024)

Module 1: Statistics

Slides

Assignment

Optional Readings and Resources

Module 2: Point Estimation

Slides

Assignment

Optional Readings and Resources

Module 3: Hypothesis Testing

Slides

Assignment

Optional Readings and Resources

Module 4: Intervals and Model Checking

Slides

Assignment

Optional Readings and Resources

When Uniform Numbers Aren’t That Uniform…

Module 5: Asymptotic Extensions

Slides

Assignment

Optional Readings and Resources

Asymptotics in Action

Module 6: Bayesian Statistics

Slides

Assignment

Optional Readings and Resources

Assessments

Summer 2021

Summer 2022

Summer 2024

Share on