Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Non-Integer Moments and Distributions

9 minute read

Published:

Does there exist a distribution which is determined only by its non-integer moments? To put it another way, for $p \geq 0$, do there exist random variables $X$ and $Y$ supported on $(0, \infty)$ such that $\mathbb{E}[X^p] = \mathbb{E}[X^p]$ if and only if $p \not \in \mathbb{N}$?

Limits of Self-Normalized Random Variables

5 minute read

Published:

I recently tweeted something very silly (redundant information, I know). The tweet in question asked the following:

Let $X_0$ be supported on some nonempty $A \subseteq \mathbb{N}_{>0}$ with $\mathbb{P}(X_0 = k) = p_{0,k}$ and $\mathbb{E}[X_0] < \infty$. For each $n \geq 1$, recursively define $X_n$ on $\mathbb{N}_{>0}$ by $\mathbb{P}(X_n = k) = c_n \cdot k \cdot p_{n-1,k}$, where $c_n$ is a normalizing constant. Then, as $n \to \infty$

…then what?

Self-Independence by Ancillarity and Completeness

4 minute read

Published:

Back in 2020, I taught STA261 for the first time. The first part of that course deals with statistics (i.e., functions of random samples, not the subject as a whole!) and I chose to provide a light introduction to completeness because of how elegant the Lehmann-Scheffé theorem and related results in point estimation are down the road…

portfolio

publications

An Introduction to the c-Statistic

STA2101H - Methods of Applied Statistics I (Prof. Jerry Brunner, Fall 2018)

In this project, we provide an introduction to the c-statistic, a measure of discrimination particularly suited to logistic regression. We discuss its advantages over pseudo-R-squared measures and its equivalence to the area under the ROC curve. Through practical examples, such as model selection on a dataset of student grades, we demonstrate the c-statistic's utility while addressing the potential of overfitting. (Download)

How to Estimate 1 and Other Interesting Quantities

STA3431H - Monte Carlo Methods (Prof. Jeffrey S. Rosenthal, Fall 2019)

In this project, we explore methods for estimating expectations and variances of functionals with respect to the limiting distribution of the Anderson-Darling test statistic under the null hypothesis. We derive the density function for this complex distribution and implement efficient algorithms in R to compute it. We then evaluate the effectiveness of random-walk Metropolis, Metropolis-Hastings, and parallel tempering algorithms in sampling from the distribution. (Download)

An Overview of Transport Map MCMC

STA4519H - Optimal Transport (Prof. Leonard Wong, Fall 2019)

In this report, we give an overview of the transport map MCMC algorithm, which integrates the theory of optimal transport into MCMC sampling to improve efficiency in exploring complex target distributions. We discuss the algorithm's use of transport maps to adaptively refine proposal distributions, thereby enhancing sampling performance (particularly in high-dimensional settings). We also examine the theoretical guarantees of ergodicity for the algorithm and discuss the potential for parallelization and future directions for extending the method to large-scale Bayesian problems. (Download)

Simulations of Determinantal Processes

MATH1128H - Topics in Probability: Random Matrices and Random Planar Geometry (Prof. Bálint Virág, Winter 2021)

In this project, we simulate and analyze determinantal point processes (DPPs), a class of random point processes characterized by repulsion. We generate realizations of DPPs including the Gaussian unitary ensemble, circular unitary ensemble, complex Ginibre ensemble, and configurations of non-intersecting random walks. We compare the observed distributions to theoretical joint densities, which allows us to highlight the characteristic repulsion of these processes. (Download)

Report on a Sampling Algorithm for Nested Archimedian Copulas

STA4528H - Dependence Modelling with Applications to Risk Management (Prof. Silvana Pesenti, Winter 2021)

In this report, we review and implement an algorithm of Hofert for sampling from nested Archimedean copulas. We outline the foundational theory of Archimedean and nested Archimedean copulas, including their stochastic representations and sampling challenges. We then apply the sampling algorithm to a range of nested copulas, illustrating its computational efficiency and flexibility across different hierarchical structures. Our simulations highlight the algorithm's strengths, as well as potential limitations when working with non-standard Archimedean generators, offering insights for future applications and improvements. (Download)

Edgeworth Expansions and Saddlepoint Approximations

STA4508H - Topics in Likelihood Inference (Prof. Nancy Reid, Winter 2022)

In this project, we delve into a foundational paper on Edgeworth expansions and saddlepoint approximations by Barndorff-Nielsen and Cox (1979). Our focus is on tracing their derivations and discussing their statistical applications, including conditional likelihood inference and likelihood ratio tests within exponential family models. We follow their exposition closely, providing additional clarity on key derivations and examples, while highlighting the significant impact of their work on the development of modern asymptotic techniques in statistics. (Download)

Copulas and Information Geometry: Strange Bedfellows?

STA4531H - Information Geometry (Prof. Leonard Wong, Fall 2022)

In this project, we critically examine the intersection of copulas and information geometry through a review of two key papers. The first paper explores variational Bayes inference using copulas by integrating a number of information-geometric principles, while the second investigates clustering multivariate time series by comparing divergences between copulas. Our report evaluates the theoretical contributions and practical implications of these works, highlighting their limitations and the challenges inherent in combining these two seemingly unrelated fields. (Download)

Copula Modeling of Serially Correlated Multivariate Data with Hidden Structures

Published in Journal of the American Statistical Association, 2023

We develop an efficient algorithm to fit HMMs with multivariate observations distributed according to state-dependent copulas. Joint work with Radu V. Craiu and Vianey Leos Barajas.

Recommended citation: Zimmerman et al (2023). Copula Modeling of Serially Correlated Multivariate Data with Hidden Structures. Journal of the American Statistical Association. 119(548):2598-2609. https://www.tandfonline.com/doi/full/10.1080/01621459.2023.2263202

Separating States in Astronomical Sources Using Hidden Markov Models: With a Case Study of Flaring and Quiescence on EV Lac

Published in Monthly Notices of the Royal Astronomical Society, 2024

We use state-space models to distinguish between different states in astronomical sources with count data. Joint work with David A. van Dyk, Vinay L. Kashyap, and Aneta Siemiginowska.

Recommended citation: Zimmerman et al (2024). Separating States in Astronomical Sources Using Hidden Markov Models: With a Case Study of Flaring and Quiescence on EV Lac. Monthly Notices of the Royal Astronomical Society. 534(3):2142-2167. https://academic.oup.com/mnras/article/534/3/2142/7754163

talks

teaching

STA201: Why Numbers Matter

Undergraduate course, University of Toronto, Winter 2019

STA201 is intended to provide an introduction to quantitative reasoning in a variety of fields to non-science students at the University of Toronto. I co-taught this course in Winter 2019 with Prof. Karen Huynh Wong.

STA261: Probability and Statistics II

Undergraduate course, University of Toronto, Summer 2020, Summer 2021, Summer 2022, Summer 2024

STA261 is intended to be a rigorous (but friendly) introduction to mathematical statistics for students in the Theory and Methods Statistics Specialist program at the University of Toronto. I taught this course four times, from 2020 to 2024.

STA2311: Advanced Computational Methods for Statistics I

PhD course, University of Toronto, Fall 2023

STA2311 is a new course required for most first-year students in the Statistical Theory and Applications PhD stream at the University of Toronto. The course, which examines optimization and sampling techniques (focusing on both underlying motivation and theoretical justification), was fully designed by Prof. Radu V. Craiu and me.

underrevision