Speaker:Dr. Maximillian Chen (Johns Hopkins University Applied Physics Laboratory)

  • Event Date: 2023-12-29
  • Speaker:  /  Host:


Topic:Inferential Procedures for Matrix- and Tensor-Variate Data

Speaker:Dr. Maximillian Chen (Johns Hopkins University Applied Physics Laboratory)

Date Time:Fri. Dec 29, 2023, 10:10 AM - 11:00 AM 

Place: 4F-427, Assembly Building I

Online Seminars- Google Meet

 
Abstract

High-dimensional data analysis has been a prominent topic of statistical research in recent years due to the growing presence of high-dimensional electronic data. Much of the current work has been done on analyzing a sample of high-dimensional multivariate data. However, not as much research has been done on analyzing matrix- and tensor-variate data. A matrix-variate dataset can consist of independent images, while a tensor-variate dataset can consist of correlated images, such as ripped frames from video data or medical images taken from the same patient over multiple time points. Of particular importance is being able to develop inferential methods for these types of datasets, where the dependencies contained in these datasets are accounted for and the datasets can be reduced to the most important pieces of the data, while being able to draw conclusions from these datasets. Two possible frameworks that can be used are the population value decomposition (PVD), originated in Crainiceanu et al (2011), and the third-order Tucker decomposition for matrix and three-dimensional tensor data, respectively. In both frameworks, a matrix or three-dimensional tensor can be decomposed into a product of two matrices/tensors with population-specific features and one matrix/tensor with subject-specific features. We develop inferential procedures for detecting significant differences existing in matrix- and tensor-variate datasets. For matrix-variate data, we assume our data follows a matrix normal distribution and model the data with the PVD framework. For tensor-variate data, we assume our data follows a tensor normal distribution and model the data with a third-order Tucker decomposition. For both setups, we introduce likelihood-ratio tests, score tests, and regression-based test for the one-, two-, and k-population problems and derive the distributions of the resulting test statistics. We implement our methods on simulated data, a facial imagery dataset, and a real train video dataset, and we conclude by discussing our results and areas for future work.