Speaker:Professor Lih-Yuan Deng (Department of Mathematical Sciences, University of Memphis)

  • Event Date: 2025-04-18
  • Speaker:  /  Host:


Topic:Subdata for high-impact-variable selection in rare event classification with application to Taiwan bankruptcy data

Speaker:Professor Lih-Yuan Deng (Department of Mathematical Sciences, University of Memphis)

Time:April 18 (Friday) , 2025, 10:10-11:00 

Place: 4F-427, Assembly Building I

Online Seminars- Google Meet

           
Abstract
 
Big data with a large number of both observations and potential input variables create an increased risk of overfitting when building statistical models. We propose a general variable selection procedure to identify the key input variables by applying the elastic net regression to select variables for representative subdata in place of the full sample. We combine the lists of selected variables from each subdata through ensemble techniques, using the frequency of selecting the variable across different subdata as a measure of the variable's importance. Using only variables that are frequently chosen (i.e. 90% or 100%), we are able to build a parsimonious model that optimizes predictive accuracy. We adapt this method to the rare event setting and show its application to Taiwanese bankruptcy data.