專題演講 主講人:鄧利源教授 (Department of Mathematical Sciences, University of Memphis)

題 目:Subdata for high-impact-variable selection in rare event classification with application to Taiwan bankruptcy data
主講人:鄧利源教授 (Department of Mathematical Sciences, University of Memphis)
時 間:114年4月18日(星期五)上午10:10-11:00
    (上午09:50-10:10茶會於綜合一館428室舉行)
地 點:綜合一館427室
 
摘要
 
 
Big data with a large number of both observations and potential input variables create an increased risk of overfitting when building statistical models. We propose a general variable selection procedure to identify the key input variables by applying the elastic net regression to select variables for representative subdata in place of the full sample. We combine the lists of selected variables from each subdata through ensemble techniques, using the frequency of selecting the variable across different subdata as a measure of the variable's importance. Using only variables that are frequently chosen (i.e. 90% or 100%), we are able to build a parsimonious model that optimizes predictive accuracy. We adapt this method to the rare event setting and show its application to Taiwanese bankruptcy data.
 
            
 
 