12/23 專題演講 主講人:鄧利源教授 (University of Memphis, U.S.A.)
題 目:Random Integrated Subdata Ensemble (RISE) for Big Data Model Building
主講人:鄧利源教授 (Professor Lih-Yuan Deng)
University of Memphis, U.S. A.
時 間:111年12月23日(星期五)上午10:40-11:30
(上午10:20-10:40茶會於交大統計所428室舉行)
地 點:綜合一館427室
使用Google Meet線上直播,
演講開始前20分鐘可進入會議,請點選下列連結後按下「要求加入」即可
摘要
We discuss our newly proposed Random Integrated Subdata Ensemble (RISE) method to build a more efficient big data model such as variable selections and/or model building. Generally speaking, large sample size tends to make some "variables" statistically significant while they may not have real "practical importance". Therefore, it is more likely to overly select such variables simply because of using (relatively) large sample size. For most "big data" applications, the number of observations for the whole data and training data can easily exceed few thousands. Consequently, most "typical" statistical variable selection procedures tends to overly select "less important" variables which are "statistically significant". RISE is a better strategy to choose and analyze various subdata of a "smaller size" that can combine or ensemble various results to build a more efficient and reliable model.