12/27 專題演講 主講人:鄧利源教授 (University of Memphis, U.S.A.)
題 目:Big Data Model Building using Dimension Reduction and Sample Selection
主講人:鄧利源教授 (Professor Lih-Yuan Deng)
University of Memphis, U.S. A.
時 間:111年12月27日(星期二) 下午15:30-16:20
地 點:綜合一館AB102
使用Google Meet線上直播,
演講開始前20分鐘可進入會議,請點選下列連結後按下「要求加入」即可
摘要
Computational resources and techniques are difficult to handle the extraordinary data volume generated in many fields today. There is a challenge when applying conventional statistical techniques in the setting of big data. It is common to divide the big data into subdata for the purpose of training, testing, and validation. The main purpose of training data is to learn the objective task which is also suitable for big data. To achieve this goal, it is essential to choose training subdata to retain characteristics of the big data. Recently, several procedures have been proposed to select “optimal design points” as training subdata under pre-specified models. However, these subdata will not be "optimal", if the assumed model is not appropriate. Furthermore, such subdata cannot be useful to build alternative models because it is not “similar” to the original big data. In this talk, we propose a new and novel algorithm for better model building and prediction via a process of selecting a “good” training sample. The proposed subdata can retain most characteristics that are similar to the original big data and it should be more robust in the sense that one can entertain various response models.