首页 > 正文

院庆十周年系列学术报告11:中国科学技术大学曾靖教授报告

发布时间:2025-05-06文章来源: 浏览次数:

报告时间:2025516 1500-1600

报告地点: 统计与数据科学学院106

报告题目: Best Subset Selection EM Algorithm: Statistical Analysis and Applications

报告摘要: The mixed linear regression (MLR) model is extensively used to model data from heterogeneous populations. When data has ultra-high dimensions, the heterogeneity and high dimensionality together pose great challenges for parameter estimation. While some works have been devoted to addressing these challenges, they have various limitations, such as the absence of statistical analysis for the algorithm iterate sequence, or a lack of theoretical guarantees for variable selection consistency. In this article, we develop an $L_{2,0}$-constrained expectation-maximization (EM) algorithm and propose an efficient algorithm for solving the $L_{2,0}$-penalized optimization leveraging a best subset selection approach. We also introduce an information criterion for selecting the sparsity level and establish its consistency. Theoretically, we establish a non-asymptotic error bound for the algorithm iterate sequence and prove that the proposed procedure accurately recovers important variables. Numerically, our theoretical findings are supported by extensive numerical studies on both synthetic data and real data from the cancer cell line encyclopedia (CCLE).

报告人简介:曾靖,中国科学技术大学管理学院特任副教授。2017年本科毕业于中国科学技术大学,2022年博士毕业于美国佛罗里达州立大学。目前主要研究方向为数据降维,高维数据分析,张量数据分析,稳健统计,混合模型,迁移学习。有多篇论文发表在Journal of the American Statistical Association, Statistica Sinica等期刊上。目前主持国家自然科学基金青年基金。


关闭 打印责任编辑:张红云

友情链接