精度矩阵置信区间在高维网络数据中的研究 |
| |
作者姓名: | 郑泽敏 周慧婷 |
| |
作者单位: | 中国科学技术大学管理学院统计与金融系, 合肥 230026 |
| |
基金项目: | 国家自然科学基金(1601501,11671374,71731010)资助 |
| |
摘 要: | 随着互联网与科学技术的发展, 大数据以前所未有的规模激增, 不同个体之间形成了错综复杂的网络数据。图模型精度(逆协方差)矩阵的置信区间对恢复网络间联系起到了非常重要的作用。如何快速得到精度矩阵的置信区间是一个亟待解决的问题。提出De-ISEE(De-innovated scalable efficient estimation)统计量,基于其构造的置信区间在保持较大理想覆盖率的同时,计算效率也得到了较大的提升。仿真实验充分展示了该方法在网络数据中覆盖率和计算方面的优势。将De-ISEE方法应用到核黄素数据以及基因表达数据,发现De-ISEE方法可作为研究基因联系的一个重要工具。
|
关 键 词: | 网络数据 高维图模型 置信区间 精度矩阵 去偏统计量 |
收稿时间: | 2020-04-15 |
修稿时间: | 2020-06-28 |
Research of confidence intervals for precision matrix in high dimensional network data |
| |
Authors: | ZHENG Zemin ZHOU Huiting |
| |
Institution: | Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China |
| |
Abstract: | With the development of the Internet, science, and technology, the surge of bigdata on an unprecedented scale has brought complex network data between different individuals. It is practical significance to uncover the network connection by studying the confidence intervals of the precision (inverse covariance) matrix in graphical models. One natural and important question is how to efficiently obtain confidence intervals of the precision matrix. This paper proposes the De-ISEE (De-innovated scalable efficient estimation) statistic, whose confidence intervals enjoy efficient computation while maintaining a desirable coverage rate. Both average coverage and computational advantages of the methods have been demonstrated by our numerical studies in network data. Moreover, this paper applies the De-ISEE method to riboflavin data and gene expression data, and finds that De-ISEE method could be an important tool for studying gene association. |
| |
Keywords: | network data high-dimensional graphical models confidence intervals precision matrix De-sparsified statistic |
|
| 点击此处可从《》浏览原始摘要信息 |
| 点击此处可从《》下载免费的PDF全文 |
|