首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This article is concerned with the question of whether the missing data mechanism routinely referred to as missing completely at random (MCAR) is statistically examinable via a test for lack of distributional differences between groups with observed and missing data, and related consequences. A discussion is initially provided, from a formal logic standpoint, of the distinction between necessary conditions and sufficient conditions. This distinction is used to argue then that testing for lack of these group distributional differences is not a test for MCAR, and an example is given. The view is next presented that the desirability of MCAR has been frequently overrated in empirical research. The article is finalized with a reference to principled, likelihood-based methods for analyzing incomplete data sets in social and behavioral research.  相似文献   

2.
A multiple testing procedure for examining implications of the missing completely at random (MCAR) mechanism in incomplete data sets is discussed. The approach uses the false discovery rate concept and is concerned with testing group differences on a set of variables. The method can be used for ascertaining violations of MCAR and disproving this mechanism in empirical behavioral and social research. The procedure can also be employed when locating violations of MCAR in observed measures is of interest. The outlined approach is illustrated with data from a cognitive intervention study.  相似文献   

3.
A procedure for evaluating candidate auxiliary variable correlations with response variables in incomplete data sets is outlined. The method provides point and interval estimates of the outcome-residual correlations with potentially useful auxiliaries, and of the bivariate correlations of outcome(s) with the latter variables. Auxiliary variables found in this way can enhance considerably the plausibility of the popular missing at random (MAR) assumption if included in ensuing maximum likelihood analyses, or can alternatively be incorporated in imputation models for subsequent multiple imputation analyses. The approach can be particularly helpful in empirical settings where violations of the MAR assumption are suspected, as is the case in many longitudinal studies, and is illustrated with data from cognitive aging research.  相似文献   

4.
The purpose of this study is to investigate the effects of missing data techniques in longitudinal studies under diverse conditions. A Monte Carlo simulation examined the performance of 3 missing data methods in latent growth modeling: listwise deletion (LD), maximum likelihood estimation using the expectation and maximization algorithm with a nonnormality correction (robust ML), and the pairwise asymptotically distribution-free method (pairwise ADF). The effects of 3 independent variables (sample size, missing data mechanism, and distribution shape) were investigated on convergence rate, parameter and standard error estimation, and model fit. The results favored robust ML over LD and pairwise ADF in almost all respects. The exceptions included convergence rates under the most severe nonnormality in the missing not at random (MNAR) condition and recovery of standard error estimates across sample sizes. The results also indicate that nonnormality, small sample size, MNAR, and multicollinearity might adversely affect convergence rate and the validity of statistical inferences concerning parameter estimates and model fit statistics.  相似文献   

5.
Missing data are common in studies that rely on multiple informant data to evaluate relationships among variables for distinguishable individuals clustered within groups. Estimation of structural equation models using raw data allows for incomplete data, and so all groups can be retained for analysis even if only 1 member of a group contributes data. Statistical inference is based on the assumption that data are missing completely at random or missing at random. Importantly, whether or not data are missing is assumed to be independent of the missing data. A saturated correlates model that incorporates correlates of the missingness or the missing data into an analysis and multiple imputation that might also use such correlates offer advantages over the standard implementation of SEM when data are not missing at random because these approaches could result in a data analysis problem for which the missingness is ignorable. This article considers these approaches in an analysis of family data to assess the sensitivity of parameter estimates and statistical inferences to assumptions about missing data, a strategy that could be easily implemented using SEM software.  相似文献   

6.
A 2-stage robust procedure as well as an R package, rsem, were recently developed for structural equation modeling with nonnormal missing data by Yuan and Zhang (2012). Several test statistics that have been used for complete data analysis are employed to evaluate model fit in the 2-stage robust method. However, properties of these statistics under robust procedures for incomplete nonnormal data analysis have never been studied. This study aims to systematically evaluate and compare 5 test statistics, including a test statistic derived from normal-distribution-based maximum likelihood, a rescaled chi-square statistic, an adjusted chi-square statistic, a corrected residual-based asymptotical distribution-free chi-square statistic, and a residual-based F statistic. These statistics are evaluated under a linear growth curve model by varying 8 factors: population distribution, missing data mechanism, missing data rate, sample size, number of measurement occasions, covariance between the latent intercept and slope, variance of measurement errors, and downweighting rate of the 2-stage robust method. The performance of the test statistics varies and the one derived from the 2-stage normal-distribution-based maximum likelihood performs much worse than the other four. Application of the 2-stage robust method and of the test statistics is illustrated through growth curve analysis of mathematical ability development, using data on the Peabody Individual Achievement Test mathematics assessment from the National Longitudinal Survey of Youth 1997 Cohort.  相似文献   

7.
A 2-stage procedure for estimation and testing of observed measure correlations in the presence of missing data is discussed. The approach uses maximum likelihood for estimation and the false discovery rate concept for correlation testing. The method can be used in initial exploration-oriented empirical studies with missing data, where it is of interest to estimate manifest variable interrelationship indexes and test hypotheses about their population values. The procedure is applicable also with violations of the underlying missing at random assumption, via inclusion of auxiliary variables. The outlined approach is illustrated with data from an aging research study.  相似文献   

8.
阐述了数据清理的概念和意义,介绍了缺失值修补和孤立点识别采用的主要方法,并指出有待进一步研究的问题。  相似文献   

9.
Respondent attrition is a common problem in national longitudinal panel surveys. To make full use of the data, weights are provided to account for attrition. Weight adjustments are based on sampling design information and data from the base year; information from subsequent waves is typically not utilized. Alternative methods to address bias from nonresponse are full information maximum likelihood (FIML) or multiple imputation (MI). The effects on bias of growth parameter estimates from using these methods are compared via a simulation study. The results indicate that caution needs to be taken when utilizing panel weights when there is missing data, and to consider methods like FIML and MI, which are not as susceptible to the omission of important auxiliary variables.  相似文献   

10.
Myriad approaches for handling missing data exist in the literature. However, few studies have investigated the tenability and utility of these approaches when used with intensive longitudinal data. In this study, we compare and illustrate two multiple imputation (MI) approaches for coping with missingness in fitting multivariate time-series models under different missing data mechanisms. They include a full MI approach, in which all dependent variables and covariates are imputed simultaneously, and a partial MI approach, in which missing covariates are imputed with MI, whereas missingness in the dependent variables is handled via full information maximum likelihood estimation. We found that under correctly specified models, partial MI produces the best overall estimation results. We discuss the strengths and limitations of the two MI approaches, and demonstrate their use with an empirical data set in which children’s influences on parental conflicts are modeled as covariates over the course of 15 days (Schermerhorn, Chow, & Cummings, 2010).  相似文献   

11.
Multivariate analysis of variance (MANOVA) is widely used in educational research to compare means on multiple dependent variables across groups. Researchers faced with the problem of missing data often use multiple imputation of values in place of the missing observations. This study compares the performance of 2 methods for combining p values in the context of a MANOVA, with the typical default for dealing with missing data: listwise deletion. When data are missing at random, the new methods maintained the nominal Type I error rate and had power comparable to the complete data condition. When 40% of the data were missing completely at random, the Type I error rates for the new methods were inflated, but not for lower percents.  相似文献   

12.
讨论了因变量随机缺失条件下变系数部分线性模型的估计问题。基于局部借补思想,使用局部线性方法和平均技巧同时得到了各个估计量的估计,进而给出了估计的渐近性质。  相似文献   

13.
Competence data from low‐stakes educational large‐scale assessment studies allow for evaluating relationships between competencies and other variables. The impact of item‐level nonresponse has not been investigated with regard to statistics that determine the size of these relationships (e.g., correlations, regression coefficients). Classical approaches such as ignoring missing values or treating them as incorrect are currently applied in many large‐scale studies, while recent model‐based approaches that can account for nonignorable nonresponse have been developed. Estimates of item and person parameters have been demonstrated to be biased for classical approaches when missing data are missing not at random (MNAR). In our study, we focus on parameter estimates of the structural model (i.e., the true regression coefficient when regressing competence on an explanatory variable), simulating data according to various missing data mechanisms. We found that model‐based approaches and ignoring missing values performed well in retrieving regression coefficients even when we induced missing data that were MNAR. Treating missing values as incorrect responses can lead to substantial bias. We demonstrate the validity of our approach empirically and discuss the relevance of our results.  相似文献   

14.
提出了一种基于RBF的时序缺失数据修复方法,利用RBF构建模板数据和当前存在缺失的数据之间的训练关系,并通过该训练关系修复缺失数据.实验表明,该方法能够应用于刚性体以及非刚形体运动或形变追踪,是一种有效的时序缺失数据修复方法.  相似文献   

15.
As useful multivariate techniques, structural equation models have attracted significant attention from various fields. Most existing statistical methods and software for analyzing structural equation models have been developed based on the assumption that the response variables are normally distributed. Several recently developed methods can partially address violations of this assumption, but still encounter difficulties in analyzing highly nonnormal data. Moreover, the presence of missing data is a practical issue in substantive research. Simply ignoring missing data or improperly treating nonignorable missingness as ignorable could seriously distort statistical influence results. The main objective of this article is to develop a Bayesian approach for analyzing transformation structural equation models with highly nonnormal and missing data. Different types of missingness are discussed and selected via the deviance information criterion. The empirical performance of our method is examined via simulation studies. Application to a study concerning people’s job satisfaction, home life, and work attitude is presented.  相似文献   

16.
Rubin’s classic missingness mechanisms are central to handling missing data and minimizing biases that can arise due to missingness. However, the formulaic expressions that posit certain independencies among missing and observed data are difficult to grasp. As a result, applied researchers often rely on informal translations of these assumptions. We present a graphical representation of missing data mechanism, formalized in Mohan, Pearl, and Tian (2013). We show that graphical models provide a tool for comprehending, encoding, and communicating assumptions about the missingness process. Furthermore, we demonstrate on several examples how graph-theoretical criteria can determine if biases due to missing data might emerge in some estimates of interests and which auxiliary variables are needed to control for such biases, given assumptions about the missingness process.  相似文献   

17.
A well-known ad-hoc approach to conducting structural equation modeling with missing data is to obtain a saturated maximum likelihood (ML) estimate of the population covariance matrix and then to use this estimate in the complete data ML fitting function to obtain parameter estimates. This 2-stage (TS) approach is appealing because it minimizes a familiar function while being only marginally less efficient than the full information ML (FIML) approach. Additional advantages of the TS approach include that it allows for easy incorporation of auxiliary variables and that it is more stable in smaller samples. The main disadvantage is that the standard errors and test statistics provided by the complete data routine will not be correct. Empirical approaches to finding the right corrections for the TS approach have failed to provide unequivocal solutions. In this article, correct standard errors and test statistics for the TS approach with missing completely at random and missing at random normally distributed data are developed and studied. The new TS approach performs well in all conditions, is only marginally less efficient than the FIML approach (and is sometimes more efficient), and has good coverage. Additionally, the residual-based TS statistic outperforms the FIML test statistic in smaller samples. The TS method is thus a viable alternative to FIML, especially in small samples, and its further study is encouraged.  相似文献   

18.
EM算法是在不完全信息资料下实现参数估计的一种通用迭代方法,其在现代科学的许多领域已有着广泛的应用。文章导出了双位点不同标记类型,包括共显性-共显性,共显性-显性和显性-显性三种模式下,部分个体缺失标记基因型时,重组率估计率的EM算法。用编制的SAS/IML程序进行了Monte Carlo模拟研究,验证了文章所述方法在遗传连锁分析中的有效性和实用性。  相似文献   

19.
Methods of uniform differential item functioning (DIF) detection have been extensively studied in the complete data case. However, less work has been done examining the performance of these methods when missing item responses are present. Research that has been done in this regard appears to indicate that treating missing item responses as incorrect can lead to inflated Type I error rates (false detection of DIF). The current study builds on this prior research by investigating the utility of multiple imputation methods for missing item responses, in conjunction with standard DIF detection techniques. Results of the study support the use of multiple imputation for dealing with missing item responses. The article concludes with a discussion of these results for multiple imputation in conjunction with other research findings supporting its use in the context of item parameter estimation with missing data.  相似文献   

20.
提出了一个基于空间数据仓库的空间数据挖掘模型,讨论了空间关联规则发现,空间分类发现,空间聚类发现和空间数据总结等四类空间数据挖掘任务的目标,采用的挖掘方法和空间采样数据的处理方法。最后,提出有待深入研究和探索的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号