首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 38 毫秒
1.
Missing data are common in studies that rely on multiple informant data to evaluate relationships among variables for distinguishable individuals clustered within groups. Estimation of structural equation models using raw data allows for incomplete data, and so all groups can be retained for analysis even if only 1 member of a group contributes data. Statistical inference is based on the assumption that data are missing completely at random or missing at random. Importantly, whether or not data are missing is assumed to be independent of the missing data. A saturated correlates model that incorporates correlates of the missingness or the missing data into an analysis and multiple imputation that might also use such correlates offer advantages over the standard implementation of SEM when data are not missing at random because these approaches could result in a data analysis problem for which the missingness is ignorable. This article considers these approaches in an analysis of family data to assess the sensitivity of parameter estimates and statistical inferences to assumptions about missing data, a strategy that could be easily implemented using SEM software.  相似文献   

2.
A 2-stage procedure for estimation and testing of observed measure correlations in the presence of missing data is discussed. The approach uses maximum likelihood for estimation and the false discovery rate concept for correlation testing. The method can be used in initial exploration-oriented empirical studies with missing data, where it is of interest to estimate manifest variable interrelationship indexes and test hypotheses about their population values. The procedure is applicable also with violations of the underlying missing at random assumption, via inclusion of auxiliary variables. The outlined approach is illustrated with data from an aging research study.  相似文献   

3.
Using Monte Carlo simulations, this research examined the performance of four missing data methods in SEM under different multivariate distributional conditions. The effects of four independent variables (sample size, missing proportion, distribution shape, and factor loading magnitude) were investigated on six outcome variables: convergence rate, parameter estimate bias, MSE of parameter estimates, standard error coverage, model rejection rate, and model goodness of fit—RMSEA. A three-factor CFA model was used. Findings indicated that FIML outperformed the other methods in MCAR, and MI should be used to increase the plausibility of MAR. SRPI was not comparable to the other three methods in either MCAR or MAR.  相似文献   

4.
Myriad approaches for handling missing data exist in the literature. However, few studies have investigated the tenability and utility of these approaches when used with intensive longitudinal data. In this study, we compare and illustrate two multiple imputation (MI) approaches for coping with missingness in fitting multivariate time-series models under different missing data mechanisms. They include a full MI approach, in which all dependent variables and covariates are imputed simultaneously, and a partial MI approach, in which missing covariates are imputed with MI, whereas missingness in the dependent variables is handled via full information maximum likelihood estimation. We found that under correctly specified models, partial MI produces the best overall estimation results. We discuss the strengths and limitations of the two MI approaches, and demonstrate their use with an empirical data set in which children’s influences on parental conflicts are modeled as covariates over the course of 15 days (Schermerhorn, Chow, & Cummings, 2010).  相似文献   

5.
A multiple testing procedure for examining the assumption of normality that is often made in analyses of incomplete data sets is outlined. The method is concerned with testing normality within each missingness pattern and arriving at an overall statement about normality using the available data. The approach is readily applied in empirical research with missing data using the popular software Mplus, Stata, and R. The procedure can be used to ascertain a main assumption underlying frequent applications of maximum likelihood in incomplete data modeling with continuous outcomes. The discussed approach is illustrated with numerical examples.  相似文献   

6.
Difficulties arise in multiple-group evaluations of factorial invariance if particular manifest variables are missing completely in certain groups. Ad hoc analytic alternatives can be used in such situations (e.g., deleting manifest variables), but some common approaches, such as multiple imputation, are not viable. At least 3 solutions to this problem are viable: analyzing differing sets of variables across groups, using pattern mixture approaches, and a new method using random number generation. The latter solution, proposed in this article, is to generate pseudo-random normal deviates for all observations for manifest variables that are missing completely in a given sample and then to specify multiple-group models in a way that respects the random nature of these values. An empirical example is presented in detail comparing the 3 approaches. The proposed solution can enable quantitative comparisons at the latent variable level between groups using programs that require the same number of manifest variables in each group.  相似文献   

7.
Respondent attrition is a common problem in national longitudinal panel surveys. To make full use of the data, weights are provided to account for attrition. Weight adjustments are based on sampling design information and data from the base year; information from subsequent waves is typically not utilized. Alternative methods to address bias from nonresponse are full information maximum likelihood (FIML) or multiple imputation (MI). The effects on bias of growth parameter estimates from using these methods are compared via a simulation study. The results indicate that caution needs to be taken when utilizing panel weights when there is missing data, and to consider methods like FIML and MI, which are not as susceptible to the omission of important auxiliary variables.  相似文献   

8.
考虑响应变量随机缺失下线性模型响应变量均值的估计问题,分别获得了基于完全观测样本数据、线性回归插补后的"完全样本"和逆概率加权插补后的"完全样本"得到的响应变量均值估计,并证明了其渐近正态性.  相似文献   

9.
几种不同缺失值填充方法的比较   总被引:1,自引:0,他引:1  
在数据挖掘和机器学习领域,缺失数据经常出现,本文从理论和实验两方面分析了常用的几种处理缺失数据的方法的优、缺点。  相似文献   

10.
Missing data is endemic in much educational research. However, practices such as step-wise regression common in the educational research literature have been shown to be dangerous when significant data are missing, and multiple imputation (MI) is generally recommended by statisticians. In this paper, we provide a review of these advances and their implications for educational research. We illustrate the issues with an educational, longitudinal survey in which missing data was significant, but for which we were able to collect much of these missing data through subsequent data collection. We thus compare methods, that is, step-wise regression (basically ignoring the missing data) and MI models, with the model from the actual enhanced sample. The value of MI is discussed and the risks involved in ignoring missing data are considered. Implications for research practice are discussed.  相似文献   

11.
When missingness is suspected to be not at random (MNAR) in longitudinal studies, researchers sometimes compare the fit of a target model that assumes missingness at random (here termed a MAR model) and a model that accommodates a hypothesized MNAR missingness mechanism (here termed a MNAR model). It is well known that such comparisons are only interpretable conditional on the validity of the chosen MNAR model’s assumptions about the missingness mechanism. For that reason, researchers often perform a sensitivity analysis comparing the MAR model to not one, but several, plausible alternative MNAR models. In the social sciences, it is not widely known that such model comparisons can be particularly sensitive to case influence, such that conclusions drawn could depend on a single case. This article describes two convenient diagnostics suited for detecting case influence on MAR–MNAR model comparisons. Both diagnostics require much less computational burden than global influence diagnostics that have been used in other disciplines for MNAR sensitivity analyses. We illustrate the interpretation and implementation of these diagnostics with simulated and empirical latent growth modeling examples. It is hoped that this article increases awareness of the potential for case influence on MAR–MNAR model comparisons and how it could be detected in longitudinal social science applications.  相似文献   

12.
As useful multivariate techniques, structural equation models have attracted significant attention from various fields. Most existing statistical methods and software for analyzing structural equation models have been developed based on the assumption that the response variables are normally distributed. Several recently developed methods can partially address violations of this assumption, but still encounter difficulties in analyzing highly nonnormal data. Moreover, the presence of missing data is a practical issue in substantive research. Simply ignoring missing data or improperly treating nonignorable missingness as ignorable could seriously distort statistical influence results. The main objective of this article is to develop a Bayesian approach for analyzing transformation structural equation models with highly nonnormal and missing data. Different types of missingness are discussed and selected via the deviance information criterion. The empirical performance of our method is examined via simulation studies. Application to a study concerning people’s job satisfaction, home life, and work attitude is presented.  相似文献   

13.
Although structural equation modeling software packages use maximum likelihood estimation by default, there are situations where one might prefer to use multiple imputation to handle missing data rather than maximum likelihood estimation (e.g., when incorporating auxiliary variables). The selection of variables is one of the nuances associated with implementing multiple imputation, because the imputer must take special care to preserve any associations or special features of the data that will be modeled in the subsequent analysis. For example, this article deals with multiple group models that are commonly used to examine moderation effects in psychology and the behavioral sciences. Special care must be exercised when using multiple imputation with multiple group models, as failing to preserve the interactive effects during the imputation phase can produce biased parameter estimates in the subsequent analysis phase, even when the data are missing completely at random or missing at random. This study investigates two imputation strategies that have been proposed in the literature, product term imputation and separate group imputation. A series of simulation studies shows that separate group imputation adequately preserves the multiple group data structure and produces accurate parameter estimates.  相似文献   

14.
A multiple testing procedure for examining implications of the missing completely at random (MCAR) mechanism in incomplete data sets is discussed. The approach uses the false discovery rate concept and is concerned with testing group differences on a set of variables. The method can be used for ascertaining violations of MCAR and disproving this mechanism in empirical behavioral and social research. The procedure can also be employed when locating violations of MCAR in observed measures is of interest. The outlined approach is illustrated with data from a cognitive intervention study.  相似文献   

15.
Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait (θ) estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to calibrate items using the incomplete data from MST design. Further complication arises when there are multiple correlated subscales per test, and when items from different subscales need to be calibrated according to their respective score reporting metric. The current calibration-per-subscale method produced biased item parameters, and there is no available method for resolving the challenge. Deriving from the missing data principle, we showed when calibrating all items together the Rubin's ignorability assumption is satisfied such that the traditional single-group calibration is sufficient. When calibrating items per subscale, we proposed a simple modification to the current calibration-per-subscale method that helps reinstate the missing-at-random assumption and therefore corrects for the estimation bias that is otherwise existent. Three mainstream calibration methods are discussed in the context of MST, they are the marginal maximum likelihood estimation, the expectation maximization method, and the fixed parameter calibration. An extensive simulation study is conducted and a real data example from NAEP is analyzed to provide convincing empirical evidence.  相似文献   

16.
Small samples are common in growth models due to financial and logistical difficulties of following people longitudinally. For similar reasons, longitudinal studies often contain missing data. Though full information maximum likelihood (FIML) is popular to accommodate missing data, the limited number of studies in this area have found that FIML tends to perform poorly with small-sample growth models. This report demonstrates that the fault lies not with how FIML accommodates missingness but rather with maximum likelihood estimation itself. We discuss how the less popular restricted likelihood form of FIML, along with small-sample-appropriate methods, yields trustworthy estimates for growth models with small samples and missing data. That is, previously reported small sample issues with FIML are attributable to finite sample bias of maximum likelihood estimation not direct likelihood. Estimation issues pertinent to joint multiple imputation and predictive mean matching are also included and discussed.  相似文献   

17.
In this article, grade point average (GPA) is considered a missing data technique for unavailable grades in school grade records. In Study 1, theoretical and empirical differences between GPA and seven alternative missing grade techniques were considered. These seven techniques are subject mean substitution, corrected subject mean, subject correlation substitution, regression imputation, expectation maximization algorithm imputation and two multiple imputation methods-stochastic regression imputation and data augmentation., The missing grade techniques differ greatly. Data augmentation and stochastic regression imputation appear to be superior as missing grade techniques. In Study 2, the completed grade records (observed and imputed values) were used in two prediction analyses of academic achievement. One analysis was based on unweighed grades, the other on weighed grades. In both analyses, alternative missing grade methods produced better and more consistent predictions. It is concluded that some alternative missing grade methods are superior to GPA.  相似文献   

18.
This is the first study to test whether the stages of change of the transtheoretical model are qualitatively different through exploring discontinuity patterns in theory of planned behavior (TPB) variables using latent multigroup structural equation modeling (MSEM) with AMOS. Discontinuity patterns in terms of latent means and prediction patterns for the different stage groups were examined. Adults (n = 3,462) were assessed on their physical activity stages of change and TPB variables. The TPB was separately examined within the five stage groups. The TPB measurement model fit was acceptable. Latent mean analyses with post-hoc contrast and MSEM indicated discontinuity patterns. Results underscore the qualitative differences between the stages that may guide further research and the design of interventions integrating the approaches.  相似文献   

19.
When data for multiple outcomes are collected in a multilevel design, researchers can select a univariate or multivariate analysis to examine group-mean differences. When correlated outcomes are incomplete, a multivariate multilevel model (MVMM) may provide greater power than univariate multilevel models (MLMs). For a two-group multilevel design with two correlated outcomes, a simulation study was conducted to compare the performance of MVMM to MLMs. The results showed that MVMM and MLM performed similarly when data were complete or missing completely at random. However, when outcome data were missing at random, MVMM continued to provide unbiased estimates, whereas MLM produced grossly biased estimates and severely inflated Type I error rates. As such, this study provides further support for using MVMM rather than univariate analyses, particularly when outcome data are incomplete.  相似文献   

20.
Technical difficulties occasionally lead to missing item scores and hence to incomplete data on computerized tests. It is not straightforward to report scores to the examinees whose data are incomplete due to technical difficulties. Such reporting essentially involves imputation of missing scores. In this paper, a simulation study based on data from three educational tests is used to compare the performances of six approaches for imputation of missing scores. One of the approaches, based on data mining, is the first application of its kind to the problem of imputation of missing data. The approach based on data mining and a multiple imputation approach based on chained equations led to the most accurate imputation of missing scores, and hence to most accurate score reporting. A simple approach based on linear regression performed the next best overall. Several recommendations are made regarding the reporting of scores to examinees with incomplete data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号