首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit statistics for absolute or relative fit under different CDM settings. The investigation covered various types of model–data misfit that can occur with the misspecifications of the Q‐matrix, the CDM, or both. Six fit statistics were considered: –2 log likelihood (–2LL), Akaike's information criterion (AIC), Bayesian information criterion (BIC), and residuals based on the proportion correct of individual items (p), the correlations (r), and the log‐odds ratio of item pairs (l). An empirical example involving real data was used to illustrate how the different fit statistics can be employed in conjunction with each other to identify different types of misspecifications. With these statistics and the saturated model serving as the basis, relative and absolute fit evaluation can be integrated to detect misspecification efficiently.  相似文献   

2.
The accuracy of structural model parameter estimates in latent variable mixture modeling was explored with a 3 (sample size) × 3 (exogenous latent mean difference) × 3 (endogenous latent mean difference) × 3 (correlation between factors) × 3 (mixture proportions) factorial design. In addition, the efficacy of several likelihood-based statistics (Akaike's Information Criterion [AIC], Bayesian Information Ctriterion [BIC], the sample-size adjusted BIC [ssBIC], the consistent AIC [CAIC], the Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test [aVLMR]), classification-based statistics (CLC [classification likelihood information criterion], ICL-BIC [integrated classification likelihood], normalized entropy criterion [NEC], entropy), and distributional statistics (multivariate skew and kurtosis test) were examined to determine which statistics best recover the correct number of components. Results indicate that the structural parameters were recovered, but the model fit statistics were not exceedingly accurate. The ssBIC statistic was the most accurate statistic, and the CLC, ICL-BIC, and aVLMR showed limited utility. However, none of these statistics were accurate for small samples (n = 500).  相似文献   

3.
This research focuses on the problem of model selection between the latent change score (LCS) model and the autoregressive cross-lagged (ARCL) model when the goal is to infer the longitudinal relationship between variables. We conducted a large-scale simulation study to (a) investigate the conditions under which these models return statistically (and substantively) different results concerning the presence of bivariate longitudinal relationships, and (b) ascertain the relative performance of an array of model selection procedures when such different results arise. The simulation results show that the primary sources of differences in parameter estimates across models are model parameters related to the slope factor scores in the LCS model (specifically, the correlation between the intercept factor and the slope factor scores) as well as the size of the data (specifically, the number of time points and sample size). Among several model selection procedures, correct selection rates were higher when using model fit indexes (i.e., comparative fit index, root mean square error of approximation) than when using a likelihood ratio test or any of several information criteria (i.e., Akaike’s information criterion, Bayesian information criterion, consistent AIC, and sample-size-adjusted BIC).  相似文献   

4.
Little research has examined factors influencing statistical power to detect the correct number of latent classes using latent profile analysis (LPA). This simulation study examined power related to interclass distance between latent classes given true number of classes, sample size, and number of indicators. Seven model selection methods were evaluated. None had adequate power to select the correct number of classes with a small (Cohen's d = .2) or medium (d = .5) degree of separation. With a very large degree of separation (d = 1.5), the Lo–Mendell–Rubin test (LMR), adjusted LMR, bootstrap likelihood ratio test, Bayesian Information Criterion (BIC), and sample-size-adjusted BIC were good at selecting the correct number of classes. However, with a large degree of separation (d = .8), power depended on number of indicators and sample size. Akaike's Information Criterion and entropy poorly selected the correct number of classes, regardless of degree of separation, number of indicators, or sample size.  相似文献   

5.
The purpose of the current study is to examine the performance of four information criteria (Akaike's information criterion [AIC], corrected AIC [AICC] Bayesian information criterion [BIC], sample-size adjusted BIC [SABIC]) for detecting the correct number of latent classes in the mixture Rasch model through simulations. The simulation study manipulated various class-distinction features (percentages of class-variant items, magnitudes, and patterns of item difficulty differences) and mixing proportions, assuming that a mixture Rasch model with two latent classes was the true model. Unlike previous studies that showed BIC's superiority to other indices, our findings from this study suggested that the four information criteria had differential performance depending on the percentage of class-variant items and the magnitude and pattern of item difficulty differences under a two-class structure. Furthermore, the present study revealed that AICC and SABIC generally performed as good as or better than their counterparts, AIC and BIC, respectively, for the class-class structure with a sample of 3,000.  相似文献   

6.
In psychological research, available data are often insufficient to estimate item factor analysis (IFA) models using traditional estimation methods, such as maximum likelihood (ML) or limited information estimators. Bayesian estimation with common-sense, moderately informative priors can greatly improve efficiency of parameter estimates and stabilize estimation. There are a variety of methods available to evaluate model fit in a Bayesian framework; however, past work investigating Bayesian model fit assessment for IFA models has assumed flat priors, which have no advantage over ML in limited data settings. In this paper, we evaluated the impact of moderately informative priors on ability to detect model misfit for several candidate indices: posterior predictive checks based on the observed score distribution, leave-one-out cross-validation, and widely available information criterion (WAIC). We found that although Bayesian estimation with moderately informative priors is an excellent aid for estimating challenging IFA models, methods for testing model fit in these circumstances are inadequate.  相似文献   

7.
This study investigated the performance of fit indexes in selecting a covariance structure for longitudinal data. Data were simulated to follow a compound symmetry, first-order autoregressive, first-order moving average, or random-coefficients covariance structure. We examined the ability of the likelihood ratio test (LRT), root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker–Lewis Index (TLI) to reject misspecified models with varying degrees of misspecification. With a sample size of 20, RMSEA, CFI, and TLI are high in both Type I and Type II error rates, whereas LRT has a high Type II error rate. With a sample size of 100, these indexes generally have satisfactory performance, but CFI and TLI are affected by a confounding effect of their baseline model. Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC) have high success rates in identifying the true model when sample size is 100. A comparison with the mixed model approach indicates that separately modeling the means and covariance structures in structural equation modeling dramatically improves the success rate of AIC and BIC.  相似文献   

8.
Model comparison is one useful approach in applications of structural equation modeling. Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC) are commonly used for selecting an optimal model from the alternatives. We conducted a comprehensive evaluation of various model selection criteria, including AIC, BIC, and their extensions, in selecting an optimal path model under a wide range of conditions over different compositions of candidate set, distinct values of misspecified parameters, and diverse sample sizes. The chance of selecting an optimal model rose as the values of misspecified parameters and sample sizes increased. The relative performance of AIC and BIC type criteria depended on the magnitudes of the parameter misspecified. The BIC family in general outperformed AIC counterparts unless under small values of omitted parameters and sample sizes, where AIC performed better. Scaled unit information prior BIC (SPBIC) and Haughton's BIC (HBIC) demonstrated the highest accuracy ratios across most of the conditions investigated in this simulation.  相似文献   

9.
This simulation study examines the efficacy of multilevel factor mixture modeling (ML FMM) for measurement invariance testing across unobserved groups when the groups are at the between level of multilevel data. To this end, latent classes are generated with class-specific item parameters (i.e., factor loading and intercept) across the between-level classes. The efficacy of ML FMM is evaluated in terms of class enumeration, class assignment, and the detection of noninvariance. Various classification criteria such as Akaike’s information criterion, Bayesian information criterion, and bootstrap likelihood ratio tests are examined for the correct enumeration of between-level latent classes. For the detection of measurement noninvariance, free and constrained baseline approaches are compared with respect to true positive and false positive rates. This study evidences the adequacy of ML FMM. However, its performance heavily depends on the simulation factors such as the classification criteria, sample size, and the magnitude of noninvariance. Practical guidelines for applied researchers are provided.  相似文献   

10.
Using a complex simulation study we investigated parameter recovery, classification accuracy, and performance of two item‐fit statistics for correct and misspecified diagnostic classification models within a log‐linear modeling framework. The basic manipulated test design factors included the number of respondents (1,000 vs. 10,000), attributes (3 vs. 5), and items (25 vs. 50) as well as different attribute correlations (.50 vs. .80) and marginal attribute difficulties (equal vs. different). We investigated misspecifications of interaction effect parameters under correct Q‐matrix specification and two types of Q‐matrix misspecification. While the misspecification of interaction effects had little impact on classification accuracy, invalid Q‐matrix specifications led to notably decreased classification accuracy. Two proposed item‐fit indexes were more strongly sensitive to overspecification of Q‐matrix entries for items than to underspecification. Information‐based fit indexes AIC and BIC were sensitive to both over‐ and underspecification.  相似文献   

11.
12.
A key challenge facing child protective services (CPS) is identifying children who are at greatest risk of future maltreatment. This analysis examined a cohort of children with a first report to CPS during infancy, a vulnerable population at high risk of future CPS reports. Birth records of all infants born in California in 2006 were linked to CPS records; 23,871 infants remaining in the home following an initial report were followed for 5 years to determine if another maltreatment report occurred. Latent class analysis (LCA) was used to identify subpopulations of infants based on varying risks of re-report. LCA model fit was examined using the Bayesian information criterion, a likelihood ratio test, and entropy. Statistical indicators and interpretability suggested the four-class model best fit the data. A second LCA included infant re-report as a distal outcome to examine the association between class membership and the likelihood of re-report. In Class 1 and Class 2 (lowest risk), the probability of a re-report was 44%; in contrast, the probability in Class 4 (highest risk) was 78%. Two birth characteristics clustered in the medium- and highest-risk classes: lack of established paternity and delayed or absent prenatal care. Two risk factors from the initial report of maltreatment emerged as predictors of re-report in the highest-risk class: an initial allegation of neglect and a family history of CPS involvement involving older siblings. Findings suggest that statistical techniques can be used to identify families with a heightened risk of experiencing later CPS contact.  相似文献   

13.
Multilevel Structural equation models are most often estimated from a frequentist framework via maximum likelihood. However, as shown in this article, frequentist results are not always accurate. Alternatively, one can apply a Bayesian approach using Markov chain Monte Carlo estimation methods. This simulation study compared estimation quality using Bayesian and frequentist approaches in the context of a multilevel latent covariate model. Continuous and dichotomous variables were examined because it is not yet known how different types of outcomes—most notably categorical—affect parameter recovery in this modeling context. Within the Bayesian estimation framework, the impact of diffuse, weakly informative, and informative prior distributions were compared. Findings indicated that Bayesian estimation may be used to overcome convergence problems and improve parameter estimate bias. Results highlight the differences in estimation quality between dichotomous and continuous variable models and the importance of prior distribution choice for cluster-level random effects.  相似文献   

14.
This study compared 5 scoring methods in terms of their statistical assumptions. They were then used to score the Teacher Observation of Classroom Adaptation Checklist, a measure consisting of 3 subscales and 21 Likert-type items. The 5 methods used were (a) sum/average scores of items, (b) latent factor scores with continuous indicators, (c) latent factor scores with ordered categorical indicators using the mean- and variance-adjusted weighted least squares estimation method, (d) latent factor scores with ordered categorical indicators using the full information maximum likelihood estimation method, and (e) multidimensional graded response model using the Bock-Aitkin expectation-maximization estimation procedure. Measurement invariance between gender groups and between free/reduced-price lunch status groups was evaluated with the second, third, fourth, and fifth methods. Group mean differences based on the 5 methods were calculated and compared.  相似文献   

15.
Despite its importance to structural equation modeling, model evaluation remains underdeveloped in the Bayesian SEM framework. Posterior predictive p-values (PPP) and deviance information criteria (DIC) are now available in popular software for Bayesian model evaluation, but they remain underutilized. This is largely due to the lack of recommendations for their use. To address this problem, PPP and DIC were evaluated in a series of Monte Carlo simulation studies. The results show that both PPP and DIC are influenced by severity of model misspecification, sample size, model size, and choice of prior. The cutoffs PPP < 0.10 and ?DIC > 7 work best in the conditions and models tested here to maintain low false detection rates and misspecified model selection rates, respectively. The recommendations provided in this study will help researchers evaluate their models in a Bayesian SEM analysis and set the stage for future development and evaluation of Bayesian SEM fit indices.  相似文献   

16.
信度和效度是衡量一个测量工具质量的关键指标,教育认知诊断测验中的信度和效度研究近年来受到研究者的关注。诊断测验的信度系数基本上源自基于α系数的属性信度系数、经验属性信度系数、四分相关系数、模拟重测一致性和分类一致性指标;效度系数主要包括模拟判准率、分类准确性和理论构想效度等。教育认知诊断测验的信度和效度研究较新,仍存在着一定的不足且缺乏全面的比较研究,更缺少系统的评价体系。  相似文献   

17.
The assumption of conditional independence between the responses and the response times (RTs) for a given person is common in RT modeling. However, when the speed of a test taker is not constant, this assumption will be violated. In this article we propose a conditional joint model for item responses and RTs, which incorporates a covariance structure to explain the local dependency between speed and accuracy. To obtain information about the population of test takers, the new model was embedded in the hierarchical framework proposed by van der Linden ( 2007 ). A fully Bayesian approach using a straightforward Markov chain Monte Carlo (MCMC) sampler was developed to estimate all parameters in the model. The deviance information criterion (DIC) and the Bayes factor (BF) were employed to compare the goodness of fit between the models with two different parameter structures. The Bayesian residual analysis method was also employed to evaluate the fit of the RT model. Based on the simulations, we conclude that (1) the new model noticeably improves the parameter recovery for both the item parameters and the examinees’ latent traits when the assumptions of conditional independence between the item responses and the RTs are relaxed and (2) the proposed MCMC sampler adequately estimates the model parameters. The applicability of our approach is illustrated with an empirical example, and the model fit indices indicated a preference for the new model.  相似文献   

18.
The multiple indicators multiple causes (MIMIC) latent class analysis (LCA) model is an excellent classification method when researchers cannot find a "gold standard" to classify participants. The MIMIC-LCA model includes features of a typical LCA model and also introduces a new relation between the latent class and covariates. In other words, a logistic regression type of analysis between participants' categorical latent status and their background information is added. Detailed statistical setups of the MIMIC-LCA model and algorithmic procedures are derived. The model features, parameter estimations, and model selections for MIMIC-LCA models are also presented. Specifically, the MIMIC-LCA model is estimated by a generalized expectation-maximization algorithm under the maximum likelihood frameworks. A substantive application of the MIMIC-LCA model in diagnosing alcoholics and, in particular, examining potential risk factors for alcoholism is demonstrated.  相似文献   

19.
The main purpose of this article is to develop a Bayesian approach for a general multigroup nonlinear factor analysis model. Joint Bayesian estimates of the factor scores and the structural parameters subjected to some constraints across different groups are obtained simultaneously. A hybrid algorithm that combines the Metropolis-Hastings algorithm and the Gibbs sampler is implemented to produce these joint Bayesian estimates. It is shown that this algorithm is computationally efficient. The Bayes factor approach is introduced for comparing models under various degrees of invariance across groups. The Schwarz criterion (BIC), a simple and useful approximation of the Bayes factor, is calculated on the basis of simulated observations from the Gibbs sampler. Efficiency and flexibility of the proposed Bayesian procedure are illustrated by some simulation results and a real-life example.  相似文献   

20.
A paucity of research has compared estimation methods within a measurement invariance (MI) framework and determined if research conclusions using normal-theory maximum likelihood (ML) generalizes to the robust ML (MLR) and weighted least squares means and variance adjusted (WLSMV) estimators. Using ordered categorical data, this simulation study aimed to address these queries by investigating 342 conditions. When testing for metric and scalar invariance, Δχ2 results revealed that Type I error rates varied across estimators (ML, MLR, and WLSMV) with symmetric and asymmetric data. The Δχ2 power varied substantially based on the estimator selected, type of noninvariant indicator, number of noninvariant indicators, and sample size. Although some the changes in approximate fit indexes (ΔAFI) are relatively sample size independent, researchers who use the ΔAFI with WLSMV should use caution, as these statistics do not perform well with misspecified models. As a supplemental analysis, our results evaluate and suggest cutoff values based on previous research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号