首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The purpose of this study was to compare and evaluate three on-line pretest item calibration-scaling methods (the marginal maximum likelihood estimate with one expectation maximization [EM] cycle [OEM] method, the marginal maximum likelihood estimate with multiple EM cycles [MEM] method, and Stocking's Method B) in terms of itern parameter recovery when the item responses to the pretest items in the pool are sparse. Simulations of computerized adaptive tests were used to evaluate the results yielded by the three methods. The MEM method produced the smallest average total error in parameter estimation, and the OEM method yielded the largest total error.  相似文献   

2.
Detection of differential item functioning (DIF) on items intentionally constructed to favor one group over another was investigated on item parameter estimates obtained from two item response theory-based computer programs, LOGIST and BILOG. Signed- and unsigned-area measures based on joint maximum likelihood estimation, marginal maximum likelihood estimation, and two marginal maximum a posteriori estimation procedures were compared with each other to determine whether detection of DIF could be improved using prior distributions. Results indicated that item parameter estimates obtained using either prior condition were less deviant than when priors were not used. Differences in detection of DIF appeared to be related to item parameter estimation condition and to some extent to sample size.  相似文献   

3.
《教育实用测度》2013,26(2):199-210
When the item response theory (IRT) model uses the marginal maximum likelihood estimation, person parameters are usually treated as random parameters following a certain distribution as a prior distribution to estimate the structural parameters in the model. For example, both PARSCALE (Muraki &; Bock, 1999) and BILOG 3 (Mislevy &; Bock, 1990) use a standard normal distribution as a default person prior. When the fixed-item linking method is used with an IRT program having a fixed-person prior distribution, it biases person ability growth downward or upward depending on the direction of the growth due to the misspecification of the prior. This study demonstrated by simulation how much biasing impact there is on person ability growth from the use of the fixed prior distribution in fixed-item linking for mixed-format test data. In addition, the study demonstrated how to recover growth through an iterative prior update calibration procedure. This shows that fixed-item linking is still a viable linking method for a fixed-person prior IRT calibration.  相似文献   

4.
5.
Robust maximum likelihood (ML) and categorical diagonally weighted least squares (cat-DWLS) estimation have both been proposed for use with categorized and nonnormally distributed data. This study compares results from the 2 methods in terms of parameter estimate and standard error bias, power, and Type I error control, with unadjusted ML and WLS estimation methods included for purposes of comparison. Conditions manipulated include model misspecification, level of asymmetry, level and categorization, sample size, and type and size of the model. Results indicate that cat-DWLS estimation method results in the least parameter estimate and standard error bias under the majority of conditions studied. Cat-DWLS parameter estimates and standard errors were generally the least affected by model misspecification of the estimation methods studied. Robust ML also performed well, yielding relatively unbiased parameter estimates and standard errors. However, both cat-DWLS and robust ML resulted in low power under conditions of high data asymmetry, small sample sizes, and mild model misspecification. For more optimal conditions, power for these estimators was adequate.  相似文献   

6.
Research in covariance structure analysis suggests that nonnormal data will invalidate chi‐square tests and produce erroneous standard errors. However, much remains unknown about the extent to and the conditions under which highly skewed and kurtotic data can affect the parameter estimates, standard errors, and fit indices. Using actual kurtotic and skewed data and varying sample sizes and estimation methods, we found that (a) normal theory maximum likelihood (ML) and generalized least squares estimators were fairly consistent and almost identical, (b) standard errors tended to underestimate the true variation of the estimators, but the problem was not very serious for large samples (n = 1,000) and conservative (99%) confidence intervals, and (c) the adjusted chi‐square tests seemed to yield acceptable results with appropriate sample sizes.  相似文献   

7.
The choice of constraints used to identify a simple factor model can affect the shape of the likelihood. Specifically, under some nonzero constraints, standard errors may be inestimable even at the maximum likelihood estimate (MLE). For a broader class of nonzero constraints, symmetric normal approximations to the modal region may not be appropriate. A simple graphical technique to gain insight into the relative location of equivalent modes is introduced. Implications for estimation and inference in factor models, and latent trait models more generally, are discussed.  相似文献   

8.
This paper puts forward a Poisson-generalized Pareto (Poisson-GP) distribution. This new form of compound extreme value distribution expands the existing application of compound extreme value distribution, and can be applied to predicting financial risk, large insurance settlement and high-grade earthquake, etc. Compared with the maximum likelihood estimation (MLE) and compound moment estimation (CME), probability-weighted moment estimation (PWME) is used to estimate the parameters of the distribution function. The specific formulas are presented. Through Monte Carlo simulation with sample sizes 10, 20, 50, 100, 1 000, it is concluded that PWME is an efficient method and it behaves steadily. The mean square errors (MSE) of estimators by PWME are much smaller than those of estimators by CME, and there is no significant difference between PWME and MLE. Finally, an example of foreign exchange rate is given. For Dollar/Pound exchange rates from 1990-01-02 to 2006-12-29, this paper formulates the distribution function of the largest loss among the investment losses exceeding a certain threshold by Poisson-GP compound extreme value distribution, and obtains predictive values at different confidence levels.  相似文献   

9.
Marginal likelihood-based methods are commonly used in factor analysis for ordinal data. To obtain the maximum marginal likelihood estimator, the full information maximum likelihood (FIML) estimator uses the (adaptive) Gauss–Hermite quadrature or stochastic approximation. However, the computational burden increases rapidly as the number of factors increases, which renders FIML impractical for large factor models. Another limitation of the marginal likelihood-based approach is that it does not allow inference on the factors. In this study, we propose a hierarchical likelihood approach using the Laplace approximation that remains computationally efficient in large models. We also proposed confidence intervals for factors, which maintains the level of confidence as the sample size increases. The simulation study shows that the proposed approach generally works well.  相似文献   

10.
LISREL 8 invokes a ridge option when maximum likelihood or generalized least squares are used to estimate a structural equation model with a nonpositive definite covariance or correlation matrix. The implications of the ridge option for model fit statistics, parameter estimates, and standard errors are explored through the use of 2 examples. The results indicate that maximum likelihood estimates are quite stable with the ridge option, though fit statistics and standard errors vary considerably and therefore cannot be trusted. As a result of these findings, the application of the ridge method to structural equation models is not recommended.  相似文献   

11.
The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees' responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is repeatedly estimated from the simulation (or resampling) test data. The latter method will result in rather stable and accurate SE estimates as the number of replications increases, but requires cumbersome and time-consuming calculations. Instead of using the empirically determined method, the adequacy of using the analytical-based method in predicting the SEs for item parameter estimates was examined by comparing results produced from both approaches. The results indicated that the SEs yielded from both approaches were, in most cases, very similar, especially when they were applied to a generalized partial credit model. This finding encourages test practitioners and researchers to apply the analytically asymptotic SEs of item estimates to the context of item-linking studies, as well as to the method of quantifying the SEs of equating scores for the item response theory (IRT) true-score method. Three-dimensional graphical presentation for the analytical SEs of item estimates as the bivariate function of item difficulty together with item discrimination was also provided for a better understanding of several frequently used IRT models.  相似文献   

12.
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test characteristic curve (i.e., the IRT true‐score (TS) estimator). The five methods are compared using a simulation study and a real data example. Results indicated that the application of different methods can sometimes lead to different estimated cut scores, and that there can be some key differences in impact data when using the IRT TS estimator compared to other methods. It is suggested that one should carefully think about their choice of methods to estimate ability and cut scores because different methods have distinct features and properties. An important consideration in the application of Bayesian methods relates to the choice of the prior and the potential bias that priors may introduce into estimates.  相似文献   

13.
The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact traditional marginal maximum likelihood (ML) estimation of IRT model parameters, including sample size, with smaller samples generally being associated with lower parameter estimation accuracy, and inflated standard errors for the estimates. Given this deleterious impact of small samples on IRT model performance, use of these techniques with low-incidence populations, where it might prove to be particularly useful, estimation becomes difficult, especially with more complex models. Recently, a Pairwise estimation method for Rasch model parameters has been suggested for use with missing data, and may also hold promise for parameter estimation with small samples. This simulation study compared item difficulty parameter estimation accuracy of ML with the Pairwise approach to ascertain the benefits of this latter method. The results support the use of the Pairwise method with small samples, particularly for obtaining item location estimates.  相似文献   

14.
Ill conditioning of covariance and weight matrices used in structural equation modeling (SEM) is a possible source of inadequate performance of SEM statistics in nonasymptotic samples. A maximum a posteriori (MAP) covariance matrix is proposed for weight matrix regularization in normal theory generalized least squares (GLS) estimation. Maximum likelihood (ML), GLS, and regularized GLS test statistics (RGLS and rGLS) are studied by simulation in a 15-variable, 3-factor model with 15 levels of sample size varying from 60 to 100,000. A key result showed that in terms of nominal rejection rates, RGLS outperformed ML at all sample sizes below 500, and GLS at most sample sizes below 500. In larger samples, their performance was equivalent. The second regularization methodology (rGLS) performed well asymptotically, but poorly in small samples. Regularization in SEM deserves further study.  相似文献   

15.
Classical accounts of maximum likelihood (ML) estimation of structural equation models for continuous outcomes involve normality assumptions: standard errors (SEs) are obtained using the expected information matrix and the goodness of fit of the model is tested using the likelihood ratio (LR) statistic. Satorra and Bentler (1994) introduced SEs and mean adjustments or mean and variance adjustments to the LR statistic (involving also the expected information matrix) that are robust to nonnormality. However, in recent years, SEs obtained using the observed information matrix and alternative test statistics have become available. We investigate what choice of SE and test statistic yields better results using an extensive simulation study. We found that robust SEs computed using the expected information matrix coupled with a mean- and variance-adjusted LR test statistic (i.e., MLMV) is the optimal choice, even with normally distributed data, as it yielded the best combination of accurate SEs and Type I errors.  相似文献   

16.
The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a maximum likelihood estimate. Other advantages over the traditional approaches are discussed. More important, we demonstrate through examples how to use the freely available software WinBUGS to obtain Bayesian results for estimation and model comparison. Simulation studies are conducted to assess the empirical performances of the approach for situations with various sample sizes and prior inputs.  相似文献   

17.
This simulation study demonstrates how the choice of estimation method affects indexes of fit and parameter bias for different sample sizes when nested models vary in terms of specification error and the data demonstrate different levels of kurtosis. Using a fully crossed design, data were generated for 11 conditions of peakedness, 3 conditions of misspecification, and 5 different sample sizes. Three estimation methods (maximum likelihood [ML], generalized least squares [GLS], and weighted least squares [WLS]) were compared in terms of overall fit and the discrepancy between estimated parameter values and the true parameter values used to generate the data. Consistent with earlier findings, the results show that ML compared to GLS under conditions of misspecification provides more realistic indexes of overall fit and less biased parameter values for paths that overlap with the true model. However, despite recommendations found in the literature that WLS should be used when data are not normally distributed, we find that WLS under no conditions was preferable to the 2 other estimation procedures in terms of parameter bias and fit. In fact, only for large sample sizes (N = 1,000 and 2,000) and mildly misspecified models did WLS provide estimates and fit indexes close to the ones obtained for ML and GLS. For wrongly specified models WLS tended to give unreliable estimates and over-optimistic values of fit.  相似文献   

18.
贝叶斯推断中边际似然函数涉及到维数较高的复杂积分的计算,因而精确地计算边际似然函数往往有困难.经常会选择一种近似方法来估计边际似然函数,本文介绍边际似然函数的几种近似估计方法.  相似文献   

19.
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect uniform and nonuniform DIF under MIRT models. The Type I error and power rates for Lord's Wald test were investigated under various simulation conditions, including different DIF types and magnitudes, different means and correlations of two ability parameters, and different sample sizes. Furthermore, English usage data were analyzed to illustrate the use of Lord's Wald test with the two estimation approaches.  相似文献   

20.
Maximum likelihood algorithms for use with missing data are becoming commonplace in microcomputer packages. Specifically, 3 maximum likelihood algorithms are currently available in existing software packages: the multiple-group approach, full information maximum likelihood estimation, and the EM algorithm. Although they belong to the same family of estimator, confusion appears to exist over the differences among the 3 algorithms. This article provides a comprehensive, nontechnical overview of the 3 maximum likelihood algorithms. Multiple imputation, which is frequently used in conjunction with the EM algorithm, is also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号