首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
The development of cognitive diagnostic‐computerized adaptive testing (CD‐CAT) has provided a new perspective for gaining information about examinees' mastery on a set of cognitive attributes. This study proposes a new item selection method within the framework of dual‐objective CD‐CAT that simultaneously addresses examinees' attribute mastery status and overall test performance. The new procedure is based on the Jensen‐Shannon (JS) divergence, a symmetrized version of the Kullback‐Leibler divergence. We show that the JS divergence resolves the noncomparability problem of the dual information index and has close relationships with Shannon entropy, mutual information, and Fisher information. The performance of the JS divergence is evaluated in simulation studies in comparison with the methods available in the literature. Results suggest that the JS divergence achieves parallel or more precise recovery of latent trait variables compared to the existing methods and maintains practical advantages in computation and item pool usage.  相似文献   

2.
The intent of this research was to find an item selection procedure in the multidimensional computer adaptive testing (CAT) framework that yielded higher precision for both the domain and composite abilities, had a higher usage of the item pool, and controlled the exposure rate. Five multidimensional CAT item selection procedures (minimum angle; volume; minimum error variance of the linear combination; minimum error variance of the composite score with optimized weight; and Kullback‐Leibler information) were studied and compared with two methods for item exposure control (the Sympson‐Hetter procedure and the fixed‐rate procedure, the latter simply refers to putting a limit on the item exposure rate) using simulated data. The maximum priority index method was used for the content constraints. Results showed that the Sympson‐Hetter procedure yielded better precision than the fixed‐rate procedure but had much lower item pool usage and took more time. The five item selection procedures performed similarly under Sympson‐Hetter. For the fixed‐rate procedure, there was a trade‐off between the precision of the ability estimates and the item pool usage: the five procedures had different patterns. It was found that (1) Kullback‐Leibler had better precision but lower item pool usage; (2) minimum angle and volume had balanced precision and item pool usage; and (3) the two methods minimizing the error variance had the best item pool usage and comparable overall score recovery but less precision for certain domains. The priority index for content constraints and item exposure was implemented successfully.  相似文献   

3.
Cognitive diagnosis models (CDMs) have been developed to evaluate the mastery status of individuals with respect to a set of defined attributes or skills that are measured through testing. When individuals are repeatedly administered a cognitive diagnosis test, a new class of multilevel CDMs is required to assess the changes in their attributes and simultaneously estimate the model parameters from the different measurements. In this study, the most general CDM of the generalized deterministic input, noisy “and” gate (G‐DINA) model was extended to a multilevel higher order CDM by embedding a multilevel structure into higher order latent traits. A series of simulations based on diverse factors was conducted to assess the quality of the parameter estimation. The results demonstrate that the model parameters can be recovered fairly well and attribute mastery can be precisely estimated if the sample size is large and the test is sufficiently long. The range of the location parameters had opposing effects on the recovery of the item and person parameters. Ignoring the multilevel structure in the data by fitting a single‐level G‐DINA model decreased the attribute classification accuracy and the precision of latent trait estimation. The number of measurement occasions had a substantial impact on latent trait estimation. Satisfactory model and person parameter recoveries could be achieved even when assumptions of the measurement invariance of the model parameters over time were violated. A longitudinal basic ability assessment is outlined to demonstrate the application of the new models.  相似文献   

4.
The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are multidimensional and binary. This study proposes a very general DIF assessment method in the CDM framework which is applicable for various CDMs, more than two groups of examinees, and multiple grouping variables that are categorical, continuous, observed, or latent. The parameters can be estimated with Markov chain Monte Carlo algorithms implemented in the freeware WinBUGS. Simulation results demonstrated a good parameter recovery and advantages in DIF assessment for the new method over the Wald method.  相似文献   

5.
This article summarizes the continuous latent trait IRT approach to skills diagnosis as particularized by a representative variety of continuous latent trait models using item response functions (IRFs). First, several basic IRT-based continuous latent trait approaches are presented in some detail. Then a brief summary of estimation, model checking, and assessment scoring aspects are discussed. Finally, the University of California at Berkeley multidimensional Rasch-model-grounded SEPUP middle school science-focused embedded assessment project is briefly described as one significant illustrative application.  相似文献   

6.
认知诊断模型是新一代心理测量理论——认知诊断理论的核心。它可分为潜在特质模型和潜在分类模型两大类。其中,潜在分类模型主要用于分析被试的作答过程从而探讨被试的潜在知识结构,克服了CCT和IRT的缺陷,开创了教育与心理测量领域新的里程碑。本文首先介绍作为该类模型基础的规则空间模型,然后集中探讨在此基础上发展起来的较新的潜在分类模型,最后对这类模型进行了评价和展望。  相似文献   

7.
The goal of the current study was to introduce a new stopping rule for computerized adaptive testing. The predicted standard error reduction stopping rule (PSER) uses the predictive posterior variance to determine the reduction in standard error that would result from the administration of additional items. The performance of the PSER was compared to that of the minimum standard error stopping rule and a modified version of the minimum information stopping rule in a series of simulated adaptive tests, drawn from a number of item pools. Results indicate that the PSER makes efficient use of CAT item pools, administering fewer items when predictive gains in information are small and increasing measurement precision when information is abundant.  相似文献   

8.
The traditional kappa statistic in assessing interrater agreement is not adequate when multiraters and multiattributes are involved. In this article, latent trait models are proposed to assess the multirater multiattribute (MRMA) agreement. Data from the Third International Mathematics and Science Studies (TIMSS) are used to illustrate the application of the latent trait models. Results showed that among four possible latent trait models, the correlated uniqueness model had the best fit to assess the MRMA agreement. Furthermore, in coding a set of different attributes, the coding accuracy within the same rater may differ across attributes. Likewise, when different raters rate the same attribute, the accuracy in rating varies among the raters. Thus, the latent models provide us with a more refined and accurate assessment of interrater agreement. The application of the latent trait models is important in school psychology research and intervention because accurate assessment of children's functioning is fundamental in designing effective intervention strategies. © 2007 Wiley Periodicals, Inc. Psychol Schs 44: 515–525, 2007.  相似文献   

9.
认知诊断通过分析被试的项目作答反应,推断被试的认知属性掌握状态,为学习困难学生设计补救教学提供了非常有价值的信息。本文作者在探讨了小学生多位数乘法计算能力的认知属性、编制了2份相同考核模式的认知诊断测验后,选择江西某小学310名高年级学生为被试,先施测第1份认知诊断测验,采用DINA模型,自编参数估计程序进行诊断,得到了每一个被试的属性掌握模式分类及全体被试在各个属性上的掌握情况。然后设计和实施补救教学,在实施补救教学后再施测第2份认知诊断测验以检验补救效果。研究发现:(1)该小学高年级学生对0XN运算法则、多位数乘以两位数的运算程序、乘法进位认知属性的掌握不太理想,特别是乘法进位。(2)属性掌握模式中属全部掌握模式的被试人数占86.47%,其余被试均分类于存在各种认知不足的掌握模式。(3)比较两份认知诊断测验报告,结果表明在认知诊断指导下的补救教学有针对性,补救后被试正确作答项目增多,属性掌握个数也有所增加,补救效果良好。  相似文献   

10.
Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal conditions using four latent trait estimation procedures and also evaluated whether the test composition, in terms of item difficulty level, reduces estimation error. Most importantly, both true and estimated item parameters were examined to disentangle the effects of latent trait estimation error from item parameter estimation error. Results revealed that non-normal latent trait distributions produced a considerably larger degree of latent trait estimation error than normal data. Estimated item parameters tended to have comparable precision to true item parameters, thus suggesting that increased latent trait estimation error results from latent trait estimation rather than item parameter estimation.  相似文献   

11.
Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of raters who may be of substantive interest to those studying the experiential, cognitive, and contextual aspects of ratings. We employ two data sources in our demonstration—simulated data and data from a large‐scale state‐wide writing assessment. We apply latent trait models to these data to identify examples of rater leniency, centrality, inaccuracy, and differential dimensionality; and we investigate the association between rater training procedures and the manifestation of rater effects in the real data.  相似文献   

12.
The qualitative characterization of individual performance that is central to modem psychological theory is not adequately modeled by traditional psychometric theory that assumes, among other things, unidimensionality. In the present study, data are presented that are more adequately modeled by HYBRID, a model that incorporates both latent trait and latent class components. The latent classes were defined by a cognitive analysis of the understanding that individuals have for a circumscribed domain. In addition to providing a better statistical fit, the analysis also improves the amount of diagnostic information available for a given individual.  相似文献   

13.
BP神经网络是目前应用最广泛的人工神经网络模型之一,在分类和识别上表现出良好的特性,因此被研究者用于认知诊断评估以对被试进行诊断分类。通过模拟研究,考查属性个数、属性层级关系、测验长度、题目质量、测试样本量5个因素对BP神经网络在认知诊断中分类准确性的影响。结果表明:1)基于BP神经网络的认知诊断分类准确率不依赖于测试样本量;2)题目质量和测验长度对BP神经网络的诊断准确率有显著的积极影响;3)属性个数对BP神经网络的分类准确率有消极影响;4)题目质量一定程度上会影响BP诊断方法在不同属性层级结构上的分类准确率。  相似文献   

14.
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that there is significant inflation in false negatives as both passers and failers continually take a test, especially for examinees closer to the passing score. Several passing policies are investigated to control the inflation of false negatives while maintaining low false‐positive rates for fixed‐length tests. Lastly, recommendations are made for testing professionals who wish to utilize the rigorous nature of the continuous testing framework while also avoiding the inflation of qualified examinees failing.  相似文献   

15.
To diagnose the English as a Foreign Language (EFL) reading ability of Chinese high-school students, the study explored how an educational theory, the revised taxonomy of educational objectives, could be used to create the attribute list. Q-matrices were proposed and refined qualitatively and quantitatively. The final Q-matrix specified the relationship between 53 reading items and 9 cognitive attributes. Thereafter, 978 examinees’ responses were calibrated by cognitive diagnosis models (CDMs) to explore their strengths and weaknesses in EFL reading. Results showed strengths and weaknesses on the 9 attributes of the sampled population, examinees at three proficiency levels and individual learners. A diagnostic score report was also developed to communicate multi-layered information to various stakeholders. The goodness of fit of the selected CDM was evaluated from multiple measures. The results provide empirical evidence for the utility of educational theories in cognitive diagnosis, and the feasibility of retrofitting non-diagnostic tests for diagnostic purposes in language testing. In addition, the study also demonstrates procedures of model selection and a post-hoc approach of model verification in language diagnosis.  相似文献   

16.
17.
This study evaluates latent differential equation models on binary and ordinal data. Binary and ordinal data are widely used in psychology research and many statistical models have been developed, such as the probit model and the logit model. We combine the latent differential equation model with the probit model through a threshold approach, and then compare the threshold model with a naive model, which blindly treats binary and ordinal data as continuous. Simulation results suggest that the naive model leads to bias on binary data and on ordinal data with fewer than 5 levels, whereas the threshold model is unbiased and efficient for binary and ordinal data. Two example analyses on empirical binary data and ordinal data show that the threshold model also has better external validity. The R code for the threshold model is provided.  相似文献   

18.
Fu Chen  Yue Yan 《教育心理学》2017,37(2):128-144
The current study focuses on developing the learning progression of number sense for primary school students, and it applies a cognitive diagnostic model, the rule space model, to data analysis. The rule space model analysis firstly extracted nine cognitive attributes and their hierarchy model from the analysis of previous research and the mathematics textbook used in Beijing. A cognitive diagnostic test for number sense was then developed based upon the cognitive attributes. Finally, the model was used to analyse a sample of 1207 Chinese primary school students’ observed item responses to identify their knowledge states and to validate and modify the hypothesised learning progression. The results showed that the test was of good psychometric quality, and that the hypothesised learning progression was generally validated. By applying the rule space model, the hypothesised learning progression was modified at each level. The results also showed that students in grade 3, grade 4 and grade 5 were mainly classified into level 1 and level 2, level 2–level 4 and level 5 of the modified learning progression, respectively. These results suggest the feasibility and benefits of using cognitive diagnostic models to develop learning progressions.  相似文献   

19.
This paper proposes two new item selection methods for cognitive diagnostic computerized adaptive testing: the restrictive progressive method and the restrictive threshold method. They are built upon the posterior weighted Kullback‐Leibler (KL) information index but include additional stochastic components either in the item selection index or in the item selection procedure. Simulation studies show that both methods are successful at simultaneously suppressing overexposed items and increasing the usage of underexposed items. Compared to item selection based upon (1) pure KL information and (2) the Sympson‐Hetter method, the two new methods strike a better balance between item exposure control and measurement accuracy. The two new methods are also compared with Barrada et al.'s (2008) progressive method and proportional method.  相似文献   

20.
In this ITEMS module, we provide a didactic overview of the specification, estimation, evaluation, and interpretation steps for diagnostic measurement/classification models (DCMs), which are a promising psychometric modeling approach. These models can provide detailed skill‐ or attribute‐specific feedback to respondents along multiple latent dimensions and hold theoretical and practical appeal for a variety of fields. We use a current unified modeling framework—the log‐linear cognitive diagnosis model (LCDM)—as well as a series of quality‐control checklists for data analysts and scientific users to review the foundational concepts, practical steps, and interpretational principles for these models. We demonstrate how the models and checklists can be applied in real‐life data‐analysis contexts. A library of macros and supporting files for Excel, SAS, and Mplus are provided along with video tutorials for key practices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号