共查询到20条相似文献,搜索用时 250 毫秒
1.
以概化理论和项目反应理论为代表的现代测验理论是在克服经典测验理论缺陷的基础上产生的。概化理论是在经典测验理论的基础上,引入实验设计和方差分析技术,对测评情境中的各类误差进行分解和控制的一种现代测量理论,其发展主要经历了一元概化理论和多元概化理论两个阶段。目前,其应用主要集中在评价、考试和评定量表编制三个领域。项目反应理论是在克服经典测验理论题目参数等指标的变异性基础上发展起来的一种现代测验理论,其发展经历了早期理论探索、理论初步形成和理论逐渐完善三个阶段。它主要用于处理分数等值和测验项目参数、测验和项目的质量的分析,剥离测验情境中评委特征对测验结果的影响,以及测查项目功能差异、编制适应性测验等。 相似文献
2.
对教育测量理论的发展进行了综述,分析了经典测验理论、概化理论、项目反应理论和认知诊断理论的理论基础与实践应用优缺点,进而探讨了理论假设存在的问题. 相似文献
3.
4.
5.
6.
随着20世纪70年代兴起的新一代统计分析理论及方法的发展与推广,长期以来在心理与教育测量领域中应用最广泛的"经典测验理论"(Classical Testing Theory,CTT)和以其为基础的标准化技术已无法满足现代测量的多样化需要.于是,在很大程度上弥补了传统经典测量理论不足的"概化理论"(Generalizability Theory,GT)逐渐受到众多研究者的关注和青睐,成为当前国际上非常盛行的新型测量理论. 相似文献
7.
概化理论是独特的测量理论体系,对于分析测验结构合理性和探讨提升测验精度的方法是非常有帮助的,在内容和运用范围上是对经典测量理论的扩展和延伸。本文在介绍概化理论的基础上,结合教育部考试中心组织开发的《兴趣测验》,探讨了概化理论在测验设计中的作用。 相似文献
8.
9.
博采两种测量理论之长努力提高自考题库质量———高教自考“逻辑学”题库建设经验漆书青戴海崎丁树良谢旭升当前我国测验编制工作中通行的理论是真分数理论。这种经典测量理论,主要是在常模参照测验编制实践基础上发展起来的。它提出的有关项目分析、信度估计、效度验证... 相似文献
10.
经典测验理论的局限性评析 总被引:1,自引:0,他引:1
纪凌开 《湖北大学成人教育学院学报》2005,23(4):64-66
本文着重从实践应用角度对经典测验理论的一些不足进行系统的分析,并指出当前测验理论的发展方向。 相似文献
11.
SYSTAT是一款集经典测量理论和项目反应理论为一身的统计软件。文章结合外语测试研究实践——TEM4语法词汇题的项目分析,介绍该软件的常用功能与操作方法,为推动现代信息技术与语言测试的整合提供技术支持。 相似文献
12.
计算机信息技术课无纸化考试的研究 总被引:1,自引:0,他引:1
介绍考试理论从经典测量到项目反应的发展,指出计算机化考试的必然性和优越性。对计算机考试如何在多媒体网络实验室实现,进行了较详细的阐述。 相似文献
13.
高考是我国现阶段最有影响的高厉害大规模教育考试。因此,研究高考质量具有重大的意义。在过去十年中,我国学者对高考的信度和效度的研究多局限于运用古典考试理论。本文提出了运用项目反应理论进一步研究我国高考的信、效度的建议,并探讨了运用等值、链接等当代教育测量学技术,建立跨地区、跨年分的高考大型数据库的可能。这些方面的研究可以为高考改革及相关教育决策提供更多可靠信息。 相似文献
14.
Soghra Akbari Chermahini Marian Hickendorff Bernhard Hommel 《Thinking Skills and Creativity》2012,7(3):177-186
The Remote Associates Test (RAT) developed by Mednick and Mednick (1967) is known as a valid measure of creative convergent thinking. We developed a 30-item version of the RAT in Dutch with high internal consistency (Cronbach's alpha = 0.85) and applied both Classical Test Theory and Item Response Theory (IRT) to provide measures of item difficulty and discriminability, construct validity, and reliability. IRT was further used to construct a shorter version of the RAT, which comprises of 22 items but still shows good reliability and validity—as revealed by its relation to Raven's Advanced Progressive Matrices test, another insight-problem test, and Guilford's Alternative Uses Test. 相似文献
15.
Silin Wei Xiufeng Liu Yuane Jia 《International Journal of Science and Mathematics Education》2014,12(5):1067-1082
Scientific models and modeling play an important role in science, and students’ understanding of scientific models is essential for their understanding of scientific concepts. The measurement instrument of Students’ Understanding of Models in Science (SUMS), developed by Treagust, Chittleborough & Mamiala (International Journal of Science Education, 24(4):357–368, 2002), has commonly been used to measure SUMS. SUMS was developed using the Classical Test Theory (CTT). Considering the limitations of CTT, in this study we applied a Rasch model to validate SUMS further. SUMS was given to 629 students in 18 classes of grades 9 and 10 from six high schools in China. The results present both additional evidence for the validity and reliability of SUMS and specific aspects for further improvement. This approach of validation of a published instrument by Rasch measurement can be applied to other measurement instruments developed using CTT. 相似文献
16.
This paper describes a procedure for automated test forms assembly based on Classical Test Theory (CTT). The procedure uses stratified random content sampling and test form pre-equating to ensure both content and psychometric equivalence in generating virtually unlimited parallel forms. The procedure extends the usefulness of CTT in automated test form construction, yielding classical item statistics based on representative sample distributions and pre-equated test forms with known psychometric characteristics. A rationale for the procedure is presented followed by an example application and discussion of psychometric considerations related to its use. 相似文献
17.
Concept inventories hold tremendous promise for promoting the rigorous evaluation of teaching methods that might remedy common student misconceptions and promote deep learning. The measurements from concept inventories can be trusted only if the concept inventories are evaluated both by expert feedback and statistical scrutiny (psychometric evaluation). Classical Test Theory and Item Response Theory provide two psychometric frameworks for evaluating the quality of assessment tools. We discuss how these theories can be applied to assessment tools generally and then apply them to the Digital Logic Concept Inventory (DLCI). We demonstrate that the DLCI is sufficiently reliable for research purposes when used in its entirety and as a post-course assessment of students’ conceptual understanding of digital logic. The DLCI can also discriminate between students across a wide range of ability levels, providing the most information about weaker students’ ability levels. 相似文献
18.
Carol Van Zile-Tamsen 《Research in higher education》2017,58(8):922-933
The use of surveys, questionnaires, and rating scales to measure important outcomes in higher education is pervasive, but reliability and validity information is often based on problematic Classical Test Theory approaches. Rasch Analysis, based on Item Response Theory, provides a better alternative for examining the psychometric quality of rating scales and informing scale improvements. This paper outlines a six-step process for using Rasch Analysis to review the psychometric properties of a rating scale. The Partial Credit Model and Andrich Rating Scale Model will be described in terms of the pyschometric information (i.e., reliability, validity, and item difficulty) and diagnostic indices generated. Further, this approach will be illustrated through the example of authentic data from a university-wide student evaluation of teaching. 相似文献
19.
Panayiotis Panayides Colin Robinson Peter Tymms 《British Educational Research Journal》2010,36(4):611-626
Assessment has been dominated by Classical Test Theory for the last half century although the radically different approach known as Rasch measurement briefly blossomed in England during the 1960s and 1970s. Its open development was stopped dead in the 1980s, whilst some work has continued almost surreptitiously. Elsewhere Rasch has assumed dominance. The purpose of this article is to discuss the major criticisms of the Rasch model, which led to its rejection by some, and to give responses to these criticisms whilst encouraging social scientists to appreciate its strengths. The original breakthrough by Georg Rasch in 1960 has been developed and extended to address every reasonable observational situation in the social sciences. 相似文献
20.
哥本哈根学派复合安全理论的修正和演进 总被引:2,自引:0,他引:2
以巴瑞.布赞和奥利.维夫为代表的欧洲安全研究———哥本哈根学派是活跃在国际安全研究领域的一支重要力量,他们由于提出了著名的“复合安全理论”而成为近些年欧洲安全研究领域中最为显赫的流派。本文首先将简要介绍“复合安全理论”提出的背景和理论演进过程;其后分析“古典复合安全理论”、“超越古典复合安全理论”以及后来的“地区复合安全理论”的内容和特点,以试图勾勒出其较完整的理论框架;文章的最后部分将对“复合安全理论”进行评论与分析。 相似文献