首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
以概化理论和项目反应理论为代表的现代测验理论是在克服经典测验理论缺陷的基础上产生的。概化理论是在经典测验理论的基础上,引入实验设计和方差分析技术,对测评情境中的各类误差进行分解和控制的一种现代测量理论,其发展主要经历了一元概化理论和多元概化理论两个阶段。目前,其应用主要集中在评价、考试和评定量表编制三个领域。项目反应理论是在克服经典测验理论题目参数等指标的变异性基础上发展起来的一种现代测验理论,其发展经历了早期理论探索、理论初步形成和理论逐渐完善三个阶段。它主要用于处理分数等值和测验项目参数、测验和项目的质量的分析,剥离测验情境中评委特征对测验结果的影响,以及测查项目功能差异、编制适应性测验等。  相似文献   

2.
对教育测量理论的发展进行了综述,分析了经典测验理论、概化理论、项目反应理论和认知诊断理论的理论基础与实践应用优缺点,进而探讨了理论假设存在的问题.  相似文献   

3.
在实践中,对测验进行量化分析是教育测量的重要应用领域。经典测验理论作为教育测量中的重要方法一直广受国内外研究者的重视。本文通过介绍基于经典测验理论的R软件CTT程序包,对测验分析的基本流程和相关研究进展进行了阐述,同时运用学科测试数据进行了实例演示,详细说明了CTT软件包中项目分析、信度分析、多序列相关计算、CTT题目特征曲线的绘制、导出分数计算等功能。  相似文献   

4.
项目反应理论在大规模选拔性考试试题质量评价中具有经典测量理论所不具备的诸多优势,在国内外得到越来越多的应用。按照分析常模参照性测验的程序和方法,应用项目反应理论对贵阳市2011年高三英语模拟考试试题命题质量进行分析,再次证明了项目反应理论分析测验质量具有项目参数跨样本不变性、对被试特质水平的估计不受测验项目影响等优点,在基础教育考试命题工作中具有重要的价值与应用前景。  相似文献   

5.
题库的建立主要有两种方法:一种方法是依据经典测量理论来进行的,另一种方法则以项目反应理论为基础。以项目反应理论为基础的方法在许多方面比经典测量理论优越,但是它的技术复杂,工作量大,不如以经典理论为基础比较直观、简明和易掌握 可以说这两种方法各有千秋,所以它们都得以广泛的应用,比如在美国,学能测验(SAT)是以经典测量理论为基础的,而我们所熟悉的托福(TOEFL)考试则采用了项目反应理论。  相似文献   

6.
葛都 《教育发展研究》2004,24(7):148-149
随着20世纪70年代兴起的新一代统计分析理论及方法的发展与推广,长期以来在心理与教育测量领域中应用最广泛的"经典测验理论"(Classical Testing Theory,CTT)和以其为基础的标准化技术已无法满足现代测量的多样化需要.于是,在很大程度上弥补了传统经典测量理论不足的"概化理论"(Generalizability Theory,GT)逐渐受到众多研究者的关注和青睐,成为当前国际上非常盛行的新型测量理论.  相似文献   

7.
概化理论是独特的测量理论体系,对于分析测验结构合理性和探讨提升测验精度的方法是非常有帮助的,在内容和运用范围上是对经典测量理论的扩展和延伸。本文在介绍概化理论的基础上,结合教育部考试中心组织开发的《兴趣测验》,探讨了概化理论在测验设计中的作用。  相似文献   

8.
导言笔者连续写作了三篇论文探讨测验等值和连接的概念、程序、应用以及存在的问题等,本文是这一系列论文的第二篇。本系列论文取材于《一名业界人士对等值和连接的介绍——经典测量理论和项目反应理论入  相似文献   

9.
博采两种测量理论之长努力提高自考题库质量———高教自考“逻辑学”题库建设经验漆书青戴海崎丁树良谢旭升当前我国测验编制工作中通行的理论是真分数理论。这种经典测量理论,主要是在常模参照测验编制实践基础上发展起来的。它提出的有关项目分析、信度估计、效度验证...  相似文献   

10.
经典测验理论的局限性评析   总被引:1,自引:0,他引:1  
本文着重从实践应用角度对经典测验理论的一些不足进行系统的分析,并指出当前测验理论的发展方向。  相似文献   

11.
SYSTAT是一款集经典测量理论和项目反应理论为一身的统计软件。文章结合外语测试研究实践——TEM4语法词汇题的项目分析,介绍该软件的常用功能与操作方法,为推动现代信息技术与语言测试的整合提供技术支持。  相似文献   

12.
计算机信息技术课无纸化考试的研究   总被引:1,自引:0,他引:1  
介绍考试理论从经典测量到项目反应的发展,指出计算机化考试的必然性和优越性。对计算机考试如何在多媒体网络实验室实现,进行了较详细的阐述。  相似文献   

13.
高考是我国现阶段最有影响的高厉害大规模教育考试。因此,研究高考质量具有重大的意义。在过去十年中,我国学者对高考的信度和效度的研究多局限于运用古典考试理论。本文提出了运用项目反应理论进一步研究我国高考的信、效度的建议,并探讨了运用等值、链接等当代教育测量学技术,建立跨地区、跨年分的高考大型数据库的可能。这些方面的研究可以为高考改革及相关教育决策提供更多可靠信息。  相似文献   

14.
The Remote Associates Test (RAT) developed by Mednick and Mednick (1967) is known as a valid measure of creative convergent thinking. We developed a 30-item version of the RAT in Dutch with high internal consistency (Cronbach's alpha = 0.85) and applied both Classical Test Theory and Item Response Theory (IRT) to provide measures of item difficulty and discriminability, construct validity, and reliability. IRT was further used to construct a shorter version of the RAT, which comprises of 22 items but still shows good reliability and validity—as revealed by its relation to Raven's Advanced Progressive Matrices test, another insight-problem test, and Guilford's Alternative Uses Test.  相似文献   

15.
Scientific models and modeling play an important role in science, and students’ understanding of scientific models is essential for their understanding of scientific concepts. The measurement instrument of Students’ Understanding of Models in Science (SUMS), developed by Treagust, Chittleborough & Mamiala (International Journal of Science Education, 24(4):357–368, 2002), has commonly been used to measure SUMS. SUMS was developed using the Classical Test Theory (CTT). Considering the limitations of CTT, in this study we applied a Rasch model to validate SUMS further. SUMS was given to 629 students in 18 classes of grades 9 and 10 from six high schools in China. The results present both additional evidence for the validity and reliability of SUMS and specific aspects for further improvement. This approach of validation of a published instrument by Rasch measurement can be applied to other measurement instruments developed using CTT.  相似文献   

16.
This paper describes a procedure for automated test forms assembly based on Classical Test Theory (CTT). The procedure uses stratified random content sampling and test form pre-equating to ensure both content and psychometric equivalence in generating virtually unlimited parallel forms. The procedure extends the usefulness of CTT in automated test form construction, yielding classical item statistics based on representative sample distributions and pre-equated test forms with known psychometric characteristics. A rationale for the procedure is presented followed by an example application and discussion of psychometric considerations related to its use.  相似文献   

17.
Concept inventories hold tremendous promise for promoting the rigorous evaluation of teaching methods that might remedy common student misconceptions and promote deep learning. The measurements from concept inventories can be trusted only if the concept inventories are evaluated both by expert feedback and statistical scrutiny (psychometric evaluation). Classical Test Theory and Item Response Theory provide two psychometric frameworks for evaluating the quality of assessment tools. We discuss how these theories can be applied to assessment tools generally and then apply them to the Digital Logic Concept Inventory (DLCI). We demonstrate that the DLCI is sufficiently reliable for research purposes when used in its entirety and as a post-course assessment of students’ conceptual understanding of digital logic. The DLCI can also discriminate between students across a wide range of ability levels, providing the most information about weaker students’ ability levels.  相似文献   

18.
The use of surveys, questionnaires, and rating scales to measure important outcomes in higher education is pervasive, but reliability and validity information is often based on problematic Classical Test Theory approaches. Rasch Analysis, based on Item Response Theory, provides a better alternative for examining the psychometric quality of rating scales and informing scale improvements. This paper outlines a six-step process for using Rasch Analysis to review the psychometric properties of a rating scale. The Partial Credit Model and Andrich Rating Scale Model will be described in terms of the pyschometric information (i.e., reliability, validity, and item difficulty) and diagnostic indices generated. Further, this approach will be illustrated through the example of authentic data from a university-wide student evaluation of teaching.  相似文献   

19.
Assessment has been dominated by Classical Test Theory for the last half century although the radically different approach known as Rasch measurement briefly blossomed in England during the 1960s and 1970s. Its open development was stopped dead in the 1980s, whilst some work has continued almost surreptitiously. Elsewhere Rasch has assumed dominance. The purpose of this article is to discuss the major criticisms of the Rasch model, which led to its rejection by some, and to give responses to these criticisms whilst encouraging social scientists to appreciate its strengths. The original breakthrough by Georg Rasch in 1960 has been developed and extended to address every reasonable observational situation in the social sciences.  相似文献   

20.
哥本哈根学派复合安全理论的修正和演进   总被引:2,自引:0,他引:2  
高峻 《教学与研究》2005,5(10):89-96
以巴瑞.布赞和奥利.维夫为代表的欧洲安全研究———哥本哈根学派是活跃在国际安全研究领域的一支重要力量,他们由于提出了著名的“复合安全理论”而成为近些年欧洲安全研究领域中最为显赫的流派。本文首先将简要介绍“复合安全理论”提出的背景和理论演进过程;其后分析“古典复合安全理论”、“超越古典复合安全理论”以及后来的“地区复合安全理论”的内容和特点,以试图勾勒出其较完整的理论框架;文章的最后部分将对“复合安全理论”进行评论与分析。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号