期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

高燕《苏州教育学院学报》2014,(5):82-84

完形填空试题由于在命题、实施、评卷、结果分析等方面具有客观、便利等优点,因而被广泛应用于外语教学和测试中。但是目前充斥市场的绝大多数完形填空试题效度不高,主要原因就是试题的考点层次不高,效度偏低。根据李筱菊提出的完形填空考点层次理论设计一道完形填空试题,并选择某高校的学生进行试测,重点分析了答题正确率和失分原因,从实证的角度得出通过提高考点层次来提升完形填空试题考点效度的方法。应着重培养学生在高层次考点上的能力,从而提高英语学习者的综合英语水平。相似文献

2.

The Validity of National Curriculum Assessment 总被引：3，自引：1，他引：3

Gordon Stobart 《British Journal of Educational Studies》2001,49(1):26-39

This paper reviews the validity of National Curriculum assessment in England. It works with the concept of 'consequential validity' (Messick, 1989) which incorporates both conventional 'reliability'issues and the use to which any assessment is put. The review uses the eight stage 'threats to validity'model developed by Crooks, Kane and Cohen (1996). The complexity of National Curriculum assessment makes evaluation difficult. These assessments are used for a variety of purposes so that the 'consequential'aspects are compounded. National Curriculum assessment also involves both Teacher Assessment and tests – each of which has strengths and limitations in relation to validity. The main finding is that the validity of National Curriculum assessment hinges on the balance between Teacher Assessment and testing. Between them they can meet Crooks et al.' s requirements of a valid assessment system. The current emphasis on the use of test results for school accountability and as a measure of national standards has undermined Teacher Assessment to a point at which the validity of the system is in question. 相似文献

3.

A Validity Argument in Support of the Use of College Admissions Test Scores for Federal Accountability

Wayne J. Camara Krista Mattern Michelle Croft Sara Vispoel Paul Nichols 《Educational Measurement》2019,38(4):12-26

In 2018, 26 states administered a college admissions test to all public school juniors. Nearly half of those states proposed to use those scores as their academic achievement indicators for federal accountability under the Every Student Succeeds Act (ESSA); many others are planning to use those scores for other accountability purposes. Accountability encompasses a number of different uses and subsumes a variety of claims. For states proposing to use summative tests for accountability, a validity argument needs to be developed, which entails delineating each specific use of test scores associated with accountability, identifying appropriate evidence, and offering a rebuttal to counterclaims. The aim of this article is to support states in developing a validity argument for use of college admission test scores for accountability by identifying claims that are applicable across states, along with summarizing existing evidence as it relates to each of these claims. As outlined by The Standards for Educational and Psychological Testing, multiple sources of evidence are used to address each claim. A series of threats to the validity argument, including weaker alignment with content standards and potential influences in narrowing teaching, are reviewed. Finally, the article contrasts validity evidence, primarily from research on the ACT, with regulatory requirements from ESSA. The Standards and guidance addressing the use of a “nationally recognized high school academic assessment” (Elementary and Secondary Education Act (ESEA), Negotiated Rulemaking Committee; Department of Education) are the primary sources for the organization of validity evidence. 相似文献

4.

聋哑人体质测试指标的效标关联效度研究

程明吉楼方芳李文广《内江师范学院学报》2010,25(10):100-103

以2005年《国民体质测定标准》（成年人）的体质监测指标体系为基础,运用数理统计法,采用随机抽样的方式对聋哑人体质测试指标进行效标关联效度研究.研究主要包括聋哑人体质测试使用成人指标体系的可操作性实验、聋哑人体质测试指标的效标关联效度研究、小样本的预测试和大样本全面测试的实效性验证三个方面,旨在构建适合聋哑人的体质测试指标体系,为聋哑人体质测试指标体系的建立、试点和全面推广提供帮助. 相似文献

5.

High-Stakes Testing Accommodations: Validity Versus Disabled Rights

《教育实用测度》2013,26(2):93-120

Traditionally, measurement specialists have provided testing accommodations for examinees with physical disabilities such as blindness or impaired mobility. Following passage of the Americans with Disabilities Act of 1990, advocates for the disabled have argued that federal law also requires testing accommodations for mental disabilities such as dyslexia and other learning disabilities. Such requested accommodations have included readers, calculators, word processors, and additional time. But these accommodations may affect test validity, requiring measurement specialists to balance the social goal of integrating the disabled against the measurement goal of accurate test score interpretation. Although the courts have provided some guidance regarding testing accommodation requirements for the disabled, they have not yet addressed the issue of where to draw the line on accommodations for mental disabilities. This article explores the measurement problems associated with granting accommodations for mental disabilities, uses existing case law to construct a legal framework for considering such accommodations, arid discusses the advantages and-disadvantages of alternative strategies for handling testing accommodation requests. 相似文献

6.

The Evolution of Validity Theory: Public School Testing,the Courts,and Incompatible Interpretations

《Educational Assessment》2013,18(2):149-165

Professional measurement standards have evolved during the past 5 decades, creating a more unitary yet nebulous conception of validation. Concurrently, due to the increase of high-stakes testing in public schools, the courts have been forced to rule on the appropriateness of decisions emanating from tests. However, the courts often have failed to apply current validation theory in rendering decisions, preferring the convenience and clarity of earlier perspectives of validity. This rift between validity theory and judicial interpretation threatens to grow into a chasm as more complex views of validation prevail in the profession. Modem measurement practitioners stand astride this chasm in their efforts to implement test validation procedures that are cost effective, legally defensible, and consistent with state-of-the-art theory. 相似文献

7.

论法的要素与法律规范有效性

黄捷车丽华《湖南师范大学社会科学学报》2001,30(3):69-73

法的要素是由法律规范、概念和原则构成的，其中法律规范是最主要、最基本的要素。法律规范除具有本身的含义、逻辑结构范式和种类外，有效性则是贯穿其始终的关键所在。法律规范的有效性应包括应然和实然两方面。应然有效性是正义和秩序的综合体，就实然有效性而言，如果一项法律规范本质上与应然有效性同一，则法律规范有效（或生效）。反之，则法律规范无效（或失效）。在法的要素中，为确保法律规范具有效性。应做到法律规范应然与突然、本质与形式有效的完美结合。相似文献

8.

高考英语写作项目构念效度刍议

王晓军《江西广播电视大学学报》2009,(1)

高考规模大,社会影响深远,其重要性不容忽视.因此有硌耍对其测试效度进行科学论证.以取信于民.本文依据英语测试理论,时2008年宁夏高考英语写作项目的阅卷信度、效度以及评阅情况进行分析,力图验证该测试的效度是否符合标准要求. 相似文献

9.

On the validity of useless tests

Stephen G. Sireci 《Assessment in Education: Principles, Policy & Practice》2016,23(2):226-235

A misconception exists that validity may refer only to the interpretation of test scores and not to the uses of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test scores remains essential today. However, test scores are not interpreted and then ignored; rather, their interpretations lead to actions. Thus, a modern definition of validity needs to describe the validation of test score interpretations as a necessary, but insufficient, step en route to validating the uses of test scores for their intended purposes. To ignore test use in defining validity is tantamount to defining validity for ‘useless’ tests. The current definition of validity stipulated in the 2014 version of the Standards for Educational and Psychological Testing properly describes validity in terms of both interpretations and uses, and provides a sufficient starting point for validation. 相似文献

10.

Validity Issues in Computer-Based Testing

Kristen L. Huff Stephen G. Sireci 《Educational Measurement》2001,20(3):16-25

Advances in technology are stimulating the development of complex, computerized assessments. The prevailing rationales for developing computer-based assessments are improved measurement and increased efficiency. In the midst of this measurement revolution, test developers and evaluators must revisit the notion of validity. In this article, we discuss the potential positive and negative effects computer-based testing could have on validity, review the literature regarding validation perspectives in computer-based testing, and provide suggestions regarding how to evaluate the contributions of computer-based testing to more valid measurement practices. We conclude that computer-based testing shows great promise for enhancing validity, but at this juncture, it remains equivocal whether technological innovations in assessment have led to more valid measurement. 相似文献

11.

浅论校内英语测试的信度和效度

吴文辉《延安教育学院学报》2010,24(5):90-91,93

语言测试是语言教学的重要环节,是测量学生语言习得成果的重要手段。衡量语言测试的关键是看它的信度和效度,好的测试是信度和效度的合理平衡的结果。拟就大学英语校内测试在信度和效度上的不足谈自己的看法,并提出相应的改进方法。相似文献

12.

构建基于效度的高职商务英语专业写作能力校内测试

许进王锦霞《山东教育学院学报》2008,23(6):103-106

以交际语言测试理论为根本依据,对测试效度构建的类型、重要性和动态的实现过程进行了较全面地研究：结合校内成绩测试对效度的侧重要求以及对高职商务英语专业学生语言能力的需求分析,从建构效度、内容效度和效标关联效度三个方面详细讨论如何构建高效度的高职商务英语专业写作能力校内测试。相似文献

13.

When Assessment Validation Neglects Any Strand of Validity Evidence: An Instructive Example from PISA

David Pepper 《Educational Measurement》2020,39(4):8-20

The Standards for Educational and Psychological Testing identify several strands of validity evidence that may be needed as support for particular interpretations and uses of assessments. Yet assessment validation often does not seem guided by these Standards, with validations lacking a particular strand even when it appears relevant to an assessment. Consequently, the degree to which validity evidence supports the proposed interpretation and use of the assessment may be compromised. Guided by the Standards, this article presents an independent validation of OECD's PISA assessment of mathematical self-efficacy (MSE) as an instructive example of this issue. OECD identifies MSE as one of a number of “factors” explaining student performance in mathematics, thereby serving the “policy orientation” of PISA. However, this independent validation identifies significant shortcomings in the strands of validity evidence available to support this interpretation and use of the assessment. The article therefore demonstrates how the Standards can guide the planning of a validation to ensure it generates the validity evidence relevant to an interpretive argument, particularly for an international large-scale assessment such as PISA. The implication is that assessment validation could yet benefit from the Standards as what Zumbo calls “a global force for testing”. 相似文献

14.

谈信度、效度与学业测试

包威《黑龙江教育学院学报》2010,29(8):29-30

信度与效度是学业测试的两个质量特征,如何处理两者之间的关系也是测试的根本问题。在介绍信度和效度的定义、关系的基础上,对学业测试中的信度与效度进行分析,并且阐述如何平衡两者之间的关系。最终证明学业测试是一种有效的测量手段,并且必将提高教学质量。相似文献

15.

Research on the Content Validity of the CET-4 Fast Reading Test

徐芝苹辛苏《海外英语》2011,(1):70-71,74

Content validity is an important part of language testing.In this paper,the content validity of the CET-4 fast reading test is analyzed in terms of expected response and text input.The result of final research shows that the content validity of the fast reading test is high with some limitations proposed. 相似文献

16.

测验效度概念的新发展

谢小庆《考试研究》2013,(3):56-64

1985年《教育与心理测验标准》(第5版)出版之前,效度研究的核心概念是"效标(criterion)",效度研究被视为一种用"效标"对测验的效度进行证明(verify)、对测验分数做出有效(valid)解释的过程。1985年以后,效度研究的核心概念是"证据(evidence)",效度研究被视为一种通过积累证据对测验的效度进行支持(support)、对测验分数做出合理(reasonable)解释的过程。关于效度的这种理解,突出体现在1999年出版的《教育与心理测验标准》(第6版)中。美国教育协会和美国国家教育测量学会共同组织编写的《教育测量》在业内被称为"教育测量领域的《圣经》"。2006年《教育测量》(第4版)出版以后,效度研究的核心概念演变为"理由(warrant)",效度研究被视为一种通过构造"理由系统"和"理由网络"对效度进行"论证(argument)"、对测验分数做出可接受的(plausible)解释的过程。本文结合笔者的考试实践,介绍了效度概念的新发展。相似文献

17.

内容违法民事行为的效力研究

周进军《怀化学院学报》2007,(2)

民事行为调整作为法律调整方式的一种,它为实现当事人的意思自治、市场经济的充分发展提供了一条重要的法律途径。但违法民事行为的破坏性又使得法律不得不对其进行规制,这样违法民事行为的效力评价就被提上了一个重要的法律平台。立足此,对内容违法的民事行为效力评价的根源、效力评价的比较法探索以及效力评价的规则的设想有所论述。相似文献

18.

CET-SET测试效度研究

景恒伟马丽玲《甘肃高师学报》2012,17(6):97-99

本研究就CET-SET(大学英语四、六级口语考试)测试效度作了相应的实证研究,研究结果表明CET-SET测试任务类型的结构效度还不完善,不能完成测试目的与测试结果的拟合(Hughes,1989),证明了CET-SET结构效度偏低的事实。针对研究结果,研究者提出了提升测试效度相关的建议和措施。相似文献

19.

Evaluating Content-Related Validity Evidence Using a Text-Based Machine Learning Procedure

Daniel Anderson Brock Rowley Sondra Stegenga P. Shawn Irvin Joshua M. Rosenberg 《Educational Measurement》2020,39(4):53-64

Validity evidence based on test content is critical to meaningful interpretation of test scores. Within high-stakes testing and accountability frameworks, content-related validity evidence is typically gathered via alignment studies, with panels of experts providing qualitative judgments on the degree to which test items align with the representative content standards. Various summary statistics are then calculated (e.g., categorical concurrence, balance of representation) to aid in decision-making. In this paper, we propose an alternative approach for gathering content-related validity evidence that capitalizes on the overlap in vocabulary used in test items and the corresponding content standards, which we define as textual congruence. We use a text-based, machine learning model, specifically topic modeling, to identify clusters of related content within the standards. This model then serves as the basis from which items are evaluated. We illustrate our method by building a model from the Next Generation Science Standards, with textual congruence evaluated against items within the Oregon statewide alternate assessment. We discuss the utility of this approach as a source of triangulating and diagnostic information and show how visualizations can be used to evaluate the overall coverage of the content standards across the test items. 相似文献

20.

An Evaluative Framework for Reviewing Fairness Standards and Practices in Educational Tests

Jessica L. Jonson Pamela Trantham Betty Jean Usher‐Tate 《Educational Measurement》2019,38(3):6-19

One of the substantive changes in the 2014 Standards for Educational and Psychological Testing was the elevation of fairness in testing as a foundational element of practice in addition to validity and reliability. Previous research indicates that testing practices often do not align with professional standards and guidelines. Therefore, to raise awareness of fairness concepts and principles from the 2014 Standards, this study aligned those standards with fairness practices, as documented in test manuals and on websites of 18 intelligence and achievement tests from different test publishers. A content analysis indicated that just under half of the fairness standards are frequently or occasionally practiced and those occurrences differed somewhat across tests but did not differ between intelligence and achievement tests or across publishers. To inform and encourage improvements in the future practice of the fairness standards, an evaluative framework along with example practices and related methodological scholarship is discussed. 相似文献