期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Validity issues in standard-setting studies

Hans A. Pant André A. Rupp Simon P. Tiffin-Richards Olaf Köller 《Studies in Educational Evaluation》2009,35(2-3):95-101

Standard-setting procedures are a key component within many large-scale educational assessment systems. They are consensual approaches in which committees of experts set cut-scores on continuous proficiency scales, which facilitate communication of proficiency distributions of students to a wide variety of stakeholders. This communicative function makes standard-setting studies a key gateway for validity concerns at the intersection of evidentiary and consequential aspects of score interpretations. This short review paper describes the conceptual and empirical basis of validity arguments for standard-setting procedures in light of recent research on validity theory. It specifically demonstrates how procedural and internal evidence for the validity of standard-setting procedures can be collected to form part of the consequential basis of validity evidence for test use. 相似文献

2.

Assessing Learning in a Technology-Supported Genetics Environment: Evidential and Systemic Validity Issues

《Educational Assessment》2013,18(3):155-196

To evaluate student learning in a computer-supported environment known as GenScope(tm), we developed a system for assessing students' reasoning proficiency in introductory genetics. A critical aspect of the development effort concerned the validity of this assessment system. We used a variety of methods to address traditional evidential validity concerns as well as more contemporary concerns with consequential and systemic validity. Specifically, we examined whether or not our assessment system helped students develop the understanding it was designed to assess. Our inquiry revealed strong evidential validity but only limited consequential validity. In response, we developed a set of formative assessments designed to scaffold student assessment performance without compromising the evidential validity of the assessment system. In addition to documenting and enhancing the validity of the system, these efforts demonstrate the utility of newer interpretive models of validity inquiry and the value of Rasch measurement tools for conducting such inquiry. 相似文献

3.

Educational and Employment Testing: Changing Concepts in Measurement and Policy

Wayne J. Camara Dianne C. Brown 《Educational Measurement》1995,14(1):5-11

How will the expansion of the concept of construct validity affect validation practice in employment testing? How does the need for consequential validity differ in educational and employment testing? How do the research bases differ for performance assessment in these settings? Are there parallel trends in policies for test use in education and industry? 相似文献

4.

International large-scale assessments: what uses,what consequences?

Stefan Johansson 《Educational research; a review for teachers and all concerned with progress in education》2016,58(2):139-148

Abstract

Background: International large-scale assessments (ILSAs) are a much-debated phenomenon in education. Increasingly, their outcomes attract considerable media attention and influence educational policies in many jurisdictions worldwide. The relevance, uses and consequences of these assessments are often the focus of research scrutiny. Whilst some argue that the assessment outcomes provide an effective basis for informed policy-making, critics claim that the use of international assessment data can result in a range of unintended consequences, such as the shaping and governing of school systems ‘by numbers’.

Purpose: This article explores and analyses the arguments about the uses and consequences of ILSAs. In particular, the discourse about the assessments’ consequential validity will be discussed and evaluated.

Sources of evidence: Literature relating to the uses and consequences of large-scale assessment was analysed, with a focus on research on the consequential aspects of validity.

Main argument: Much research suggests that ILSAs have unintended consequences that affect and influence educational policy. However, the influences on educational policy are complex and interwoven: for example, it is not clear-cut whether effects such as converging curricular are, necessarily, direct consequences of large-scale assessments. Further, it is suggested that a beneficial consequence of large-scale assessment is the infrastructure they provide for studies in the social sciences, although caution must be applied to causal claims, in particular because of the cross-sectional design of the assessments.

Conclusions: The considerable literature discussing the uses and consequences of large-scale assessments tends to point out potential negative aspects of the studies. However, it is also apparent that large-scale international assessments can be a valuable resource for studying global trends and evolving systems in education. Despite the extensive debates around large-scale assessment outcomes both in the media and in educational policy arenas, empirical educational research all too often appears underused in the discussion. 相似文献

5.

Consequential Validity From the Test Developer's Perspective

Mark D. Reckase 《Educational Measurement》1998,17(2):13-16

What would a test developer have to do in order to address the consequential aspects of validity? Is it, in fact, possible to meet the requirements implied from a consequential perspective on test validity? 相似文献

6.

Strategies for Examining the Consequences of Assessment and Accountability Programs

Suzanne Lane Clement A. Stone 《Educational Measurement》2002,21(1):23-30

This article addresses issues in evaluating the consequences of assessment programs that are developed for the purpose of holding schools accountable to state standards. After providing a brief review of research examining consequential evidence, a validation study to obtain consequential evidence for state assessment and accountability programs is proposed. The proposal includes a validity argument, a set of propositions that follow from the validity argument, a delineation of the consequential evidence needed, and a way to model the relationship between performance gains and school, principal, teacher, and student variables. 相似文献

7.

A review scrutinising the consequential validity of dynamic assessment

Marlous Tiekstra Alexander Minnaert Marco G.P. Hessels 《教育心理学》2016,36(1):112-137

This literature review explored whether dynamic assessment procedures in psycho-educational practice might bridge the well-known gap between diagnosis and intervention. Due to a learning phase included in the testing procedure, qualitative information about the child’s learning needs can be revealed by means of dynamic assessment. The question is, however, what the consequential validity, i.e. the extent to which assessment influences instructional and learning processes, of dynamic assessment procedures really is. The review of 31 articles that met the inclusion criteria showed that proximal consequential validity of dynamic assessment is warranted, but distal consequential validity is warranted to a lesser extent (e.g. some guidelines for practice). Furthermore, it can be noticed that motivational aspects never played an explicit role during learning phases. In order to design student-tailored interventions following dynamic assessment, there is a need for more explicitness of learning phases and types of feedback in the development of these instruments. 相似文献

8.

The Centrality of Test Use and Consequences for Test Validity 总被引：3，自引：0，他引：3

Lorrie A. Shepard 《Educational Measurement》1997,16(2):5-24

What are the origins of consequential validity? What is the role of intended test use in validation? Is the study of unintended effects part of validation? What practical problems does this pose? 相似文献

9.

Assessment of teacher competence using video portfolios: Reliability, construct validity, and consequential validity

Wilfried Admiraal Mark HoeksmaMarie-Thérèse van de Kamp Gee van Duin 《Teaching and Teacher Education》2011,27(6):1019-1028

The richness and complexity of video portfolios endanger both the reliability and validity of the assessment of teacher competencies. In a post-graduate teacher education program, the assessment of video portfolios was evaluated for its reliability, construct validity, and consequential validity. Although video portfolio facilitated a reliable and valid assessment of teacher competencies, procedures to improve assessment quality were also revealed and are therefore discussed: more explicit grounding of assessment results in the data, peer debriefing, prolonged engagement with the assessment data, cross-checking to find confirmatory or counter examples. 相似文献

10.

Investigating the Consequential Aspects of Validity: Who Is Responsible and What Should They Do?

Wendy M. Yen 《Educational Measurement》1998,17(2):5-5

Where do we stand today as the concept of the consequential aspect of validity gains maturity? What are the implications for the various stakeholders in the measurement enterprise? 相似文献

11.

Standardized Teacher Testing Fails Excellence and Validity Tests

John P. Portelli R. Patrick Solomon Sarah Barrett Donatille Mujawamariya 《Teaching Education》2013,24(4):281-295

This paper presents and critically analyzes data from a project that sought teacher candidates' responses to the process and content of the Ontario Teacher Qualifying Test (OTQT), a mandatory, standardized, pencil and paper initial teacher qualification test. The aim of the project, guided by a critical democratic perspective, was to critically assess the success of the OTQT in achieving the government's stated objectives of greater competency and accountability. More specifically, the paper focuses on findings relating to teacher competency, with particular reference to criterion, ecolological and consequential validity, and alternative assessment strategies. The teacher candidates' responses raise serious issues about the content and format of the test since they believe that it neither achieves accountability nor does it secure excellence in teaching. These concerns echo the major concerns found in the literature. 相似文献

12.

Consequential Validity: A Practitioner's Perspective

Elizabeth Taleporos 《Educational Measurement》1998,17(2):20-23

What happens when philosophical and theoretical propositions meet the harsh realities of the nation's largest school district? How does consequential validity play out in the Big Apple? 相似文献

13.

Quality standards for new modes modes of assessment. An exploratory study of the consequential validity of the OverAll Test

Mien Segers Sabine Dierick Filip Dochy 《European Journal of Psychology of Education - EJPE》2001,16(4):569-588

During the past decade, due to societal developments, methods of instruction as well as the assessment of students’ performances have changed to an important considerable extent. Two of the elements of this change are the accents on cognitive competencies such as problem solving and on learning in an authentic context. In conjunction with the development of such learning methods, new modes of assessment were implemented. It was expected that this change would have positive feedback effects on learning and teaching. These feedback effects are the central issue of this article. They are discussed in terms of the experiences of the Maastricht School of Economics and Business Administration. This school places the analysis of authentic problems at the core of the curriculum, including the learning process as well as the assessment system. The OverAll Test, a case-based assessment instrument aiming to assess problem solving skills, was implemented as part of this. Different quality issues related to the OverAll Test have been evaluated. This article presents the results of one of the four validity studies conducted; an exploratory study of the consequential validity of the OverAll Test. It starts with the an outline of the main features of the new modes of assessment and the OverAll test as an example. There is then a discussed discussion of effectively the OverAll test fits these features as well as the goals and characteristics of problem-based learning. The study of the consequential validity of the OverAll Test is then described in depth. The results of the survey, as well as the results of the semi-structured interviews with staff and students, indicate a friction between the intended characteristics of the learning and assessment environment and the practice of instruction and assessment. 相似文献

14.

Validity considerations ensuing from examinees’ perceptions about high-stakes national examinations in Cyprus

Michalis P. Michaelides 《Assessment in Education: Principles, Policy & Practice》2014,21(4):427-441

Student examinees are key stakeholders in large-scale, high-stakes, public examination systems. How they perceive the purpose, comprehend the technical characteristics of testing and how they interpret scores influence their response to the system demands and their preparation for the examinations; this information relates to intended and unintended consequences of testing and is a component of an expanded notion of test validity. The research reported in this paper investigates examinees’ perceptions about the secondary school graduation and university-entrance national exams in Cyprus. Interviews with recent examinees reveal the versatility and complexity of their perceptions about the fairness and appropriateness of the system, which are influenced by design features of the exams and by the local context. There are important, mostly unintended, consequences on their in- and out-of-school experience, on school curricula and on instructional practices. Empirical evidence about consequential aspects of examinations contributes to the validity argument needed to support such programmes. 相似文献

15.

Educational Assessment of the Post-Pandemic Age: Chinese Experiences and Trends Based on Large-Scale Online Learning

Hong Su 《Educational Measurement》2020,39(3):37-40

Owing to the break-out of the COVID-19 pandemic, students have to take more online learning than offline, and large-scale education assessment programs have to be suspended or postponed. How could education assessment adapt to large-scale online learning? How could the effect and safety of online assessment be improved? What role should formative assessment play in student admissions? How could different assessment results be linked? Reflections on and trends of the Chinese experiences are presented in this article. Based on cross-cultural comparison research, measures to be recommended are as follows: reviewing previous theories, improving existing methods continuously, and developing assessment techniques innovatively according to new application scenarios. 相似文献

16.

Testing Injustice: Examining the Consequential Validity of edTPA

Nadia Behizadeh Adrian Neely 《Equity & Excellence in Education》2013,46(3-4):242-264

In this case study, we examine the consequential validity of using edTPA in a social justice-oriented, urban teacher preparation program. According to the developers of edTPA, a primary purpose is to support teacher candidate learning, yet our analysis suggests that edTPA does not support learning when used during student teaching. Our 16 participants, who are primarily teacher candidates of color and many first-generation college students, and who all passed edTPA, unanimously indicated that edTPA increased their mental and financial stress, which they linked to design elements including high stakes, standardization, and external scoring. Participants also critiqued the construct of teaching represented in edTPA, arguing that dispositions and a social justice orientation are missing and that edTPA is more about following procedures than supporting candidate learning. Moreover, edTPA encouraged inequitable practices, including focusing on high-achieving classes and selecting curricula based on scoring procedures instead of student need. Overall, our analysis indicates that there is not strong consequential validity evidence to support the use of edTPA as an assessment during student teaching, particularly in social justice-oriented programs, yet suggests edTPA could be a useful tool if stakes and proceduralism are reduced and scoring is conducted locally. 相似文献

17.

Changes in secondary teachers' perceptions of barriers to portfolio assessment

《Assessing Writing》1999,6(1):85-105

Portfolio assessment has become a popular medium for merging classroom assessment with large-scale testing, but adoption of portfolios in the classroom for external assessment purposes may be difficult because the use of such portfolios may require changes in the curriculum, instructions, and assessments used by teachers. As a result, there are numerous potential barriers to the adoption of portfolios that can be used for large-scale assessment purposes. This study investigates how secondary teachers' perceptions of portfolio implementation barriers changed when teachers participated in a 1-year portfolio implementation effort. Survey results are analyzed with a Rasch rating scale model. Results suggest that teachers' apprehension about portfolio barriers increased slightly, but that this increase can be attributable to teachers with little portfolio experience. Furthermore, teachers' concerns about the amount of time required to develop and score portfolios increased substantially while concerns about the availability of resources and resistance from parents decreased. 相似文献

18.

Teaching About Performance Assessment

Judy Arter 《Educational Measurement》1999,18(2):30-44

How should we teach prospective teachers about performance assessment? What are the issues and concerns that new teachers will encounter as they begin their teaching careers? How can assessment and instruction be better integrated in classrooms? 相似文献

19.

Developing and assessing beginning teacher effectiveness: the potential of performance assessments

Linda Darling-Hammond Stephen P. Newton Ruth Chung Wei 《Educational Assessment, Evaluation and Accountability》2013,25(3):179-204

The Performance Assessment for California Teachers (PACT) is an authentic tool for evaluating prospective teachers by examining their abilities to plan, teach, assess, and reflect on instruction in actual classroom practice. The PACT seeks both to measure and develop teacher effectiveness, and this study of its predictive and consequential validity provides information on how well it achieves these goals. The research finds that teacher candidates’ PACT scores are significant predictors of their later teaching effectiveness as measured by their students’ achievement gains in both English language arts (ELA) and mathematics. Several subscales of the PACT are also influential in predicting later effectiveness: These include planning, assessment, and academic language development in ELA, and assessment and reflection in mathematics. In addition, large majorities of PACT candidates report that they acquired additional knowledge and skills for teaching by virtue of completing the assessment. Candidates’ feelings that they learned from the assessment were the strongest when they also felt well-supported by their program in learning to teach and in completing the assessment process. 相似文献

20.

Assessment of prior learning in higher education: a review from a validity perspective

Tova Stenlund 《Assessment & Evaluation in Higher Education》2010,35(7):783-797

The process of giving official acknowledgment to formal, informal and non‐formal prior learning is commonly labelled as assessment, accreditation or recognition of prior learning (APL), representing a practice that is expanding in higher education in many countries. This paper focuses specifically on the assessment part of APL, which undoubtedly is central to the whole process, through a review of research in this area and an analysis of the reviewed studies from a validity perspective. The research reviewed (published 1990–2007) is categorised into empirical as well as more theoretically oriented publications, with a quantitative dominance of the latter. According to the validity analysis, a majority of the studies conducted in this area relate to the evidential basis of test interpretation and use, primarily providing theoretical rationales and theories for a variety of practices. The consequential basis of test interpretation and use has not been studied to any larger extent, resulting in a lack of both theoretical and empirical studies dealing with this aspect of validity. 相似文献