首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
This study explored differences in test anxiety on high‐stakes standardized achievement testing and low‐stakes testing among elementary school children. This is the first study to directly examine differences in young students’ reported test anxiety between No Child Left Behind (NCLB) achievement testing and classroom testing. Three hundred thirty‐five students in Grades 3 through 5 participated in the study. Students completed assessments of test anxiety following NCLB testing and typical classroom testing. Students reported significantly more overall test anxiety in relation to high‐stakes testing versus classroom testing on two measures of test anxiety, effect sizes r = ?.21 and r = ?.10. Students also reported significantly more cognitive (r = ?.20) and physiological (r = ?.24) symptoms of test anxiety in relation to high‐stakes testing. This study adds to the test anxiety literature by demonstrating that students experience heightened anxiety in response to NCLB testing.  相似文献   

2.
This study analyzed the relationship between benchmark scores from two curriculum‐based measurement probes in mathematics (M‐CBM) and student performance on a state‐mandated high‐stakes test. Participants were 298 students enrolled in grades 7 and 8 in a rural southeastern school. Specifically, we calculated the criterion‐related and predictive validity of benchmark scores from CBM probes measuring math computation and math reasoning skills. Results of this study suggest that math reasoning probes have strong concurrent and predictive validity. The study also provides evidence that calculation skills, while important, do not have strong predictive strength at the secondary level when a state math assessment is the criterion. When reading comprehension skill is taken into account, math reasoning scores explained the greatest amount of variance in the criterion measure. Computation scores explained less than 5% of the variance in the high‐stakes test, suggesting that it may have limitations as a universal screening measure for secondary students.  相似文献   

3.
Curriculum‐Based Measurement silent reading (CBM‐SR) items have been found to be reliable and valid for measuring reading comprehension skills This generalizability study reports the findings from administration of three CBM‐SR passages to fifth through eighth grade students in one school district. Using Repeated Measures Analyses of Variance (RMANOVA) procedures, the statistical probability of performance on the CBM‐SR task as a differential indicator of reading comprehension skill was found to be significant among students in different grade levels and between students who did and did not receive special education services. Follow‐up analyses were conducted using generalizability theory to estimate the amount of variance in CBM‐SR scores from individual score differences, grade levels, and special education status. The results indicated that on two of the passages, variability in CBM‐SR scores came primarily from grade level differences in scores on the tasks, while on the third passage, the differences were most attributable to individual differences in scores, regardless of grade level or special education services. Implications for the use of CBM‐SR items for routine assessment of students' reading skills are discussed. © 2003 Wiley Periodicals, Inc. Psychol Schs 40: 363–377, 2003.  相似文献   

4.
5.
This study explored the relationships among formative curriculum‐based measures of reading (CBM‐R), student engagement as an extra‐academic indicator of student motivation, and summative performance on a high‐stakes reading assessment. A diverse sample of third‐, fourth‐, and fifth‐grade students and their teachers responded to questionnaires about student engagement in academic tasks. These questionnaires were collected about the same time as fall CBM‐R oral reading fluency and maze screening tasks. Results indicated that fall student and teacher reports of engagement and a composite score of reading competence derived from CBM‐R screening tests uniquely predicted student performance on year‐end standardized reading tests. Profile analyses indicated that student engagement was associated with better reading performance among students with low competence, suggesting that engagement may be particularly important for increasing student performance for struggling readers. Implications for interventions targeting both student motivation, as well as reading skill development, are discussed.  相似文献   

6.
Two hundred and two (n = 202) sixth‐grade students in social studies were administered a weekly vocabulary‐matching curriculum‐based measure (CBM) for 35 weeks. Students were also administered the Scholastic Reading Inventory (SRI), along with the annual state high‐stakes test in Communication Arts. CBM scores were analyzed with respect to alternate form reliability, validity with criterion measures, and student growth over time. Results suggest that the vocabulary‐matching CBM is reliable and valid with the SRI but not with the state test. Students showed an overall linear trend of growth, but this growth was flat in the middle of the semester. Implications for research and practice are discussed.  相似文献   

7.
8.
High‐stakes mathematics assessments require students to write about mathematics, although research suggests students exhibit limited proficiency on such assessments. Students with LD may have difficulties with writing, mathematics, or both. Researchers employed an intervention for teaching students how to organize mathematics writing (MW). Researchers randomly assigned participants (n = 61) in grades 3–5 to receive instruction in MW or information writing. Students receiving MW outperformed control students on a researcher‐developed measure of MW (d = 1.05). Component assessment revealed MW students improved in writing organization (d = 1.49) but not in mathematics content (d = 0.11 ns). Results also indicated MW students outperformed control on percentage of correct MW sequences (d = 0.82). Future directions for MW intervention development are discussed.  相似文献   

9.
Proponents of performance assessments purport that they allow more options for student choice and autonomy and, therefore, are more motivating and more preferred by students. This study explored the role of stakes and the student’s familiarity with the format in these examination preferences. A survey of 148 college students suggested that: their familiarity with open-ended assessments led students to prefer them without reference to stakes; they tended to prefer closed assessments when the stakes are low but open formats when the stakes are high; and their goal orientation had no relationship with these decisions. In the end, students seemed to be rather pragmatic and protective of their grades regardless of their goal orientations, which is only natural. The students’ goal orientations appear, then, to be desiderata for both them and their instructors, but a graded environment stifles their full operation.  相似文献   

10.
This study examined the potential influence of test‐based accountability policies on school environment and teacher stress among early elementary teachers. Structural equation modeling of data from 541 kindergarten through second grade teachers across three states found that use of student performance on high‐stakes tests to evaluate teachers indirectly was related to teachers’ professional investment via test stress in the environment. Although students in kindergarten through second grade do not take high‐stakes assessments, early elementary teachers reported high levels of stress associated with test‐based accountability policies. This study provides data across multiple states that test‐based accountability policies may have negative influences on school environment and teacher stress among early elementary teachers. Implications for practice and research are discussed.  相似文献   

11.
This study investigated the accuracy of classroom teachers' judgments of the reading progress of their low‐performing students. Participants were 36 second grade teachers and students in their lowest reading groups (n = 150). Student progress was monitored weekly using reading‐curriculum‐based measurement (R‐CBM) procedures. After 6 weeks, teachers were asked to rate their students' progress. Expert judges later reviewed the teachers' R‐CBM graphs and rated the individual and group progress based on the graphs. Teacher ratings did not correlate with expert ratings or the R‐CBM slope estimates. Expert ratings correlated highly with slope estimates. Teachers' estimates of progress were significantly higher than expert judges' ratings, indicating that teachers may overestimate student progress. Implications for practice and future research are discussed. © 2008 Wiley Periodicals, Inc.  相似文献   

12.
In this study, we examined the reliability and validity of curriculum‐based measures (CBM) in reading for indexing the performance of secondary‐school students. Participants were 236 eighth‐grade students (134 females and 102 males) in the classrooms of 17 English teachers. Students completed 1‐, 2‐, and 3‐minute reading aloud and 2‐, 3‐, and 4‐minute maze selection tasks. The relation between performance on the CBMs and the state reading test were examined. Results revealed that both reading aloud and maze selection were reliable and valid predictors of performance on the state standards tests, with validity coefficients above .70. An exploratory follow‐up study was conducted in which the growth curves produced by the reading‐aloud and maze‐selection measures were compared for a subset of 31 students from the original study. For these 31 students, maze selection reflected change over time whereas reading aloud did not. This pattern of results was found for both lower‐ and higher‐performing students. Results suggest that it is important to consider both performance and progress when examining the technical adequacy of CBMs. Implications for the use of measures with secondary‐level students for progress monitoring are discussed.  相似文献   

13.
Differences in oral reading curriculum‐based measurement (R‐CBM) slopes based on two commonly used progress monitoring practices in field‐based data were compared in this study. Semester‐specific R‐CBM slopes were calculated for 150 Grade 1 and 2 students who completed benchmark (i.e., 3 R‐CBM probes collected 3 times per year) and strategic (i.e., one R‐CBM probe collected monthly) assessments. Slopes based on two adjacent benchmark assessments were positively correlated with slopes based on three monthly strategic assessments in the spring semester of Grade 1 but not in either Grade 2 semester, and significant differences were found between the slopes in all semesters. Consistent with another study showing that slopes are overestimated when single probes are administered per occasion, slopes were larger when based on strategic versus benchmark data in the current study, and the average discrepancies between slopes were greater‐than‐expected growth rates in all semesters. The current findings, based on field‐based data, illustrate the impact of variations in commonly used progress monitoring procedures on the precision of calculated slope estimates.  相似文献   

14.
States use standards‐based English language proficiency (ELP) assessments to inform relatively high‐stakes decisions for English learner (EL) students. Results from these assessments are one of the primary criteria used to determine EL students’ level of ELP and readiness for reclassification. The results are also used to evaluate the effectiveness of and funding allocation to district or school programs that serve EL students. In an effort to provide empirical validity evidence for such important uses of ELP assessments, this study focused on examining the constructs of ELP assessments as a fundamental validity issue. Particularly, the study examined the types of language proficiency measured in three sample states’ ELP assessments and the relationship between each type of language proficiency and content assessment performance. The results revealed notable variation in the presence of academic and social language in the three ELP assessments. A series of hierarchical linear modeling (HLM) analyses also revealed varied relationships among social language proficiency, academic language proficiency, and content assessment performance. The findings highlight the importance of examining the constructs of ELP assessments for making appropriate interpretations and decisions based on the assessment scores for EL students. Implications for policy and practice are discussed.  相似文献   

15.
Science education needs valid, authentic, and efficient assessments. Many typical science assessments primarily measure recall of isolated information. This paper reports on the validation of assessments that measure knowledge integration ability among middle school and high school students. The assessments were administered to 18,729 students in five states. Rasch analyses of the assessments demonstrated satisfactory item fit, item difficulty, test reliability, and person reliability. The study showed that, when appropriately designed, knowledge integration assessments can be balanced between validity and reliability, authenticity and generalizability, and instructional sensitivity and technical quality. Results also showed that, when paired with multiple‐choice items and scored with an effective scoring rubric, constructed‐response items can achieve high reliabilities. Analyses showed that English language learner status and computer use significantly impacted students' science knowledge integration abilities. Students who took the assessment online, which matched the format of content delivery, performed significantly better than students who took the paper‐and‐pencil version. Implications and future directions of research are noted, including refining curriculum materials to meet the needs of diverse students and expanding the range of topics measured by knowledge integration assessments. © 2011 Wiley Periodicals, Inc. J Res Sci Teach 48: 1079–1107, 2011  相似文献   

16.
Changes to federal guidelines for the identification of children with disabilities have supported the use of multi‐tiered models of service delivery. This study investigated the impact of measurement methodology as used across numerous tiers in determining special education eligibility. Four studies were completed using a sample of inner‐city children (N = 150) who were administered a reading screener twice and a reading measure adapted from the state high‐stakes reading test. A sub‐sample of children identified as At‐Risk were administered a comprehensive reading assessment and compared with a randomly selected control group, who were also administered a comprehensive reading assessment (n = 14). A model was developed to estimate the likelihood of special education eligibility based on both theoretical and empirical measurement parameters. Depending on the measurement assumptions of the multi‐tiered model, special education eligibility outcomes varied from a low of 0.2% to as high as 11%, depending on the type of measure used, decision‐making criteria used at each tier, and the number of tiers in the model. This study highlights the importance of measurement specification, explicit decision‐making criteria, and empirical investigation to fully understand outcomes associated with the implementation of multi‐tiered models. Implications for special education eligibility policy and practical implications for implementing comprehensive measurement practice in multi‐tiered systems at the school level are discussed. © 2012 Wiley Periodicals, Inc.  相似文献   

17.
The rise of computer‐based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple‐choice items. In particular, very short response time—termed rapid guessing—has been shown to indicate disengaged test taking, regardless whether it occurs in high‐stakes or low‐stakes testing contexts. This article examines rapid‐guessing behavior—its theoretical conceptualization and underlying assumptions, methods for identifying it, misconceptions regarding its dynamics, and the contextual requirements for its proper interpretation. It is argued that because it does not reflect what a test taker knows and can do, a rapid guess to an item represents a choice by the test taker to momentarily opt out of being measured. As a result, rapid guessing tends to negatively distort scores and thereby diminish validity. Therefore, because rapid guesses do not contribute to measurement, it makes little sense to include them in scoring.  相似文献   

18.
Rater training is an important part of developing and conducting large‐scale constructed‐response assessments. As part of this process, candidate raters have to pass a certification test to confirm that they are able to score consistently and accurately before they begin scoring operationally. Moreover, many assessment programs require raters to pass a calibration test before every scoring shift. To support the high‐stakes decisions made on the basis of rater certification tests, a psychometric approach for their development, analysis, and use is proposed. The circumstances and uses of these tests suggest that they are expected to have relatively low reliability. This expectation is supported by empirical data. Implications for the development and use of these tests to ensure their quality are discussed.  相似文献   

19.
Contemporary educational accountability systems, including state‐level systems prescribed under No Child Left Behind as well as those envisioned under the “Race to the Top” comprehensive assessment competition, rely on school‐level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that ignore the classroom‐level clustering of students within schools. This paper reports balanced and unbalanced generalizability analyses investigating the consequences of ignoring variation at the level of classrooms within schools when analyzing the reliability of such school‐level accountability measures. Results show that the reliability of school means cannot be determined accurately when classroom‐level effects are ignored. Failure to take between‐classroom variance into account biases generalizability (G) coefficient estimates downward and standard errors (SEs) upward if classroom‐level effects are regarded as fixed, and biases G‐coefficient estimates upward and SEs downward if they are regarded as random. These biases become more severe as the difference between the school‐level intraclass correlation (ICC) and the class‐level ICC increases. School‐accountability systems should be designed so that classroom (or teacher) level variation can be taken into consideration when quantifying the precision of school rankings, and statistical models for school mean score reliability should incorporate this information.  相似文献   

20.
The present study focused on CBM written language procedures by conducting an investigation of the developmental, gender, and practical considerations surrounding three categories of CBM written language scoring indices: production‐dependent, production‐independent, and accurate‐production. Students in first‐ through eighth‐grade generated a three‐minute writing sample in the fall and spring of the school year using standard CBM procedures. The writing samples were scored using all three types of scoring indices to assess the trends in scoring indices for students of varying ages and gender and of the time required to score writing samples using various scoring indices. With only one exception, older students outperformed younger students on all of the scoring indices. Although at the middle school level students' levels of writing fluency and writing accuracy were not closely associated, at the younger grade levels the CBM indices were significantly related. With regard to gender differences, girls outperformed boys on measures of writing fluency at all grade levels. The average scoring time per writing sample ranged from 1‐1/2 to 2‐1/2 minutes (depending on grade level). © 2003 Wiley Periodicals, Inc. Psychol Schs 40: 379–390, 2003.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号