期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Grade stability in a criterion‐referenced grading system: the Swedish example

Christina Wikström 《Assessment in Education: Principles, Policy & Practice》2005,12(2):125-144

This study investigates empirically the mechanisms behind the increasing grade point averages in Swedish upper secondary schools. Four hypotheses are presented as plausible explanations; improved student achievements, student selection effects, strategic behaviour in course choices, and lowering of grading standards. The analysis is based on extensive data, and focuses on grades and test scores from upper secondary school graduates over a 6‐year period. The result shows that the increase in grade point averages cannot be explained by better achievements, selection effects or course choices, which means that standards have been lowered, which is interpreted here as grade inflation. The grade inflation is most likely an effect of the leniency in the grading system in combination with pressure for high grading, related to the upper secondary school grades’ function as an instrument for selection to higher education. 相似文献

2.

THE ASSOCIATION AMONG STUDENT SUCCESS IN COURSES,PLACEMENT TEST SCORES,STUDENT BACKGROUND DATA,AND INSTRUCTOR GRADING PRACTICES

William B. Armstrong 《Community College Journal of Research & Practice》2013,37(8):681-695

Growth in the use of testing to determine student eligibility for community college courses has prompted debate and litigation regarding over the equity, access, and legal implications of these practices. In California, this has resulted in state regulations requiring that community colleges provide predictive validity evidence of test-score?based inferences and course prerequisites. In addition, companion measures that supplement placement test scores must be used for placement purposes. However, for both theoretical and technical reasons the predictive validity coefficients between placement test scores and final grades or retention in a course generally demonstrate a weak relationship. The study discussed in this article examined the predictive validity of placement test scores with course grade and retention in English and mathematics courses. The investigation produced a model to explain variance in course outcomes using test scores, student background data, and instructor differences in grading practices. The model produced suggests that student dispositional characteristics explain the high proportion of variance in the dependent variables. Including instructor grading practices in the model adds significantly to the explanatory power and suggests that grading variations make accurate placement more problematic. This investigation underscores the importance of academic standards as something imposed on students by an institution and not something determined by the entering abilities of students. 相似文献

3.

Dimensions of Norm-Referenced Compulsory School Grades and their Relative Importance for the Prediction of Upper Secondary School Grades

Cecilia Thorsen 《Scandinavian Journal of Educational Research》2014,58(2):127-146

Irrespective of the grading system, grades are the most valid instrument for predicting educational success. Previous studies have shown that criterion-referenced compulsory school grades are multidimensional, reflecting subject-specific dimensions and a common grade dimension, both of which contribute to the predictive validity of grades. This suggests that in addition to knowledge and skills, grades reflect other aspects which might have importance for the prediction of educational success. The purpose of this study was to investigate, using structured equation modeling, whether norm-referenced compulsory school grades display similar patterns of dimensionality and predictive validity to criterion-referenced grades. Possible differences due to gender and parents' education were considered. Participants were 3855 students born in 1972. The results showed that norm-referenced grades are multidimensional, and that both the subject-specific and common grade dimensions contribute to predicting educational success. In the common grade dimension, girls and students with higher educational backgrounds were favored. 相似文献

4.

The effects of college grade adjustments on the predictive validity and utility of SAT scores

Dana Keller James Crouse Dale Trusheim 《Research in higher education》1994,35(2):195-208

Regressing adjusted grade-point averages on freshman SAT scores and high school grade-point averages results in large increases in the incremental predictive validity of the SAT. Even so, the SAT still changes no more than a small proportion of admissions decisions and does not result in substantively important increases in freshman grades. The test does, however, change the composition of the freshman class by altering acceptances to some major areas of study and by limiting the access of women and blacks. 相似文献

5.

Differential Prediction of Study Success Across Academic Programs in the Swedish Context: The Validity of Grades and Tests as Selection Instruments for Higher Education

Christina Cliffordson 《Educational Assessment》2013,18(1):56-75

The purpose of the study is to investigate the predictive validity of criterion- and norm-referenced grades and the Swedish Scholastic Aptitude Test (SweSAT) and, in particular, possible differences in the prediction of achievement in higher education across academic programs. The analyses were based on credit points obtained by 164,106 Swedish students during the years 1993 to 2001. Two-level modeling with randomly varying slopes with academic program as cluster variable was used. The results provide means and variances of the slopes across the different programs. Variability in the slopes because of program subject area was also investigated. The results indicate that the validity of grades, irrespective of grading system, is stronger in comparison with SweSAT scores. The results also indicate considerable differences in predictive power across programs for the SweSAT, whereas there are much smaller differences for norm-referenced grades and relatively modest differences for criterion-referenced grades. The impact of program subject area on the variability of prediction was substantial for SweSAT scores. 相似文献

6.

Gender Bias in the Prediction of College Course Performance 总被引：1，自引：0，他引：1

Robert L. MeCornack Mary M. McLeod 《Journal of Educational Measurement》1988,25(4):321-331

Is the relationship of college grades to the traditional predictors of aptitude test scores and high school grades different for men and women? The usual gender bias of underpredicting the grade point averages of women may result from gender-related course selection effects. This study controlled course selection effects by predicting single course grades rather than a composite grade from several courses. In most of the large introductory courses studied, no gender bias was found that would hold up on cross-validation in a subsequent semester. Usually, it was counterproductive to adjust grade predictions according to gender. Grade point average was predicted more accurately than single course grades 相似文献

7.

Predicting Freshman Grade‐Point Average from Test Scores: Effects of Variation Within and Between High Schools

下载免费PDF全文

D. Koretz M. Langi 《Educational Measurement》2018,37(2):9-19

Most studies predicting college performance from high‐school grade point average (HSGPA) and college admissions test scores use single‐level regression models that conflate relationships within and between high schools. Because grading standards vary among high schools, these relationships are likely to differ within and between schools. We used two‐level regression models to predict freshman grade point average from HSGPA and scores on both college admissions and state tests. When HSGPA and scores are considered together, HSGPA predicts more strongly within high schools than between, as expected in the light of variations in grading standards. In contrast, test scores, particularly mathematics scores, predict more strongly between schools than within. Within‐school variation in mathematics scores has no net predictive value, but between‐school variation is substantially predictive. Whereas other studies have shown that adding test scores to HSGPA yields only a minor improvement in aggregate prediction, our findings suggest that a potentially more important effect of admissions tests is statistical moderation, that is, partially offsetting differences in grading standards across high schools. 相似文献

8.

Does grade inflation affect the reliability of grades?

Jason Millman Simeon P. Slovacek Edward Kulick Karen J. Mitchell 《Research in higher education》1983,19(4):423-429

Two studies were conducted to examine the effect of grade inflation on the piling up of grades in fewer grade categories and on the reliability of grade point averages (GPAs). In all comparisons, grades were more bunched after grade inflation, which in turn, was associated with only slight, nonsignificant decreases in GPA reliability. As expected, grades were more bunched when the traditional 5-point letter scale was used than when plus and minus grades were also allowed. In the latter case as well, grade inflation seemed to have had very little effect on the reliability of GPAs. GPA reliability began to suffer, however, for graduate programs in which almost all grades were placed into just two categories, A and B. 相似文献

9.

Predicting College Performance of American Indians: A Large‐Sample Examination of the SAT

下载免费PDF全文

Siwen Shu Nathan R. Kuncel Paul R. Sackett 《Educational Measurement》2017,36(2):24-33

Extensive research has examined the validity and fairness of standardized tests in academic admissions. However, due to their underrepresentation in higher education, American Indians have gained much less attention in this research. In the present study, we examined for American Indian students (1) group differences on SAT scores, (2) the predictive and incremental validity of SAT over high school grades, (3) the effect of socioeconomic status on SAT validity, (4) differential prediction in the use of SAT scores, and (5) potential omitted variables that could explain differential prediction for American Indian students. Results provided evidence of predictive and incremental validity of SAT scores, and the validity of SAT scores was largely independent of socioeconomic status. Overprediction was found when using SAT scores to predict college performance and it was reduced when including high school grades as an additional predictor. This study provides substantial evidence of the validity and fairness of SAT scores for American Indians. 相似文献

10.

Grades and Test Scores: Accounting for Observed Differences 总被引：1，自引：0，他引：1

Warren W. Willingham Judith M. Pollack Charles Lewis 《Journal of Educational Measurement》2002,39(1):1-37

Why do grades and test scores often differ? A framework of possible differences is proposed in this article. An approximation of the framework was tested with data on 8,454 high school seniors from the National Education Longitudinal Study. Individual and group differences in grade versus test performance were substantially reduced by focusing the two measures on similar academic subjects, correcting for grading variations and unreliability, and adding teacher ratings and other information about students. Concurrent prediction of high school average was thus increased from 0.62 to 0.90; differential prediction in eight subgroups was reduced to 0.02 letter‐grades. Grading variation was a major source of discrepancy between grades and test scores. Other major sources were teacher ratings and Scholastic Engagement, a promising organizing principle for understanding student achievement. Engagement was defined by three types of observable behavior: employing school skills, demonstrating initiative, and avoiding competing activities. While groups varied in average achievement, group performance was generally similar on grades and tests. Major factors in achievement were similarly constituted and similarly related from group to group. Differences between grades and tests give these measures complementary strengths in high‐stakes assessment. If artifactual differences between the two measures are not corrected, common statistical estimates of validity and fairness are unduly conservative. 相似文献

11.

POSTDICTION STUDY OF THE GRADUATE RECORD EXAMINATION AND EIGHT SEMESTERS OF COLLEGE GRADES1

LLOYD G. HUMPHREYS THOMAS TABER 《Journal of Educational Measurement》1973,10(3):179-184

Data from a postdictive study of the tests of the Graduate Record Examination and the eight semesters of undergraduate grade averages, each semester's average being computed independently of the rest, are presented. Postdictive validities of the aptitude portions of the GRE are essentially similar to predictive validities obtained earlier by the senior author. Both predictive and postdictive validity gradients over the eight semesters are relatively steep, with freshman grades having the highest correlations with the tests. The validity gradient for all advanced tests combined does not follow the pattern for the aptitude tests, but neither does it show the opposite gradient. Advanced test results are most highly correlated with sophomore grades, but the validity gradient over the eight semesters is relatively flat. A small scale extension of this research into post baccalaureate training indicated that senior grades were most predictive of graduate criteria, but a larger scale study is clearly called for. Possible implications for ability theory and for selection of graduate students are discussed. 相似文献

12.

Middle school students’ attitudes toward physical education

Prithwi Raj Subramaniam Stephen Silverman 《Teaching and Teacher Education》2007

The purpose of this study was to determine middle school students’ attitudes toward physical education using an attitude instrument grounded in attitude theory. In addition, this investigation also sought to ascertain if gender and grade level influence student attitudes toward the subject matter. Participants for this study were 995 students from grades 6 to 8. A previously validated attitude instrument based on a two-component view of attitude with scores that showed evidence of reliability and validity was used. Overall all students had moderately positive attitudes toward physical education. There was, however, a decline in attitude scores as students progressed in grade level. Higher grades had lower mean scores. 相似文献

13.

Factors influencing academic success and retention following a 1st-year post-secondary success course

Deborah J. Kennett Maureen J. Reed 《Educational Research and Evaluation》2013,19(2):153-166

Research has found that grades are the most valid instruments for predicting educational success. Why grades have better predictive validity than, for example, standardized tests is not yet fully understood. One possible explanation is that grades reflect not only subject-specific knowledge and skills but also individual differences in other aspects. The purpose was to investigate the relative importance of knowledge and skills and other aspects encapsulated in grades for the predictive validity of compulsory school grades for educational success in upper secondary school. Structural equation modelling was used. Participants were 9th-grade students from 3 birth cohorts, each comprising full populations of approximately 100,000 students. The results showed that the subject-specific factors and an additional common grade factor contributed to the predictive validity. Effects of gender and parents' education were found in the common grade factor, with girls and students with a lower educational background being advantaged. 相似文献

14.

Monitoring the University Admission Process in Spain

Anna Cuxart I Jardí Nicholas T. Longford 《Higher Education in Europe》1998,23(3):385-396

The examinations taken by graduating high school students in Spain and the role of the results of these examinations in the university admissions process are described. Several issues related to the equity of the system are evoked: reliability of grading, comparability of grades and scores (equating), maintenance of standards, and compilation and use of composite scores. Studies to assess the reliability of graders and the impact of various types of imperfections in the grading system are proposed. Various schemes for score adjustment are reviewed, and the feasibility of their implementation, discussed. The advantages of pretesting items and of empirical checks of the judgments of experts are pointed out. The article concludes with an outline of a planned reorganization of higher education in Spain and with a call for a comprehensive programme of empirical research concurrent with the operation of the examination and scoring system. 相似文献

15.

Are validity coefficients understated due to correctable defects in the GPA?

John W. Young 《Research in higher education》1990,31(4):319-325

The predictive validity of preadmissions measures such as standardized test scores and high school grades may be understated because of correctable defects in both the freshman year and cumulative grade point average (GPA). Measurement error in the criterion artificially depresses the size of observed validity coefficients. A study was conducted using item response theory (IRT) to develop a more reliable measure of performance, called an IRT-based GPA, and tested in a predictive validity study using data from Stanford University. Results indicate increased predictability when the IRT-based GPA is compared with the usual GPA.This article is based, in part, on the doctoral dissertation of the author, which was completed at the School of Education at Stanford University. 相似文献

16.

Inter-subject comparability of examination standards in GCSE and GCE in England

Qingping He Ian Stockford Michelle Meadows 《牛津教育评论》2018,44(4):494-513

Results from Rasch analysis of GCSE and GCE A level data over a period of four years suggest that the standards of examinations in different subjects are not consistent in terms of the levels of the latent trait specified in the Rasch model required to achieve the same grades. Variability in statistical standards between subjects exists at both individual grade level and the overall subject level. Findings from this study are generally consistent with those from previous studies using similar statistical models. It has been demonstrated that the alignment of statistical standards between subjects based on the Rasch model would likely result in substantial change in performance standards of the examinations for some subjects evidenced here by significant changes in grade boundary scores and grade outcomes. It is argued that the defined purposes of GCSE and A level qualifications determine how their results should be interpreted and reported and that the existing grading and results reporting procedures are appropriate for supporting these purposes. 相似文献

17.

Supplemental Instruction: Understanding Academic Assistance in Underrepresented Groups

Erin M. Buchanan Kathrene D. Valentine Michael L. Frizell 《Journal of Experimental Education》2019,87(2):288-298

Student retention rates are increasingly important in higher education. Higher education institutions have adopted various programs in the hopes of increasing graduation rates and grade point averages (GPAs). One of the most effective attempts at improvement has been the Supplemental Instruction (SI) program. We examined our SI program relative to three facets: attendance, attendance's influence on final scores, and graduation rates for students who had participated in these courses. These questions were also investigated focusing on specific comparison groups, as we looked into how these effects differed for minority students and nontraditional students compared with those of White and traditional peers. Overall, SI attendance led to positive outcomes— increased final course grades and graduation rates—even after adjusting for previous achievement. 相似文献

18.

Establishing an Early Warning System: Predicting Low Grades in College Students from Survey of Academic Orientations Scores

Hall P. Beck William D. Davidson 《Research in higher education》2001,42(6):709-723

Counselors, faculty, and student personnel specialists are often unaware that college students are experiencing serious academic or adjustment difficulties until it is too late to rectify the problem. Most universities would benefit from an early warning system that detects at-risk students before performance or social problems jeopardize their college careers. This investigation demonstrated that scores from the Survey of Academic Orientations (SAO) were predictive of first-semester freshmen grades. Subsequent analysis showed that the SAO significantly improved the prediction of grade point averages, after taking the effects of Scholastic Assessment Test scores and high school percentage rank into consideration. The SAO gives educators a new early warning device, a way to identify those undergraduates most at risk of receiving poor grades. The next steps in the research process are to: (1) assess the relationship of SAO scores to other important academic indexes, such as retention and student stress, and (2) determine if furnishing counselors and other college personnel with SAO scores is of therapeutic value. 相似文献

19.

Grading as a Reform Effort: Do Standards‐Based Grades Converge With Test Scores?

Megan E. Welsh Jerome V. D'Agostino Burcu Kaniskan 《Educational Measurement》2013,32(2):26-36

Standards‐based progress reports (SBPRs) require teachers to grade students using the performance levels reported by state tests and are an increasingly popular report card format. They may help to increase teacher familiarity with state standards, encourage teachers to exclude nonacademic factors from grades, and/or improve communication with parents. The current study examines the SBPR grade–state test score correspondence observed across 2 years in 125 third and fifth grade classrooms located in one school district to examine the degree of consistency between grades and state test results. It also examines the grading practices of a subset of 37 teachers to determine whether there is an association between teacher appraisal style and convergence rates. A moderate degree of grade–test score convergence was observed using three agreement estimates (coefficient kappa, tau‐b correlations, and classroom‐level mean differences between grades and test scores). In addition, only small amounts of grade–test score convergence were observed between teachers; a much greater proportion of variance lay within classrooms and subjects. Appraisal style correlated weakly with convergence rates, but was most strongly related to assigning students to the same performance level as the test. Therefore using recommended grading practices may improve the quality of SBPR grades to some extent. 相似文献

20.

A new science and engineering career interest survey for middle school students

Edward P. Donovan Robert H. Fronk Phillip B. Horton 《科学教学研究杂志》1985,22(1):19-30

This study describes the development and validation of a science and engineering (S/E) career interest survey (CIS). This 56 question survey was developed to measure the overall S/E career interests of 7th through 9th grade students. In the CIS, a S/E career is characterized as one which requires the completion of at least a four-year college program with a major in science, science education, or engineering. The CIS is divided into four major parts. In Part I (30 questions), students are expected to select from occupational activities, while in Part II (20 questions) they are to select from various occupations. Part III (5 questions) and Part IV together make up the CIS internal verification scale. The CIS test-retest reliability coefficients for one week and eight months were calculated as 0.96 (n = 57, grades 7–9) and 0.78 (n = 1937, grade 8), respectively. The KR-21 estimate for the CIS was calculated as 0.92. Criterion-related validity coefficients were calculated in two ways: (a) CIS scores were correlated with the Kuder GIS science subscale (r = 0.75, n = 45, grades 7–9), and (b) CIS scores were correlated with a CIS internal verification scale (r = 0.59, n = 127, grades 7–9). Evidence to support the construct validity of the CIS was collected by two methods: (a) for 7–9 grade students (n = 45), the CIS score was found to correlate 0.75 with the scientific subscale and ?0.42 with the artistic sub-scale, of the Kuder GIS. (b) the second method compared the scores of known groups. Test results for students in grades 7-9 (n = 127; n = 1937) showed a statistically significant difference between the scores of boys and girls on S/E career interest. The readability of the CIS was seventh grade level. 相似文献