期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Assessing the reliability of student evaluations of teaching: choosing the right coefficient

Donald Morley 《Assessment & Evaluation in Higher Education》2014,39(2):127-139

Many of the studies used to support the claim that student evaluations of teaching are reliable measures of teaching effectiveness have frequently calculated inappropriate reliability coefficients. This paper points to three coefficients that would be appropriate depending on if student evaluations were used for formative or summative purposes. Results from the present study indicated that students had very low absolute inter-rater reliability, but somewhat higher consistency inter-rater reliability. 相似文献

2.

Long-term stability of students' evaluations: A note on Feldman's “consistency and variability among college students in rating their teachers and courses”

Herbert W. Marsh Dr. J. U. Overall 《Research in higher education》1979,10(2):139-147

Feldman (1977), reviewing research about the reliability of student evaluations, reported that while class average responses were quite reliable (.80s and .90s), single rater reliabilities were typically low (.20s). However, studies he reviewed determined single rater reliability with internal consistency measures which assumed that differences among students in the same class (within-class variance) were completely random—an assumption which Feldman seriously questioned. In the present study, this assumption was tested by collecting evaluations from the same students at the end of each class and again one year after graduation. Single rater reliability based upon an internal consistency approach (agreement among different students in the same class) was similar to that reported by Feldman. However, single rater reliability based upon a stability approach (agreement between end-of-term and follow-up ratings by the same student) was much higher (medianr=.59). These results indicate that individual student evaluations were remarkably stable over time and more reliable than previously assumed. Most important, there was systematic information in individual student ratings—beyond that implied by the class average response—that internal consistency approaches have ignored or assumed to be nonexistent. 相似文献

3.

Student evaluations of teaching effectiveness: a Nigerian investigation

David Watkins Adebowale Akande 《Higher Education》1992,24(4):453-463

An investigation is reported which tests the applicability of two American instruments designed to assess tertiary students' evaluations of teaching effectiveness with 158 Nigerian undergraduates. The scales were found to have generally high internal consistency reliability coefficients, most of the items were seen to be appropriate, and every item was considered of importance by at least some of the students. In addition, all but the Workload/Difficulty items clearly differentiated between good and poor lecturers. Factor analysis found a strong main factor of teaching effectiveness plus a minor factor referring to course workload and difficulty. Further analysis generally supported the convergent and discriminant validity of those scales hypothesized to measure similar or dissimilar components of effective teaching. However, this analysis supported the factor analytic results as more overlap between aspects of teaching skill and enthusiasm was found than has been evident in Western studies. Thus there must be doubt about the cross-cultural validity of a multidimensional model of teaching effectiveness. 相似文献

4.

Student perception of teaching effectiveness: development and validation of the Evaluation of Teaching Competencies Scale (ETCS)

Victor M. Catano Steve Harvey 《Assessment & Evaluation in Higher Education》2011,36(6):701-717

A major criticism of student evaluations of teaching is that they do not reflect student perspectives. Using critical incidents job analysis, students identified nine teaching effectiveness competencies: communication, availability, creativity, individual consideration, social awareness, feedback, professionalism, conscientiousness and problem‐solving. The behaviourally anchored Evaluation of Teaching Competencies Scale is a highly reliable (alpha = .94), unidimensional measure that correlated strongly with an instructor‐related composite of the Students’ Evaluation of Educational Quality (SEEQ, r = .72), but not to a SEEQ composite related to instructor assigned work (r = .04, N = 195). The results are discussed in the context of other measures of teaching effectiveness and transformational leadership theory. 相似文献

5.

How reliable are students’ evaluations of teaching quality? A variance components approach

Daniela Feistauer Tobias Richter 《Assessment & Evaluation in Higher Education》2017,42(8):1263-1279

The inter-rater reliability of university students’ evaluations of teaching quality was examined with cross-classified multilevel models. Students (N = 480) evaluated lectures and seminars over three years with a standardised evaluation questionnaire, yielding 4224 data points. The total variance of these student evaluations was separated into the variance components of courses, teachers, students and the student/teacher interaction. The substantial variance components of teachers and courses suggest reliability. However, a similar proportion of variance was due to students, and the interaction of students and teachers was the strongest source of variance. Students’ individual perceptions of teaching and the fit of these perceptions with the particular teacher greatly influence their evaluations. This casts some doubt on the validity of student evaluations as indicators of teaching quality and suggests that aggregated evaluation scores should be used with caution. 相似文献

6.

Using mid-semester course evaluation as a feedback tool for improving learning and teaching in higher education

E. Murat Sozer Mustafa Kaya 《Assessment & Evaluation in Higher Education》2019,44(7):1003-1016

The way in which mid-semester course evaluations are structured, administered and reported is important for generating rich and high-quality student feedback for the enhancement of learning and teaching. Mid-semester evaluations usually contain open-ended questions that trigger more elaborative feedback about what is going on in a class than that from end-of-semester evaluations with Likert scale-type questions. The anonymity of the process for students and the confidentiality of the process for instructors make the process more reflective for students and less stressful for instructors. This study describes how the mid-semester course evaluation process can be used as a feedback tool for improving the quality of teaching and learning at an institutional level. Through a longitudinal analysis of 341 mid-semester course evaluation reports, positive areas and areas of concern with respect to learning and teaching were identified, and changes in student evaluations over the years were examined meticulously to make an overall evaluation of the quality of learning and teaching at a non-profit Turkish university. This research showed that the value of mid-semester course evaluations can go beyond course-level if we use open-ended questions and are able to gather the reports together to make some comprehensive analysis at university level. 相似文献

7.

SPSS macros for assessing the reliability and agreement of student evaluations of teaching

Donald D. Morley 《Assessment & Evaluation in Higher Education》2009,34(6):659-671

相似文献

8.

Evaluating tertiary teaching: A New Zealand perspective

《Teaching and Teacher Education》1987,3(1):41-53

The purpose of this research was to test the applicability of two American instruments designed to assess tertiary students' evaluations of teaching effectiveness with New Zealand students. The scales were found to have high internal consistency reliability coefficients, most of the items were seen to be appropriate, and every item was considered of importance by at least some of the students. In addition, all but the Workload/Difficulty items clearly differentiated between “good,” “average,” and “poor” lectures. Further analyses generally supported both the factor structure identified in earlier research and the convergent and discriminate validity of the scales from both instruments. This research has provided strong support for the applicability of these American instruments for evaluating effective teaching at a New Zealand university. 相似文献

9.

ASSESSING TEACHING EFFECTIVENESS: AN INDIAN PERSPECTIVE

David Watkins Babu Thomas 《Assessment & Evaluation in Higher Education》1991,16(3):185-198

>An investigation is reported which tests the applicability of two American instruments designed to assess tertiary students’ evaluations of teaching effectiveness with 111 Indian graduate students. The scales were found to have generally high internal consistency reliability coefficients, most of the items were seen to be appropriate, and every item was considered of importance by at least some of the students. In addition, all but the Workload/Difficulty items clearly differentiated between ‘good’ and ‘poor’ lecturers. Further analysis generally supported the convergent and discriminant validity of those scales hypothesised to measure similar or dissimilar components of effective teaching. However, this analysis did indicate more overlap between aspects of teaching skill and enthusiasm than evident in Western studies. Factor analysis confirmed this finding as a strong main factor of teaching effectiveness plus minor factors referring to specific aspects of teaching were obtained. 相似文献

10.

Honesty on student evaluations of teaching: effectiveness,purpose, and timing matter!

Lauren McClain Angelika Gulbis Donald Hays 《Assessment & Evaluation in Higher Education》2018,43(3):369-385

Student evaluations of teaching (SETs) are an important point of assessment for faculty in curriculum development, tenure and promotion decisions, and merit raises. Faculty members utilise SETs to gain feedback on their classes and, hopefully, improve them. The question of the validity of student responses on SETs is a continuing debate in higher education. The current study uses data from two universities (n = 596) to determine whether and under what conditions students are honest on in-class and online SETs, while also assessing their knowledge and attitudes about SETs. Findings reveal that, while students report a high level of honesty on SETs, they are more likely to be honest when they believe that evaluations effectively measure the quality of the course, the results improve teaching and benefit students rather than the administration, and when they are given at the end of the term. Honesty on evaluations is not associated with socio-demographic characteristics. 相似文献

11.

Online student evaluations of teaching: what are we sacrificing for the affordances of technology?

Anglica Risquez Elaine Vaughan Maura Murphy 《Assessment & Evaluation in Higher Education》2015,40(1):120-134

In the context of increased emphasis on quality assurance of teaching, it is crucial that student evaluations of teaching (SET) methods be both reliable and workable in practice. Online SETs particularly tend to raise criticisms with those most reactive to mechanisms of teaching accountability. However, most studies of SET processes have been conducted with convenience, small and cross-sectional samples. Longitudinal studies are rare, as comparison studies on SET methodological approaches are generally pilot studies followed shortly after by implementation. The investigation presented here significantly contributes to the debate by examining the impact of the online administration method of SET on a very large longitudinal sample at the course level rather than attending to the student unit, thus compensating for the inter-dependency of students’ responses according to the instructor variable. It explores the impact of the administration method of SET (paper based in-class vs. out-of-class online collection) on scores, with a longitudinal sample of over 63,000 student responses collected over a total period of 10 years. Having adjusted for the confounding effect of class size, faculty, year of evaluation, years of teaching experience and student performance, it is observed that the actual effect of the administration method exists, but is insignificant. 相似文献

12.

The number of feedbacks needed for reliable evaluation. A multilevel analysis of the reliability,stability and generalisability of students’ evaluation of teaching

Pekka Rantanen 《Assessment & Evaluation in Higher Education》2013,38(2):224-239

A multilevel analysis approach was used to analyse students’ evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher’s general teaching effectiveness, one needs to evaluate four randomly chosen course implementations. Two implementations are needed when one course is evaluated, and if one implementation is evaluated, up to 15 feedbacks are needed. The stability of students’ ratings is very high, which reflects students’ stable rating criteria. There is an obvious rating paradox: from the student’s point of view, each rating is very precise, stable and justifiable, but from the teacher’s point of view a single feedback reflects the quality of teaching to just a moderate extent. Cross-hierarchical analysis reveals that there are large discrepancies between the uses of rating scales; some students are systematically more lenient in their rating whereas others are systematically more severe. The study also reveals that some courses are generally rated more favourably and that some courses are more suitable for certain teachers. Managers can thus improve the quality of teaching by finding the most suitable courses for each teacher. 相似文献

13.

高校学生评教工具的结构与维度

吴瑞林张彦通《高教发展与评估》2011,27(6)

高校教学管理部门自制的评教问卷目前仍是被广泛使用的学生评教工具,这些问卷的结构往往缺少足够的科学性和有效性。通过对已有研究成果的分析和思考,发现学生评教普遍被认为具有多维结构,且在多个维度之上存在总教学效果的二阶因素。国外编制的学生评教工具更关注教学行为本身,主要从演讲者、推动者和管理者三种角色定义教师有效教学的维度。而在传统教育思想的影响下,我国学生评教的维度中还更加注意考察教师的道德和敬业精神等维度。相似文献

14.

Assessing academic program and department effectiveness using student evaluation data

Stephen A. Stumpf 《Research in higher education》1979,11(4):353-363

The use of aggregated student evaluations of their courses and course elements (e.g., subject functionality, affect, difficulty, graded assignments) is suggested as an efficient and useful means of obtaining program and department assessments. Given that the instruments used to collect student evaluations are valid (if they are not, they should not be used for any purpose), then averaging class data is likely to provide a valid and reliable index of program and department effectiveness as evaluated by students.Program and department assessment data are presented and discussed for a large northeastern professional school. Large and significant differences in the ratings of program elements were found. Although many of the elements designed into the program by the administration and faculty were perceived as operational by the students, some discrepancies between the design and student perceptions existed. Substantial departmental differences were also found which indicated areas of strength and weakness both within and across departments. The potential usefulness of the assessment for internal change and development is discussed. 相似文献

15.

Multi-Method Evaluation of Instruction in Engineering Classes

Ganesh Mohanty John Gretes Claudia Flowers Bob Algozzine Fred Spooner 《Journal of Personnel Evaluation in Education》2005,18(2):139-151

Student evaluation of instruction in college and university courses has been a routine and mandatory part of undergraduate and graduate education for some time. A major shortcoming of the process is that it often relies exclusively on the opinions or qualitative judgments of students rather than the learning or transfer of knowledge that takes place in the classroom. To develop a more objective system of assessment, this research focused on a learning-centered approach to course work and teaching evaluation. Standardized testing tools were developed suitable for measuring the content knowledge of students in a representative group of undergraduate courses. Course evaluations were conducted using two systems of assessment: the traditional student questionnaire feedback system and one based on the learning-centered approach using a computer-based question bank and on-line testing. Significant performance differences were evident in pretest/posttest comparisons of student learning. Favorable ratings of instruction are reflected in opinions on student questionnaires. No relationship was demonstrated between learning and traditional course evaluation outcomes. Our hypothesis that the learning-centered approach provides information that is not available using the traditional student feedback system was supported.Support for this research was provided in part by Grant No. P116B981224-00 from the U.S. Department of Education, Fund for the Improvement of Post-Secondary Programs awarded to the University of North Carolina at Charlotte. The opinions expressed do not necessarily reflect the position or policy of the Department of Education, and no official endorsement should be inferred. 相似文献

16.

Good teaching: what matters to university students

Hwee Hoon Lee Grace May Lin Kim Ling Ling Chan 《Asia Pacific Journal of Education》2015,35(1):98-110

Institutions assess teaching effectiveness in various ways, such as classroom observation, peer evaluation and self-assessment. In higher education, student feedback continues to be the main teaching evaluation tool. However, most of such forms include characteristics of good teaching that the institutions deem important and may not adequately reflect what students perceive to be good teaching. This study explored students' understandings of good teaching via a survey with students from two faculties at a Singapore university. Students were asked what characteristics they thought constituted the following categories of teaching: preparation and organization, knowledge, learning and thinking, enthusiasm and delivery. It was found that while distinct characteristics were highlighted for the first four categories, the last saw recurring characteristics of teacher attributes and teaching strategies. These two aspects weigh in significantly in the way students perceive whether the teacher is effective. The study has implications for teacher development programmes and the design of student evaluation forms for more accurate assessments of teacher ability and foci on areas of improvement. This study is potentially useful to teachers, as knowing the characteristics of teaching that matter to students could help teachers determine for themselves how to maintain or improve their performance in the classroom. 相似文献

17.

Student evaluation of teaching: the use of best–worst scaling

Twan Huybers 《Assessment & Evaluation in Higher Education》2014,39(4):496-513

An important purpose of student evaluation of teaching is to inform an educator’s reflection about the strengths and weaknesses of their teaching approaches. Quantitative instruments are one way of obtaining student responses. They have traditionally taken the form of surveys in which students provide their responses to various statements using item-by-item agree/disagree ratings. Previous research has identified shortcomings of such rating scales, including response bias and the associated lack of discrimination amongst the items evaluated. In this paper, best–worst scaling is proposed as a novel method for quantitative teaching evaluation. The way in which best–worst scaling can be used in this context is illustrated in three different applications. Two applications demonstrate how it can be used for evaluations in a small-size classroom environment. The third application is a broader evaluation of university courses on a larger scale. In comparison with conventional rating scales, the best–worst scaling approach enables better highlighting of the differences between evaluation items. In doing so, it can provide enhanced guidance to educators in their reflection about their teaching. Moreover, implementation and analysis of a best–worst scaling evaluation is relatively straightforward, which establishes it a feasible method for teaching practitioners and researchers. 相似文献

18.

Instruments for obtaining student feedback: a review of the literature 总被引：4，自引：4，他引：4

John T. E. Richardson 《Assessment & Evaluation in Higher Education》2005,30(4):387-415

This paper reviews the research evidence concerning the use of formal instruments to measure students’ evaluations of their teachers, students’ satisfaction with their programmes and students’ perceptions of the quality of their programmes. These questionnaires can provide important evidence for assessing the quality of teaching, for supporting attempts to improve the quality of teaching and for informing prospective students about the quality of course units and programmes. The paper concludes by discussing several issues affecting the practical utility of the instruments that can be used to obtain student feedback. Many students and teachers believe that student feedback is useful and informative, but for a number of reasons many teachers and institutions do not take student feedback sufficiently seriously. 相似文献

19.

Advancing text-analysis to tap into the student voice: a proof-of-concept study

Jenny McDonald Adon Christian Michael Moskal Allen Goodchild Sarah Stein Stuart Terry 《Assessment & Evaluation in Higher Education》2020,45(1):154-164

Abstract

Student evaluations of teaching and courses (SETs) are part of the fabric of tertiary education and quantitative ratings derived from SETs are highly valued by tertiary institutions. However, many staff do not engage meaningfully with SETs, especially if the process of analysing student feedback is cumbersome or time-consuming. To address this issue, we describe a proof-of-concept study to automate aspects of analysing student free text responses to questions. Using Quantext text analysis software, we summarise and categorise student free text responses to two questions posed as part of a larger research project which explored student perceptions of SETs. We compare human analysis of student responses with automated methods and identify some key reasons why students do not complete SETs. We conclude that the text analytic tools in Quantext have an important role in assisting teaching staff with the rigorous analysis and interpretation of SETs and that keeping teachers and students at the centre of the evaluation process is key. 相似文献

20.

Correlation between grade point averages and student evaluation of teaching scores: taking a closer look

Tyler J. Griffin Kenneth Plummer Devynne Barret 《Assessment & Evaluation in Higher Education》2014,39(3):339-348

One of the most contentious potential sources of bias is whether instructors who give higher grades receive higher ratings from students. We examined the grade point averages (GPAs) and student ratings across 2073 general education religion courses at a large private university. A moderate correlation was found between GPAs and student evaluations of teaching (SETs); however, this global correlation did not hold true for individual teachers and courses. In fact, there was a large variance in the correlations between GPAs and SETs, including some teachers with a negative correlation and a large variance between courses. 相似文献