期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic,multiple assessment options in undergraduate meteorology education

Jonathan D. W. Kahl 《Assessment & Evaluation in Higher Education》2017,42(8):1319-1325

Since 2008, automatic, multiple assessment options have been utilised in selected undergraduate meteorology courses at the University of Wisconsin–Milwaukee. Motivated by a desire to reduce stress among students, the assessment methodology includes examination-heavy and homework-heavy alternatives, differing by an adjustable 15% of the overall course grade. Students do not need to commit a priori to one particular assessment option, as the more beneficial of two alternative assessment schemes is automatically chosen at the end of the semester. An analysis of assessment score differences reveals that end-of-semester assessments increased by more than 1% for 48% of the students enrolled, a consequential increase which can improve a student’s assessment by a fractional grade. Score differences among the two assessment alternatives tend to be smaller for higher achieving students, and larger for the middle- and lower achieving students. Score differences larger than 3% were rare. Limited survey results indicate that students understand and appreciate the assessment scheme, and feel that it reduces stress and may improve their academic performance. The automatic, multiple assessment methodology presents no risk to either student or instructor, as students can only benefit and virtually no effort is required of the instructor. 相似文献

2.

Career guidance,career assessment,and consultancy

Fred A. Rowe Kathy A. Mauer 《Journal of Career Development》1991,17(3):223-233

Summary Although career guidance activities can positively influence students' career development, the effectiveness of such activities is often impaired because they attempt to provide the same services to all students at the same time. Programs typically do not consider either students' differential developmental patterns or the fact that many students have particular needs due to learning, physical, or emotional handicaps. Therefore, a primary focus of refining programs must be the individualization of services to meet a wide variety of student needs.Referral to career assessment or to professionals trained in career assessment and consultation is a valuable option open to school counselors who have neither the time nor the specialized training to conduct such assessments themselves. Career assessment centers offer comprehensive assessment services, including both more traditional assessments as well as innovative activities. Using a consultation paradigm, the school counselor and career assessment officer can better provide comprehensive, individualized assessment and counseling tailored to the specific needs of students, regardless of their handicaps and their different levels of developmental readiness 相似文献

3.

Fifth graders' science inquiry abilities: A comparative study of students in hands‐on and textbook curricula

Jerome Pine Pamela Aschbacher Ellen Roth Melanie Jones Cameron McPhee Catherine Martin Scott Phelps Tara Kyle Brian Foley 《科学教学研究杂志》2006,43(5):467-484

A large number of American elementary school students are now studying science using the hands‐on inquiry curricula developed in the 1990s: Insights; Full Option Science System (FOSS); and Science and Technology for Children (STC). A goal of these programs, echoed in the National Science Education Standards, is that children should gain “abilities to do scientific inquiry” and “understanding about scientific inquiry.” We have studied the degree to which students can do inquiries by using four hands‐on performance assessments, which required one or three class periods. To be fair, the assessments avoided content that is studied in depth in the hands‐on programs. For a sample of about 1000 fifth grade students, we compared the performance of students in hands‐on curricula with an equal number of students with textbook curricula. The students were from 41 classrooms in nine school districts. The results show little or no curricular effect. There was a strong dependence on students' cognitive ability, as measured with a standard multiple‐choice instrument. There was no significant difference between boys and girls. Also, there was no difference on a multiple‐choice test, which used items released from the Trends in International Mathematics and Science Study (TIMSS). It is not completely clear whether the lack of difference on the performance assessments was a consequence of the assessments, the curricula, and/or the teaching. © 2006 Wiley Periodicals, Inc. J Res Sci Teach 43: 467–484, 2006 相似文献

4.

Psychometric models of student conceptions in science: Reconciling qualitative studies and distractor-driven assessment instruments

Philip M. Sadler 《科学教学研究杂志》1998,35(3):265-296

We stand poised to marry the fruits of qualitative research on children's conceptions with the machinery of psychometrics. This merger allows us to build upon studies of limited groups of subjects to generalize to the larger population of learners. This is accomplished by reformulating multiple choice tests to reflect gains in understanding cognitive development. This study uses psychometric modeling to rank the appeal of a variety of children's astronomical ideas on a single uniform scale. Alternative conceptions are captured in test items with highly attractive multiple choice distractors administered twice to 1250 8th through 12th-grade students at the start and end of their introductory astronomy courses. For such items, an unusual psychometric profile is observed—instruction appears to strengthen support for alternative conceptions before this preference eventually declines. This lends support to the view that such ideas may actually be markers of progress toward scientific understanding and are not impediments to learning. This method of analysis reveals the ages at which certain conceptions are more prevalent than others, aiding developers and practitioners in matching curriculum to student grade levels. This kind of instrument, in which distractors match common student ideas, has a profoundly different psychometric profile from conventional tests and exposes the weakness evident in conventional standardized tests. Distractor-driven multiple choice tests combine the richness of qualitative research with the power of quantitative assessment, measuring conceptual change along a single uniform dimension. © 1998 John Wiley & Sons, Inc. J Res Sci Teach 35: 265–296, 1998. 相似文献

5.

The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

Jason C. Immekus Ben Atitya 《Educational Assessment》2016,21(3):176-195

Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of summative test performance, based on Grade 6 student data (N = 4,651) from a larger, urban district. Factor analytic results supported modeling the interim test data in terms of a bifactor model (Gibbons & Hedeker, 1992), with items reporting moderate to high relationships to the primary dimension (i.e., ELA) and varying estimates on the secondary domains. Hierarchical multiple linear regression results indicated that primary ELA scores were the strongest predictors of summative test performance, with subscale scores not improving predictive accuracy. Findings address issues pertaining to investigating the technical quality of test data widely used in district-wide assessment systems. 相似文献

6.

Intelligence assessment: Gardner multiple intelligence theory as an alternative

Leandro S. Almeida Maria Dolores Prieto Aristides I. Ferreira Maria Rosario Bermejo Mercedes Ferrando Carmen Ferrándiz 《Learning and individual differences》2010,20(3):225-230

In the multiple intelligence framework, newer and more contextualized cognitive tasks are suggested as alternative to more traditional psychometric tests. The purpose of this article is to examine whether or not these two types of instruments converge into a general factor of cognitive performance. Thus, the Battery of General and Differential Aptitudes (BADyG: reasoning, memory, verbal aptitude, numerical aptitude and spatial aptitude) and a set of Gardner's multiple intelligence assessment tasks (linguistic, logical, visual/spatial, bodily-kinesthetic, naturalistic and musical intelligences) were administered to 294 children aged 5 to 7. The confirmatory factor analysis points out the absence of a common general factor considering both batteries, indicating instead the existence of two general factors, which gather the tests that encompass them. Also, these two general factors correspond to traditional and multiple intelligence assessments and show a statistically moderate correlation between them. These results challenge Gardner's original position on refusing a general factor of intelligence, especially when considering the cognitive dimensions measured which do not coincide with the more traditional tests of intelligence. 相似文献

7.

Designing effective,contemporary assessment on a flipped educational sciences course

Caroline Fell Kurban 《Interactive Learning Environments》2019,27(8):1143-1159

ABSTRACT

Evidence shows flipped learning increases academic performance and student satisfaction. Yet, often practitioners flip instruction but keep traditional curricula and assessment. Assessment in higher education is often via written exams. But these provide limited feedback and do not ask students to put knowledge into practice. This does not support the tenets of flipped learning. For two years, the author flipped instruction but retained traditional curricula and assessment. However, on the author’s current course, all three aspects were redesigned to better support flipped learning. The aim of this research is to test the effectiveness of this redesign regarding student engagement and satisfaction. Thus, it is asked: How, on this course, can meaningful, continuous assessment be provided as well as effective, personalized feedback, while staying in line with the philosophy of flipped learning? Action research took place from September 2016 to June 2017. Quantitative data from a student survey, and qualitative data from a research diary and student focus group were gathered. What emerged is: a little-and-often assessment approach is effective for learning and engagement; tasks must be authentic and test demonstration of knowledge, not memory; quality, not quantity, is key for student learning; and students desire individualized feedback. 相似文献

8.

Measuring the impact of the flipped anatomy classroom: The importance of categorizing an assessment by Bloom's taxonomy

下载免费PDF全文

David A. Morton Jorie M. Colbert‐Getz 《Anatomical sciences education》2017,10(2):170-175

The flipped classroom (FC) model has emerged as an innovative solution to improve student‐centered learning. However, studies measuring student performance of material in the FC relative to the lecture classroom (LC) have shown mixed results. An aim of this study was to determine if the disparity in results of prior research is due to level of cognition (low or high) needed to perform well on the outcome, or course assessment. This study tested the hypothesis that (1) students in a FC would perform better than students in a LC on an assessment requiring higher cognition and (2) there would be no difference in performance for an assessment requiring lower cognition. To test this hypothesis the performance of 28 multiple choice anatomy items that were part of a final examination were compared between two classes of first year medical students at the University of Utah School of Medicine. Items were categorized as requiring knowledge (low cognition), application, or analysis (high cognition). Thirty hours of anatomy content was delivered in LC format to 101 students in 2013 and in FC format to 104 students in 2014. Mann Whitney tests indicated FC students performed better than LC students on analysis items, U = 4243.00, P = 0.030, r = 0.19, but there were no differences in performance between FC and LC students for knowledge, U = 5002.00, P = 0.720 or application, U = 4990.00, P = 0.700, items. The FC may benefit retention when students are expected to analyze material. Anat Sci Educ 10: 170–175. © 2016 American Association of Anatomists. 相似文献

9.

An alternative method of answering and scoring multiple choice tests

Charles Taylor Paul L. Gardner 《Research in Science Education》1999,29(3):353-363

A simple modification to the method of answering and scoring multiple choice tests allows students to indicate their estimates of the probability of the correctness of the multiple choice options for each question, without affecting the validity of the assessment. A study was conducted using a test that investigated common misconceptions in mechanics. The study showed that for assessment purposes this method gives results that are very similar to results obtained by students who answer in the traditional manner. Year 12 Physics students (N=85) were randomly allocated to two treatment groups: one received a standard format multiple choice test, the other a test format allowing students to select more than one response in a multiple choice test, and to distribute their marks among their chosen optionsl An analysis of the students' uncertainties is used to argue that not only can students appeal to different conceptions in different contexts, but that they can also hold conflicting conceptions with respect to a single context. 相似文献

10.

Differentiating work for statistics modules in sports degrees

《Journal of Further & Higher Education》2012,36(3):295-302

The aims of the study were to use differentiated online learning material for use with a Level 1 statistics module for undergraduate sport students and examine relationships between student performance on differentiated tests and module performance. We developed the differentiated material by writing easy and hard multiple choice tests, with the harder tests having a shorter completion time and more choices. Each multiple choice test related to information available online and immediate feedback was provided on completion of the test. Results indicated that 85% of students accessed the module online, with 26% accessing difficult tests and 22% accessing easy tests. Correlation results indicated that module performance was significantly related to performance on the easy test (r = 0.27, P<0.01) and also on the harder test (r = 0.26, P<0.01). Findings suggest that lecturers should encourage students to engage with interactive material and that future research should explore methods to enhance students' independent learning skills. 相似文献

11.

Assessment of performance in practical science and pupil attributes 总被引：2，自引：0，他引：2

Ros Roberts Richard Gott 《Assessment in Education: Principles, Policy & Practice》2006,13(1):45-67

Performance assessment in the UK science General Certificate of Secondary Education (GCSE) currently relies on pupil reports of their investigations. These are widely criticized. Written tests of procedural understanding could be used as an alternative, but what exactly do they measure? This paper describes small‐scale research in which there was an analysis of assessments of pupils' GCSE scores of substantive ideas, their coursework performance assessment and a novel written evidence test. Results from these different assessments were compared with each other and with baseline data on CAT scores and pupils' attributes. Significant predictors of performance on each of these assessments were determined. The data reported shows that a choice could be made between practical coursework that links to ‘behaviour’ and written evidence tests which link, albeit less strongly, with ‘quickness’. There would be differential effects on pupils. 相似文献

12.

The influence of test‐based accountability policies on early elementary teachers: School climate,environmental stress,and teacher stress

下载免费PDF全文

Elina Saeki Natasha Segool Laura Pendergast Nathaniel von der Embse 《Psychology in the schools》2018,55(4):391-403

This study examined the potential influence of test‐based accountability policies on school environment and teacher stress among early elementary teachers. Structural equation modeling of data from 541 kindergarten through second grade teachers across three states found that use of student performance on high‐stakes tests to evaluate teachers indirectly was related to teachers’ professional investment via test stress in the environment. Although students in kindergarten through second grade do not take high‐stakes assessments, early elementary teachers reported high levels of stress associated with test‐based accountability policies. This study provides data across multiple states that test‐based accountability policies may have negative influences on school environment and teacher stress among early elementary teachers. Implications for practice and research are discussed. 相似文献

13.

Coaching for the PISA test

《Learning and Instruction》2007,17(2):111-122

Coaching is known to improve student performance on tests with high personal relevance (“high-stakes tests”). We investigate whether the same holds for a test that has no personal relevance for the students taking it (“low-stakes test”). More specifically, we explore whether student performance on the reading and mathematics assessments of the OECD's Programme for International Student Assessment (PISA) can be fostered by coaching (and administering a pretest). Coaching and pretest effects were studied for each content domain separately in a pre-/posttest quasi-experimental design. To examine differential effects of academic tracks, samples were drawn from German Hauptschule and Gymnasium schools. Results show that only the combined effects of pretesting and coaching have substantial positive effects on student performance. Implications for the interpretation of large-scale assessment programs are discussed. 相似文献

14.

Assessing students’ performance by measured patterns of perceived strengths: does preference make a difference?

Julie Urda Stephen P. Ramocki 《Assessment & Evaluation in Higher Education》2015,40(1):33-44

This paper is an empirical field study of whether college students’ preferences for assessment type correspond to their performance in assessment that tests that particular strength. For example, if students say they prefer assessment that tests their creativity, do they actually perform better on assessment tasks requiring the use of creativity? Seventy-eight students in three different courses were surveyed to determine their preferences in four types of assessment: memorisation, analysis, creativity and practical application. These preferences were then compared to student grades on corresponding forms of assessment to see if the preferences corresponded to actual performance. The study found that, while students had a clear preference for memorisation, they were not likely to deliver their best performances on memorisation tasks. There was no relationship at all between student preferences in assessment type and their performance in the respective assessments. These results indicate that, while in theory assessing students based on their preferences is reasonable for improved learning, we were not able to find evidence that it actually leads to higher performance. 相似文献

15.

Early prediction of student knowledge in game-based learning with distributed representations of assessment questions

Andrew Emerson Wookhee Min Roger Azevedo James Lester 《British journal of educational technology : journal of the Council for Educational Technology》2023,54(1):40-57

Game-based learning environments hold significant promise for facilitating learning experiences that are both effective and engaging. To support individualised learning and support proactive scaffolding when students are struggling, game-based learning environments should be able to accurately predict student knowledge at early points in students' gameplay. Student knowledge is traditionally assessed prior to and after each student interacts with the learning environment with conventional methods, such as multiple choice content knowledge assessments. While previous student modelling approaches have leveraged machine learning to automatically infer students' knowledge, there is limited work that incorporates the fine-grained content from each question in these types of tests into student models that predict student performance at early junctures in gameplay episodes. This work investigates a predictive student modelling approach that leverages the natural language text of the post-gameplay content knowledge questions and the text of the possible answer choices for early prediction of fine-grained individual student performance in game-based learning environments. With data from a study involving 66 undergraduate students from a large public university interacting with a game-based learning environment for microbiology, Crystal Island , we investigate the accuracy and early prediction capacity of student models that use a combination of gameplay features extracted from student log files as well as distributed representations of post-test content assessment questions. The results demonstrate that by incorporating knowledge about assessment questions, early prediction models are able to outperform competing baselines that only use student game trace data with no question-related information. Furthermore, this approach achieves high generalisation, including predicting the performance of students on unseen questions.

Practitioner notes

What is already known about this topic

A distinctive characteristic of game-based learning environments is their capacity to enable fine-grained student assessment.
Adaptive game-based learning environments offer individualisation based on specific student needs and should be able to assess student competencies using early prediction models of those competencies.
Word embedding approaches from the field of natural language processing show great promise in the ability to encode semantic information that can be leveraged by predictive student models.

What this paper adds

Investigates word embeddings of assessment question content for reliable early prediction of student performance.
Demonstrates the efficacy of distributed word embeddings of assessment questions when used by early prediction models compared to models that use either no assessment information or discrete representations of the questions.
Demonstrates the efficacy and generalisability of word embeddings of assessment questions for predicting the performance of both new students on existing questions and existing students on new questions.

Implications for practice and/or policy

Word embeddings of assessment questions can enhance early prediction models of student knowledge, which can drive adaptive feedback to students who interact with game-based learning environments.
Practitioners should determine if new assessment questions will be developed for their game-based learning environment, and if so, consider using our student modelling framework that incorporates early prediction models pretrained with existing student responses to previous assessment questions and is generalisable to the new assessment questions by leveraging distributed word embedding techniques.
Researchers should consider the most appropriate way to encode the assessment questions in ways that early prediction models are able to infer relationships between the questions and gameplay behaviour to make accurate predictions of student competencies.

相似文献

16.

Analysis of assessment practice and subsequent performance of third year level students in natural sciences

《Africa Education Review》2013,10(4):563-583

Abstract

Summative assessment qualifies the achievement of a student in a particular field of specialization at a given time. Questions should include a range of cognitive levels from Bloom's taxonomy and be consistent with the learning outcomes of the module in question. Furthermore, a holistic approach to assessment, such as the application of the principles of the Herrmann Whole Brain Model, needs to be used to accommodate learning style diversity. The purpose of this study was to analyse, assess and compare the summative assessment of two third year level modules in the Bachelor of Science degree programme, namely Biochemistry and Zoology as part of action research with a view to enhancing the professional development of the lecturers involved. The questions posed in summative assessments were classified in terms of Bloom's differentiation of cognitive levels and the four different learning styles determined by Herrmann. Spearman's non-parametric analysis indicated that no correlation existed in this study between cognitive level and student performance based on achievement. In addition, there was not much difference between the cognitive levels and student performance between the two disciplines. Although the students seemed to do better at application level questions, the authors need to reflect on whether the assessments were valid with respect to the learning outcomes, methods of facilitating learning, and the assessments based on cognitive levels and learning style preferences. We conclude that continuous action research must be taken to improve the formulation of learning outcomes and students' achievement of these outcomes and quality of student learning – the main aim being the successful completion of the modules. 相似文献

17.

Multi-informant assessment of treatment integrity in the classroom

Evan H. Dart Melissa A. Collier-Meek Caitlyn Chambers Ashley Murphy 《Psychology in the schools》2020,57(5):805-822

Assessing the degree to which interventions are implemented in school settings is critical to making decisions about student outcomes. School psychologists may not be available to regularly conduct observations of intervention implementation, however, their data may be used alongside other methods for multi-informant assessment. Teacher self-report is a commonly used and feasible assessment method. Students have been trained to implement interventions with their peers in instances where traditional adult interventionists were unavailable. This exploratory study investigated the accuracy with which classroom teachers and middle and high school students assessed implementation of the Good Behavior Game and the impact of performance feedback on their accuracy. Results indicated that most students and teachers were able to provide accurate assessments of treatment integrity compared to researcher direct observation; however, some required performance feedback to do so. These findings suggest that multi-informant assessment may be a feasible and accurate way for school psychologists to collect formative treatment-integrity data in the classroom. Limitations and future directions are discussed. 相似文献

18.

Correlation between academic and skills-based tests in computer networks

William Buchanan 《British journal of educational technology : journal of the Council for Educational Technology》2006,37(1):69-78

Computing‐related programmes and modules have many problems, especially related to large class sizes, large‐scale plagiarism, module franchising, and an increased requirement from students for increased amounts of hands‐on, practical work. This paper presents a practical computer networks module which uses a mixture of online examinations and a practical skills‐based test to assess student performance. For widespread adoption of practical assessments, there must be a level of checking that the practical assessments are set at a level that examinations are set at. This paper shows that it is possible to set practical tests so that there can be a strong correlation between practical skills‐based tests and examination‐type assessments, but only if the practical assessment are set at a challenging level. This tends to go against the proposition that students who are good academically are not so good in a practice test, and vice versa. The paper shows results which bands students in A, B, C, and FAIL groups based on two online, multiple‐choice tests, and then analyses the average time these students took to complete a practical online test. It shows that there is an increasing average time to complete the test for weaker students. Along with this, the paper shows that female students in the practical test outperform male students by a factor of 25%. 相似文献

19.

Trends in computer applications in science assessment

David D. Kumar Stanley L. Helgeson 《Journal of Science Education and Technology》1995,4(1):29-36

Seven computer applications to science assessment are reviewed. Conventional test administration includes record keeping, grading, and managing test banks. Multiple-choice testing involves forced selection of an answer from a menu, whereas constructed-response testing involves options for students to present their answers within a set standard deviation. Adaptive testing attempts to individualize the test to minimize the number of items and time needed to assess a student's knowledge. Figurai response testing assesses science proficiency in pictorial or graphic mode and requires the student to construct a mental image rather than selecting a response from a multiple choice menu. Simulations have been found useful for performance assessment on a large-scale basis in part because they make it possible to independently specify different aspects of a real experiment. An emerging approach to performance assessment is solution pathway analysis, which permits the analysis of the steps a student takes in solving a problem. Virtually all computer-based testing systems improve the quality and efficiency of record keeping and data analysis. 相似文献

20.

NCME 2008 Presidential Address: The Impact of Anchor Test Configuration on Student Proficiency Rates

Anne R. Fitzpatrick 《Educational Measurement》2008,27(4):34-40

Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple‐choice tests administered in an annual, large‐scale, high‐stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a small universe of items in order to investigate the stability of equating results over anchor test samples. The operational tests were calibrated using the one‐parameter model and equated using the mean b‐value method. The findings indicated that student proficiency rates could display important variability over anchor test samples when 15 anchor items were used. Notable increases in this variability were found for some tests when shorter anchor tests were used. For these tests, some of the anchor items had parameters that changed somewhat in relative difficulty from one year to the next. It is recommended that anchor sets with more than 15 items be used to mitigate the instability in equating results due to anchor item sampling. Also, the optimal allocation method of stratified sampling should be evaluated as one means of improving the stability and precision of equating results. 相似文献