首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Adaptive Comparative Judgement (ACJ) is a modification of Thurstone’s method of comparative judgement that exploits the power of adaptivity, but in scoring rather than testing. Professional judgement by teachers replaces the marking of tests; a judge is asked to compare the work of two students and simply to decide which of them is the better. From many such comparisons a measurement scale is created showing the relative quality of students’ work; this can then be referenced in familiar ways to generate test results. The judges are asked only to make a valid decision about quality, yet ACJ achieves extremely high levels of reliability, often considerably higher than practicable operational marking can achieve. It therefore offers a radical alternative to the pursuit of reliability through detailed marking schemes. ACJ is clearly appropriate for performances like writing or art, and for complex portfolios or reports, but may be useful in other contexts too. ACJ offers a new way to involve all teachers in summative as well as formative assessment. The model provides strong statistical control to ensure quality assessment for individual students. This paper describes the theoretical basis of ACJ, and illustrates it with outcomes from some of our trials.  相似文献   

2.
The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and many-facet Rasch modeling (MFRM) to comparative judgment (CJ), the new approach, which shows promising results in terms of reliability. We employed two groups of raters—novice and experienced—and used essays that had been previously double-rated, analyzed with MFRM, and selected with fit statistics. We compared the results of the novice and experienced groups against the initial ratings using raw scores, MFRM, and a modern form of CJ—randomly distributed comparative judgment (RDCJ). Results showed that the CJ approach, though not appropriate for all contexts, can be as reliable as RR while showing promise as a more practical approach. Additionally, CJ is easily transferable to novel assessment tasks while still providing context-specific scores. Results from this study will not only inform future studies but can help guide ESL programs in selecting a rating model best suited to their specific needs.  相似文献   

3.
本文认为充分条件假言判断仅在前件真后件必真时才真,前件假时判断是否为真要看是否满足前件真后件必真这一前提。另外,确定一充分条件假言判断逻辑值还要考虑应用这一判断的语境因素。  相似文献   

4.
Authoritarian personality types possess characteristics that are especially troubling if found among criminal justice (CJ) professionals. Recent research found significantly higher Right-Wing Authoritarianism (RWA) scores in male college students majoring in CJ than in male nonmajors as well as significantly higher scores among lower division students than their upper division counterparts. However, the results of that study were limited because the sample was predominantly Caucasian. Given the growth in African-American CJ professionals and the special salience of race, it is important to examine whether the findings can be generalized to African-Americans. In order to explore that issue the current study replicates that research with a largely African-American sample drawn from a historically black college/university (HBCU). Results indicate that, unlike the findings in the original study, CJ majors at the HBCU did not have statistically higher RWA scores. Theoretical and practical implications are discussed.  相似文献   

5.
6.
7.
《Learning and Instruction》2006,16(4):350-362
The differential effects of four task selection methods on training efficiency and transfer in a computer-based training for Air Traffic Control were investigated. Two personalised conditions were compared with two corresponding yoked control conditions. The hypothesis that personalised adaptive task selection leads to more efficient training than non-adaptive task selection was partially confirmed. However, the hypothesis that adaptive task selection based on personalised efficiency leads to more efficient training than adaptive task selection based on personalised preference was not supported. The results are discussed and suggestions are given for future research.  相似文献   

8.
1Introduction The orthogonal frequency division multiplexing(OFDM)system is becoming a chosen modulationtechnique for wireless communications,which canprovide large data rates with sufficient robustness toradio channel i mpairments.Recently,there have beenmany attempts to further i mprove the OFDM systemperformance.Among them,an adaptive OFDMscheme has attracted much attention[1,2],in whichmodulation mode of each subcarrier is adaptivelychanged with the channel quality.However,such i m-pr…  相似文献   

9.
The purpose of the present study was to develop and evaluate a scale to measure adaptive behavior skills in Chinese children with autism spectrum disorder (ASD). Participants were 121 young children (M = 55.18 months, SD = 0.18 months) with a formal diagnosis of ASD (73% male). Psychometric evaluation indicated that the reliability and validity of this scale were good. Furthermore, independent t‐tests revealed that boys demonstrated better adaptive behavior skills than girls. The present findings suggest that the scale is a valid measure of adaptive behavior skills in Chinese children with ASD.  相似文献   

10.
11.
The survey investigated the problems of social desirability (SD), non‐response bias (NRB) and reliability in the Minnesota Multiphasic Personality Inventory – Revised (MMPI‐2) self‐report inventory administered to Brunei student teachers. Bruneians scored higher on all the validity scales than the normative US sample, thereby threatening the internal validity of the study. Of the three validity scales that assess various forms of SD, only the F scale was reliable and its mean score was in the clinical range. In addition, seven of the ten clinical scales had poor reliability. Although Brunei males scored much higher on the K scale than females, both mean scores were below the critical region. Protocols for two respondents with many missing values indicated that the study’s external validity was vulnerable to NRB effects. Altogether SD, NRB and low reliability had potential to undermine and depress the overall validity of the MMPI‐2 and caution the value of using it ‘as is’ in Brunei.  相似文献   

12.
Peer assessment exercises yield varied reliability and validity. To maximise reliability and validity, the literature recommends adopting various design principles including the use of explicit assessment criteria. Counter to this literature, we report a peer assessment exercise in which criteria were deliberately avoided yet acceptable reliability and validity were achieved. Based on this finding, we make two arguments. First, the comparative judgement approach adopted can be applied successfully in different contexts, including higher education and secondary school. Second, the success was due to this approach; an alternative technique based on absolute judgement yielded poor reliability and validity. We conclude that sound outcomes are achievable without assessment criteria, but success depends on how the peer assessment activity is designed.  相似文献   

13.
This study aims to identify an adequate approach for revealing conceptual understanding in higher professional education. Revealing students’ conceptual understanding is an important step towards developing effective curricula, assessment and aligned teaching strategies to enhance conceptual understanding in higher education. Essays and concept maps were used to determine how students’ conceptual understanding of international business can be revealed adequately. To this end, 132 international business students in higher professional education were randomly assigned to four conditions to write essays and to construct concept maps about an international business research topic. The conditions were: essay alone, essay after concept map, concept map alone, and concept map after essay. An assessment rubric was used to assess the breadth and depth of students’ conceptual understanding. Results show essays are the most adequate approach for revealing conceptual understanding of international business. In particular, concept maps revealed fewer facts and less reasoning than essays. Essays written after concept maps were less effective than essays, possibly since students perceived these essays as redundant. Further research is suggested on how educators can foster conceptual understanding.  相似文献   

14.
The World Wide Web is increasingly being used as a vehicle for flexible learning, where learning is seen to be free from time, geographical, and participation constraints. In addition to flexibility, the Web facilitates student-centered approaches, creating a motivating and active learning environment. The purpose of this study is to set up an adaptive learning environment on Internet and to experiment with the most suitable methods and applications. Our goal is to provide a better solution with regard to the related distance learning research. All the resources and background are from current relevant documents on the theory of asynchronous distance education. We set up an adaptive Internet learning system based on learning theory and related learning models. Our research targets are those students who took the ‘Life Chemistry’ course for the asynchronous distance education environment at Providence University at Taiwan. The students were divided randomly into two groups: the experimental group, which was in an adaptive learning environment and the controlled group, which was in a non-adaptive one. We used the American Chemistry Society Test Bank as our research tool and used SPSS to analyse the data we obtained. Results show that the experimental group in the adaptive learning environment out-performs the controlled group. In addition, those students who are field independent learning types, have higher pre-knowledge, are male, in science departments and have a longer study time span in an adaptive learning environment show much greater achievement levels than those in the opposite situations.  相似文献   

15.
The World Wide Web is increasingly being used as a vehicle for flexible learning, where learning is seen to be free from time, geographical, and participation constraints. In addition to flexibility, the Web facilitates student-centered approaches, creating a motivating and active learning environment. The purpose of this study is to set up an adaptive learning environment on Internet and to experiment with the most suitable methods and applications. Our goal is to provide a better solution with regard to the related distance learning research. All the resources and background are from current relevant documents on the theory of asynchronous distance education. We set up an adaptive Internet learning system based on learning theory and related learning models. Our research targets are those students who took the ‘life chemistry’ course for the asynchronous distance education environment at Providence University in Taiwan. The students were divided randomly into two groups: the experimental group, which was in an adaptive learning environment, and the controlled group, which was in a non-adaptive one. We used the American Chemistry Society test bank as our research tool and used SPSS to analyse the data we obtained. Results show that the experimental group in the adaptive learning environment out-performs the controlled group. In addition, those students who are field independent learning types, have higher pre-knowledge, are male, in science departments and have a longer study time span in an adaptive learning environment show much greater achievement levels than those in the opposite situations.  相似文献   

16.
ANADAPTIVECDMARAKERECEIVERXueGuoqiang(薛国强)ChengShixin(程时昕)(NationalCommunicationsResearchLaboratory)ANADAPTIVECDMARAKERECEIVE...  相似文献   

17.
The purpose of this study was to examine the quality assurance issues of a national English writing assessment in Chinese higher education. Specifically, using generalizability theory and rater interviews, this study examined how the current scoring policy of the TEM-4 (Test for English Majors – Band 4, a high-stakes national standardized EFL assessment in China) writing could impact its score variability and reliability. Eighteen argumentative essays written by nine English major undergraduate students were selected as the writing samples. Ten TEM-4 raters were first invited to use the authentic TEM-4 writing scoring rubric to score these essays holistically and analytically (with time intervals in between). They were then interviewed for their views on how the current scoring policy of the TEM-4 writing assessment could affect its overall quality. The quantitative generalizability theory results of this study suggested that the current scoring policy would not yield acceptable reliability coefficients. The qualitative results supported the generalizability theory findings. Policy implications for quality improvement of the TEM-4 writing assessment in China are discussed.  相似文献   

18.
INNLEDNING     
Engvik, H., Kvale, S. &; Havik, O. E. (1970). Rater Reliability in Evaluation of Essay and Oral Examinations. Scand. J. educ. Res. 14, 195‐220. The rater reliability for the examination system at the Psychological Institute in Oslo was investigated. The essay and oral performances of the candidates are evaluated by an examination committee of three. Significant differences in arithmetic mean were found both among and within the committees. When rating the same essays within a committee a wide variation of reliability coefficients was found — from —.16 to +.90. At the critical boundaries of the scale, such as the Laudabilis boundary for access to further study of psychology, considerable variations between raters were demonstrated. There was demonstrated a slight, but significant trend for female students to improve more than male students at the oral examination. The general rater reliability found is not satisfactory, either with respect to current standards for psychometric tests or with respect to the importance of the marks for the individual students.  相似文献   

19.
We report one teacher’s response to a top-down shift from external examinations to internal teacher assessment for summative purposes in the Republic of Ireland. The teacher adopted a comparative judgement approach to the assessment of secondary students’ understanding of a chemistry experiment. The aims of the research were to investigate whether comparative judgement can produce assessment outcomes that are valid and reliable without producing undue workload for the teachers involved. Comparative judgement outcomes correlated as expected both with test marks and with existing student achievement data, supporting the validity of the approach. Further analysis suggested that teacher judgement privileged scientific understanding, whereas marking privileged factual recall. The estimated reliability of the outcome was acceptably high, but comparative judgement was notably more time-consuming than marking. We consider how validity and efficiency might be improved and the contributions that comparative judgement might offer to summative assessment, moderation of teacher assessment and peer assessment.  相似文献   

20.
A study was undertaken to determine the effects on essay scores of intermingling handwritten and word-processed versions of student essays. A sample of examinees, each of whom had produced both a handwritten and a word-processed essay, was drawn from a larger sample of students who had participated in a pilot study of a new academic skills assessment battery. Students'original handwritten essays were converted to word-processed versions, and their original word-processed essays were converted to handwritten versions. Analyses revealed higher average scores for essays scored in the handwritten mode than for essays scored as word processed, regardless of the mode in which essays were originally produced. Several hypotheses were advanced to explain the discrepancies between scores on handwritten and word-processed essays. The training of essay readers was subsequently modified on the basis of these hypotheses, and the experiment was repeated using the modified training with a new set of readers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号