首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Two conventional scores and a weighted score on a group test of general intelligence were compared for reliability and predictive validity. One conventional score consisted of the number of correct answers an examinee gave in responding to 69 multiple-choice questions; the other was the formula score obtained by subtracting from the number of correct answers a fraction of the number of wrong answers. A weighted score was obtained by assigning weights to all the response alternatives of all the questions and adding the weights associated with the responses, both correct and incorrect, made by the examinee. The weights were derived from degree-of-correctness judgments of the set of response alternatives to each question. Reliability was estimated using a split-half procedure; predictive validity was estimated from the correlation between test scores and mean school achievement. Both conventional scores were found to be significantly less reliable but significantly more valid than the weighted scores. (The formula scores were neither significantly less reliable nor significantly more valid than number-correct scores.)  相似文献   

2.
The cognitive thought processes involved in students’ answers to different kinds of teachers’ questions were investigated using data obtained from a previous study. The dimensions examined were (a) the degree of correspondence between the cognitive level of teachers’ questions and the cognitive level of students’ answers, and (b) the relation of that correspondence to the type of cognitive coding system used, grade level, and clarity of the questions and answers. It was found that the chances are about even that there will be a correspondence between the cognitive level of the question asked and the cognitive level of the response that was elicited. The coding system used, grade level of the students, and clarity of the questions each moderated this effect.  相似文献   

3.
High-ability pupils in primary schools often do not achieve up to their full potential and teachers seem to face difficulties to motivate these pupils. In this study 891 primary school pupils (463 high-ability pupils) were asked about their views on desired characteristics of good teachers by means of an open teacher-spider-questionnaire. The characteristics reported, were analysed using the three “basic needs” from the Self-Determination Theory. The answers of high-ability pupils were compared to answers of pupils from regular primary education. For both groups, teaching characteristics fostering relatedness, followed by competence, were mentioned most. It was autonomy which was mentioned less frequently by both groups. The answers of the two groups of pupils mostly corresponded, although some differences emerged in specific subcategories. High-ability pupils more frequently mentioned characteristics attuning to their needs (understanding) and encouragement (challenge), and mentioned “providing choice” less often. There were also some differences found between characteristics mentioned by (high-ability) boys and girls.  相似文献   

4.
Recent studies have shown that restricting review and answer change opportunities on computerized adaptive tests (CATs) to items within successive blocks reduces time spent in review, satisfies most examinees' desires for review, and controls against distortion in proficiency estimates resulting from intentional incorrect answering of items prior to review. However, restricting review opportunities on CATs may not prevent examinees from artificially raising proficiency estimates by using judgments of item difficulty to signal when to change previous answers. We evaluated six strategies for using item difficulty judgments to change answers on CATs and compared the results to those from examinees reviewing and changing answers in the usual manner. The strategy conditions varied in terms of when examinees were prompted to consider changing answers and in the information provided about the consistency of the item selection algorithm. We found that examinees fared best on average when they reviewed and changed answers in the usual manner. The best gaming strategy was one in which the examinees knew something about the consistency of the item selection algorithm and were prompted to change responses only when they were unsure about answer correctness and sure about their item difficulty judgments. However, even this strategy did not produce a mean gain in proficiency estimates.  相似文献   

5.
Results obtained from computer-adaptive and self-adaptive tests were compared under conditions in which item review was permitted and not permitted. Comparisons of answers before and after review within the "review" condition showed that a small percentage of answers was changed (5.23%), that more answers were changed from wrong to right than from right to wrong (by a ratio of 2.92:1), that most examinees (66.5%) changed answers to at least some questions, that most examinees who changed answers improved their ability estimates by doing so (by a ratio of 2.55 to 1), and that review was particularly beneficial to examineees at high ability levels. Comparisons between the "review" and "no-review" conditions yielded no significant differences in ability estimates or in estimated measurement error and provided no trustworthy evidence that test anxiety moderated the effects of review on those indexes. Most examinees desired review, but permitting it increased testing time by 41%.  相似文献   

6.
This paper presents the findings from a small–scale experiment investigating the presentation of a synchronous remote electronic examination. It discusses the students' experiences of taking such an examination. The study confirms that the majority of participants found the experience at least as good as a conventional written examination. In addition, typing answers does not prevent students from producing answers in the time available. However, the pressure of time continues to be a major cause of anxiety for students. The paper discusses technical issues, particularly those related to the loss of communications during the 3–hour duration of the exam. Although software processes were available to save and restore students' answers throughout the examination, problems still occurred and more robust software is required.  相似文献   

7.
8.
We report on an investigation of students' ideas about gravity after a semester of instruction in physics at university. There are two aspects to the study which was concerned with students' answers to a carefully designed qualitative examination question on gravity. The first aspect is a classification of the answers and a comparative study of the ways the problem was tackled by two large groups of students who had different backgrounds in physics and were exposed to different teaching styles. The second aspect is to investigate how students link concepts to solve the problem. We used a phenomenographic analysis of student responses to extract patterns of reasoning and alternative conceptions behind the solutions. We found no differences between the classes of answers given by students in the two courses. Our analysis also identifies a hierarchy in the complexity of the hypothetical reasoning pathways, which we interpret as reflecting the ways in which students may link concepts and resolve conflicts as they solve the problem. The hypothetical reasoning pathways may help educators to develop instructional material or lecture room dialogue in order to tease out key issues. An unexpected finding is that there is a discrepancy between our conclusion that the two groups of answers are similar and the distribution of marks awarded by the examiner – which implies that the quality of the answers is different for the two groups.  相似文献   

9.
This study examines to what extent assessment of text comprehension involves knowledge of the properties of human cognition (theory of mind) and the social context of assessment. The subjects (N=332) were asked to read a text and then assess eight answers to questions about this text. The independent variables were the quality of the answers to requests for paraphrases, the quality of answers to direct questions about the meaning of the text, the order of the paragraphs in the text and the human vs. artificial source attributed to the answers. Results show that answers to requests for paraphrases were thought to be better when the source was artificial rather than human. Inversely, answers to direct questions about the meaning of the text were thought to be better when their source was human. The assessment of answers attributed to a human source were differentiated by a greater integration of contiguous assessments (contrast effect between contiguous assessments). This was noted more particularly for a person than for a machine, poor paraphrasing being followed by a better assessment of answers to questions about the meaning of the text. The assessment of human understanding of a text is hypothesised to be guided by an expectation of answer coherency and a wider and more structured knowledge than the assessment of artificial answers.  相似文献   

10.
Approximately one-half of the fifth through eighth graders in a school district (n = 164) were randomly selected to be administered a group test of disjunctive reasoning containing 48 inclusive and exclusive items varying in content of the premises (symbolic, object, and human), and affirmation or negation of the conclusion. Using an analysis of variance for repeated measures it was found that performance improved until seventh grade. Eighth graders scored similar to sixth graders. There was a main effect for negative, with negative conclusions producing more correct answers. Further, there were significant first order interactions for Disjunctive by Negation, Content by Grade, and Negation by Content. Since either “YES” or “NO” were the only correct answers, and “MAYBE” was always wrong, contrast of the MAYBE responses to other wrong answers revealed an increasing tendency to use MAYBE among older subjects. Implications were discussed in relation to cognitive developmental theory and educational practices.  相似文献   

11.
The aim of the study was to discover the essential characteristics of engineering teachers' pedagogical content knowledge by studying teachers' conceptions of their students' ideas of moment. To compare the conceptions maintained by teachers with those of their students, the most common difficulties experienced by first-year engineering students in understanding the moments of forces were looked at. The data on students' conceptions were collected by means of a questionnaire. In addition, four experienced teachers were given the same questionnaire as the students and then were asked to write what they expected the students' answers to be. The students' answers and the teachers' conceptions of their students' potential answers were compared. It was found that although the teachers originally appeared to be familiar with their students' conceptions, they were rather astonished by the general pattern of the students' thinking. It is planned that the information gathered about the teachers' pedagogical content knowledge will eventually be used to improve engineering teacher training.  相似文献   

12.
The use of content validity as the primary assurance of the measurement accuracy for science assessment examinations is questioned. An alternative accuracy measure, item validity, is proposed. Item validity is based on research using qualitative comparisons between (a) student answers to objective items on the examination, (b) clinical interviews with examinees designed to ascertain their knowledge and understanding of the objective examination items, and (c) student answers to essay examination items prepared as an equivalent to the objective examination items. Calculations of item validity are used to show that selected objective items from the science assessment examination overestimated the actual student understanding of science content. Overestimation occurs when a student correctly answers an examination item, but for a reason other than that needed for an understanding of the content in question. There was little evidence that students incorrectly answered the items studied for the wrong reason, resulting in underestimation of the students' knowledge. The equivalent essay items were found to limit the amount of mismeasurement of the students' knowledge. Specific examples are cited and general suggestions are made on how to improve the measurement accuracy of objective examinations.  相似文献   

13.
14.
Changing a small number of answers to multiple-choice questions reliably improves test-scores, although it remains unclear how examinees select which initial answers to change and whether answer-changing behaviour is susceptible to instruction. We tested the effect of an instructional intervention on the number of changes made by examinees on a mock-exam in a controlled experimental design. We also examined how examinees' confidence with their initial answers, and their judgement of how difficult each exam question was, predicted their answer-changing behaviour. We found that the number of changes made increased, to a small extent, through instruction, without increasing the rate of errors. The likelihood of changing an initial answer decreased with examinees’ feeling of confidence, and increased with their feeling of difficulty. This is consistent with the theory that examinees use metacognitive experiences to select which initial answers to change on exams.  相似文献   

15.
The hypothesis that it is unwise to change answers to multiple choice questions was tested using the technique of multiple regression analysis. The net number of correct answers as a result of changing responses was regressed against final grade in the course, numeric score on the exam, percent of total answers changed for all questions and for analytical questions, sex of the student, and scope of the exam.
The results show that there are gains to be made by changing responses. The variables which proved to be significant indicated that students who did well on the test changed a large percentage of answers, and that those who were taking a final exam tended to gain more. Final grades, sex of the student, and analytical questions had no significant impact on gains from changing responses. On the basis of the results gathered, the authors reject the hypothesis that changing responses is unwise.  相似文献   

16.
师生问答是教学中最常用的交流行为。教师与学生、提问与回答之间的关系可以从心理效应层面加以研究和认识。心理效应能够帮助教师认识到学生回答问题的目的及所出现的问题。应从教师和学生两个层面采取相应的对策,促进师生问答的顺利进行。  相似文献   

17.
This research documents Kuwaiti eighth grade students’ performance in recognizing reasonable answers and the strategies they used to determine reasonableness. The results from over 200 eighth grade students show they were generally unable to recognize reasonable answers. Students’ performance was consistently low across all three number domains (whole numbers, fractions, and decimals). There was no significant difference in students’ performance on items that focused on the practicality of the answers or on items that focused on the relationships of numbers and the effect of operations, or on both. Interview data revealed that 35% of the students’ strategies were derived from two criteria for judging answers for reasonableness: the relationships of numbers and the effect of operations, and the practicality of the answers. They used strategies such as estimation, numerical benchmarks, real-world benchmarks, and applied their understanding of the meaning of operations. However, over 60% of the students’ strategies were procedurally driven. That is, they relied on algorithmic techniques such as carrying out paper-and-pencil procedures. Additionally, some of the students’ strategies reflected misunderstandings of how and when to apply certain procedures. Given these findings, mathematics education in Kuwait should shift the emphasis from paper-and-pencil procedures and provide systematic attention to the development of number sense and computational estimation so Kuwaiti students will be more adept at recognizing reasonable answers.  相似文献   

18.
不确定判断与阅读多选题的猜测策略   总被引:1,自引:0,他引:1  
本研究应用不确定判断理论,采用有声思维的方法,探讨受试在不确定的情况下如何猜测多项选择阅读题。研究发现受试采用了代表性、可得性、锚定与调整等搜索性策略。而锚定与调整策略最为有效,常被用作"锚"去调整猜测的有背景知识、常识、逻辑推理、应试型知识等。这一发现对命题有启示,有助于改善试题质量,降低猜测的命中率,提高考试效度。  相似文献   

19.
Abstract

An attempt was made to extend and clarify prior research which had demonstrated consistently that changed answers to objective test items tend to be correct. Results extended the basic effect of profiting from changed answers to Air Force personnel responding to multiple-choice questions regarding technical skills; the profit from changes was very similar to that observed in a university group responding to relatively "academic" items. Secondly, most individuals in both groups profited from changes. Third, individuals with the highest test scores tended to profit more from changes than those with the lowest test scores. Fourth, neither Airman Qualifying Exam scores (for the military personnel) nor Scholastic Aptitude Test scores (for the university students) were related to profit. Finally, a systematic case against the popular belief that one should not change answers on objective tests was made, based on an integration of the research to date.  相似文献   

20.
Six undergraduate and three graduate classes were given multiple-choice tests with subsequent evaluation of answer changes. The 300 students were tested twice, once before and once after instruction on answer changing. After each test, students were asked to complete two forms. The forms evaluated attitude toward answer changing, reasons for changing, and confidence in final answers. Students showed a significant increase in favorability toward answer changing after instruction. No significant change was found in number of answers changed. Psychology students were found to change significantly more items than were business students. Mean gain score did not change significantly after instruction. It was concluded that although instruction does lead to a change in attitude in answer changing, the number of changes and overall gain due to answer changing do not change. It was also determined that students continue to make significant gains even when their confidence in the final answer is less than 50 on a 100-point scale.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号