首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Educational Assessment》2013,18(2):165-176
A regression analysis was carried out to assess the contributions of passage and no-passage factors to item variance on the Scholastic Aptitude Test reading comprehension task. Unlike earlier regression studies of multiple-choice reading tasks, no-passage factors were experimentally isolated from passage factors, and passage factors from the multiple-choice context. Results showed that no-passage factors play a larger role than do passage factors, accounting for as much as three fourths of systematic variance in item difficulty and more than half of total variance. The task, therefore, appears largely to reflect the systematic influence of factors having nothing to do with the comprehension of reading passages.  相似文献   

2.
本研究应用项目反应理论,从被试的阅读能力值和题目的难度值这两个方面,分析阅读理解测试中多项选择题命题者对考试效度的影响。实验设计中,将两组被试同时施测于一项“阅读水平测试”,根据测试结果估计出的两组被试能力值之间无显著性差异。再次将这两组被试分别施测于两位不同命题者所命制的题目,尽管这些题目均产生于相同的阅读材料,且题目的难度值之间并没有显著性差异,被试的表现却显著不同。Rasch模型认为,被试表现由被试能力和试题难度共同决定。因此,可以推测,这是由于不同命题者所命制的题目影响了被试的表现,并进而影响了使用多项选择题进行阅读理解测试的效度。  相似文献   

3.
This study reports the results of a componential analysis of items comprising Sections A and C of Form Z of the reading comprehension portions of the California Achievement Tests (CAT) (Tiegs & Clark, 1963). A set of problem components or attributes characterizing the test items in terms of manifest content, psychologically salient features, and processing demands was developed, including methods for their quantification. The contributions of these components to task difficulty were then evaluated using linear regression methodology. Item difficulty indices were transformations of the familiar proportion-correct item score, obtained from data gathered during the spring of 1989 from 158 deaf examinees. Variation in the item difficulty values was substantially accounted for in terms of a small number of predictor variables (R2 greater than or equal to .90). Implications of the results for construct validity and interpretation of test scores are discussed.  相似文献   

4.
While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b) multiple-choice (choose multiple answers), c) re-order paragraphs, d) reading (fill-in-the-blanks), and e) reading and writing (fill-in-the-blanks). Utilizing a multiple regression approach, the criterion measure consisted of item difficulty scores for 172 items. 18 passage, passage-question, and response-format variables served as predictors. Overall, four significant predictors were identified for the entire group (i.e., sentence length, falsifiable distractors, number of correct options, and abstractness of information requested) and five variables were found to be significant for high-performing readers (including the four listed above and passage coherence); only the number of falsifiable distractors was a significant predictor for low-performing readers. Implications for assessing reading comprehension are discussed.  相似文献   

5.
The purpose of this study is to introduce and evaluate a method for generating reading comprehension items using template-based automatic item generation. To begin, we describe a new model for generating reading comprehension items called the text analysis cognitive model assessing inferential skills across different reading passages. Next, the text analysis cognitive model is used to generate reading comprehension items where examinees are required to read a passage and identify the irrelevant sentence. The sentences for the generated passages were created using OpenAI GPT-3.5. Finally, the quality of the generated items was evaluated. The generated items were reviewed by three subject-matter experts. The generated items were also administered to a sample of 1,607 Grade-8 students. The correct options for the generated items produced a similar level of difficulty and yielded strong discrimination power while the incorrect options served as effective distractors. Implications of augmented intelligence for item development are discussed.  相似文献   

6.
A concurrent speaking paradigm was used to assess the importance of subvocalization during the reading of lengthy natural prose passages. Experiment 1 showed that having subjects count aloud while reading interfered with their comprehension and recall of the text's details as well as its gist, but did not affect the durability of the memory trace. Experiment 2 replicated these findings and established the validity of using concurrent speaking as a technique to interfere with speech-specific processes during silent reading. By pitting concurrent speaking against a nonverbal concurrent task, Experiment 2 provided evidence that its detrimental effect on comprehension was due to a competition for speech-related resources rather than a general competition for cognitive resources. Interfering with speech recoding during silent reading led to an average decrement of 10–12% in comprehension performance. However, Experiment 2 also showed that there were substantial individual differences in the magnitude of the speech interference effect and that these differences were systematically related to subjective reports about the concurrent speaking manipulation.  相似文献   

7.
Test items become easier when a representational picture visualizes the text item stem; this is referred to as the multimedia effect in testing. To uncover the processes underlying this effect and to understand how pictures affect students' item-solving behavior, we recorded the eye movements of sixty-two schoolchildren solving multiple-choice (MC) science items either with or without a representational picture. Results show that the time students spent fixating the picture was compensated for by less time spent reading the corresponding text. In text-picture items, students also spent less time fixating incorrect answer options; a behavior that was associated with better test scores in general. Detailed gaze likelihood analyses revealed that the picture received particular attention right after item onset and in the later phase of item solving. Hence, comparable to learning, pictures in tests seemingly boost students' performance because they may serve as mental scaffolds, supporting comprehension and decision making.  相似文献   

8.
The integration of knowledge during reading was tested in 1,109 secondary school students. Reading times for the second sentence in a pair (Jane’s headache went away) were compared in conditions where the first sentence was either causally or temporally related to the first sentence (Jane took an aspirin vs. Jane looked for an aspirin). Mixed-effects explanatory item response models revealed that at higher comprehension levels, sentences were read more quickly in the causal condition. There were no condition-related reading time differences at lower comprehension levels. This interaction held with comprehension- and inference-related factors (working memory, word and world knowledge, and word reading efficiency) in the models. Less skilled comprehenders have difficulty in knowledge-text integration processes that facilitate sentence processing during reading.  相似文献   

9.
This study assessed the effects of curriculum on technical features within curriculum-based measurement in reading. Curriculum was defined as the difficulty of material and the basal series from which students read. Technical features were the criterion validity and developmental growth rates associated with the measurement. Ninety-one students took a commercial, widely used test of reading comprehension and read orally for 1 minute from each of 19 passages, one from each grade level within two reading series. Correlations between the oral reading samples and the test of reading comprehension were similar across difficulty levels and across series. Developmental growth rates also remained strong regardless of difficulty level and series.  相似文献   

10.
为保证语言测试题目的质量和加强题库建设,本文基于经典测试理论,使用Gitest Ⅲ对一份高考试卷(阅读部分)题目进行项目分析,结果显示:该阅读题目的难度、区分度较理想,但难度分布并不理想。建议在使用题库中的组合试卷前先进行试测,以改进试题的难度分布以及部分题目选项的质量,从而提高试题的信度和效度。  相似文献   

11.
阅读理解能力测验中所选择的文章在内容方面对不同专业背景的考生亚团体是否具有公平性的问题,是测验效度高低的重要证据,也是测验效度验证(validation)的重要环节。本研究以中国语言与文学专业考生为目标组,分别将经济学专业和生物医学专业考生作为参照组,采用效标测量和蕴涵量表分析相结合的方法,对HSK(高等)阅读理解测验的文章难度对三个不同专业背景的考生组的公平性问题进行了检验。研究结果表明,两个参照组考生尽管具有各自的相对专业优势,但他们在六篇阅读材料上获得的难度排列顺序与目标组考生完全一致;虽然目标组考生不具备汉语知识以外的其他专业优势,但因为HSK考试所选择的阅读材料没有涉及语言知识本身以外的特殊专业要求,因而测验对三个不同专业背景的考生具有较高的公平性。  相似文献   

12.
In this study, 180 Norwegian fifth‐grade students with a mean age of 10.5 years were administered measures of word recognition skills, strategic text processing, reading motivation and working memory. Six months later, the same students were given three different multiple‐choice reading comprehension measures. Based on three forced‐order hierarchical multiple regression analyses, results indicated that the unique contribution of measured skills and processes to performance varied across comprehension tests. In particular, when the test consisted of a longer passage, contained a larger proportion of inferential questions and was answered without access to relevant text passages, the relative importance of word recognition skills seemed to be reduced while working memory emerged as a relatively strong, unique positive predictor of comprehension performance. These findings have important practical implications for the assessment of reading comprehension.  相似文献   

13.
The inference mediation hypothesis (IMH) assumes that individual difference factors that affect reading proficiency have direct and indirect effects on comprehension outcomes, with the indirect effects involving inference processes. The present study tested the IMH in a diverse sample of two and four-year college students in a task that emphasizes comprehension of the passage (traditional assessment) and a task that emphasizes complex problem solving (SBA). Participants were administered assessments of foundational skills that support reading, inference generation, a traditional assessment of comprehension proficiency, and a scenario-based reading assessment. The results support the IMH. However, the strength of the indirect relationships depended on the type of reading performance assessment. Coherence building inferences partially mediated the relationship for the traditional assessment. Elaborative inferences partially mediated the relationship for the scenario-based assessment. The results are discussed in terms of theories of purposeful reading and implications for understanding college readiness.  相似文献   

14.
Using a New Statistical Model for Testlets to Score TOEFL   总被引:1,自引:0,他引:1  
Standard item response theory (IRT) models fit to examination responses ignore the fact that sets of items (testlets) often are matched with a single common stimulus (e.g., a reading comprehension passage). In this setting, all items given to an examinee are unlikely to be conditionally independent (given examinee proficiency). Models that assume conditional independence will overestimate the precision with which examinee proficiency is measured. Overstatement of precision may lead to inaccurate inferences as well as prematurely ended examinations in which the stopping rule is based on the estimated standard error of examinee proficiency (e.g., an adaptive test). The standard three parameter IRT model was modified to include an additional random effect for items nested within the same testlet (Wainer, Bradlow, & Du, 2000). This parameter, γ characterizes the amount of local dependence in a testlet.
We fit 86 TOEFL testlets (50 reading comprehension and 36 listening comprehension) with the new model, and obtained a value for the variance of γ for each testlet. We compared the standard parameters (discrimination (a), difficulty (b) and guessing (c)) with what is obtained through traditional modeling. We found that difficulties were well estimated either way, but estimates of both a and c were biased if conditional independence is incorrectly assumed. Of greater import, we found that test information was substantially over-estimated when conditional independence was incorrectly assumed.  相似文献   

15.
Applications of traditional unidimensional item response theory models to passage-based reading comprehension assessment data have been criticized based on potential violations of local independence. However, simple rules for determining dependency, such as including all items associated with a particular passage, may overestimate the dependency that actually exists among the items. The current study proposed a more refined method based on cognitive principles and substantive theories to determine those items that pose a threat. Specifically, the use of common necessary information from text was examined as a contributor of local dependence. Cognitively similar item pairs, those with connected necessary information, had higher local dependence values than item pairs with no connected necessary information. Results suggest that focusing on necessary information may be useful to some extent for understanding and managing item dependence for passage-based reading comprehension tests.  相似文献   

16.
In this study, we explored the relationship between beginning readers' phonological awareness and other aspects of phonological processing, specifically as manifested in short-term memory and comprehension tasks. The theoretical questions underlying the study were (a) what roles phonological processes play in children's beginning reading, from word identification through sentence comprehension, and (b) whether those roles are sufficiently related that potential difficulties at one level directly affect processing at other levels. Phonologically induced effects were observed for word-list memory and for sentence judgments for both novice readers (at the end of kindergarten) and relatively more experienced readers (end of Grades 1 and 2). For both age groups, correlational analyses revealed relationships among phonological awareness, phonological processing in list memory, and word reading. However, phonological processing in sentence comprehension was not related to other types of phonological processing. These results indicate that although phonology plays a role during comprehension, phonological processing may not be as limiting a factor in comprehension as in word reading.  相似文献   

17.
Low accuracy levels are often obtained when readers are asked to predict test performance over reading materials. Three investigations further explore the information readers use to make predictions during metacomprehension. Our results show that readers’ estimates are influenced by factors such as their initial impression of the reading task, based in part on their perceptions surrounding text genre and test item type. To explain these and other published results, a new framework for investigating metacomprehension using Tversky and Kahneman’s (Science, 185:1124–1131, 1974) anchoring and adjustment heuristic as a guide is proposed. We argue that readers anchor comprehension test performance on factors such as self-perceptions of reading ability and/or perceptions of the reading task and then insufficiently adjust their predictions to reflect the demands of the specific reading task at hand such as text difficulty.  相似文献   

18.

Aims

Speed reading is advertised as a way to increase reading speed without any loss in comprehension. However, research on speed reading has indicated that comprehension suffers as reading speed increases. We were specifically interested in how processes of inference generation were affected by speed reading.

Methods

We examined how reading speed influenced inference generation in typical readers, trained speed readers and participants trained to skim read passages. Passages either strongly or weakly promoted a bridging or predictive inference. After reading, participants performed a lexical decision task on either a nonword, neutral or inference‐related word.

Results

Typical readers responded to strong and weak inference words faster than neutral words. There were no statistical differences in reaction time between inference‐related and neutral words for speed and skim readers.

Conclusions

These findings provide no substantive evidence that the appropriate inferences are generated when reading at rapid speeds. Thus, speed reading may be detrimental to normal integrative comprehension processes.  相似文献   

19.
A new type of test item was developed which required Ss to recognize groups of words, i.e., chunks, whose meaning had been changed from that in the original reading or listening passage. In one study involving 52 Ss and 20 test variables, individual differences on the chunked reading test were found to correlate .68 with a multiple-choice alternate form. In another study, the decrease in listening comprehension due to increased speech rate as measured by the chunked items was roughly parallel to the decrease as measured by the multiple-choice questions. These data were interpreted as providing evidence for the validity of the chunked items as measures of comprehension. However, other results suggested that the chunked items may be less dependent upon grammatical and vocabulary knowledge and more sensitive to within individual changes in comprehension as compared to the traditional multiple-choice question.  相似文献   

20.
We examined the effect of experimenter-controlled incentives and feedback on the calibration of performance. Subjects answered 36 reading comprehension and 8 mathematical multiple-choice questions and rated the accuracy of their responses. Perfect calibration was possible only when true and estimated test performance were approximately equal. Incentives for improved performance (i.e., doubling the credit people received for correct answers) adversely affected performance and calibration compared to the same incentives for improved calibration (i.e., doubling credit for minimizing the error between true and estimated performance). Feedback had no effect on performance or accuracy nor did it interact with the incentive variable. An examination of coefficient α suggested a strong response bias by individuals when calibrating their performance; individuals tended to rate their performance accuracy consistently regardless of item difficulty or whether they answered the item correctly. Educational implications were discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号