期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

聂丹《考试研究》2009,(4):79-89

本文根据历次“实用汉语水平认定考试（C．TEST）”测试的题目数据,对C．TEST语音测试的试卷质量、题型效果、测试点类型等方面进行综合分析。结果表明,语音测试题整体区分度较好,但普遍容易;语音读辨题型的难度明显大于听辨题型,区分度也是前者好于后者;不同测试点类型的题目平均区分度相差不大,但难度存在一定差距。笔者认为,C．TEST语音测试当务之急是提高题目的整体难度,特别是听辨题以及声母、声一韵测试题的难度。相似文献

2.

自学考试试题难度影响因素的多元回归分析

程力柳博《教育科学》2012,28(3):60-62

以实测数据为基础,采用多元线性回归的统计方法,分析自学考试试题难度的影响因素。结果表明,试题难度的多元回归线性模型,基本能解释因变量和自变量的关系,对自学考试预估难度的赋值具有现实的指导意义,也为题库建设中试题难度的调控提供了有效的途径。相似文献

3.

挽弓当挽强用箭当用长——谈谈如何把握高考模拟试题的难度、区分度

苏跃民《教学月刊》2014,(1):65-68

正为了使考生适应高考要求、提高高考成绩,近年来各地市或一些水平相当的学校都会在高考前联合组织两至三次模拟考试,但测试效果有时并不理想。出现问题的原因有多种,模拟试题的难度、区分度没有得到有效控制最为关键。本文结合浙江省高考文综试卷实例和自拟题,从正确理解难度、区分度的含义,高考对难度、区分度的要求,以及影响难度、区分度的因素等几个方面,谈谈如何把握高考模拟试题的难度、区分度。相似文献

4.

自学考试试题实测难度和区分度指标的研究 总被引：1，自引：0，他引：1

程力柳博《黑龙江高教研究》2011,(2)

在自学考试实测数据的统计分析中,按照考试分数将考生分为四类目标群体,探讨试题对特定目标群体的难度指标和区分度指标,进而计算试卷对不同目标群体的区分度,探索一条适合自学考试实际、可操作的难度指标和区分度指标体系,为自学考试题库建设中试题参数的赋值提供重要的数据支撑. 相似文献

5.

迭择题区分度的分析研究

高凌飙《物理教师》1987,(4)

区分度是表示试题鉴别不同水平的考生的能力的一项指标(注一),是检查试题的质量的一个重要指标。一份好的试卷,一道好的试题,都必须有良好的区分度。对于象高考这一类以选拔考生的主要目的的考试来说,一般要求区分度应高于0.4。区分度低于0.3的试题,必需修正或淘汰(注二)。怎样才能提高试题的区分度,保证考试的质量?什么样的试题区分度高?为弄清这一问题,我们对广东省1986年进行标准化考试改革所采用的三份试题(二次模拟考试和一次正式考试)中的选择题(共120题)的测试数据和试题情况作了初步的分析,其结果如下: 一、难度与区分度的关系难度即难易程度。习惯上用通过率(答对人数/考生总数)作为考察难度的指标。难度指标值高,试题越容易。好的试题应有相似文献

6.

试卷质量分析方法初探

李春杨明周《蒙自师范高等专科学校学报》2006,4(5):19-22

影响试卷质量的因素很多,本文主要利用正态性和三度指标(难度、区分度、信度)来对试卷质量进行综合评价,提出了一种合理的、可操作性强的综合评价试卷质量的方法,并以某班《离散数学》期末考试试卷为例进行了实例分析．相似文献

7.

基于经典测量理论的初中数学试卷质量分析

李倩倩《考试周刊》2014,(36):4-5

本文利用南宁市某中学2013年秋季学期数学期末考试成绩,在经典测量理论(CTT)范畴下探讨了该次期末考试数学试卷的信度、效度、难度、区分度和成绩分布情况,结果显示,该试卷信度高、难度适中、区分效果好、知识覆盖面广,试卷质量较高。相似文献

8.

难度与效度的统一——天津市1992年中考物理升学试卷评析

苏培华徐温光《天津教育》1992,(12)

本文以天津市初中物理学科升学试卷为例,重点研讨升学试卷中难度与效度的统一。升学试卷的主要任务是起到对学生进行分流的作用,这就要求试卷有一定的难度和区分度。但往往在强调难度的同时不能很好地把握效度,影响了考试的有效性。如何使试卷的难度和效度有机地统一,是升学试卷命题中一个至关重要的问题。我们对天津市1992年物理学科升学试卷进行抽样分析统计,结果是:半数以上的题难度指数在60%左右(难度指数=答对人数/应试人数),有的难度指数已达29.05%;其中有6道题的区分度在0.87以上,其它各题区分度也均在0.56以上,有积极区分的意相似文献

9.

《有机化学》考试试卷分析

朱新军《教育教学论坛》2013,(21)

为了评估我校生物系2011级有机化学期末考试试卷的质量,改进有机化学教学方法和提高考试试卷质量提供依据,运用教育测量学原理与教育统计学方法对试题进行统计分析。结果表明学生成绩呈正态分布,平均分为73.03分,标准差8.33分,平均难度0.71,区分度0.74,信度0.78,效度0.88。本次考试的学生成绩呈正态分布,试题难度适中,区分度良好,试卷反应了学生对有机化学的掌握程度,可信、有效。相似文献

10.

试卷质量分析浅析

蒋精瑾余宇峰《华章》2012,(35)

本文论述了难度、区分度、信度、效度指标用于试卷质量分析时的含义和计算方法,并用难度、区分度、信度指标对学校12春工商管理专业的《企业信息管理》期终考试试卷进行了定量分析,为科学出好试卷提供了依据. 相似文献

11.

Item Difficulty of Four Verbal Item Types and an Index of Differential Item Functioning for Black and White Examinees 总被引：1，自引：0，他引：1

Roy Freedle Irene Kostin 《Journal of Educational Measurement》1990,27(4):329-343

In this study, the authors explored the importance of item difficulty (equated delta) as a predictor of differential item functioning (DIF) of Black versus matched White examinees for four verbal item types (analogies, antonyms, sentence completions, reading comprehension) using 13 GRE-disclosed forms (988 verbal items) and 11 SAT-disclosed forms (935 verbal items). The average correlation across test forms for each item type (and often the correlation for each individual test form as well) revealed a significant relationship between item difficulty and DIF value for both GRE and SAT. The most important finding indicates that for hard items, Black examinees perform differentially better than matched ability White examinees for each of the four item types and for both the GRE and SAT tests! The results further suggest that the amount of verbal context is an important determinant of the magnitude of the relationship between item difficulty and differential performance of Black versus matched White examinees. Several hypotheses accounting for this result were explored. 相似文献

12.

高考语文阅读主观题评分方法对题目参数分析的影响

温红博李峰《考试研究》2020,(1):65-73

针对目前高考语文阅读主观题评分方法的局限,提出基于SOLO理论的分类评价法和基于阅读认知过程的建构整合模型(CI)评分法。选择1019名学生高考语文阅读三道主观题的真实作答,采用三种评分法评分,采用项目反应理论对三道主观题进行测量学分析,结果表明:相对于原始评分法,SOLO评分法和CI评分法题目之间具有更高的相关,测验模型拟合更佳,题目区分度较高,题目得分的难度阈限和步长更合理,题目的信息量更大,而CI评分法又明显优于SOLO评分法。研究支持了将CI评方法作为高考语文阅读主观题评分方法的潜在优势。相似文献

13.

Graphical artefacts: Taxonomy of students’ response to test items

Oduor Olande 《Educational Studies in Mathematics》2014,85(1):53-74

The present study, carried out in the Nordic countries, examines the characteristics of students’ scholastic performance on items containing graphical artefacts, that is, bar graphs, pie charts and line graphs, selected from the Programme for International Student Assessment (PISA) survey test. Graphical analysis of statistical data resulted in the observation of two major categories of performance by the students. The results of cluster analysis also confirmed the two approaches. One approach consists of items perceived as requiring identification, that is, focusing primarily on perceptual elements. The other consisting of items requiring a critical-analytical approach, that is, involving evaluation of the graphical system, active interaction with subject specific operators and forms of expression. The general observation is that the pattern of response is similar for all these countries, with items demanding an identification approach showing comparatively higher scores than for items perceived as demanding a critical-analytical approach. 相似文献

14.

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Guher Gorgun Okan Bulut 《Educational and psychological measurement》2021,81(5):847

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of not-reached items is high, these traditional approaches may yield biased scores and thereby threatening the validity of test results. In this study, we propose a polytomous scoring approach for handling not-reached items and compare its performance with those of the traditional scoring approaches. Real data from a low-stakes math assessment administered to second and third graders were used. The assessment consisted of 40 short-answer items focusing on addition and subtraction. The students were instructed to answer as many items as possible within 5 minutes. Using the traditional scoring approaches, students’ responses for not-reached items were treated as either not-administered or incorrect in the scoring process. With the proposed scoring approach, students’ nonmissing responses were scored polytomously based on how accurately and rapidly they responded to the items to reduce the impact of not-reached items on ability estimation. The traditional and polytomous scoring approaches were compared based on several evaluation criteria, such as model fit indices, test information function, and bias. The results indicated that the polytomous scoring approaches outperformed the traditional approaches. The complete case simulation corroborated our empirical findings that the scoring approach in which nonmissing items were scored polytomously and not-reached items were considered not-administered performed the best. Implications of the polytomous scoring approach for low-stakes assessments were discussed. 相似文献

15.

Item difficulty in the evaluation of computer-based instruction: an example from neuroanatomy

Chariker JH Naaz F Pani JR 《Anatomical sciences education》2012,5(2):63-75

This article reports large item effects in a study of computer-based learning of neuroanatomy. Outcome measures of the efficiency of learning, transfer of learning, and generalization of knowledge diverged by a wide margin across test items, with certain sets of items emerging as particularly difficult to master. In addition, the outcomes of comparisons between instructional methods changed with the difficulty of the items to be learned. More challenging items better differentiated between instructional methods. This set of results is important for two reasons. First, it suggests that instruction may be more efficient if sets of consistently difficult items are the targets of instructional methods particularly suited to them. Second, there is wide variation in the published literature regarding the outcomes of empirical evaluations of computer-based instruction. As a consequence, many questions arise as to the factors that may affect such evaluations. The present article demonstrates that the level of challenge in the material that is presented to learners is an important factor to consider in the evaluation of a computer-based instructional system. 相似文献

16.

基于高考英语难题的试题命制技术探讨

程晓堂王瑶《中国考试》2021,(5):63-71

难度不是试题的固有属性,而是考生因素与试题特征之间互动的结果。很多试题分析者倾向于将试题难度偏高的原因仅仅归结于学生未掌握相关知识或技能,而忽视试题本身的特征。通过分析60道难度在0.6以下的高考英语试题,探究其难度来源。结果显示,除考生因素外,难题或偏难题的难度来源也与命题技术有关,比如答案的唯一性与可接受性、考查内容超纲、考点设置与评分标准欠妥等方面的问题。为此,提出考试机构应提高命题水平,加强试题质量监控,确保大规模考试科学选拔人才。相似文献

17.

Teachers’ and students’ perceptions of assessments: A review and a study into the ability and accuracy of estimating the difficulty levels of assessment items

Gerard van de Watering Janine van der Rijt 《Educational Research Review》2006,1(2):133-147

In today's higher education, high quality assessments play an important role. Little is known, however, about the degree to which assessments are correctly aimed at the students’ levels of competence in relation to the defined learning goals. This article reviews previous research into teachers’ and students’ perceptions of item difficulty. It focuses on the item difficulty of assessments and students’ and teachers’ abilities to estimate item difficulty correctly. The review indicates that teachers tend to overestimate the difficulty of easy items and underestimate the difficulty of difficult items. Students seem to be better estimators of item difficulty. The accuracy of the estimates can be improved by: the information the estimators or teachers have about the target group and their earlier assessment results; defining the target group before the estimation process; the possibility of having discussions about the defined target group of students and their corresponding standards during the estimation process; and by the amount of training in item construction and estimating. In the subsequent study, the ability and accuracy of teachers and students to estimate the difficulty levels of assessment items was examined. In higher education, results show that teachers are able to estimate the difficulty levels correctly for only a small proportion of the assessment items. They overestimate the difficulty level of most of the assessment items. Students, on the other hand, underestimate their own performances. In addition, the relationships between the students’ perceptions of the difficulty levels of the assessment items and their performances on the assessments were investigated. Results provide evidence that the students who performed best on the assessments underestimated their performances the most. Several explanations are discussed and suggestions for additional research are offered. 相似文献

18.

宁波市小学生体质调查分析

陈正富何红英《宁波教育学院学报》2003,5(3):55-57

从身体形态、机能、素质三方面共 10项指标对宁波市小学生体质状况进行调研分析 ,结果表明 ,宁波市小学生的体质状况总体上高于全国常模标准 ,城乡间差异缩小 ,其中小学生心肺功能方面的指标农村好于城市。相似文献

19.

Comparison of the Performance of Varimax and Promax Rotations: Factor Structure Recovery for Dichotomous Items 总被引：1，自引：0，他引：1

Holmes Finch 《Journal of Educational Measurement》2006,43(1):39-52

Nonlinear factor analysis is a tool commonly used by measurement specialists to identify both the presence and nature of multidimensionality in a set of test items, an important issue given that standard Item Response Theory models assume a unidimensional latent structure. Results from most factor-analytic algorithms include loading matrices, which are used to link items with factors. Interpretation of the loadings typically occurs after they have been rotated in order to amplify the presence of simple structure. The purpose of this simulation study is to compare the ability of two commonly used methods of rotation, Varimax and Promax, in terms of their ability to correctly link items to factors and to identify the presence of simple structure. Results suggest that the two approaches are equally able to recover the underlying factor structure, regardless of the correlations among the factors, though the oblique method is better able to identify the presence of a "simple structure." These results suggest that for identifying which items are associated with which factors, either approach is effective, but that for identifying simple structure when it is present, the oblique method is preferable. 相似文献

20.

A Comparison of Strategies for Forming Product Indicators for Unequal Numbers of Items in Structural Equation Models of Latent Interactions

Yan Wu Zhonglin Wen Herbert W. Marsh Kit-Tai Hau 《Structural equation modeling》2013,20(4):551-567

This Monte Carlo simulation study investigated different strategies for forming product indicators for the unconstrained approach in analyzing latent interaction models when the exogenous factors are measured by unequal numbers of indicators under both normal and nonnormal conditions. Product indicators were created by (a) multiplying parcels of the larger scale by items of the smaller scale, and (b) matching items according to reliability to create several product indicators, ignoring those items with lower reliability. Two scaling approaches were compared where parceling was not involved: (a) fixing the factor variances, and (b) fixing 1 loading to 1 for each factor. The unconstrained approach was compared with the latent moderated structural equations (LMS) approach. Results showed that under normal conditions, the LMS approach was preferred because the biases of its interaction estimates and associated standard errors were generally smaller, and its power was higher than that of the unconstrained approach. Under nonnormal conditions, however, the unconstrained approach was generally more robust than the LMS approach. It is recommended to form product indicators by using items with higher reliability (rather than parceling) in the matching and then to specify the model by fixing 1 loading of each factor to unity when adopting the unconstrained approach. 相似文献