期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Digital ITEMS Module 1: Reliability in Classical Test Theory

Charlie Lewis Michael Chajewski André A. Rupp 《Educational Measurement》2018,37(2):71-72

In this ITEMS module, we provide a two‐part introduction to the topic of reliability from the perspective of classical test theory (CTT). In the first part, which is directed primarily at beginning learners, we review and build on the content presented in the original didactic ITEMS article by Traub and Rowley (1991). Specifically, we discuss the notion of reliability as an intuitive everyday concept to lay the foundation for its formalization as a reliability coefficient via the basic CTT model. We then walk through the step‐by‐step computation of key reliability indices and discuss the data collection conditions under which each is most suitable. In the second part, which is directed primarily at intermediary learners, we present a distribution‐centered perspective on the same content. We discuss the associated assumptions of various CTT models ranging from parallel to congeneric, and review how these affect the choice of reliability statistics. Throughout the module, we use a customized Excel workbook with sample data and basic data manipulation functionalities to illustrate the computation of individual statistics and to allow for structured independent exploration. In addition, we provide quiz questions with diagnostic feedback as well as short videos that walk through sample exercises within the workbook. 相似文献

2.

Refined empirical line method to calibrate IKONOS imagery 总被引：1，自引：0，他引：1

XU Jun-feng HUANG Jing-feng 《浙江大学学报(A卷英文版)》2006,7(4):641-646

INTRODUCTION To extract quantitative biophysical parameter such as leaf biomass and leaf chlorophyll concentra- tion from the remotely sensed imagery accurately, the effects of atmospheric scattering and absorption must be removed. Atmospheric effects add to or diminish true ground reflectance, if the atmospheric spectral features are not properly removed. A significant analytical bias could be introduced for data interpre- tation (Ben-Dor and Levin, 2000). Many approaches have been deve… 相似文献

3.

A Note on Vulnerability in Graphs of Diameter Five （ Ⅱ ）

XU Cheng-de 《上海大学学报(英文版)》2005,9(4):306-308

The line persistence of a graph G, Pt （ G ） is the minimum number of lines which must be removed to increase the diameter of G. In Ref. [7] （J. Shanghai Univ., 2003,7（4）：352-357）, we gave a characterization of graphs of diameter five with ρ1 （ G ）≥2. In this paper we will show that each of the 8 special graphs Xi （ i = 1,2,3,4,5,6,7,8） listed in condition （2） of Theorem 1 in Ref. [7] can not be deleted. Therefore the results we obtained in Ref. [7] can not in general be improved. 相似文献

4.

Assessing Understanding of the Learning Cycle: The ULC

Edmund A. Marek Steven J. Maier Florence McCann 《Journal of Science Teacher Education》2008,19(4):375-389

An 18-item, multiple choice, 2-tiered instrument designed to measure understanding of the learning cycle (ULC) was developed and field-tested from the learning cycle test (LCT) of Odom and Settlage (Journal of Science Teacher Education, 7, 123–142, 1996). All question sets of the LCT were modified to some degree and 5 new sets were added, resulting in the ULC. The ULC measures (a) understandings and misunderstandings of the learning cycle, (b) the learning cycle’s association with Piaget’s (Biology and knowledge theory: An essay on the relations between organic regulations and cognitive processes, 1975) theory of mental functioning, and (c) applications of the learning cycle. The resulting ULC instrument was evaluated for internal consistency with Cronbach’s alpha, yielding a coefficient of .791. 相似文献

5.

Relationships between cognitive diagnosis, CTT, and IRT indices: an empirical investigation

Young-Sun Lee Jimmy de la Torre Yoon Soo Park 《Asia Pacific Education Review》2012,13(2):333-345

Cognitive diagnosis models (CDMs) continue to generate interest among researchers and practitioners because they can provide diagnostic information relevant to classroom instruction and student learning. However, its modeling component has outpaced its complementary component??test construction. Thus, most applications of cognitive diagnosis modeling involve retrofitting of CDMs to assessments constructed using classical test theory (CTT) or item response theory (IRT). This study explores the relationship between item statistics used in the CTT, IRT, and CDM frameworks using such an assessment, specifically a large-scale mathematics assessment. Furthermore, by highlighting differences between tests with varying levels of diagnosticity using a measure of item discrimination from a CDM approach, this study empirically uncovers some important CTT and IRT item characteristics. These results can be used to formulate practical guidelines in using IRT- or CTT-constructed assessments for cognitive diagnosis purposes. 相似文献

6.

Study of factors influencing research productivity of agriculture faculty members in Iran

Yousef Hedjazi Jaleh Behravan 《Higher Education》2011,62(5):635-647

The purpose of this research is to analyze the relationship between individual, institutional and demographic characteristics on one hand and the research productivity of agriculture faculty members on the other. The statistical population of the research comprises 280 academic staff in agricultural faculties all over Tehran Province. The data regarding research productivity and demographic characteristics were extracted from the faculty members’ profiles. Questionnaires were utilized to collect information concerning individual and institutional variables. The reliably of the questionnaire was calculated to be between 0.74 and 0.97 using the Cronbach’s Alpha. The regression analysis revealed that from among demographic characteristics two variables, namely, academic rank and age ( \textR_\textAD² {\text{R}}_{\text{AD}}^{ 2} = 0.265), among individual characteristics, three variables, namely, working habits, creativity as well as autonomy and commitment ( \textR_\textAD² {\text{R}}_{\text{AD}}^{ 2} = 0.097), and among institutional characteristics four variables namely, network of communication with colleagues, resources of facilities, corporate management and clear research objectives ( \textR_\textAD² {\text{R}}_{\text{AD}}^{ 2} = 0.151) were significant predictors for agricultural faculty members’ research productivity. 相似文献

7.

直径为5的图的边直径稳定度

许承德《上海大学学报(英文版)》2003,7(4):352-357

1　Introduction　Ingeneral,wefollowthenotationandterminologyof[1- 5 ].Inthispaperallgraphsaresimple .　LetGbeagraph ,V(G)thevertexsetofG ,andE(G)theedgesetofG .Thedistancebetweentwoverticesx ,y∈V(G) ,isdenotedbydG(x ,y) .ThediameterofGisdenotedbyd(G) .Apairofverticesx ,y∈V(G)suchthatdG(x ,y) =d(G)iscalledadiametricalpair.Forx ,y∈V(G) ,ashort (x ,y) pathisan (x ,y) pathwithlength≤d(G ) .ThelengthofapathPisdenotedby|P|.Anedgee∈E(G)iscalledcyclicifthereexistsacycleinGcontaining… 相似文献

8.

Measuring Student Involvement: A Comparison of Classical Test Theory and Item Response Theory in the Construction of Scales from Student Surveys

Jessica Sharkness Linda DeAngelo 《Research in higher education》2011,52(5):480-507

This study compares the psychometric utility of Classical Test Theory (CTT) and Item Response Theory (IRT) for scale construction with data from higher education student surveys. Using 2008 Your First College Year (YFCY) survey data from the Cooperative Institutional Research Program at the Higher Education Research Institute at UCLA, two scales are built and tested—one measuring social involvement and one measuring academic involvement. Findings indicate that although both CTT and IRT can be used to obtain the same information about the extent to which scale items tap into the latent trait being measured, the two measurement theories provide very different pictures of scale precision. On the whole, IRT provides much richer information about measurement precision as well as a clearer roadmap for scale improvement. The findings support the use of IRT for scale construction and survey development in higher education. 相似文献

9.

Using Tree Diagrams as an Assessment Tool in Statistics Education

Yue Yin 《Educational Assessment》2013,18(1):22-49

This study examines the potential of the tree diagram, a type of graphic organizer, as an assessment tool to measure students' knowledge structures in statistics education. Students' knowledge structures in statistics have not been sufficiently assessed in statistics, despite their importance. This article first presents the rationale and method for using tree diagrams as assessment tools in statistics education, followed by an empirical study examining the technical quality of tree diagram assessment. Thirty-seven university students enrolled in an introductory statistics course, and four statistics experts participated in the study. The results provide evidence of reliability and validity for the proposed function of using tree diagrams to measure knowledge structures in statistics education. As reliability evidence, the interrater reliability of tree diagram scoring is high (Pearson's r = .96); alpha coefficient of the 21 linked vertex pairs in tree diagrams is .89. As validity evidence, experts performed better than novices on the tree diagram assessment; students' performance with the tree diagram is significantly correlated with their performance on statistics achievement tests (Pearson's correlation coefficient r = .62); tree diagrams are sensitive to the discrepancies in students' knowledge structures, and most students considered tree diagrams helpful to their organization of statistics knowledge. 相似文献

10.

Anisotropic Open Cosmological Models of Spin Matter with Magnetic Moment

沈利明孙迺疆吉桂芳陆惠卿《上海大学学报(英文版)》2001,5(3):196-200

We have derived a set of field equations for a Weyssenhoff spin fluid including magnetic interacton among the spinning particles prevailling in spatially homogeneous,but anisotropically cosmological models of Bianchi type V based on Einstein-Cartan theory.We analyze the field equations in three different equations of states specified by p=1(1/3)ρand p=0,The analytical solutions found are non-singular provided that the combined energy arising from matter spin and magnetic interaction among particles overcomes the anisotropy energy in the Universe,We have also deduced that the minimum particle numers for the radiation(p=(1/3)ρ) and matter(p=0) epochs are 10^88 and 10^108 respectively.the minimum particle number for the state p=ρ is 10^96,leading to the conclusion that we must consider the existence of neutrinos and other creation of particles and anti-particles under torsion and strong gravitational field in the early Universe. 相似文献

11.

On Turán type inequality with doubling weights andA * weights

Yu Dan-sheng Wei Bao-rong 《浙江大学学报(A卷英文版)》2005,6(7):764-768

LetH _n be the set of real algebraic polynomials of degreen, whose zeros all lie in the interval [−1,1]. The well known Turán type inequalities tell us that forf(x)∈H _n, it holds ‖f′‖≥C√n‖f‖. This note deals with the weighted Turán type inequalities with the weights having inner singularities underL ^p norm for 0<p≤∞. Our results essentially extend the result of Wang and Zhou (2002), and the method used in this paper is simpler and more direct than that of Wang and Zhou (2002). The results and methods have their own values in approximation theory and computation. 相似文献

12.

The reliability of test scores for the pervasive developmental disorders rating scale

Thomas O. Williams Ronald C. Eaves 《Psychology in the schools》2002,39(6):605-611

The Pervasive Developmental Disorders Rating Scale (PDDRS; Eaves, 1993) is a screening instrument used in the assessment of autistic disorder. In this study, the reliability of test scores for the PDDRS was examined with three samples. The first sample consisted of 456 participants ranging in age from 1 to 12 years old and the second sample consisted of 111 participants in the 13 to 24 year‐old range. Additionally, the test‐retest reliability of scores for the PDDRS was examined with a sample of 40 participants. The results indicated that coefficient alpha for the PDDRS Total Score was adequate for screening purposes (r = .89) for both age groups. The results of the test‐retest study also suggested that PDDRS had adequate test‐retest reliability (r = .92) for the PDDRS Total Score. © 2002 Wiley Periodicals, Inc. Psychol Schs 39: 605–611, 2002. 相似文献

13.

Regular-singular crossings on nonlinear systems with turning point

Zhang Hanlin 《上海大学学报(英文版)》1998,2(3):191-193

１ＩｎｔｒｏｄｕｃｔｉｏｎＤｉｆｅｒｅｎｔｉａｌｅｑｕａｔｉｏｎｓｗｉｔｈｔｕｒｎｉｎｇｐｏｉｎｔｓｅｘｉｓｔｗｉｄｅｌｙｉｎｍａｎｙｐｒｏｂｌｅｍｓｏｆｍａｔｈｅｍａｔｉｃｓｐｈｙｓｉｃｓ,ｓｏｔｈｅｓｅｐｒｏｂｌｅｍｓａｒｅｖｅｒｙｉｍｐｏｒｔａ... 相似文献

14.

Performance of the Generalized S‐X2 Item Fit Index for Polytomous IRT Models

Taehoon Kang Troy T. Chen 《Journal of Educational Measurement》2008,45(4):391-406

Orlando and Thissen's S‐X ² item fit index has performed better than traditional item fit statistics such as Yen's Q₁ and McKinley and Mill's G² for dichotomous item response theory (IRT) models. This study extends the utility of S‐X ² to polytomous IRT models, including the generalized partial credit model, partial credit model, and rating scale model. The performance of the generalized S‐X ² in assessing item model fit was studied in terms of empirical Type I error rates and power and compared to G². The results suggest that the generalized S‐X ² is promising for polytomous items in educational and psychological testing programs. 相似文献

15.

Relating Unidimensional IRT Parameters to a Multidimensional Response Space: A Review of Two Alternative Projection IRT Models for Scoring Subscales

Nilufer Kahraman Tony Thompson 《Journal of Educational Measurement》2011,48(2):146-164

A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article is to compare two different formulations of an alternative Item Response Theory (IRT) model developed to parameterize unidimensional projections of multidimensional test items: Analytical and Empirical formulations. Two real data applications are provided to illustrate how the projection IRT model can be used in practice, as well as to further examine how ability estimates from the projection IRT model compare to external examinee measures. The results suggest that collateral information extracted by a projection IRT model can be used to improve reliability and validity of subscale scores, which in turn can be used to provide diagnostic information about strength and weaknesses of examinees helping stakeholders to link instruction or curriculum to assessment results. 相似文献

16.

Strong Convergence of the Coefficient Alpha Estimator for Reliability of Multiple-Component Measuring Instruments

Tenko Raykov 《Structural equation modeling》2019,26(3):430-436

It is shown that in general the popular coefficient alpha estimator for reliability of multi-component measuring instruments converges almost surely to a quantity that is not equal to the population reliability coefficient. This convergence with probability 1 is a stronger statement than convergence in probability (consistency) and convergence in distribution for the alpha estimator, which have been studied in the past. In the special case of congeneric measures with uncorrelated errors and equal loadings on the common true score, the alpha estimator converges almost surely to the population reliability coefficient that equals population alpha, which implies also its consistency as a reliability estimator. When the loadings are unequal but sufficiently high and similar, the alpha estimator converges almost surely to population alpha that is essentially indistinguishable from the population reliability coefficient, which implies alpha’s approximate consistency then. For the general case, the results entail that the alpha estimator is not a consistent estimator of reliability. The findings add to the critical literature on coefficient alpha in the general case, as well as to the justification of its use as a dependable measuring instrument reliability estimator in special cases and settings resulting under appropriate restrictive conditions, and are illustrated using a numerical example. 相似文献

17.

Statistical estimates of learners�� judgments about knowledge in calibration of achievement

Philip H. Winne Krista Renee Muis 《Metacognition and Learning》2011,6(2):179-193

In theoretical models of self-regulated learning, calibration is one important component in successful learning. Two issues of calibration are explored. First, Nelson (1987) suggested the G (gamma) coefficient is the most appropriate measure of calibration (judgment accuracy) and rejected signal detection theory’s d′ statistic because data commonly challenge distributional assumptions. We empirically examined this issue, comparing G and d′. Second, we examined whether a learner’s calibration varies across three domains of knowledge: general, word, and mathematics. A sample of 266 university students volunteered to participate in the study. Participants were selected from various undergraduate and graduate courses. Participants first answered demographic items. Then they completed three knowledge tests (general, word, and mathematics) and judged correctness for each answer provided. Order of domains was randomly counterbalanced among participants. Results show that d′ is a valid measure of calibration, that assumptions about underlying distributions can be tested, and that preliminary evidence suggests that d′ may be a superior measure of accuracy compared to G. Finally, calibration varied by domain. 相似文献

18.

基于CTT、GT、IRT的评分者信度研究——以某届奥运会女子跳水决赛为例

钟晓玲康春花陈婧《考试研究》2013,(5):41-52

本文以某届国际奥林匹克运动会女子跳水决赛为例,综合应用CTT、GT和IRT三大测量理论进行评分者信度分析,从不同角度揭示评分者之间和评分者内部的差异情况。结果表明:CTT的评分者信度分别为0.981和078;GT的概化系数和可靠性指数分别为0.8279和0.8271,比赛中所采用的7名评委分别对选手在5轮上的跳水表现进行评定的决策是比较适宜的决策;在IRT中,相对而言,评委5在7名评委中最为严厉,评委2最为宽松,但评委之间在宽严程度上的差异不显著,评委1和评委4在自身一致性上存在问题,不同评委在评定不同选手、不同难度系数动作和不同轮数上存在偏差,但未达到显著性水平。基于本文的分析,可以了解三种评分者信度分析方法的特点及各自优势,为评分者培训和提高评分信度提供有用信息。相似文献

19.

Left-handed materials in magnetized metallic magnetic thin films

伍瑞新 XIAO John Q. 《浙江大学学报(A卷英文版)》2006,7(1):71-75

INTRODUCTION Veselago (1968) showed that if both the permit- tivity ε and the magnetic permeability μ of a material are negative, the propagation direction of an elec- tromagnetic (EM) wave will be opposite to its energy flow direction. Such media are called left-handed materials (LHMs) since the electric field E, magnetic field H, and wave vector k in these media form a left-hand triplet of vectors, instead of a right-hand triplet observed in conventional materials. For a long time, … 相似文献

20.

Estimating reliability of school-level scores using multilevel and generalizability theory models

Min-Jeong Jeon Guemin Lee Jeong-Won Hwang Sang-Jin Kang 《Asia Pacific Education Review》2009,10(2):149-158

The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, ‘student within schools’ and ‘students within schools and subject areas,’ were conceptualized and implemented in this study. Four methods resulting from the combination of these two approaches with generalizability theory and multilevel models were compared for both balanced and unbalanced data. The generalizability theory and multilevel models for the ‘students within schools’ approach produced the same variance components and reliability estimates for the balanced data, while failing to do so for the unbalanced data. The different results from the two models can be explained by the fact that they administer different procedures in estimating the variance components used, in turn, to estimate reliability. Among the estimation methods investigated in this study, the generalizability theory model with the ‘students nested within schools crossed with subject areas’ design produced the lowest reliability estimates. Fully nested designs such as (students:schools) or (subject areas:students:schools) would not have any significant impact on reliability estimates of school-level scores. Both methods provide very similar reliability estimates of school-level scores. 相似文献