首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Policymakers usually leave decisions about scaling the scores used for accountability to their appointed technical advisory committees and the testing contractors. However, scaling decisions can have an appreciable impact on school ratings. Using middle-school data from New York State, we examined the consistency of school ratings based on two scaling approaches that differed in scaling decisions that are important in high-stakes testing contexts. We found that, depending on subject, grade, and year, a switch in scaling approach led to (1) average absolute shifts in ranks of between 50 and 132 positions (median = 69), which are appreciable shifts for a listing of 1,243 schools; and (2) between 7% and 45% (average = 20%) of schools experiencing shifts in assigned performance bands, depending on the classification scheme. Further, the effect of scaling approach was larger when the raw-score distribution has more severe ceiling effect, and in these cases, it was driven primarily by the difference in the location of the highest obtainable scale score from the two scaling approaches.  相似文献   

2.
Value-added assessment methods have been criticized by researchers and policy makers for a number of reasons. One issue includes the sensitivity of model results across different outcome measures. This study examined the utility of incorporating multivariate latent variable approaches within a traditional value-added framework. We evaluated the stability of teacher rankings across univariate and multivariate measurement model structures and scaling metric combinations using a cumulative cross-classified mixed effect model. Our results showed multivariate models were more stable across modeling conditions than univariate approaches. These findings suggest there is potential utility in incorporating multiple measures with teacher evaluation systems, yet future research will need to evaluate the degree to which models recover known population parameters via Monte Carlo simulation.  相似文献   

3.
This study introduced various nonlinear growth models, including the quadratic conventional polynomial model, the fractional polynomial model, the Sigmoid model, the growth model with negative exponential functions, the multidimensional scaling technique, and the unstructured growth curve model. It investigated which growth models effectively describe student growth in math and reading using four-wave longitudinal achievement data. The objective of the study is to provide valuable information to researchers especially when they consider applying one of the nonlinear models to longitudinal studies. The results showed that the quadratic conventional polynomial model fit the data best. However, this model seemed to overfit the data and made statistical inference problematic concerning parameter estimates. Alternative nonlinear models with fewer parameters adequately fit the data and yielded consistent significance testing results under extreme multicollinearity. It indicates that the alternative models denoting somewhat simpler models would be selected over the conventional polynomial model with more fixed parameters. Other practical issues pertaining to these growth models are also discussed.  相似文献   

4.
Scaling properties of Navier-Stokes turbulence   总被引:1,自引:0,他引:1  
  相似文献   

5.
As a universal conclusion of turbulent scale, scaling laws are important to the research on statistic turbulence. We measured two-dimensional instantaneous velocity field in turbulent boundary layers of flat plate with the momentum thickness Reynolds number Reθ=2 167. Scaling laws have different forms in different wall distance and scale. We proposed an expected scaling law and compared it with the She-Leveque (SL) scaling law based on the wavelet analysis and traditional statistical methods. Results show that the closer to the wall, the more the expected scaling law approached to the SL scaling law.  相似文献   

6.
研究了Gledzer-Ohkitani-Yamada模型中各种模型参数随外力的变化情况,计算了该模型充分发展湍流速度结构函数的相对标度指数随泰勒微尺度雷诺数的变化,计算结果既显示了与She-L啨vesque的速度结构函数标度律和实验结果的一致性,又给出了需要实验进一步检验的部分.  相似文献   

7.
This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel model, CCrem, CCMMrem). Results indicate that ignoring rater bias can lead to teachers being misclassified within an evaluation system. The best estimates of teacher effectiveness are produced using CCrems regardless of scaling method. Use of CCMMrems to model rater bias cannot be recommended based on the results of this study; combining the use of CCMMrems with an IRT scaling method produced especially unstable results.  相似文献   

8.
一种利用等效模型与遗传算法的动态有限元模型修正方法   总被引:3,自引:0,他引:3  
为了解决现有动态有限元模型修正方法计算效率不高或者可能获得局部最优解的问题,提出了一种利用等效模型和遗传算法的动态有限元模型修正新方法.首先,在设计参数的取值范围内,根据预设的多项式模型的阶次以及自变量的个数,利用试验设计方法获得拟合响应面模型所需要的最优样本点;通过有限元分析获得样本数据,并利用回归分析获得响应面模型,从而以响应面模型逼近结构特征与设计参数之间的函数关系.然后,在遗传算法的适应度评估环节,利用响应面模型替代有限元模型计算对应于一组设计参数的结构特征,并计算遗传个体的适应度,最终通过进化获得最优解,即为修正后的设计参数.以汽车车架模型为例,对其进行有限元分析与模态试验,并利用所提出的方法进行模型修正.修正后,模态频率误差的均方值小于2%.用修改后结构的动态特性的测试结果,对修正后有限元模型的预测能力进行检验,模态频率预测误差的均方值小于2%.  相似文献   

9.
Structured means analysis is a very useful approach for testing hypotheses about population means on latent constructs. In such models, a z test is most commonly used for testing the statistical significance of the relevant parameter estimates or of the differences between parameter estimates, where a z value is computed based on the asymptotic standard error estimate associated with the parameter of interest. In the current article, a series of population analyses demonstrate that the z tests for latent mean structure parameters or, more directly, the standard error estimates upon which those z tests are based are, not invariant to how factors are scaled. As such, circumstances exist in which latent mean inference is compromised solely as a result of scaling decisions. This problem is illustrated in the context of between-subjects (i.e., multisample) latent means models and within-subjects latent means models. Recommendations for practice are also offered.  相似文献   

10.
We developed an empirical Bayes (EB) enhancement to Mantel-Haenszel (MH) DIF analysis in which we assume that the MH statistics are normally distributed and that the prior distribution of underlying DIF parameters is also normal. We use the posterior distribution of DIF parameters to make inferences about the item's true DIF status and the posterior predictive distribution to predict the item's future observed status. DIF status is expressed in terms of the probabilities associated with each of the five DIF levels defined by the ETS classification system: C–, B–, A, B+, and C+. The EB methods yield more stable DIF estimates than do conventional methods, especially in small samples, which is advantageous in computer-adaptive testing. The EB approach may also convey information about DIF stability in a more useful way by representing the state of knowledge about an item's DIF status as probabilistic.  相似文献   

11.
Reading and Mathematics tests of multiple-choice items for grades Kindergarten through 9 were vertically scaled using the three-parameter logistic model and two different scaling procedures: concurrent and separate by grade groups. Item parameters were estimated using Markov chain Monte Carlo methodology while fixing the grade 4 population abilities to have a standard normal distribution. For the separate grade-groups scaling, grade groupings were linked using the Stocking and Lord test characteristic curve procedure. Abilities were estimated using the maximum-likelihood method. In either content area, scatterplots of item difficulty, discrimination, and ability estimates from the two methods showed consistently strong linear relationships. However, as grade deviated from the base grade of four, the best-fit linear line through the pairs of item discriminations started to rotate away from the identity line. This indicated the discrimination estimates from the separate grade-groups procedure for extreme grades to be, on average, higher than those from the concurrent analysis. The study also observed some systematic change in score variability across grades. In general, the two vertical scaling approaches yielded similar results at more grades in Reading than in Mathematics.  相似文献   

12.
The conventional approach to scaling up educational reforms considers the development and testing phases to be distinct from the work of implementing at scale. Decades of research suggest that this approach yields inconsistent and often disappointing improvements for schools most in need. More recent scholarship on scaling school improvement suggests that these activities should be integrated into implementation, although this presents challenges in how we evaluate implementation in particular schools. This paper presents a framework to conceptualize implementation when design, implementation, and scaling up are integrated activities.  相似文献   

13.
Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from one year to the next. This study compares three different item response theory scaling methods (fixed common item parameter, Stocking & Lord, and Concurrent Calibration) with respect to examinee classification into performance categories, and estimation of the ability parameter, when the content of the test form changes slightly from year to year, and the examinee ability distribution changes. The results indicate that calibration methods, especially concurrent calibration, produced more stable results than the transformation method.  相似文献   

14.
该文探讨了一种基于虚拟仪器组成舵机测试系统的设计思想,在软件开发中比较好的给出了舵机各个参数的测试方法和测试流程设计等,列举了几个单台参数测试的实例。利用本系统全面测试舵机的各参数性能,对开发新产品具有实用价值和理论指导意义。  相似文献   

15.
In psychological research, available data are often insufficient to estimate item factor analysis (IFA) models using traditional estimation methods, such as maximum likelihood (ML) or limited information estimators. Bayesian estimation with common-sense, moderately informative priors can greatly improve efficiency of parameter estimates and stabilize estimation. There are a variety of methods available to evaluate model fit in a Bayesian framework; however, past work investigating Bayesian model fit assessment for IFA models has assumed flat priors, which have no advantage over ML in limited data settings. In this paper, we evaluated the impact of moderately informative priors on ability to detect model misfit for several candidate indices: posterior predictive checks based on the observed score distribution, leave-one-out cross-validation, and widely available information criterion (WAIC). We found that although Bayesian estimation with moderately informative priors is an excellent aid for estimating challenging IFA models, methods for testing model fit in these circumstances are inadequate.  相似文献   

16.
Computerized adaptive testing in instructional settings   总被引:3,自引:0,他引:3  
Item response theory (IRT) has most often been used in research on computerized adaptive testing (CAT). Depending on the model used, IRT requires between 200 and 1,000 examinees for estimating item parameters. Thus, it is not practical for instructional designers to develop their own CAT based on the IRT model. Frick improved Wald's sequential probability ratio test (SPRT) by combining it with normative expert systems reasoning, referred to as an EXSPRT-based CAT. While previous studies were based on re-enactments from historical test data, the present study is the first to examine how well these adaptive methods function in a real-time testing situation. Results indicate that the EXSPRT-I significantly reduced test lengths and was highly accurate in predicting mastery. EXSPRT is apparently a viable and practical alternative to IRT for assessing mastery of instructional objectives.  相似文献   

17.
This study describes a new approach to undergraduate science training that offers an alternate model to the national objective of scaling up scientific research interests and capabilities among undergraduate students. With this new focus, we seek to more effectively bring scientific research methods and experiences to larger numbers of students in non-elite educational circumstances. Our model has been designed and implemented, at the John Jay College and the Borough of Manhattan Community College, both of which are part of the City University of New York (CUNY), where we have a majority non-White and economically disadvantaged student body. We have successfully engaged large numbers of undergraduate students by linking multiple classes in the social and behavioral sciences to build collective cross-disciplinary research projects that give every enrolled student an opportunity to receive high-quality research training and create cumulative data sets over years that are cumulative, collaborative, and of professional value.  相似文献   

18.
通过文献资料法、量表测量和实验的方法,对江苏广播电视大学158名在校学生进行高等数学TEC教学研究,结果表明,通过TEC教学,学生数学意识、数学应用能力、数学审美能力、数学机智和数学创新能力比较传统教学条件下的学生方面均有不同程度提高,其中数学意识和数学应用能力有显著提高。研究表明TEC教学对于全面提高高职院校学生素质有积极的作用,这与TEC教学方式的目标相一致,在一定程度上说明TEC教学方式适用于高职院校高等数学教学。  相似文献   

19.
探讨无约束优化的一族最优条件数自调比变尺度(OCSSVM)算法的收敛性。证明了采用Wolfe线性搜索的此族算法对于一般的凸函数是整体收敛的。  相似文献   

20.
Confirmatory factor analysis (CFA) is often used in the social sciences to estimate a measurement model in which multiple measurement items are hypothesized to assess a particular latent construct. This article presents the utility of multilevel CFA (MCFA; Muthén, 1991, 1994) and hierarchical linear modeling (HLM; Raudenbush, Rowan, & Kang, 1991) methods in testing measurement models in which the underlying attribute may vary as a function of various levels of observation. An illustrative example using a real dataset is provided in which an unconditional model specification and parameter estimates from the MCFA and HLM are shown. The article demonstrates the comparability of the two methods in estimating measurement parameters of interest (i.e., true variance at levels the measures are used and measurement errors).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号