期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of methods for determining dimensionality in Rasch measurement

Richard M. Smith 《Structural equation modeling》2013,20(1):25-40

This study compares the Rasch item fit approach for detecting multidimensionality in response data with principal component analysis without rotation using simulated data. The data in this study were simulated to represent varying degrees of multidimensionality and varying proportions of items representing each dimension. Because the requirement of unidimensionality is necessary to preserve the desirable measurement properties of Rasch models, useful ways of testing this requirement must be developed. The results of the analyses indicate that both the principal component approach and the Rasch item fit approach work in a variety of multidimensional data structures. However, each technique is unable to detect multidimensionality in certain combinations of the level of correlation between the two variables and the proportion of items loading on the two factors. In cases where the intention is to create a unidimensional structure, one would expect few items to load on the second factor and the correlation between the factors to be high. The Rasch item fit approach detects dimensionality more accurately in these situations. 相似文献

2.

Exploring the intentions and practices of principals regarding inclusive education: an application of the Theory of Planned Behaviour

Zi Yan Kuen-fung Sin 《Cambridge Journal of Education》2015,45(2):205-221

This study aimed at providing explanation and prediction of principals’ inclusive education intentions and practices under the framework of the Theory of Planned Behaviour (TPB). A sample of 209 principals from Hong Kong schools was surveyed using five scales that were developed to assess the five components of TPB: attitude, subjective norm, perceived behaviour control, intention, and behaviour. Rasch analysis was utilised to examine the psychometric quality of the scales and generate principals’ measures, which were subsequently subjected to path analysis to investigate the relationships among the five components. The results revealed a good model–data fit. Principals’ attitude and perceived subjective norm were strong and significant predictors of their intention to implement inclusive education. The predictive power of perceived behaviour control on intention was not significant. Intention and perceived behaviour control were found to have significant predictive power for principals’ reported inclusive practice. The implications of the findings are discussed. 相似文献

3.

The Impact of Multidimensionality on Extraction of Latent Classes in Mixture Rasch Models

下载免费PDF全文

Yoonsun Jang Seock‐Ho Kim Allan S. Cohen 《Journal of Educational Measurement》2018,55(3):403-420

This study investigates the effect of multidimensionality on extraction of latent classes in mixture Rasch models. In this study, two‐dimensional data were generated under varying conditions. The two‐dimensional data sets were analyzed with one‐ to five‐class mixture Rasch models. Results of the simulation study indicate the mixture Rasch model tended to extract more latent classes than the number of dimensions simulated, particularly when the multidimensional structure of the data was more complex. In addition, the number of extracted latent classes decreased as the dimensions were more highly correlated regardless of multidimensional structure. An analysis of the empirical multidimensional data also shows that the number of latent classes extracted by the mixture Rasch model is larger than the number of dimensions measured by the test. 相似文献

4.

Stability of Rasch Scales Over Time

Catherine S. Taylor Yoonsun Lee 《教育实用测度》2013,26(1):87-113

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items. When tests are equated across forms, researchers check for the stability of common items before including them in equating procedures. Stability is usually examined in relation to polytomous items' central “location” on the scale without taking into account the stability of the different item scores (step difficulties). We examined the stability of score scales over a 3–5-year period, considering both stability of location values and stability of step difficulties for common item equating. We also investigated possible changes in the scale measured by the tests and systematic scale drift that might not be evident in year-to-year equating. Results across grades and content areas suggest that equating results are comparable whether or not the stability of step difficulties is taken into account. Results also suggest that there may be systematic scale drift that is not visible using year-to-year common item equating. 相似文献

5.

Development of an instrument to understand the child protective services decision-making process,with a focus on placement decisions

《Child abuse & neglect》2015

When children come to the attention of the child welfare system, they become involved in a decision-making process in which decisions are made that have a significant effect on their future and well-being. The decision to remove children from their families is particularly complex; yet surprisingly little is understood about this decision-making process. This paper presents the results of a study to develop an instrument to explore, at the caseworker level, the context of the removal decision, with the objective of understanding the influence of the individual and organizational factors on this decision, drawing from the Decision Making Ecology as the underlying rationale for obtaining the measures. The instrument was based on the development of decision-making scales used in prior decision-making studies and administered to child protection caseworkers in several states. Analyses included reliability analyses, principal components analyses, and inter-correlations among the resulting scales. For one scale regarding removal decisions, a principal components analysis resulted in the extraction of two components, jointly identified as caseworkers’ decision-making orientation, described as (1) an internal reference to decision-making and (2) an external reference to decision-making. Reliability analyses demonstrated acceptable to high internal consistency for 9 of the 11 scales. Full details of the reliability analyses, principal components analyses, and inter-correlations among the seven scales are discussed, along with implications for practice and the utility of this instrument to support the understanding of decision-making in child welfare. 相似文献

6.

Digital Module 10: Rasch Measurement Theory https://ncme.elevate.commpartners.com

Jue Wang George Engelhard 《Educational Measurement》2019,38(4):112-113

In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and invariant measurement. Specifically, they introduce the origins of Rasch measurement theory, the development of model‐data fit indices, as well as commonly used Rasch measurement models. From an applied perspective, they discuss best practices in constructing, estimating, evaluating, and interpreting a Rasch scale using empirical examples. They provide an overview of a specialized Rasch software program (Winsteps) and an R program embedded within Shiny (Shiny_ERMA) for conducting the Rasch model analyses. The module is designed to be relevant for students, researchers, and data scientists in various disciplines such as psychology, sociology, education, business, health, and other social sciences. It contains audio‐narrated slides, sample data, syntax files, access to Shiny_ERMA program, diagnostic quiz questions, data‐based activities, curated resources, and a glossary. 相似文献

7.

THE EFFECTS OF THE DELETION OF MISFITTING PERSONS ON VERTICAL EQUATING VIA THE RASCH MODEL

S. E. PHILLIPS 《Journal of Educational Measurement》1986,23(2):107-118

The purpose of the study was to compare Rasch model equatings of multilevel achievement test data before and after the deletion of misfitting persons. The Rasch equatings were also compared with an equating obtained using the equipercentile method. No basis could be found in the results for choosing between the two Rasch equatings. The deletion of misfitting persons produced minor improvements in Rasch model fit to the data. Both Rasch equatings produced results that differed from the results of the equipercentile equating. The Rasch data also indicated that the misfitting persons deleted in the second Rasch equating tended to be from the lower portion of the achievement distribution, suggesting that they may have been guessing. 相似文献

8.

Using Rasch Analysis to Inform Rating Scale Development

Carol Van Zile-Tamsen 《Research in higher education》2017,58(8):922-933

The use of surveys, questionnaires, and rating scales to measure important outcomes in higher education is pervasive, but reliability and validity information is often based on problematic Classical Test Theory approaches. Rasch Analysis, based on Item Response Theory, provides a better alternative for examining the psychometric quality of rating scales and informing scale improvements. This paper outlines a six-step process for using Rasch Analysis to review the psychometric properties of a rating scale. The Partial Credit Model and Andrich Rating Scale Model will be described in terms of the pyschometric information (i.e., reliability, validity, and item difficulty) and diagnostic indices generated. Further, this approach will be illustrated through the example of authentic data from a university-wide student evaluation of teaching. 相似文献

9.

The Course Experience Questionnaire: a Rasch Measurement Model Analysis

Russell F. Waugh 《高等教育研究与发展》1998,17(1):45-64

The Course Experience Questionnaire (CEQ) is applied to graduating students of Australian universities. Data from a selected university for graduates from 1994 to 1996 were analysed using a Rasch measurement model. The whole scale and each of the five sub‐scales were analysed for each year separately, to investigate its conceptual design and validity. The results show that, taken together, at least 17 of the 25 items con form a valid scale measuring graduate perceptions of their courses for each of the three data groups. Of the five sub‐scales, Good Teaching and Generic Skills are only moderately valid and reliable for use and interpretation separately from the main scale. 相似文献

10.

Scale Alignment in Between‐Item Multidimensional Rasch Models

Leah Feuerstahler Mark Wilson 《Journal of Educational Measurement》2019,56(2):280-301

Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described—delta dimensional alignment (DDA) and logistic regression alignment (LRA)—to transform estimated item parameters so that dimensions are aligned. Both the DDA and LRA methods are applied to real and simulated data, and it is demonstrated that both methods are broadly effective for achieving aligned scales. The routine use of scale alignment methods is recommended prior to comparing scores across dimensions. 相似文献

11.

Measuring Longitudinal Gains in Student Learning: A Comparison of Rasch Scoring and Summative Scoring Approaches

Yue Zhao Jenny M. Y. Huen Y. W. Chan 《Research in higher education》2017,58(6):605-616

This study pioneers a Rasch scoring approach and compares it to a conventional summative approach for measuring longitudinal gains in student learning. In this methodological note, our proposed methodology is demonstrated using an example of rating scales in a student survey as part of a higher education outcome assessment. Such assessments have become increasingly important worldwide for purposes of institutional accreditation and accountability to stakeholders. Data were collected from a longitudinal study by tracking self-reported learning outcomes of individual students in the same cohort who completed the student learning experience questionnaire (SLEQ) in their first and final years. Rasch model was employed for item calibration and latent trait estimation, together with a scaling procedure of concurrent calibration incorporating a randomly equivalent group design and a single group design to measure the gains in self-reported learning outcomes as yielded by repeated measures. The extent to which Rasch scoring compared to the conventional summative scoring method in its sensitivity to change was quantified by a statistical index namely relative performance (RP). Findings indicated greater ability to capture learning outcomes gains from Rasch scoring over the conventional summative scoring method, with RP values ranging from 3 to 17% in the cognitive, social, and value domains of the SLEQ. The Rasch scoring approach and the scaling procedure presented in the study can be readily generalised to studies using rating scales to measure change in student learning in the higher education context. The methodological innovations and contributions of this study are discussed. 相似文献

12.

Hong Kong parents and their children’s music training: measurement properties of the Parental Involvement in Music Training Questionnaire

Dianne M. Tai Shane N. Phillipson 《教育心理学》2018,38(5):633-647

Many Hong Kong-Chinese parents are active in their support for their children’s music training. To better understand this support, the Parental Involvement in Music Training Questionnaire (PIMTQ) is designed to measure the variability in parental involvement in their children’s music training. This study begins by exploring the factor structure of the PIMTQ and then establishes its measurement properties using Rasch modelling. Two hundred and ninety-five Hong Kong-Chinese parents completed a Chinese version of the 42-item instrument with principal components analysis of the responses showing seven factors. However, Rasch modelling showed that two of the five factors (Family Music Background and Family Music Interest) are unable to reliably predict variability in parent responses. We conclude, however, that the remaining five factors (Parental Support Toward Music Training, Parental Expectations, Home Music Environment, Music Programme Support and Attitude Toward Music) of the PIMTQ can be used as subscales to measure the involvement of Hong Kong-Chinese parents in their children’s music training. 相似文献

13.

Evaluating Instrument Quality in Science Education: Rasch‐based analyses of a Nature of Science test

Irene Neumann Knut Neumann Ross Nehm 《International Journal of Science Education》2013,35(10):1373-1405

Given the central importance of the Nature of Science (NOS) and Scientific Inquiry (SI) in national and international science standards and science learning, empirical support for the theoretical delineation of these constructs is of considerable significance. Furthermore, tests of the effects of varying magnitudes of NOS knowledge on domain‐specific science understanding and belief require the application of instruments validated in accordance with AERA, APA, and NCME assessment standards. Our study explores three interrelated aspects of a recently developed NOS instrument: (1) validity and reliability; (2) instrument dimensionality; and (3) item scales, properties, and qualities within the context of Classical Test Theory and Item Response Theory (Rasch modeling). A construct analysis revealed that the instrument did not match published operationalizations of NOS concepts. Rasch analysis of the original instrument—as well as a reduced item set—indicated that a two‐dimensional Rasch model fit significantly better than a one‐dimensional model in both cases. Thus, our study revealed that NOS and SI are supported as two separate dimensions, corroborating theoretical distinctions in the literature. To identify items with unacceptable fit values, item quality analyses were used. A Wright Map revealed that few items sufficiently distinguished high performers in the sample and excessive numbers of items were present at the low end of the performance scale. Overall, our study outlines an approach for how Rasch modeling may be used to evaluate and improve Likert‐type instruments in science education. 相似文献

14.

Exploring Secondary Students' Knowledge and Misconceptions about Influenza: Development,validation, and implementation of a multiple-choice influenza knowledge scale

William L. Romine Lloyd H. Barrow William R. Folk 《International Journal of Science Education》2013,35(11):1874-1901

Understanding infectious diseases such as influenza is an important element of health literacy. We present a fully validated knowledge instrument called the Assessment of Knowledge of Influenza (AKI) and use it to evaluate knowledge of influenza, with a focus on misconceptions, in Midwestern United States high-school students. A two-phase validation process was used. In phase 1, an initial factor structure was calculated based on 205 students of grades 9–12 at a rural school. In phase 2, one- and two-dimensional factor structures were analyzed from the perspectives of classical test theory and the Rasch model using structural equation modeling and principal components analysis (PCA) on Rasch residuals, respectively. Rasch knowledge measures were calculated for 410 students from 6 school districts in the Midwest, and misconceptions were verified through the χ ² test. Eight items measured knowledge of flu transmission, and seven measured knowledge of flu management. While alpha reliability measures for the subscales were acceptable, Rasch person reliability measures and PCA on residuals advocated for a single-factor scale. Four misconceptions were found, which have not been previously documented in high-school students. The AKI is the first validated influenza knowledge assessment, and can be used by schools and health agencies to provide a quantitative measure of impact of interventions aimed at increasing understanding of influenza. This study also adds significantly to the literature on misconceptions about influenza in high-school students, a necessary step toward strategic development of educational interventions for these students. 相似文献

15.

Multilevel Rasch Modeling: Does Misfit to the Rasch Model Impact the Regression Model?

Christine E. DeMars 《Journal of Experimental Education》2020,88(4):605-619

Abstract

Multilevel Rasch models are increasingly used to estimate the relationships between test scores and student and school factors. Response data were generated to follow one-, two-, and three-parameter logistic (1PL, 2PL, 3PL) models, but the Rasch model was used to estimate the latent regression parameters. When the response functions followed 2PL or 3PL models, the proportion of variance explained in test scores by the simulated student or school predictors was estimated accurately with a Rasch model. Proportion of variance within and between schools was also estimated accurately. The regression coefficients were misestimated unless they were rescaled out of logit units. However, item-level parameters, such as DIF effects, were biased when the Rasch model was violated, similar to single-level models. 相似文献

16.

分析性英语口语测试评分标准的研究

武书敬《考试研究》2014,(5):45-53

基于多层面Rasch模型,研究分析某省随机抽样高中考生短文朗读和自由交谈两种口语考试任务的评分维度及量表的使用情况。结果表明,短文朗读任务和自由交谈任务的评分维度设置均较合理,能够较准确地反映考生的能力,但是短文朗读量表的等级之间存在非等距性问题,自由交谈任务评分维度中"交际策略"与其他三个维度存在显著差异。这些信息对于修改和完善评分量表及相关维度具有重要意义。相似文献

17.

Comparing Rasch measurement and factor analysis

Benjamin D. Wright 《Structural equation modeling》2013,20(1):3-24

This article illustrates how Rasch measurement is preferable to factor analysis for reducing complex data matrices to unidimensional variables. The two methods: (a) address the same kind of data, but with different interpretations of numerical status; (b) use the same estimation methods, but with different measurement models; and (c) solve the same problems, but with substantially different utility. Factor analysis is faulted for mistaking ordinally labeled stochastic observations for linear measures and for failing to construct linear measurement. The motivation and mathematical basis for Rasch measurement are introduced. How to use Rasch measurement to replace factor analysis is developed for a dichotomy and demonstrated for a rating scale. 相似文献

18.

An evaluation of the environmental literacy of preservice teachers in Turkey through Rasch analysis

G. Tuncer Teksoz J.W. Boone O. Yilmaz Tuzun C. Oztekin 《Environmental Education Research》2014,20(2):202-227

The purpose of this study was to make use of proposed definitions of environmental literacy to (1) guide the application of Rasch analysis and (2) utilize the developed instrumentation to further inform the work of environmental educators. A total of 2311 preservice teachers attending Faculty of Education departments of four public universities located in the capital city of Turkey provided data for this study. The instrument used included a knowledge scale, an attitude scale, an attitude towards environmental responsibility scale and a concern scale. Rasch analysis revealed which those items which address the environmental knowledge widely broadcasted by mass media also were answered correctly by most participants. Generally, instrument items that addressed the understanding of the interrelated nature of environmental knowledge were answered incorrectly by participants. Analysis of attitude and attitude towards environmental responsibility scales indicated that the preservice teachers exhibited the most support for plant and animal rights, environmental protection laws and ecological balance. Results of the concern scale suggested that the preservice teachers were most concerned with regard to issues of poor drinking-water quality. Gender analysis revealed different orientations among females and males in terms of knowledge, attitudes, attitude towards environmental responsibility and concern scales. 相似文献

19.

A Rasch Analysis of Raven Item Data

《Journal of Experimental Education》2012,80(1):27-32

The Progressive Matrices items require varying degrees of analytical reasoning. Individuals high on the underlying trait measured by the Raven should score high on the test. Latent trait models applied to data of the Raven form provide a useful methodology for examining the tenability of the above hypothesis. In this study the Rasch latent model was applied to investigate the fit of observed performance on Raven items to what was expected by the model for individuals at six different levels of the underlying scale. For the most part the model showed a good fit to the test data. The findings were similar to previous empirical work that has investigated the behavior of Rasch test scores. In three instances, however, the item fit statistic was relatively large. A closer study of the “misfitting” items revealed two items were of extreme difficulty, which is likely to contribute to the misfit. The study raises issues about the use of the Rasch model in instances of small samples. Other issues related to the interpretation of the Rasch model to Raven-type data are discussed. 相似文献

20.

Behavioral and emotional strength-based assessment of Finnish elementary students: psychometrics of the BERS-2

Erkko Tapio Sointu Hannu Savolainen Matthew C. Lambert Kristiina Lappalainen Michael H. Epstein 《European Journal of Psychology of Education - EJPE》2014,29(1):1-19

When rating scales are used in different countries, thorough investigation of the psychometric properties is needed. We examined the internal structure of the Finnish translated Behavioral and Emotional Rating Scale-2 (BERS-2) using Rasch and confirmatory factor analysis approaches with a sample of youth, parents, and teachers. The results suggested that the Finnish translated BERS-2 has acceptable measurement properties and is suitable for use in Finnish schools. Results highlighted the issue that there is a need to consider cross-cultural aspects when introducing new measures in another culture. Directions for future research are also discussed in light of present findings. 相似文献