期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

CFI versus RMSEA: A comparison of two fit indexes for structural equation modeling

Edward E. Rigdon 《Structural equation modeling》2013,20(4):369-379

This article compares two structural equation modeling fit indexes—Bentler's ( 1990; Bentler & Bonett, 1980) Confirmatory Fit Index (CFI) and Steiger and Lind's (1980; Browne & Cudeck, 1993) Root Mean Square Error of Approximation (RMSEA). These two fit indexes are both conceptually linked to the noncentral chi‐square distribution, but CFI has seen much wider use in applied research, whereas RMSEA has only recently been gaining attention. The article suggests that use of CFI is problematic because of its baseline model. CFI seems to be appropriate in more exploratory contexts, whereas RMSEA is appropriate in more confirmatory contexts. On the other hand, CFI does have an established parsimony adjustment, although the adjustment included in RMSEA may be inadequate. 相似文献

2.

A sequential Scheffé‐type respecification procedure for controlling type I error in exploratory structural equation model modification

Gregory R. Hancock 《Structural equation modeling》2013,20(2):158-168

Regarding post hoc structural equation modeling modification, Kaplan (1990) noted in his response to Steiger (1990), “As there is currently no analogous Scheffé test, the best we can do is to free those restrictions that have the highest probability of being wrong” (p. 201). This article proposes just such an analog to the Scheffé test to be applied to the exploratory model‐modification scenario. This method is a sequential finite‐intersection multiple comparison procedure, controlling the Type I error rate to a desired alpha level across the set of all possible post hoc model modifications. 相似文献

3.

Point Estimation,Hypothesis Testing,and Interval Estimation Using the RMSEA: Some Comments and a Reply to Hayduk and Glaser

《Structural equation modeling》2013,20(2):149-162

Hayduk and Glaser (2000) asserted that the most commonly used point estimate of the Root Mean Square Error of Approximation index of fit (Steiger & Lind, 1980) has two significant problems: (a) The frequently cited target value of. 05 is not a stable target, but a "sample size adjustment"; and (b) the truncated point estimate R_t = max(R, 0) effectively throws away a substantial part of the sampling distribution of the test statistic with "proper models," rendering it useless a substantial portion of the time. In this article, I demonstrate that both issues discussed by Hayduk and Glaser are actually not problems at all. The first "problem" derives from a false premise by Hayduk and Glaser that Steiger (1995) specifically warned about in an earlier publication. The second so-called problem results from the point estimate satisfying a fundamental property of a good estimator and can be shown to have virtually no negative implications for statistical practice. 相似文献

4.

A note on multiple sample extensions of the RMSEA fit index

James H. Steiger 《Structural equation modeling》2013,20(4):411-419

Generalization of the Steiger‐Lind root mean square error of approximation fit indexes and interval estimation procedure to models based on multiple independent samples is discussed. In this article, we suggest an approach that seems both reasonable and workable, and caution against one that definitely seems inappropriate. 相似文献

5.

Self‐report Case‐studies: an experiment in own classroom data collection by teachers

Trevor Kerry 《International Journal of Research & Method in Education》2013,36(1):103-111

The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection. As a simplified demonstration, we illustrate (1) how Bayesians test and compare two non‐nested growth curve models using Bayesian estimation with non‐informative prior; (2) how Bayesians model and handle missing outcomes in the context of missing values; and (3) how Bayesians incorporate data‐based evidence from a previous data set, construct informative priors and treat them as extra information while conducting an up‐to‐date analogy analysis. 相似文献

6.

Assessing sources of error in structural equation models: The effects of sample size,reliability, and model misspecification

Deborah L. Bandalos 《Structural equation modeling》2013,20(3):177-192

This study used Monte Carlo methods to investigate the accuracy and utility of estimators of overall error and error due to approximation in structural equation models. The effects of sample size, indicator reliabilities, and degree of misspecification were examined. The rescaled noncentrality parameter (McDonald & Marsh, 1990) was examined as a measure of approximation error, whereas the one‐ and two‐sample cross‐validation indices and a sample estimator of overall error (EFo) proposed by Browne and Cudeck (1989, 1993) were presented as measures of overall error. The rescaled noncentrality parameter and EFo provided extremely accurate estimates of the amounts of approximation and overall error, respectively. However, although models with errors of omission produced larger estimates of approximation and overall error, the presence of errors of inclusion had little or no effect on estimates of either type of error. The cross‐validation indices and sample estimator of overall error reached minimum values for the same model as an empirically derived measure of overall error only for models with large amounts of specification error. Implications for the use of these estimators in choosing among competing models were discussed. 相似文献

7.

Testing Nested Additive,Multiplicative, and General Multitrait-Multimethod Models

《Structural equation modeling》2013,20(2):219-250

The article gives alternatives to Campbell and O'Connell's (1967) definitions of additive and multiplicative method effects in multitrait-multimethod (MTMM) data. The alternative definitions can be formulated by means of constraints in the parameters of the correlated uniqueness (CU) model (Marsh, 1989), which is first reviewed. The definitions have 2 major advantages. First, they allow the researcher to test for additive and multiplicative method effects in a straightforward manner by simply testing the appropriate constraints. An illustration of these tests is given. Second, the alternative definitions are closely linked to other currently used models. The article shows that CU models with additive constraints are equivalent to constrained versions of the confirmatory factor analysis model for MTMM data (Althauser, Heberlein, & Scott, 1971; Werts & Linn, 1970). In addition, Coenders and Saris (1998) showed that, for designs with 3 methods, a CU model with multiplicative constraints is equivalent to the direct product model (Browne, 1984). 相似文献

8.

Browne's composite direct product model for MTMM correlation matrices

Timo M. Bechger 《Structural equation modeling》2013,20(2):191-204

This article is about the analysis of the multitrait‐multimethod (MTMM) correlation matrix with the composite direct product model proposed by Browne (1984). Campbell and Fiske's (1959) criteria for convergent and discriminant validity have a direct interpretation in the composite direct product model. The model has a multiplicative property suggested as appropriate for MTMM correlation matrices by Campbell and O'Connell (1967, 1982), and it can be fitted to the correlation matrix by means of conventional software. 相似文献

9.

Gender,the city and the politics of schooling: towards a collective biography of women ‘doing good’ as public moralists in Victorian London

Jane Martin 《Gender and education》2005,17(2):143-163

This article tells the stories of four middle class, white, English women whose participation in educational policy making is little known: Annie Leigh Browne (1851–1936), Margaret MacDonald (1870–1911), Hilda Miall‐Smith (born 1861) and Honnor Morten (1861–1913). In doing so, it provides a perspective on the circumstances that enabled or encouraged or compelled women’s political mobilization in the socially divided Victorian city. Working through local government, voluntary societies, women’s organizations and settlement houses they operated at the margins of high politics and yet were self‐consciously redrawing the imagined boundaries of political terrain. Combined, their stories suggest the power of female networks to challenge a landscape of male public space within a matrix of specific local circumstances and cultural politics. The paper uses the creation and presentation of stories about the self across a range of social and cultural practices, both public and private, to situate the women as public moralists. They had strong commitments to ‘doing good’, which they combined with a feminist agenda. The author suggests past women’s political initiatives raise contradictory issues that still leave contemporary feminism uncertain and confused. 相似文献

10.

Resampling and Distribution of the Product Methods for Testing Indirect Effects in Complex Models

Jason Williams David P. MacKinnon 《Structural equation modeling》2013,20(1):23-51

Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional z test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a simulation was conducted to evaluate these alternative methods in a more complex path model with multiple mediators and indirect paths with 2 and 3 paths. Methods for testing contrasts of 2 effects were evaluated also. The simulation included 1 exogenous independent variable, 3 mediators and 2 outcomes and varied sample size, number of paths in the mediated effects, test used to evaluate effects, effect sizes for each path, and the value of the contrast. Confidence intervals were used to evaluate the power and Type I error rate of each method, and were examined for coverage and bias. The bias-corrected bootstrap had the least biased confidence intervals, greatest power to detect nonzero effects and contrasts, and the most accurate overall Type I error. All tests had less power to detect 3-path effects and more inaccurate Type I error compared to 2-path effects. Confidence intervals were biased for mediated effects, as found in previous studies. Results for contrasts did not vary greatly by test, although resampling approaches had somewhat greater power and might be preferable because of ease of use and flexibility. 相似文献

11.

Conducting statistical tests with data from clustered school samples

《International Journal of Research & Method in Education》2013,36(2):113-124

This article discusses issues associated with statistical testing conducted with data from clustered school samples. Empirical researchers often conduct tests of statistical inference on sample data to ascertain the extent to which differences exist within groups in the population. Typically, much school‐related data are collected from students. These data are hierarchical because students are nested within classes within schools. This article studies the influence of this nesting on tests of statistical significance conducted with the student as the unit of analysis. Theory that adjusts F‐test scores for nested data in multi‐group comparisons is presented and applied to a teacher interaction dataset. The article demonstrates the potential impact of data hierarchy on the results of statistical testing if clustering is ignored. Data analysis techniques that recognize the clustering of students in classes are essential, and it is recommended that either multilevel analysis or adjustments to statistical parameters be undertaken in studies involving nested data. 相似文献

12.

‘Because the ego started to grow bigger than the project itself’: a case study of founder’s syndrome on an educational community of practise

Donovon Keith Ceaser 《Ethnography and Education》2018,13(4):459-476

In 2010 I worked at Green Shoots, a nonprofit service-learning urban farming school started by John Browne. Despite an openly egalitarian community of practise ethic, Browne used his leadership to create a hierarchy at the school that eventually led to a walk-out by staff in which he responded by outright firing them. Using the theory of founder’s syndrome, or the tendency of a founder to subvert the aims of their organization, I examine Browne’s leadership and how his attitude destroyed the community of practise at Green Shoots. Findings reveal a character profile of Browne as a charming and inspirational teacher who engaged in a dominating leadership that contradicted the community of practise's ideals, creating further contradictions which resulted in disrespected students, disillusioned staff, and a founder unable to take responsibility for his leadership. Finally, I discuss the importance of addressing the power of founders so that founder’s syndrome does not impede the important work of educational nonprofits. 相似文献

13.

Estimating the Coefficient of Cross-Validity in Multiple Regression: A Comparison of Analytical and Empirical Methods 总被引：1，自引：1，他引：0

Jeffrey D. Kromrey Constance V. Hines 《Journal of Experimental Education》2013,81(3):240-266

In predictive applications of multiple regression, interest centers on the estimation of the population coefficient of cross-validation rather than the population multiple correlation. The accuracy of 3 analytical formulas for shrinkage estimation (Ezekiel, Browne, & Darlington) and 4 empirical techniques (simple cross-validation, multicross-validation, jackknife, and bootstrap) were investigated in a Monte Carlo study. Random samples of size 20 to 200 were drawn from a pseudopopulation of actual field data. Regression models were investigated with population coefficients of determination ranging from .04 to .50 and with numbers of regressors ranging from 2 to 10. For all techniques except the Browne formula and multicross-validation, substantial statistical bias was evident when the shrunken R ² values were used to estimate the coefficient of cross-validation. In addition, none of the techniques examined provided unbiased estimates with sample sizes smaller than 100, regardless of the number of regressors. 相似文献

14.

Evaluating the Wald Test for Item‐Level Comparison of Saturated and Reduced Models in Cognitive Diagnosis

Jimmy de la Torre Young‐Sun Lee 《Journal of Educational Measurement》2013,50(4):355-373

This article used the Wald test to evaluate the item‐level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G‐DINA model. Results show that when the sample size is small and a larger number of attributes are required, the Type I error rate of the Wald test for the DINA and DINO models can be higher than the nominal significance levels, while the Type I error rate of the A‐CDM is closer to the nominal significance levels. However, with larger sample sizes, the Type I error rates for the three models are closer to the nominal significance levels. In addition, the Wald test has excellent statistical power to detect when the true underlying model is none of the reduced models examined even for relatively small sample sizes. The performance of the Wald test was also examined with real data. With an increasing number of CDMs from which to choose, this article provides an important contribution toward advancing the use of CDMs in practical educational settings. 相似文献

15.

The effect of clustering on statistical tests: an illustration using classroom environment data

Jeffrey Paul Dorman 《教育心理学》2008,28(5):583-595

This paper discusses the effect of clustering on statistical tests and illustrates this effect using classroom environment data. Most classroom environment studies involve the collection of data from students nested within classrooms and the hierarchical nature to these data cannot be ignored. In particular, this paper studies the influence of intraclass correlations on tests of statistical significance conducted with the individual as the unit of analysis. Theory that adjusts t‐test scores for nested data in two‐group comparisons is presented and applied to classroom environment data. This paper demonstrates that Type I error rates inflate greatly as the intraclass correlation increases. Data analysis techniques that recognise the clustering of students in classrooms in classroom environment studies are essential, and it is recommended that either multilevel analysis or adjustments to statistical parameters be undertaken in studies involving nested data. 相似文献

16.

The Power of the Test for Treatment Effects in Three-Level Block Randomized Designs

《Journal of research on educational effectiveness》2013,6(4):265-288

Abstract

Experiments that involve nested structures may assign treatment conditions either to subgroups (such as classrooms) or individuals within subgroups (such as students). The design of such experiments requires knowledge of the intraclass correlation structure to compute the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level block randomized balanced designs (with two levels of nesting) where, for example, students are nested within classrooms and classrooms are nested within schools. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of level-1, level-2, and level-3 units), and covariate effects (e.g., pretreatment measures). The methods are generalizable to quasi-experimental studies that examine group differences on an outcome. 相似文献

17.

Approximating Test Statistics Using Eigenvalue Block Averaging

Njål Foldnes Steffen Grønneberg 《Structural equation modeling》2018,25(1):101-114

We introduce and evaluate a new class of approximations to common test statistics in structural equation modeling. Such test statistics asymptotically follow the distribution of a weighted sum of i.i.d. chi-square variates, where the weights are eigenvalues of a certain matrix. The proposed eigenvalue block averaging (EBA) method involves creating blocks of these eigenvalues and replacing them within each block with the block average. The Satorra–Bentler scaling procedure is a special case of this framework, using one single block. The proposed procedure applies also to difference testing among nested models. We investigate the EBA procedure both theoretically in the asymptotic case, and with simulation studies for the finite-sample case, under both maximum likelihood and diagonally weighted least squares estimation. Comparison is made with 3 established approximations: Satorra–Bentler, the scaled and shifted, and the scaled F tests. 相似文献

18.

Detection of Test Collusion via Kullback–Leibler Divergence

Dmitry I. Belov 《Journal of Educational Measurement》2013,50(2):141-163

The development of statistical methods for detecting test collusion is a new research direction in the area of test security. Test collusion may be described as large‐scale sharing of test materials, including answers to test items. Current methods of detecting test collusion are based on statistics also used in answer‐copying detection. Therefore, in computerized adaptive testing (CAT) these methods lose power because the actual test varies across examinees. This article addresses that problem by introducing a new approach that works in two stages: in Stage 1, test centers with an unusual distribution of a person‐fit statistic are identified via Kullback–Leibler divergence; in Stage 2, examinees from identified test centers are analyzed further using the person‐fit statistic, where the critical value is computed without data from the identified test centers. The approach is extremely flexible. One can employ any existing person‐fit statistic. The approach can be applied to all major testing programs: paper‐and‐pencil testing (P&P), computer‐based testing (CBT), multiple‐stage testing (MST), and CAT. Also, the definition of test center is not limited by the geographic location (room, class, college) and can be extended to support various relations between examinees (from the same undergraduate college, from the same test‐prep center, from the same group at a social network). The suggested approach was found to be effective in CAT for detecting groups of examinees with item pre‐knowledge, meaning those with access (possibly unknown to us) to one or more subsets of items prior to the exam. 相似文献

19.

Using the friedman method of ranks for model comparison in structural equation modeling

Edward E. Rigdon 《Structural equation modeling》2013,20(3):219-232

Empirical researchers maximize their contribution to theory development when they compare alternative theory‐inspired models under the same conditions. Yet model comparison tools in structural equation modeling—χ² difference tests, information criterion measures, and screening heuristics—have significant limitations. This article explores the use of the Friedman method of ranks as an inferential procedure for evaluating competing models. This approach has attractive properties, including limited reliance on sample size, limited distributional assumptions, an explicit multiple comparison procedure, and applicability to the comparison of nonnested models. However, this use of the Friedman method raises important issues regarding the lack of independence of observations and the power of the test. 相似文献

20.

Assessing Fit of Unidimensional Graded Response Models Using Bayesian Methods

Xiaowen Zhu Clement A. Stone 《Journal of Educational Measurement》2011,48(1):81-97

The posterior predictive model checking method is a flexible Bayesian model‐checking tool and has recently been used to assess fit of dichotomous IRT models. This paper extended previous research to polytomous IRT models. A simulation study was conducted to explore the performance of posterior predictive model checking in evaluating different aspects of fit for unidimensional graded response models. A variety of discrepancy measures (test‐level, item‐level, and pair‐wise measures) that reflected different threats to applications of graded IRT models to performance assessments were considered. Results showed that posterior predictive model checking exhibited adequate power in detecting different aspects of misfit for graded IRT models when appropriate discrepancy measures were used. Pair‐wise measures were found more powerful in detecting violations of the unidimensionality and local independence assumptions. 相似文献