首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
ABSTRACT

Experimental evaluations that involve the educational system usually involve a hierarchical structure (students are nested within classrooms that are nested within schools, etc.). Concerns about contamination, where research subjects receive certain features of an intervention intended for subjects in a different experimental group, have often led researchers to randomize units at a higher level of the educational hierarchy. Existing work on two-level designs suggests that situations where contamination should lead to randomization at a higher level are likely to be rare. This article extends these results to the case of three-level designs. In order to understand the implications of mathematical results, existing information about the size of intracluster correlation coefficients (ICCs) in educational studies with three levels and about the extent of treatment effect heterogeneity across schools is discussed. Better empirical estimates of ICCs, treatment effect heterogeneity, and plausible contamination values are necessary to make full use of the results in this article. However, it seems likely that situations where contamination should lead to randomization at a higher level in three-level designs are rare.  相似文献   

2.
Abstract

This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and group-random assignment design studies and for common quasi-experimental design studies. The paper and accompanying tool cover computation of minimum detectable effect sizes under the following study designs: individual random assignment designs, hierarchical random assignment designs (2-4 levels), block random assignment designs (2-4 levels), regression discontinuity designs (6 types), and short interrupted time-series designs. In each case, the discussion and accompanying tool consider the key factors associated with statistical power and minimum detectable effect sizes, including the level at which treatment occurs and the statistical models (e.g., fixed effect and random effect) used in the analysis. The tool also includes a module that estimates for one and two level random assignment design studies the minimum sample sizes required in order for studies to attain user-defined minimum detectable effect sizes.  相似文献   

3.
Abstract

When well-implemented, mediation analyses play a critical role in probing theories of action because their results help lay the ground work for the critical development of a treatment and the iterative advancement of theories that are foundational to a discipline. Despite strong interest in designs that incorporate mediation, few studies have developed effective and efficient strategies to plan experiments examining multilevel mediation. We probe several design strategies for cluster-randomized designs and derive sampling plans that maximize power under cost constraints. The results suggest that among the more durable design strategies for mediation is covariance adjustment on variables predictive of the outcome and optimal sample allocation. The statistical power and optimal sample allocation results are implemented in the R package PowerUpR.  相似文献   

4.
Abstract

This article develops a new approach for calculating appropriate sample sizes for school-based randomized control trials (RCTs) with binary outcomes using logit models with and without baseline covariates. The theoretical analysis develops sample size formulas for clustered designs where random assignment is at the school or teacher level using generalized estimating equation methods. The article focuses on the impact parameter pertaining to rates and proportions rather than to the log odds of response, which has been the focus of the previous literature. The article also compiles intraclass correlations (ICCs) for the clustered design for a range of binary outcomes using data from seven education RCTs. These ICCs and the power formulas are then used to conduct a power analysis using a provided SAS macro; the key finding is that sample sizes of 40 to 60 schools that are typically included in clustered RCTs for student test score or behavioral scale outcomes will often be insufficient for binary outcomes. A key reason is that the potential for precision gains from regression adjustment is likely to be smaller for binary outcomes.  相似文献   

5.
We derive sample-allocation formulas that maximize the power of several mediation tests in two-level–group-randomized studies under a linear cost structure and fixed budget. The results suggest that the optimal individual sample size is typically smaller than that associated with the detection of a main effect and is frequently less than 10 under parameter values commonly seen in the literature. However, the optimal sample allocation can be heavily influenced by the group-to-individual cost ratio, the ratio of the treatment-mediator to mediator-outcome path coefficients, and the outcome variance structure. We illustrate these findings with a hypothetical group-randomized trial examining a school-discipline reform policy. To encourage utilization of the sample allocation formulas we implement them in the R package PowerUpR and powerupr Shiny application.  相似文献   

6.
Abstract

This article examines changes in the research design, sample size, and precision between the planning phase and implementation phase of group randomized trials (GRTs) funded by the Institute of Education Sciences. Thirty-eight GRTs funded between 2002 and 2006 were examined. Three studies revealed changes in the experimental design. Ten studies showed decreases in the total number of groups randomized, whereas 18 studies showed increases. In five cases, the decreases in the number of groups randomized were large enough to decrease the precision of the study. However, in the majority of the studies, the precision was relatively unchanged from planning phase to implementation phase. The consistency in the precision between the planning phase and implementation phase highlights the importance of planning adequately powered studies.  相似文献   

7.
Practical considerations in conducting an equating study often require a trade-off between testing time and sample size. A counterbalanced design (Angoff's Design II) is often selected because, as each examinee is administered both test forms and therefore the errors are correlated, sample sizes can be dramatically reduced over those required by a spiraling design (Angoff's Design I), where each examinee is administered only one test form. However, the counterbalanced design may be subject to fatigue, practice, or context effects. This article investigated these two data collection designs (for a given sample size) with equipercentile and IRT equating methodology in the vertical equating of two mathematics achievement tests. Both designs and both methodologies were judged to adequately meet an equivalent expected score criterion; Design II was found to exhibit more stability over different samples.  相似文献   

8.
Multisite trials, which are being used with increasing frequency in education and evaluation research, provide an exciting opportunity for learning about how the effects of interventions or programs are distributed across sites. In particular, these studies can produce rigorous estimates of a cross-site mean effect of program assignment (intent-to-treat), a cross-site standard deviation of the effects of program assignment, and a difference between the cross-site mean effects of program assignment for two subpopulations of sites. However, to capitalize on this opportunity will require adequately powering future trials to estimate these parameters. To help researchers do so, we present a simple approach for computing the minimum detectable values of these parameters for different sample designs. The article then uses this approach to illustrate for each parameter, the precision trade-off between increasing the number of study sites and increasing site sample size. Findings are presented for multisite trials that randomize individual sample members and for multisite trials that randomize intact groups or clusters of sample members.  相似文献   

9.
ABSTRACT

Although reflection is a key behaviour of expert designers, it is often a challenging task for new designers. In addition, research on the reflectivity of student designers is limited. The purpose of this study is twofold: (1) to identify the levels of reflectivity while designing, and (2) to study the relationship between reflectivity and conceptions of informed design. We collected data from high school students engaged in an engineering design project. We developed a coding protocol to score levels of reflectivity in student reflections at three levels (low, medium, and high), and used the conceptions of design test to assess changes in student understanding of design activities. Using Wilcoxon signed-rank tests, we determined if students tended to select more ‘key’ design activities and fewer ‘distractors’ within each reflection group. We also performed McNemar’s tests to determine which specific design activities were important within each reflection group after the design project. The results show moderately reflective students had higher gains in understanding of informed design activities compared to those with high or low reflectivity. Results also indicate that different design activities became important for students within each of the three reflective groups. Implications from this research indicate that groups of students experience changing conceptions of design in different ways. An understanding of what students deem important while designing would better allow teachers to encourage behaviours that are like those of informed designers.  相似文献   

10.
The purpose of this study was to determine the proportion of empirical studies published in the last 5 years in a sample of special education peer‐reviewed journals that (1) assessed the effects of reading and math interventions with group designs and (2) used random assignment to treatment conditions to test those interventions. A hand search of articles from the Journal of Special Education, Exceptional Children, Learning Disabilities Research & Practice, the Journal of Learning Disabilities, and School Psychology Review yielded 806 relevant articles, of which 5.46 percent tested a reading or math intervention using a group design and 4.22 percent used random assignment. These findings indicate that randomized experimental designs, which offer the highest level of evidence of an intervention's efficacy, are underrepresented in the literature, at least in the area of reading and math interventions.  相似文献   

11.
Background:?Education, and information about education, is highly structured: individuals are grouped into classes, which are grouped into schools, which are grouped into local authorities, which are grouped into countries. The degree of similarity among members of a group, such as a school or classroom, is a very important factor in the design and analysis of studies in education.

Purpose:?The aim of this article is to provide information on this degree of similarity within schools to enable those involved in carrying out surveys of schools to do so most efficiently in terms of resources and minimum disturbance of schools.

Sources of data:?This paper uses data from 13 studies at primary and secondary level conducted by the National Foundation for Educational Research in England and Wales.

Main argument:?The degree of similarity among members of a group is measured by two statistics, the intra-cluster correlation and the design effect. The study described here classifies outcomes into a number of categories and estimates intra-cluster correlation and design effect. The relevance of the results to survey design and analysis is discussed, and examples of how to use these are given.

Conclusions:?The main findings of this study, rather than conclusions as such, are the intra-cluster correlations for each topic category. However, the paper reaches some tentative conclusions about the degree of clustering by topic. Using Hox's convention (Multilevel analysis: Techniques and applications, Lawrence Erlbaum, London, 2002) for the size of intra-cluster correlations, it was found that the degree of clustering of achievement was high, while ethnic and language variables were highly clustered in secondary but not primary. By contrast, attitudes towards school, educationally relevant home characteristics, and perception of school policies have quite low values of ρ (mean < 0.05), defined as small.  相似文献   

12.
The power of analysis of covariance (ANCOVA) and 2 types of randomized block designs were compared as a function of the correlation between the concomitant variable and the outcome measure, the number of groups, the number of participants, and nominal power. ANCOVA had a small but consistent advantage over a randomized block design with 1 participant in each Block × Treatment combination (RB1). At correlations of .3 or greater, ANCOVA was superior to a randomized block design with n participants per Block × Treatment combination (RBn), with increasing differences as the correlation increased. RBn was superior to the other 2 designs only when the correlation was .2 or less. At those levels, however, the randomized group analysis of variance ignoring the concomitant variable was equally powerful. The findings held regardless of sample size, number of groups, or nominal power.  相似文献   

13.
Abstract

Experiments that involve nested structures may assign treatment conditions either to subgroups (such as classrooms) or individuals within subgroups (such as students). The design of such experiments requires knowledge of the intraclass correlation structure to compute the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level block randomized balanced designs (with two levels of nesting) where, for example, students are nested within classrooms and classrooms are nested within schools. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of level-1, level-2, and level-3 units), and covariate effects (e.g., pretreatment measures). The methods are generalizable to quasi-experimental studies that examine group differences on an outcome.  相似文献   

14.
In 1958, Page conducted a large multiple experiment: 74 teachers gave one class its normal quiz, scored and graded it in the usual way, assigned three comment treatments to students in stratified-random blocks, and then reported scores from the next objective quiz. There was a highly significant effect of comments. Others have borrowed some study features, with results that have appeared mixed. Here, a critical overall analysis shows much agreement with the ordered hypothesis of comments and with specified comments over no comments (p < .01). Despite great variety of designs and subtlety of effect, results broadly support teachers who comment. A typical effect size is demonstrated for ranks, and lessons are taken about the proper strategies for designs and the future of such research.  相似文献   

15.
Abstract

Experiments that involve nested structures often assign entire groups (such as schools) to treatment conditions. Key aspects of the design of such experiments include knowledge of the intraclass correlation structure and the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level cluster randomized balanced designs (with two levels of nesting), where, for example, students are nested within classrooms and classrooms are nested within schools and schools are assigned to treatments. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of schools, classrooms, and individuals), and covariate effects (e.g., pretreatment measures). The methods are applicable to quasi-experimental studies that examine group differences in an outcome.  相似文献   

16.
ABSTRACT

This paper aims at presenting the experience of the Power Conversion project in teaching students to design a proof-of-principle contactless energy transfer system for the charging of electrical vehicles. The Power Conversion is a second-year electrical engineering (EE) project in which students are to gather and apply EE knowledge to design and test a system. This system is to work with power level and operates independent from an electricity grid. The instructional method used in this project is design-based learning (DBL). As an educational approach, DBL is to support students to gather and apply knowledge in open-ended assignments. The set-up of the project has gone through different modifications and iterations in three consecutive years regarding the organisation and supervision of the students. We have analysed the students’ design products in the past three academic years in order to evaluate whether the project set-up and supervision have influenced students’ designs. Results indicate that the open-ended character of the project has a positive influence on the designs especially regarding the criteria on efficiency, Maximum Power Point Tracking algorithm and power tracking.  相似文献   

17.
Abstract

We aimed to compare the findings of three research designs to bracket effect estimates of a strongly worded warning letter delivered by certified mail to students on academic probation.

We embedded an experiment within a regression discontinuity design and calculated two achievement estimates, average GPA and percentage of students remaining on probation. Study participants attended a large Midwestern college. Cohen's d experimental effect size was .45. Regression discontinuity design results were validated by our experimental evidence, and outcome measures were generally statistically significant. We provided additional supportive evidence using comparative RD control group design logic. Regression point displacement design results were successfully replicated using a within-study comparison inside the experiment. In the context of probation, a diverse design, replicative approach provided considerable promise for more precise estimation of intervention effectiveness. We found no deleterious impact on reenrollment and concluded that the certified letter represents an inexpensive probation policy.  相似文献   

18.
A meta‐analysis of the relationship between attitudes in reading and achievement in reading was conducted to provide a statistical summary to the observed variability in the magnitude of previously reported effect sizes. A total of 32 studies, with a total sample size of 224,615 were used, and included a total of 118 effect sizes. A multi‐level approach was used in meta‐analysis to determine if variance in the magnitude of effect sizes could be partitioned to study (level 1) and moderator (level 2) levels by using a mixed model approach. Results from the meta‐analysis indicated that the mean strength of the relationship between reading attitudes and achievement is moderate (Zr=.32), while stronger for students in elementary school (Zr=.44) when compared with middle school students (Zr=.24). Findings related to selected moderator variables are discussed, with suggestions for future research.  相似文献   

19.
Abstract

Recent publications have drawn attention to the idea of utilizing prior information about the correlation structure to improve statistical power in cluster randomized experiments. Because power in cluster randomized designs is a function of many different parameters, it has been difficult for applied researchers to discern a simple rule explaining when prior correlation information will substantially improve power. This article provides bounds on the maximum possible improvement in power as a function of a single parameter, the number of clusters at the highest level of a multilevel experiment. The maximum improvement in power is less than 0.05 unless the number of clusters at the highest level is less than 20. Thus, the utility of using prior correlation information is limited to experiments with very small cluster-level sample sizes. Situations where small cluster-level sample sizes could still result in experiments with good statistical power are discussed, as is the relative utility of prior information about intracluster correlations as compared with covariate information that can explain cluster level variability in the outcome.  相似文献   

20.
This article develops an argument that the type of intervention research most useful for improving science teaching and learning and leading to scalable interventions includes both research to develop and gather evidence of the efficacy of innovations and a different kind of research, design‐based implementation research (DBIR). DBIR in education focuses on what is required to bring interventions and knowledge about learning to all students, wherever they might engage in science learning. This research focuses on implementation, both in the development and initial testing of interventions and in the scaling up process. In contrast to traditional intervention research that focuses principally on one level of educational systems, DBIR designs and tests interventions that cross levels and settings of learning, with the aim of investigating and improving the effective implementation of interventions. The article concludes by outlining four areas of DBIR that may improve the likelihood that new standards for science education will achieve their intended purpose of establishing an effective, equitable, and coherent system of opportunities for science learning in the United States. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 281–304, 2012  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号