首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The main issue addressed in this article is that there is much to learn about students’ knowledge and thinking in science from largescale international quantitative studies beyond overall score measures. Response patterns on individual or groups of items can give valuable diagnostic insight into students’ conceptual understanding, but there is also a danger of drawing conclusions that may be too simple and nonvalid. We discuss how responses to multiple-choice items could be interpreted, and we also show how responses on constructed-response items can be systematised and analysed. Finally, we study, empirically, interactions between item characteristics and student responses. It is demonstrated that even small changes in the item wording and/or the item format may have a substantial influence on the response pattern. Therefore, we argue that interpretations of results from these kinds of studies should be based on a thorough analysis of the actual items used. We further argue that diagnostic information should be an integrated part of the international research aims of such large-scale studies. Examples of items and student responses presented are taken from The Third International Mathematics and Science Study (TIMSS).  相似文献   

2.
The performance of English language learners (ELLs) has been a concern given the rapidly changing demographics in US K-12 education. This study aimed to examine whether students' English language status has an impact on their inquiry science performance. Differential item functioning (DIF) analysis was conducted with regard to ELL status on an inquiry-based science assessment, using a multifaceted Rasch DIF model. A total of 1,396 seventh- and eighth-grade students took the science test, including 313 ELL students. The results showed that, overall, non-ELLs significantly outperformed ELLs. Of the four items that showed DIF, three favored non-ELLs while one favored ELLs. The item that favored ELLs provided a graphic representation of a science concept within a family context. There is some evidence that constructed-response items may help ELLs articulate scientific reasoning using their own words. Assessment developers and teachers should pay attention to the possible interaction between linguistic challenges and science content when designing assessment for and providing instruction to ELLs.  相似文献   

3.
The assumption that inquiry-based instruction is more effective in influencing student science achievement than traditional didactic teaching has been the driving force of science education reform in recent decades and in many countries. However, the empirical relationship between these two kinds of science teaching and student science performance is not soundly established, which is worth a careful examination. Framed through the theoretical perspectives of inquiry-based instruction and culturally relevant pedagogy, using a two-level hierarchical linear modeling (HLM) approach and simultaneous multiple regression, this study examines the above relationship using the Trends in International Mathematics and Science Study (TIMSS) 2011 8th grade dataset from Singapore, Chinese Taipei, and the US. The study found that for the low-performing students, none of the inquiry-based teaching practice items measured had a significant relationship with the science achievements at any performance levels of students in any country/region except for the case of two inquiry-based teaching practice items that were positively related to Chinese Taipei students’ achievements. No didactic teaching practice items were associated with the Singapore students’ science achievement, three of these practice items were found negatively related to Chinese Taipei students’ science achievement, and one traditional didactic teaching practice was negatively related to the science achievement of U.S. students. However, for medium- and high-performing students, none of these inquiry-based or traditional didactic science-teaching practices were found to be positive predictors of science performance in all three countries/regions. However, in the case of Chinese Taipei, one didactic teaching practice item was negatively related with the medium level performing students’ achievement and two didactic teaching practices were found to hinder high-performing students’ science achievements.  相似文献   

4.
Growing evidence from recent curriculum documents and previous research suggests that reform-oriented science teaching practices promote students’ conceptual understanding, levels of achievement, and motivation to learn, especially when students are actively engaged in constructing their ideas through scientific inquiries. However, it is difficult to identify to what extent science teachers engage students in reform-oriented teaching practices (RTPs) in their science classrooms. In order to exactly diagnose the current status of science teachers’ implementation of the RTPs, a valid and reliable instrument tool is needed. The principles of validity and reliability are fundamental cornerstones in developing a robust measurement tool. As such, this study was motivated by the desire to point out the limitations of the existing statistical and psychometric analyses and to further examine the validation of the RTP survey instrument. This paper thus aims at calibrating the items of the RTPs for science teachers using the Rasch model. The survey instrument scale was adapted from the 2012 National Survey of Science and Mathematics Education (NSSME) data. A total of 3701 science teachers from 1403 schools from across the USA participated in the NSSME survey. After calibrating the RTP items and persons on the same scale, the RTP instrument well represented the population of US science teachers. Model-data fit determined by Infit and Outfit statistics was within an appropriate range (0.5–1.5), supporting the unidimensional structure of the RTPs. The ordered category thresholds and the probability of the thresholds showed that the five-point rating scale functioned well. The results of this study support the use of the RTP measure from the 2012 NSSME in assessing usage of RTPs.  相似文献   

5.
ABSTRACT

Students’ attitude towards science (SAS) is often a subject of investigation in science education research. Survey of rating scale is commonly used in the study of SAS. The present study illustrates how Rasch analysis can be used to provide psychometric information of SAS rating scales. The analyses were conducted on a 20-item SAS scale used in an existing dataset of The Trends in International Mathematics and Science Study (TIMSS) (2011). Data of all the eight-grade participants from Hong Kong and Singapore (N?=?9942) were retrieved for analyses. Additional insights from Rasch analysis that are not commonly available from conventional test and item analyses were discussed, such as invariance measurement of SAS, unidimensionality of SAS construct, optimum utilization of SAS rating categories, and item difficulty hierarchy in the SAS scale. Recommendations on how TIMSS items on the measurement of SAS can be better designed were discussed. The study also highlights the importance of using Rasch estimates for statistical parametric tests (e.g. ANOVA, t-test) that are common in science education research for group comparisons.  相似文献   

6.
In recent years, students’ test scores have been used to evaluate teachers’ performance. The assumption underlying this practice is that students’ test performance reflects teachers’ instruction. However, this assumption is generally not empirically tested. In this study, we examine the effect of teachers’ instruction on test performance at the item level using a hierarchical differential item functioning approach. The items are from the U.S. TIMSS 2011 4th-grade math test. Specifically, we tested whether students who had received instruction on a given item performed significantly better on that item compared with students who had not received such instruction when their overall math ability was controlled for, whether with or without controlling for student-level and class-level covariates. This study provides preliminary findings regarding why some items show instructional sensitivity and sheds light on how to develop instructionally sensitive items. Implications and directions for further research are also discussed.  相似文献   

7.
Present instructional trends in science indicate a need to reexamine a traditional concern in science education: the readability of science textbooks. An area of reading research not well documented is the effect of color, visuals, and page layout on readability of science materials. Using the cloze readability method, the present study explored the relationships between page format, grade level, sex, content, and elementary school students ability to read science material. Significant relationships were found between cloze scores and both grade level and content, and there was a significant interaction effect between grade and sex in favor of older males. No significant relationships could be attributed to page format and sex. In the area of science content, biological materials were most difficult in terms of readability followed by earth science and physical science. Grade level data indicated that grade five materials were more difficult for that level than either grade four or grade six materials were for students at each respective level. In eight of nine cases, the science text materials would be classified at or near the frustration level of readability. The implications for textbook writers and publishers are that science reading materials need to be produced with greater attention to readability and known design principles regarding visual supplements. The implication for teachers is that students need direct instruction in using visual materials to increase their learning from text material. Present visual materials appear to neither help nor hinder the student to gain information from text material.  相似文献   

8.
This study explored the predictive effects of science self-beliefs on science achievement for 24,680 13-year-old students from Gulf Cooperation Council member countries – Bahrain, Kuwait, Oman, Qatar, Saudi Arabia and the United Arab Emirates – who participated in the Trends in International Mathematics and Science Study (TIMSS) 2007. The performance of adolescent students in Qatar and Saudi Arabia on the TIMSS 2007 science assessment was significantly below the TIMSS scale average. Adolescent students’ science beliefs had both positive and negative predictive effects on science achievement across the Gulf Cooperation Council member countries.  相似文献   

9.
This research examined the impact of the first‐year implementation of an instructional intervention to promote achievement and equity in science and literacy for culturally and linguistically diverse elementary students. The research addressed three areas: (a) overall science and literacy achievement, (b) achievement gaps among demographic subgroups, and (c) comparison with national (NAEP) and international (TIMSS) samples of students. The research involved 1,523 third‐ and fourth‐grade students at six elementary schools in a large urban school district. Significance tests of mean scores between pre‐ and posttests indicate statistically significant increases on all measures of science and literacy at both grade levels. While achievement gaps widened with third graders on some of the measures, the gaps tended to narrow with fourth graders. The results based on item‐by‐item comparisons with NAEP TIMSS samples of students indicated overall positive performance of the students in the research at the end of the school year. © 2005 Wiley Periodicals, Inc. J Res Sci Teach 42: 857–887, 2005  相似文献   

10.
We report here on a comparative study of middle school students’ attitudes towards science involving three countries: England, Singapore and the U.S.A. Complete attitudinal data sets from TIMSS (Trends in International Mathematics and Science Study) 2011 were used, thus giving a very large sample size (N?=?20,246), compared to other studies in the journal literature. The Rasch model was used to analyse the data, and the findings have shed some useful light on not only how the Western and Asian students responded on a comparative basis in the various scales related to attitudes but also on the validity, reliability, and unidimensionality of the attitudes instrument used in TIMSS 2011. There may be a need for TIMSS test developers to consider doing away with negatively phrased items in the attitudes instrument and phrasing these positively as the Rasch framework shows that response bias is associated with these statements.  相似文献   

11.
Mathematical word problems represent a common item format for assessing student competencies. Automatic item generation (AIG) is an effective way of constructing many items with predictable difficulties, based on a set of predefined task parameters. The current study presents a framework for the automatic generation of probability word problems based on templates that allow for the generation of word problems involving different topics from probability theory. It was tested in a pilot study with N = 146 German university students. The items show a good fit to the Rasch model. Item difficulties can be explained by the Linear Logistic Test Model (LLTM) and by the random-effects LLTM. The practical implications of these findings for future test development in the assessment of probability competencies are also discussed.  相似文献   

12.
Abstract

With the national move toward competency testing, publishers and educators have become increasingly concerned about test validity, item construction, and item readability. While a major effort is usually made by test developers to control the readability level of the test items, there is currently no validated measure of individual item readability.

It is commonly assumed that oral reading of test items by the teacher would ameliorate the readability problem for poor readers. Over 4,000 fifth-grade students were involved in this study aimed at determining the effect of teacher oral reading of test items to good and poor readers. The findings suggested that having teachers read test items aloud during the administration of standardized examinations yielded, overall, higher scores than having students read the items for themselves. However, this intervention did not benefit poor readers more than good readers. Both of these groups reflected similar gains under the influence of this intervention.  相似文献   

13.
Background : The Trends in International Mathematics and Science Study (TIMSS) assesses the quality of the teaching and learning of science and mathematics among Grades 4 and 8 students across participating countries.

Purpose : This study explored the relationship between positive affect towards science and mathematics and achievement in science and mathematics among Malaysian and Singaporean Grade 8 students.

Sample : In total, 4466 Malaysia students and 4599 Singaporean students from Grade 8 who participated in TIMSS 2007 were involved in this study.

Design and method : Students’ achievement scores on eight items in the survey instrument that were reported in TIMSS 2007 were used as the dependent variable in the analysis. Students’ scores on four items in the TIMSS 2007 survey instrument pertaining to students’ affect towards science and mathematics together with students’ gender, language spoken at home and parental education were used as the independent variables.

Results : Positive affect towards science and mathematics indicated statistically significant predictive effects on achievement in the two subjects for both Malaysian and Singaporean Grade 8 students. There were statistically significant predictive effects on mathematics achievement for the students’ gender, language spoken at home and parental education for both Malaysian and Singaporean students, with R 2 = 0.18 and 0.21, respectively. However, only parental education showed statistically significant predictive effects on science achievement for both countries. For Singapore, language spoken at home also demonstrated statistically significant predictive effects on science achievement, whereas gender did not. For Malaysia, neither gender nor language spoken at home had statistically significant predictive effects on science achievement.

Conclusions : It is important for educators to consider implementing self-concept enhancement intervention programmes by incorporating ‘affect’ components of academic self-concept in order to develop students’ talents and promote academic excellence in science and mathematics.  相似文献   

14.
A critical aspect of teacher education is gaining pedagogical content knowledge of how to teach science for conceptual understanding. Given the time limitations of college methods courses, it is difficult to touch on more than a fraction of the science topics potentially taught across grades K-8, particularly in the context of relevant pedagogies. This research and development work centers on constructing a formative assessment resource to help expose pre-service teachers to a greater number of science topics within teaching episodes using various modes of instruction. To this end, 100 problem-based, science pedagogy assessment items were developed via expert group discussions and pilot testing. Each item contains a classroom vignette followed by response choices carefully crafted to include four basic pedagogies (didactic direct, active direct, guided inquiry, and open inquiry). The brief but numerous items allow a substantial increase in the number of science topics that pre-service students may consider. The intention is that students and teachers will be able to share and discuss particular responses to individual items, or else record their responses to collections of items and thereby create a snapshot profile of their teaching orientations. Subsets of items were piloted with students in pre-service science methods courses, and the quantitative results of student responses were spread sufficiently to suggest that the items can be effective for their intended purpose.  相似文献   

15.
Science education needs valid, authentic, and efficient assessments. Many typical science assessments primarily measure recall of isolated information. This paper reports on the validation of assessments that measure knowledge integration ability among middle school and high school students. The assessments were administered to 18,729 students in five states. Rasch analyses of the assessments demonstrated satisfactory item fit, item difficulty, test reliability, and person reliability. The study showed that, when appropriately designed, knowledge integration assessments can be balanced between validity and reliability, authenticity and generalizability, and instructional sensitivity and technical quality. Results also showed that, when paired with multiple‐choice items and scored with an effective scoring rubric, constructed‐response items can achieve high reliabilities. Analyses showed that English language learner status and computer use significantly impacted students' science knowledge integration abilities. Students who took the assessment online, which matched the format of content delivery, performed significantly better than students who took the paper‐and‐pencil version. Implications and future directions of research are noted, including refining curriculum materials to meet the needs of diverse students and expanding the range of topics measured by knowledge integration assessments. © 2011 Wiley Periodicals, Inc. J Res Sci Teach 48: 1079–1107, 2011  相似文献   

16.
This research explored the measurement characteristics of two science examinations and the potential to use access arrangements data to investigate how students requiring reading support are affected by features of exam questions. For two science examinations, traditional and Rasch analyses provided estimates of difficulty and information on item functioning. For one examination, the performance of students eligible for support from a reader in exams was compared to a ‘norm’ group. For selected items a sample of student responses were analysed. A number of factors potentially making questions easier, more difficult or potentially contributing to problems with item functioning were identified. A number of features that may particularly influence those requiring reading support were also identified.  相似文献   

17.
The Third International Mathematics and Science Study (TIMSS) involved 47 countries, thousands of students, and their teachers and schools. Included in the battery of tests and other instruments was a Student Questionnaire that was concerned with the personal and school contexts of the students in relation to their mathematics and science learning. Quite late in the planning of this very expensive study, it transpired that no country had considered gathering data on the students’ sense of the relevance of the science topics in the achievement tests, of their science learning, or, their metacognitive awareness of this learning. This paper reports one last minute attempt to collect these data from one group of student in Population 3—the students in the final year of schooling. Like many other aspects of TIMSS, the psychometric dominance in its design meant that this study was logistically very difficult, but some interesting findings are reported.  相似文献   

18.
Given the central importance of the Nature of Science (NOS) and Scientific Inquiry (SI) in national and international science standards and science learning, empirical support for the theoretical delineation of these constructs is of considerable significance. Furthermore, tests of the effects of varying magnitudes of NOS knowledge on domain‐specific science understanding and belief require the application of instruments validated in accordance with AERA, APA, and NCME assessment standards. Our study explores three interrelated aspects of a recently developed NOS instrument: (1) validity and reliability; (2) instrument dimensionality; and (3) item scales, properties, and qualities within the context of Classical Test Theory and Item Response Theory (Rasch modeling). A construct analysis revealed that the instrument did not match published operationalizations of NOS concepts. Rasch analysis of the original instrument—as well as a reduced item set—indicated that a two‐dimensional Rasch model fit significantly better than a one‐dimensional model in both cases. Thus, our study revealed that NOS and SI are supported as two separate dimensions, corroborating theoretical distinctions in the literature. To identify items with unacceptable fit values, item quality analyses were used. A Wright Map revealed that few items sufficiently distinguished high performers in the sample and excessive numbers of items were present at the low end of the performance scale. Overall, our study outlines an approach for how Rasch modeling may be used to evaluate and improve Likert‐type instruments in science education.  相似文献   

19.
A surprising result of the Third International Mathematics and Science Study (TIMSS) is that computer use was negatively associated with high student achievement in some countries. More specifically, the students from all three countries who indicated that they use computers in the classroom most frequently were those with the lowest achievement on the TIMSS in 1995. For the purpose of this study, a similar comparison was made for 15-year-old U.S.A. students, based on the data from the Program for International Student Assessment (PISA). The results of this study show that it is not computer use itself that has a positive or negative effect on the science achievement of students, but the way in which computers are used. For example, after controlling for the student's socioeconomic status in the United States of America, the results indicated that the students who used computers frequently at home, including for the purpose of writing papers, tended to have higher science achievement. However, the results of this study also show that science achievement was negatively related to the use of certain types of educational software. This indicates a result similar to that found in the TIMSS data, which might reflect the fact that teachers assign the use of the computer and of educational software to the lower achieving students more frequently, so that these students can obtain more personal and direct feedback through educational software.  相似文献   

20.
In response to the demand for sound science assessments, this article presents the development of a latent construct called knowledge integration as an effective measure of science inquiry. Knowledge integration assessments ask students to link, distinguish, evaluate, and organize their ideas about complex scientific topics. The article focuses on assessment topics commonly taught in 6th- through 12th-grade classes. Items from both published standardized tests and previous knowledge integration research were examined in 6 subject-area tests. Results from Rasch partial credit analyses revealed that the tests exhibited satisfactory psychometric properties with respect to internal consistency, item fit, weighted likelihood estimates, discrimination, and differential item functioning. Compared with items coded using dichotomous scoring rubrics, those coded with the knowledge integration rubrics yielded significantly higher discrimination indexes. The knowledge integration assessment tasks, analyzed using knowledge integration scoring rubrics, demonstrate strong promise as effective measures of complex science reasoning in varied science domains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号