首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Berglund, G. W. (1970). The Effect of four Sets of Test Instructions on Scores in Mental Ability Tests. Scand. J. Educ. Res. 14, 31‐38. Four hundred and eighteen Swedish children (11‐year‐olds) were divided randomly into four experimental groups. Three mental ability tests of the factor type were administered to the groups by means of four different sets of instructions. In the first group the tests were presented as intelligence tests and in the second group as achievement tests. The third group received the original instructions of the tests and the fourth group received routine instructions. It is concluded (a) that the four instructions do not differentiate the groups in power tests, and (b) that the routine instruction does not affect the subjects’ working speed to the same degree as the other instructions.  相似文献   

2.
3.
This essay seeks to establish a metaphor of the professional practice of teaching to the attributes and training of an offensive lineman in the game of American football. Effective classroom instruction does not rely exclusively on a rare set of talents but rather rests on the commitment to the work of teaching. Like the position of offensive lineman, the profession of teaching is one of service. And more, it is one in which the person's performance can blossom through intense determination. An invitation is offered to serve as an effective teacher.  相似文献   

4.
Reliability of Scores From Teacher-Made Tests   总被引:1,自引:0,他引:1  
Reliability is the property of a set of test scores that indicates the amount of measurement error associated with the scores. Teachers need to know about reliability so that they can use test scores to make appropriate decisions about their students. The level of consistency of a set of scores can he estimated by using the methods of internal analysis to compute a reliability coefficient. This coefficient, which can range between 0.0 and +1.0, usually has values around 0.50 for teacher-made tests and around 0.90 for commercially prepared standardized tests. Its magnitude can be affected by such factors as test length, test-item difficulty and discrimination, time limits, and certain characteristics of the group—extent of their testwiseness, level of student motivation, and homogeneity in the ability measured by the test.  相似文献   

5.
In the service of educational accountability, student achievement tests are being used to measure constructs quite unlike those envisioned by test developers. Scores are compared to cut points to create classifications like “proficient”; scores are combined over time to measure growth; student scores are aggregated to measure the effectiveness of teachers, schools, and school districts; indices are created to measure college and career readiness. These and other new uses rely on derived scores created to measure new constructs. The field of educational and psychological measurement has largely ignored these significant, consequential measurement applications. The conceptual frameworks and analytical tools of educational and psychological measurement should be used to study such derived scores and the validity of their uses and interpretations.  相似文献   

6.
Abstract

The effectiveness of the game Order Out was investigated and differences in achievement when a) sets of fraction bars, b) pictorial representations of fraction bars, or c) neither physical nor pictorial aids were made available during the play of the game. The game was played 20 minutes, twice weekly for 5 weeks. The subjects were 85 fifth grade students in four intact classes and 177 seventh grade students in eight intact classes. Pre- and posttests of 40 items were given; in each item, the student ordered a pair of proper fractions. The game was an effective way to improve students’ achievement of the game content. There were no significant achievement differences among treatments. Post-hoc analysis of the data revealed sex-related trends suggesting two hypotheses for further study.  相似文献   

7.
Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a highest density region (HDR). Furthermore, these methods were compared with the standardized log-likelihood statistic with and without a correction for the estimated latent trait value (denoted as l*z and lz, respectively). Data were simulated on the basis of the one-parameter logistic model, and both parametric and non-parametric logistic regression was used to obtain estimates of the latent trait. Results showed that it is important to take the trait level into account when comparing subtest scores. In a nonparametric item response theory (IRT) context, on adapted version of the HDR method was a powerful alterative to p. In a parametric IRT context, results showed that l*z had the highest power when the data were simulated conditionally on the estimated latent trait level.  相似文献   

8.
Time limits on some computer-adaptive tests (CATs) are such that many examinees have difficulty finishing, and some examinees may be administered tests with more time-consuming items than others. Results from over 100,000 examinees suggested that about half of the examinees must guess on the final six questions of the analytical section of the Graduate Record Examination if they were to finish before time expires. At the higher-ability levels, even more guessing was required because the questions administered to higher-ability examinees were typically more time consuming. Because the scoring model is not designed to cope with extended strings of guesses, substantial errors in ability estimates can be introduced when CATs have strict time limits. Furthermore, examinees who are administered tests with a disproportionate number of time-consuming items appear to get lower scores than examinees of comparable ability who are administered tests containing items that can be answered more quickly, though the issue is very complex because of the relationship of time and difficulty, and the multidimensionality of the test.  相似文献   

9.
10.
11.
Abstract

The purpose of this study was to investigate the effectiveness of a training program in creative thinking and problem solving on children from varying racial backgrounds and social-class levels. The Ss were 218 fifth and sixth grade students. All Ss were administered the Torrance Tests of Creative Thinking, Form A. The experimental Ss participated in the eight-week Productive Thinking Program and the control Ss in the Gates-Peardon Reading Exercises. At the completion of the Program, all Ss were administered the Torrance Tests of Creative Thinking, Form B. The results indicate that participation in the Productive Thinking Program enabled the students to improve their creative thinking and problem-solving abilities. Neither race nor social-class level affected the child's ability to increase these skills.  相似文献   

12.
心理测验在大学生心理健康评价中的误差分析   总被引:2,自引:0,他引:2  
马锦华 《天中学刊》2001,16(4):83-85
心理测验是指经过测验编制程度完成标准化用以测量心理特性的工具,心理测验应用于大学生心理健康评价有助于科学地开展大学心理健康教育,但是由于心理测验理论的薄弱、心理学理论基础的发展问题、心理测验量表与评价标准问题以及心理测验的跨文化问题等因素的影响,使心理测验的大学生心理健康评价中出现一定的失误。  相似文献   

13.
To assess the effects of logical support, the Smith, Meux, Coombs, and Nuthall (12) system for the analysis of teaching strategies was used to construct four passages for each of two topics, fluoridation and the use of pesticides. The four passages varied in degree and kind of support or justification for the negative value judgments used. The passages were administered to 303 eleventh grade students. The group which received the passages having the most support for the negative value judgments reacted negatively as much or more than the other three groups, while the group which received the passages having the least support reacted negatively as little or less than the other three groups. There was also a topic-by-passage interaction.  相似文献   

14.
Abstract

Cognitive changes in socially disadvantaged children in Grades 5 to 7 who were participating in a one-to-one tutoring program in Israel were assessed. Tutors were university students who received a partial tuition rebate if they met their child twice a week in 2-hour sessions over a 7-month period. The progress of a sample of tutored children was compared to that of a sample of nontutored children in mathematics, reading (Hebrew), and English. The tutored children were not found to be at an advantage on the tests although other data from tutors, parents, children, and teachers indicated that the project should be having an impact on academic achievement.  相似文献   

15.
This paper illustrates that the psychometric properties of scores and scales that are used with mixed‐format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is on mixed‐format tests in situations for which raw scores are integer‐weighted sums of item scores. Four associated real‐data examples include (a) effects of weights associated with each item type on reliability, (b) comparison of psychometric properties of different scale scores, (c) evaluation of the equity property of equating, and (d) comparison of the use of unidimensional and multidimensional procedures for evaluating psychometric properties. Throughout the paper, and especially in the conclusion section, the examples are related to issues associated with test interpretation and test use.  相似文献   

16.
针对新一轮高考改革中的选科组合与等级分数,主要探讨了选科组合与招生专业限科之间的复杂性映射关系,等级分数转换及其应用问题,选科组合与多元录取的协调发展等问题,旨在推动高考改革研究的不断深化和良性发展。  相似文献   

17.
针对新一轮高考改革中的选科组合与等级分数,主要探讨了选科组合与招生专业限科之间的复杂性映射关系,等级分数转换及其应用问题,选科组合与多元录取的协调发展等问题,旨在推动高考改革研究的不断深化和良性发展。  相似文献   

18.
19.
In this study, a variation of the bookmark standard setting procedure for passage-based tests is proposed in which separate ordered item booklets are created for the items associated with each passage. This variation is compared to the traditional bookmark procedure for a fifth-grade reading test. The results showed that the single-passage bookmark method produced greater consistency among the participants' cutscores, and most participants' bookmark placements did not change after the first round. In addition, participants reported greater understanding of the bookmarking task and greater confidence in their recommended cutscores. Both procedures required approximately the same amount of time to complete, but it is likely that the single-passage bookmark method could be carried out in two, or possibly even one, round of bookmarking rather than the three rounds used in traditional bookmarking. On the other hand, there are several concerns about the single-passage bookmark method that warrant further research. These include floor and ceiling effects, training issues, optimal booklet length, and multiple standards.  相似文献   

20.
以高三数学诊断性考试为观测样本,对各类题型进行统计分析,得到差错分布与总成绩的相关规律。根据差错分布特征对教学策略进行调整,逐步形成反馈改进式教学模式。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号