首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The purpose of this ITEMS module is to provide an introduction to subscores. First, examples of subscores from an operational test are provided. Then, a review of methods that can be used to examine if subscores have adequate psychometric quality is provided. It is demonstrated, using results from operational and simulated data, that subscores have to be based on a sufficient number of items and have to be sufficiently distinct from each other to have adequate psychometric quality. It is also demonstrated that several operationally reported subscores do not have adequate psychometric quality. Recommendations are made for those interested in reporting subscores for educational tests.  相似文献   

2.
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added value for several operational data sets. Following that I provide results from a detailed simulation study that examines what properties subscores should possess in order to have added value. The results indicate that subscores have to satisfy strict standards of reliability and correlation to have added value. A weighted average of the subscore and the total score was found to have added value more often.  相似文献   

3.
In this digital ITEMS module, Dr. Roy Levy describes Bayesian approaches to psychometric modeling. He discusses how Bayesian inference is a mechanism for reasoning in a probability-modeling framework and is well-suited to core problems in educational measurement: reasoning from student performances on an assessment to make inferences about their capabilities more broadly conceived, as well as fitting models to characterize the psychometric properties of tasks. The approach is first developed in the context of estimating a mean and variance of a normal distribution before turning to the context of unidimensional item response theory (IRT) models for dichotomously scored data. Dr. Levy illustrates the process of fitting Bayesian models using the JAGS software facilitated through the R statistical environment. The module is designed to be relevant for students, researchers, and data scientists in various disciplines such as education, psychology, sociology, political science, business, health, and other social sciences. It contains audio-narrated slides, diagnostic quiz questions, and data-based activities with video solutions as well as curated resources and a glossary.  相似文献   

4.
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008b) suggested reporting an augmented subscore that is a linear combination of a subscore and the total score. Sinharay and Haberman (2008) and Sinharay (2010) showed that augmented subscores often lead to more accurate diagnostic information than subscores. In order to report augmented subscores operationally, they should be comparable across the different forms of a test. One way to achieve comparability is to equate them. We suggest several methods for equating augmented subscores. Results from several operational and simulated data sets show that the error in the equating of augmented subscores appears to be small in most practical situations.  相似文献   

5.
In this ITEMS module, we introduce the generalized deterministic inputs, noisy “and” gate (G‐DINA) model, which is a general framework for specifying, estimating, and evaluating a wide variety of cognitive diagnosis models. The module contains a nontechnical introduction to diagnostic measurement, an introductory overview of the G‐DINA model, as well as common special cases, and a review of model‐data fit evaluation practices within this framework. We use the flexible GDINA R package, which is available for free within the R environment and provides a user‐friendly graphical interface in addition to the code‐driven layer. The digital module also contains videos of worked examples, solutions to data activity questions, curated resources, a glossary, and quizzes with diagnostic feedback.  相似文献   

6.
In this ITEMS module, we provide a didactic overview of the specification, estimation, evaluation, and interpretation steps for diagnostic measurement/classification models (DCMs), which are a promising psychometric modeling approach. These models can provide detailed skill‐ or attribute‐specific feedback to respondents along multiple latent dimensions and hold theoretical and practical appeal for a variety of fields. We use a current unified modeling framework—the log‐linear cognitive diagnosis model (LCDM)—as well as a series of quality‐control checklists for data analysts and scientific users to review the foundational concepts, practical steps, and interpretational principles for these models. We demonstrate how the models and checklists can be applied in real‐life data‐analysis contexts. A library of macros and supporting files for Excel, SAS, and Mplus are provided along with video tutorials for key practices.  相似文献   

7.
Drawing valid inferences from modern measurement models is contingent upon a good fit of the data to the model. Violations of model‐data fit have numerous consequences, limiting the usefulness and applicability of the model. As Bayesian estimation is becoming more common, understanding the Bayesian approaches for evaluating model‐data fit models is critical. In this instructional module, Allison Ames and Aaron Myers provide an overview of Posterior Predictive Model Checking (PPMC), the most common Bayesian model‐data fit approach. Specifically, they review the conceptual foundation of Bayesian inference as well as PPMC and walk through the computational steps of PPMC using real‐life data examples from simple linear regression and item response theory analysis. They provide guidance for how to interpret PPMC results and discuss how to implement PPMC for other model(s) and data. The digital module contains sample data, SAS code, diagnostic quiz questions, data‐based activities, curated resources, and a glossary.  相似文献   

8.
高校毕业生就业数字化服务平台是基于提高高校就业服务水平而开发的.平台建立了企业和学生的数据库,实现了就业需求和市场海量用人需求的即时交互和精准对接,解决了毕业生与用人单位信息不对称问题;远程视频面试系统,提供异地远程面试服务,打破了时间和空间限制,使招聘会常态化、即时化,为学生和企业提供便利;系统中的信息统计模块,对学生求职和市场需求等数据进行提炼分析,建立了高校的就业评估模型,为高校的就业指导和人才的培养提供决策参考.  相似文献   

9.
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows scores to be returned faster at lower cost. In the module, they discuss automated scoring from a number of perspectives. First, they discuss benefits and weaknesses of automated scoring, and what psychometricians should know about automated scoring. Next, they describe the overall process of automated scoring, moving from data collection to engine training to operational scoring. Then, they describe how automated scoring systems work, including the basic functions around score prediction as well as other flagging methods. Finally, they conclude with a discussion of the specific validity demands around automated scoring and how they align with the larger validity demands around test scores. Two data activities are provided. The first is an interactive activity that allows the user to train and evaluate a simple automated scoring engine. The second is a worked example that examines the impact of rater error on test scores. The digital module contains a link to an interactive web application as well as its R-Shiny code, diagnostic quiz questions, activities, curated resources, and a glossary.  相似文献   

10.
This study examines the intersection of two key reform ideas in science teacher education – professional teaching standards and the use of case methods. In this article, we track the historical development of what can be called second wave teaching standards and describe how those standards can be exemplified through multimedia web cases of science teaching. We describe a web case development project in which a group of experienced secondary science teachers work together over several months to video their own classes, and assemble video and audio commentaries of their lessons based on a set of science teaching standards. We conclude that the project was a rich professional development experience for those involved. Further, as the teaching standards movement gathers momentum in Australia and elsewhere, we contend that high quality multimedia cases linked to a standards framework show considerable promise as a vehicle to assist science teachers to reflect on their practice.  相似文献   

11.
Although researchers have reported positive effects on teacher learning from observing published video, teachers’ own video, and their colleagues’ video, very few professional development programs have integrated all three types of video to improve teacher learning. In this study, we examined the affordances and challenges of the three types of video when they were used in a Problem-Based Learning professional development program, drawing upon multiple data sources from 26 K-12 science teachers. We present a case study to illustrate how one teacher might learn from each type of video, and conclude with recommendations for using video in professional development.  相似文献   

12.
Recent research has proposed a criterion to evaluate the reportability of subscores. This criterion is a value‐added ratio (VAR), where values greater than 1 suggest that the true subscore is better approximated by the observed subscore than by the total score. This research extends the existing literature by quantifying statistical significance and effect size for using VAR to provide practical guidelines for subscore interpretation and reporting. Findings indicate that subscores with VAR ≥ 1.1 are a minimum requirement for a meaningful contribution to a user's score interpretation; subscores with .9 < VAR < 1.1 are redundant with the total score and subscores with VAR ≤ .9 would be misleading to report. Additionally, we discuss what to do when subscores do not add value, yet must be reported, as well as when VAR ≥ 1.1 may be undesirable.  相似文献   

13.
In this digital ITEMS module, Nikole Gregg and Dr. Brian Leventhal discuss strategies to ensure data visualizations achieve graphical excellence. Data visualizations are commonly used by measurement professionals to communicate results to examinees, the public, educators, and other stakeholders. To do so effectively, it is important that these visualizations communicate data efficiently and accurately. These visualizations can achieve graphical excellence when they simultaneously display data effectively, efficiently, and accurately. Unfortunately, measurement and statistical software default graphics typically fail to uphold these standards and are therefore not suitable for publication or presentation to the public. To illustrate best practices, the instructors provide an introduction to the graphical template language in SAS and show how elementary components can be used to make efficient, effective, and accurate graphics for a variety of audiences. The module contains audio-narrated slides, embedded illustrative videos, quiz questions with diagnostic feedback, a glossary, sample SAS code, and other learning resources.  相似文献   

14.
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second question, three subscore estimation methods (i.e., subscore estimated from the observed subscore, total score, or a combination of both the subscore and total score) were compared. Analyses were conducted using data from six licensure tests. Results indicated that reporting subscores at the examinee level may not be necessary as they did not provide much additional information over what is provided by the total score. However, at the institutional level (for institution size ≥ 30), reporting subscores may not be harmful, although they may be redundant because the subscores were predicted equally well by the observed subscores or total scores. Finally, results indicated that estimating the subscore using a combination of observed subscore and total score resulted in the highest reliability.  相似文献   

15.
Abstract

This article discusses the application of digital video to multimedia, and looks at the pros and cons of two different approaches. It also considers the standards of equipment for various uses.  相似文献   

16.
In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and invariant measurement. Specifically, they introduce the origins of Rasch measurement theory, the development of model‐data fit indices, as well as commonly used Rasch measurement models. From an applied perspective, they discuss best practices in constructing, estimating, evaluating, and interpreting a Rasch scale using empirical examples. They provide an overview of a specialized Rasch software program (Winsteps) and an R program embedded within Shiny (Shiny_ERMA) for conducting the Rasch model analyses. The module is designed to be relevant for students, researchers, and data scientists in various disciplines such as psychology, sociology, education, business, health, and other social sciences. It contains audio‐narrated slides, sample data, syntax files, access to Shiny_ERMA program, diagnostic quiz questions, data‐based activities, curated resources, and a glossary.  相似文献   

17.
Digital video is a growing and important presence in student learning. This paper reports the results of a survey of American educators in Michigan (n = 426) conducted in spring 2008. The survey included questions about educators’ attitudes toward the streaming and downloadable video services available to them in their schools. The survey results showed that educators mainly used digital video to introduce and to conclude learning experiences. These educators, who predominantly used the Web‐based unitedstreaming TM digital video service, relished the service’s content breadth, short clips and potential to foster instructional innovation. However, many survey respondents reported concerns about internet connection bandwidth, projection capabilities and lack of end‐user control over on‐demand download and manipulation. Educators viewed school‐level Information and Communications Technology policies, bandwidth capacity and lack of support personnel as the causes of these difficulties. Because little research has been done on the implementation of digital video services in US schools, the conclusions of this study may provide direction for educators, researchers and policy‐makers who use a variety of digital video platforms and services.  相似文献   

18.
This paper presents an efficient VLSI architecture of the contest-based adaptive variable length code (CAVLC) decoder with power optimized for the H.264/advanced video coding (AVC) standard. In the proposed design, according to the regularity of the codewords, the first one detector is used to solve the low efficiency and high power dissipation problem within the traditional method of table-searching. Considering the relevance of the data used in the process of runbefore’s decoding, arithmetic operation is combined with finite state machine (FSM), which achieves higher decoding efficiency. According to the CAVLC decoding flow, clock gating is employed in the module level and the register level respectively, which reduces 43% of the overall dynamic power dissipation. The proposed design can decode every syntax element in one clock cycle. When the proposed design is synthesized at the clock constraint of 100 MHz, the synthesis result shows that the design costs 11 300 gates under a 0.25 μm CMOS technology, which meets the demand of real time decoding in the H.264/AVC standard.  相似文献   

19.
The goal of this study is to develop the professional noticing abilities of prospective elementary school teachers in the context of the Stages of Early Arithmetic Learning. In their mathematics methods course, ninety-four prospective elementary school teachers from three institutions participated in a researcher-developed five-session module that progressively nests the three interrelated components of professional noticing—attending, interpreting, and deciding. The module embeds video excerpts of diagnostic interviews of children doing mathematics (representations of practice) to prepare the prospective teachers for similar work. The module culminates with prospective teachers implementing similar diagnostic interviews (approximations of practice) to gain experience in the three component skills of professional noticing. A pre- and post-assessment was administered to measure prospective teachers’ change in the three components. A Wilcoxon signed ranks test was conducted and found the prospective elementary school teachers demonstrated significant growth in all three components. Selected prospective elementary school teacher responses on the pre- and post-assessment are provided to illustrate sample growth in the prospective teachers’ abilities to professionally notice. These results, the first in an ongoing study, indicate the potential that prospective teachers can develop professional noticing skills through this module. Continued data collection and analysis from the ongoing study by these authors and future, longer-term emphasis on professional noticing for prospective teachers should be studied.  相似文献   

20.
Brennan noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. One way to interpret the method is that a subscore has added value only if it has a better agreement than the total score with the corresponding subscore on a parallel form. The focus of this article is on classification of the examinees into “pass” and “fail” (or master and nonmaster) categories based on subscores. A new CTT‐based method is suggested to assess whether classification based on a subscore is in better agreement, than classification based on the total score, with classification based on the corresponding subscore on a parallel form. The method can be considered as an assessment of the added value of subscores with respect to classification. The suggested method is applied to data from several operational tests. The added value of subscores with respect to classification is found to be very similar, except at extreme cutscores, to their added value from a value‐added analysis of Haberman.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号