期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Lack-of-fit testing of a regression model with response missing at random

Xiaoyu Li 《Journal of statistical planning and inference》2012,142(1):155-170

This paper proposes a class of lack-of-fit tests for fitting a linear regression model when some response variables are missing at random. These tests are based on a class of minimum integrated square distances between a kernel type estimator of a regression function and the parametric regression function being fitted. These tests are shown to be consistent against a large class of fixed alternatives. The corresponding test statistics are shown to have asymptotic normal distributions under null hypothesis and a class of nonparametric local alternatives. Some simulation results are also presented. 相似文献

2.

Adaptive Percolation Using Subjective Likelihoods

Nozer D. Singpurwalla 《Econometric Reviews》2014,33(1-4):379-394

A phenomenon that I call “adaptive percolation” commonly arises in biology, business, economics, defense, finance, manufacturing, and the social sciences. Here one wishes to select a handful of entities from a large pool of entities via a process of screening through a hierarchy of sieves. The process is not unlike the percolation of a liquid through a porous medium. The probability model developed here is based on a nested and adaptive Bayesian approach that results in the product of beta-binomial distributions with common parameters. The common parameters happen to be the observed data. I call this the percolated beta-binomial distribution . The model turns out to be a slight generalization of the probabilistic model used in percolation theory. The generalization is a consequence of using a subjectively specified likelihood function to construct a probability model. The notion of using likelihoods for constructing probability models is not a part of the conventional toolkit of applied probabilists. To the best of my knowledge, a use of the product of beta-binomial distributions as a probability model for Bernoulli trials appears to be new. The development of the material of this article is illustrated via data from the 2009 astronaut selection program, which motivated this work. 相似文献

3.

Admissible unbiased estimation of finite population variance under a randomized response model

S. Sengupta 《统计学通讯:理论与方法》2018,47(20):5077-5082

We consider the problem of estimation of a finite population variance related to a sensitive character under a randomized response model and prove (i) the admissibility of an estimator for a given sampling design in a class of quadratic unbiased estimators and (ii) the admissibility of a sampling strategy in a class of comparable quadratic unbiased strategies. 相似文献

4.

Why a time effect often has a limited impact on capture‐recapture estimates in closed populations

Louis‐Paul Rivest 《Revue canadienne de statistique》2008,36(1):75-84

The author is concerned with log‐linear estimators of the size N of a population in a capture‐recapture experiment featuring heterogeneity in the individual capture probabilities and a time effect. He also considers models where the first capture influences the probability of subsequent captures. He derives several results from a new inequality associated with a dispersive ordering for discrete random variables. He shows that in a log‐linear model with inter‐individual heterogeneity, the estimator N is an increasing function of the heterogeneity parameter. He also shows that the inclusion of a time effect in the capture probabilities decreases N in models without heterogeneity. He further argues that a model featuring heterogeneity can accommodate a time effect through a small change in the heterogeneity parameter. He demonstrates these results using an inequality for the estimators of the heterogeneity parameters and illustrates them in a Monte Carlo experiment 相似文献

5.

Generating random variates from a bicompositional Dirichlet distribution

《Journal of Statistical Computation and Simulation》2012,82(6):797-805

A composition is a vector of positive components summing to a constant. The sample space of a composition is the simplex, and the sample space of two compositions, a bicomposition, is a Cartesian product of two simplices. We present a way of generating random variates from a bicompositional Dirichlet distribution defined on the Cartesian product of two simplices using the rejection method. We derive a general solution for finding a dominating density function and a rejection constant and also compare this solution to using a uniform dominating density function. Finally, some examples of generated bicompositional random variates, with varying number of components, are presented. 相似文献

6.

On the efficiency of a testimator for the weibull shape parameter

Nimai Kumar Chandra Arijit Chaudhuri 《统计学通讯:理论与方法》2013,42(4):1247-1259

In estimating the shape parameter of a two-parameter Weibull distribution from a failure-censored sample, a recently popular procedure is to employ a testimator which is a shrinkage estimator based on a preliminary hypothesis test for a guessed value of the parameter. Such an adaptive testimator is a linear compound of the guessed value and a statistic. A new compounding coefficient is numerically shown to yield higher efficiency in many situations compared to some of the existing ones. 相似文献

7.

A frequentist understanding of sets of measures

P.I. Fierens L.C. Rêgo T.L. Fine 《Journal of statistical planning and inference》2009

We present a mathematical theory of objective, frequentist chance phenomena that uses as a model a set of probability measures. In this work, sets of measures are not viewed as a statistical compound hypothesis or as a tool for modeling imprecise subjective behavior. Instead we use sets of measures to model stable (although not stationary in the traditional stochastic sense) physical sources of finite time series data that have highly irregular behavior. Such models give a coarse-grained picture of the phenomena, keeping track of the range of the possible probabilities of the events. We present methods to simulate finite data sequences coming from a source modeled by a set of probability measures, and to estimate the model from finite time series data. The estimation of the set of probability measures is based on the analysis of a set of relative frequencies of events taken along subsequences selected by a collection of rules. In particular, we provide a universal methodology for finding a family of subsequence selection rules that can estimate any set of probability measures with high probability. 相似文献

8.

Detecting a changed segment in DNA sequences

P. J. Avery & D. A. Henderson 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(4):489-503

Non-coding deoxyribonucleic acid (DNA) can typically be modelled by a sequence of Bernoulli random variables by coding one base, e.g. T, as 1 and other bases as 0. If a segment of a sequence is functionally important, the probability of a 1 will be different in this changed segment from that in the surrounding DNA. It is important to be able to see whether such a segment occurs in a particular DNA sequence and to pin-point it so that a molecular biologist can investigate its possible function. Here we discuss methods for testing the occurrence of such a changed segment and how to estimate the end points of it. Maximum-likelihood-based methods are not very tractable and so a nonparametric method based on the approach of Pettitt has been developed. The problem and its solution are illustrated by a specific DNA example. 相似文献

9.

An empirical likelihood goodness-of-fit test for time series

Song Xi Chen Wolfgang Härdle Ming Li 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(3):663-678

Summary. Standard goodness-of-fit tests for a parametric regression model against a series of nonparametric alternatives are based on residuals arising from a fitted model. When a parametric regression model is compared with a nonparametric model, goodness-of-fit testing can be naturally approached by evaluating the likelihood of the parametric model within a nonparametric framework. We employ the empirical likelihood for an α -mixing process to formulate a test statistic that measures the goodness of fit of a parametric regression model. The technique is based on a comparison with kernel smoothing estimators. The empirical likelihood formulation of the test has two attractive features. One is its automatic consideration of the variation that is associated with the nonparametric fit due to empirical likelihood's ability to Studentize internally. The other is that the asymptotic distribution of the test statistic is free of unknown parameters, avoiding plug-in estimation. We apply the test to a discretized diffusion model which has recently been considered in financial market analysis. 相似文献

10.

Gaining insight with recursive partitioning of generalized linear models

Thomas Rusch Achim Zeileis 《Journal of Statistical Computation and Simulation》2013,83(7):1301-1315

Recursive partitioning algorithms separate a feature space into a set of disjoint rectangles. Then, usually, a constant in every partition is fitted. While this is a simple and intuitive approach, it may still lack interpretability as to how a specific relationship between dependent and independent variables may look. Or it may be that a certain model is assumed or of interest and there is a number of candidate variables that may non-linearly give rise to different model parameter values. We present an approach that combines generalized linear models (GLM) with recursive partitioning that offers enhanced interpretability of classical trees as well as providing an explorative way to assess a candidate variable's influence on a parametric model. This method conducts recursive partitioning of a GLM by (1) fitting the model to the data set, (2) testing for parameter instability over a set of partitioning variables, (3) splitting the data set with respect to the variable associated with the highest instability. The outcome is a tree where each terminal node is associated with a GLM. We will show the method's versatility and suitability to gain additional insight into the relationship of dependent and independent variables by two examples, modelling voting behaviour and a failure model for debt amortization, and compare it to alternative approaches. 相似文献

11.

A Nonstationary Cylinder–Based Model Describing Group Dispersal in a Fragmented Habitat

Samuel Soubeyrand Tomáš Mrkvicka Antti Penttinen 《随机性模型》2014,30(1):48-67

□ A doubly nonstationary cylinder-based model is built to describe the dispersal of a population from a point source. In this model, each cylinder represents a fraction of the population, i.e., a group. Two contexts are considered: The dispersal can occur in a uniform habitat or in a fragmented habitat described by a conditional Boolean model. After the construction of the models, we investigate their properties: the first and second order moments, the probability that the population vanishes, and the distribution of the spatial extent of the population. 相似文献

12.

Robust wavelet shrinkage using robust selection of thresholds

Hee-Seok Oh Donghoh Kim Yongdai Kim 《Statistics and Computing》2009,19(1):27-34

This paper considers the problem of selecting a robust threshold of wavelet shrinkage. Previous approaches reported in literature to handle the presence of outliers mainly focus on developing a robust procedure for a given threshold; this is related to solving a nontrivial optimization problem. The drawback of this approach is that the selection of a robust threshold, which is crucial for the resulting fit is ignored. This paper points out that the best fit can be achieved by a robust wavelet shrinkage with a robust threshold. We propose data-driven selection methods for a robust threshold. These approaches are based on a coupling of classical wavelet thresholding rules with pseudo data. The concept of pseudo data has influenced the implementation of the proposed methods, and provides a fast and efficient algorithm. Results from a simulation study and a real example demonstrate the promising empirical properties of the proposed approaches. 相似文献

13.

A Five-Decision Testing Procedure to Infer the Value of a Unidimensional Parameter

《The American statistician》2012,66(4):321-326

ABSTRACT

A statistical test can be seen as a procedure to produce a decision based on observed data, where some decisions consist of rejecting a hypothesis (yielding a significant result) and some do not, and where one controls the probability to make a wrong rejection at some prespecified significance level. Whereas traditional hypothesis testing involves only two possible decisions (to reject or not a null hypothesis), Kaiser’s directional two-sided test as well as the more recently introduced testing procedure of Jones and Tukey, each equivalent to running two one-sided tests, involve three possible decisions to infer the value of a unidimensional parameter. The latter procedure assumes that a point null hypothesis is impossible (e.g., that two treatments cannot have exactly the same effect), allowing a gain of statistical power. There are, however, situations where a point hypothesis is indeed plausible, for example, when considering hypotheses derived from Einstein’s theories. In this article, we introduce a five-decision rule testing procedure, equivalent to running a traditional two-sided test in addition to two one-sided tests, which combines the advantages of the testing procedures of Kaiser (no assumption on a point hypothesis being impossible) and Jones and Tukey (higher power), allowing for a nonnegligible (typically 20%) reduction of the sample size needed to reach a given statistical power to get a significant result, compared to the traditional approach. 相似文献

14.

A Tree-Structured Markovian Model of the Shipment Consolidation Process

Qishu Cai Qi-Ming He James H. Bookbinder 《随机性模型》2014,30(4):521-553

This article studies the dispatch of consolidated shipments. Orders, following a batch Markovian arrival process, are received in discrete quantities by a depot at discrete time epochs. Instead of immediate dispatch, all outstanding orders are consolidated and shipped together at a later time. The decision of when to send out the consolidated shipment is made based on a “dispatch policy,” which is a function of the system state and/or the costs associated with that state. First, a tree structured Markov chain is constructed to record specific information about the consolidation process; the effectiveness of any dispatch policy can then be assessed by a set of long-run performance measures. Next, the effect on shipment consolidation of varying the order-arrival process is demonstrated through numerical examples and proved mathematically under some conditions. Finally, a heuristic algorithm is developed to determine a favorable parameter of a special set of dispatch policies, and the algorithm is proved to yield the overall optimal policy under certain conditions. 相似文献

15.

A Unified Approach to the Characterization of Equivalence Classes of DAGs, Chain Graphs with no Flags and Chain Graphs

ALBERTO ROVERATO 《Scandinavian Journal of Statistics》2005,32(2):295-312

Abstract. A Markov property associates a set of conditional independencies to a graph. Two alternative Markov properties are available for chain graphs (CGs), the Lauritzen–Wermuth–Frydenberg (LWF) and the Andersson–Madigan– Perlman (AMP) Markov properties, which are different in general but coincide for the subclass of CGs with no flags . Markov equivalence induces a partition of the class of CGs into equivalence classes and every equivalence class contains a, possibly empty, subclass of CGs with no flags itself containing a, possibly empty, subclass of directed acyclic graphs (DAGs). LWF-Markov equivalence classes of CGs can be naturally characterized by means of the so-called largest CGs , whereas a graphical characterization of equivalence classes of DAGs is provided by the essential graphs . In this paper, we show the existence of largest CGs with no flags that provide a natural characterization of equivalence classes of CGs of this kind, with respect to both the LWF- and the AMP-Markov properties. We propose a procedure for the construction of the largest CGs, the largest CGs with no flags and the essential graphs, thereby providing a unified approach to the problem. As by-products we obtain a characterization of graphs that are largest CGs with no flags and an alternative characterization of graphs which are largest CGs. Furthermore, a known characterization of the essential graphs is shown to be a special case of our more general framework. The three graphical characterizations have a common structure: they use two versions of a locally verifiable graphical rule. Moreover, in case of DAGs, an immediate comparison of three characterizing graphs is possible. 相似文献

16.

Bounded-influence rank estimation in the linear model

Douglas Wiens Julie Zhou 《Revue canadienne de statistique》1994,22(2):233-245

We introduce and study a class of rank-based estimators for the linear model. The estimate may be roughly described as being calculated in the same manner as a generalized M-estimate, but with the residual being replaced by a function of its signed rank. The influence function can thus be bounded, both as a function of the residual and as a function of the carriers. Subject to such a bound, the efficiency at a particular model distribution can be optimized by appropriate choices of rank scores and carrier weights. Such choices are given, with respect to a variety of optimality criteria. We compare our estimates with several others, in a Monte Carlo study and on a real data set from the literature. 相似文献

17.

Flexible Class of Skew-Symmetric Distributions 总被引：2，自引：0，他引：2

Yanyuan Ma Marc G. Genton 《Scandinavian Journal of Statistics》2004,31(3):459-468

Abstract. We propose a flexible class of skew-symmetric distributions for which the probability density function has the form of a product of a symmetric density and a skewing function. By constructing an enumerable dense subset of skewing functions on a compact set, we are able to consider a family of distributions, which can capture skewness, heavy tails and multimodality systematically. We present three illustrative examples for the fibreglass data, the simulated data from a mixture of two normal distributions and the Swiss bills data. 相似文献

18.

Regression models to dependence for exceedance

Fernando Ferraz do Nascimento Andreson Almeida Azevedo Valmaria Rocha da Silva Ferraz 《Journal of applied statistics》2021,48(16):3048

Extreme Value Theory (EVT) aims to study the tails of probability distributions in order to measure and quantify extreme events of maximum and minimum. In river flow data, an extreme level of a river may be related to the level of a neighboring river that flows into it. In this type of data, it is very common for flooding of a location to have been caused by a very large flow from an affluent river that is tens or hundreds of kilometers from this location. In this sense, an interesting approach is to consider a conditional model for the estimation of a multivariate model. Inspired by this idea, we propose a Bayesian model to describe the dependence of exceedance between rivers, where we considered a conditionally independent structure. In this model, the dependence between rivers is captured by modeling the excess marginally of one river as a consequence of linear functions of the other rivers. The results showed that there is a strong and positive connection between excesses in one river caused by the excesses of the other rivers. 相似文献

19.

Pooled Cross-Sectional and Time Series Data: A Survey of Current Statistical Methodology

Terry E. Dielman 《The American statistician》2013,67(2):111-122

A data base that provides a multivariate statistical history for each of a number of individual entities is called a pooled cross-sectional and time series data base in the econometrics literature. In marketing and survey literature the terms panel data or longitudinal data are often used. In management science a convenient term might be management data base. Such a data base provides a particularly rich environment for statistical analysis. This article reviews methods for estimating multivariate relationships particular to each individual entity and for summarizing these relationships for a number of individuals. Inference to a larger population when the data base is viewed as a sample is also considered. 相似文献

20.

Treatment-competing events in dynamic regimes

Johnson BA 《Lifetime data analysis》2008,14(2):196-215

A dynamic treatment regime is a sequence of decision rules for assigning treatment based on a patient’s current need for treatment. Dynamic regimes are viewed, by many, as a natural way of treating patients with chronic diseases; that is, treating patients with adaptive, complex, longitudinal treatment regimens. In developing dynamic treatment strategies, treatment-competing events may play an important role in the overall treatment strategy, and their effects on subsequent treatment decisions and eventual outcome should be considered. Treatment-competing events may be defined generally as patient-specific, random events which interrupt the ongoing treatment decision process in a dynamic regime. Treatment-competing events censor later treatment decisions that would otherwise be made on a particular dynamic treatment regime had the competing events not occurred. For example, in therapeutic studies of HIV, physicians may assign treatment based on a patient’s current level HIV1-RNA; this defines a treatment assignment rule. However, the presence of opportunistic infections or severe adverse events may preclude a strict adherence of the treatment assignment rule. In other contexts, the “censoring”-by-death phenomenon may be viewed as an example of a treatment-competing event for a particular dynamic treatment regime. Treatment-competing events can be built into the dynamic treatment regime framework and counting processes are a natural mechanism to facilitate this development. In this paper, we develop treatment-competing events in a dynamic infusion policy, a random dynamic treatment regime where multiple infusion treatments are initiated simultaneously and given continuously over time subject to the presence/absence of a treatment-competing event. We illustrate how our methodology may be used to suggest an estimator for a particular causal estimand of recent interest. Finally, we exemplify our methods in a recent study of patients undergoing coronary stent implantation. 相似文献