首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
Goldsmith  Judy  Sloan  Robert H.  Turán  György 《Machine Learning》2002,47(2-3):257-295
The theory revision, or concept revision, problem is to correct a given, roughly correct concept. This problem is considered here in the model of learning with equivalence and membership queries. A revision algorithm is considered efficient if the number of queries it makes is polynomial in the revision distance between the initial theory and the target theory, and polylogarithmic in the number of variables and the size of the initial theory. The revision distance is the minimal number of syntactic revision operations, such as the deletion or addition of literals, needed to obtain the target theory from the initial theory. Efficient revision algorithms are given for three classes of disjunctive normal form expressions: monotone k-DNF, monotone m-term DNF and unate two-term DNF. A negative result shows that some monotone DNF formulas are hard to revise.  相似文献   

2.
We introduce a new fault-tolerant model of algorithmic learning using an equivalence oracle and anincomplete membership oracle, in which the answers to a random subset of the learner's membership queries may be missing. We demonstrate that, with high probability, it is still possible to learn monotone DNF formulas in polynomial time, provided that the fraction of missing answers is bounded by some constant less than one. Even when half the membership queries are expected to yield no information, our algorithm will exactly identifym-term,n-variable monotone DNF formulas with an expectedO(mn 2) queries. The same task has been shown to require exponential time using equivalence queries alone. We extend the algorithm to handle some one-sided errors, and discuss several other possible error models. It is hoped that this work may lead to a better understanding of the power of membership queries and the effects of faulty teachers on query models of concept learning.  相似文献   

3.
We show the following: (a) For any ε>0, log(3+ε)n-term DNF cannot be polynomial-query learned with membership and strongly proper equivalence queries. (b) For sufficiently large t, t-term DNF formulas cannot be polynomial-query learned with membership and equivalence queries that use t1+ε-term DNF formulas as hypotheses, for some ε<1 (c) Read-thrice DNF formulas are not polynomial-query learnable with membership and proper equivalence queries. (d) logn-term DNF formulas can be polynomial-query learned with membership and proper equivalence queries. (This complements a result of Bshouty, Goldman, Hancock, and Matar that -term DNF can be so learned in polynomial time.)Versions of (a)-(c) were known previously, but the previous versions applied to polynomial-time learning and used complexity theoretic assumptions. In contrast, (a)-(c) apply to polynomial-query learning, imply the results for polynomial-time learning, and do not use any complexity-theoretic assumptions.  相似文献   

4.
We consider two issues in polynomial-time exact learning of concepts using membership and equivalence queries: (1) errors or omissions in answers to membership queries, and (2) learning finite variants of concepts drawn from a learnable class.To study (1), we introduce two new kinds of membership queries: limited membership queries and malicious membership queries. Each is allowed to give incorrect responses on a maliciously chosen set of strings in the domain. Instead of answering correctly about a string, a limited membership query may give a special I don't know answer, while a malicious membership query may give the wrong answer. A new parameter Lis used to bound the length of an encoding of the set of strings that receive such incorrect answers. Equivalence queries are answered correctly, and learning algorithms are allowed time polynomial in the usual parameters and L. Any class of concepts learnable in polynomial time using equivalence and malicious membership queries is learnable in polynomial time using equivalence and limited membership queries; the converse is an open problem. For the classes of monotone monomials and monotone k-term DNF formulas, we present polynomial-time learning algorithms using limited membership queries alone. We present polynomial-time learning algorithms for the class of monotone DNF formulas using equivalence and limited membership queries, and using equivalence and malicious membership queries.To study (2), we consider classes of concepts that are polynomially closed under finite exceptions and a natural operation to add exception tables to a class of concepts. Applying this operation, we obtain the class of monotone DNF formulas with finite exceptions. We give a polynomial-time algorithm to learn the class of monotone DNF formulas with finite exceptions using equivalence and membership queries. We also give a general transformation showing that any class of concepts that is polynomially closed under finite exceptions and is learnable in polynomial time using standard membership and equivalence queries is also polynomial-time learnable using malicious membership and equivalence queries. Corollaries include the polynomial-time learnability of the following classes using malicious membership and equivalence queries: deterministic finite acceptors, boolean decision trees, and monotone DNF formulas with finite exceptions.  相似文献   

5.
A central topic in query learning is to determine which classes of Boolean formulas are efficiently learnable with membership and equivalence queries. We consider the class kconsisting of conjunctions ofkunate DNF formulas. This class generalizes the class ofk-clause CNF formulas and the class of unate DNF formulas, both of which are known to be learnable in polynomial time with membership and equivalence queries. We prove that 2can be properly learned with a polynomial number of polynomial-size membership and equivalence queries, but can be properly learned in polynomial time with such queries if and only if P=NP. Thus the barrier to properly learning 2with membership and equivalence queries is computational rather than informational. Few results of this type are known. In our proofs, we use recent results of Hellersteinet al.(1997,J. Assoc. Comput. Mach.43(5), 840–862), characterizing the classes that are polynomial-query learnable, together with work of Bshouty on the monotone dimension of Boolean functions. We extend some of our results to kand pose open questions on learning DNF formulas of small monotone dimension. We also prove structural results for k. We construct, for any fixedk2, a class of functionsfthat cannot be represented by any formula in k, but which cannot be “easily” shown to have this property. More precisely, for any functionfonnvariables in the class, the value offon any polynomial-size set of points in its domain is not a witness thatfcannot be represented by a formula in k. Our construction is based on BCH codes.  相似文献   

6.
Fast Algorithms for max independent set   总被引:1,自引:0,他引:1  
We first propose a method, called “bottom-up method” that, informally, “propagates” improvement of the worst-case complexity for “sparse” instances to “denser” ones and we show an easy though non-trivial application of it to the min set cover problem. We then tackle max independent set. Here, we propagate improvements of worst-case complexity from graphs of average degree?d to graphs of average degree greater than?d. Indeed, using algorithms for max independent set in graphs of average degree 3, we successively solve max independent set in graphs of average degree 4, 5 and?6. Then, we combine the bottom-up technique with measure and conquer techniques to get improved running times for graphs of maximum degree?5 and?6 but also for general graphs. The computation bounds obtained for max independent set are?O ?(1.1571 n ), O ?(1.1895 n ) and?O ?(1.2050 n ), for graphs of maximum (or more generally average) degree?4, 5 and?6 respectively, and?O ?(1.2114 n ) for general graphs. These results improve upon the best known results for these cases for polynomial space algorithms.  相似文献   

7.
Higher-order neurons with k monomials in n variables are shown to have Vapnik-Chervonenkis (VC) dimension at least nk + 1. This result supersedes the previously known lower bound obtained via k-term monotone disjunctive normal form (DNF) formulas. Moreover, it implies that the VC dimension of higher-order neurons with k monomials is strictly larger than the VC dimension of k-term monotone DNF. The result is achieved by introducing an exponential approach that employs gaussian radial basis function neural networks for obtaining classifications of points in terms of higher-order neurons.  相似文献   

8.
We introduce a combinatorial dimension that characterizes the number of queries needed to exactly (or approximately) learn concept classes in various models. Our general dimension provides tight upper and lower bounds on the query complexity for all sorts of queries, not only for example-based queries as in previous works.As an application we show that for learning DNF formulas, unspecified attribute value membership and equivalence queries are not more powerful than standard membership and equivalence queries. Further, in the approximate learning setting, we use the general dimension to characterize the query complexity in the statistical query as well as the learning by distances model. Moreover, we derive close bounds on the number of statistical queries needed to approximately learn DNF formulas.  相似文献   

9.
In his paper “On a Boolean matrix”, Nechiporuk gave an explicit example of a set of n homogeneous monotone Boolean functions of the first degree in n variables that require Ω(n3/2) two-input gates in any monotone Boolean network computing them. In this note we show how this can be extended to Ω(n5/3) two-input gates.  相似文献   

10.
We prove a separation between monotone and general arithmetic formulas for polynomials of constant degree. We give an example of a polynomial C in n variables and degree k which is computable by a homogeneous arithmetic formula of size O(k2n2), but every monotone formula computing C requires size (n/kc)Ω(logk), with c∈(0,1). Since the upper bound is achieved by a homogeneous arithmetic formula, we also obtain a separation between monotone and homogeneous formulas, for polynomials of constant degree.  相似文献   

11.
We study heuristic learnability of classes of Boolean formulas, a model proposed by Pitt and Valiant. In this type of example-based learning of a concept class C by a hypothesis class H, the learner seeks a hypothesis h H that agrees with all of the negative (resp. positive) examples, and a maximum number of positive (resp. negative) examples. This learning is equivalent to the problem of maximizing agreement with a training sample, with the constraint that the misclassifications be limited to examples with positive (resp. negative) labels. Several recent papers have studied the more general problem of maximizing agreements without this one-sided error constraint. We show that for many classes (though not all), the maximum agreement problem with one-sided error is more difficult than the general maximum agreement problem. We then provide lower bounds on the approximability of these one-sided error problems, for many concept classes, including Halfspaces, Decision Lists, XOR, k-term DNF, and neural nets.Editor Philip M. LongThis research was supported by the fund for promotion of research at the Technion. Research no. 120-025.  相似文献   

12.
We derive an upper and a lower bound on the sample size needed for PAC-learning a concept class in the presence of one-sided classification noise. The upper bound is achieved by the strategy “Minimum One-sided Disagreement”. It matches the lower bound (which holds for any learning strategy) up to a logarithmic factor. Although “Minimum One-sided Disagreement” often leads to NP-hard combinatorial problems, we show that it can be implemented quite efficiently for some simple concept classes like, for example, unions of intervals, axis-parallel rectangles, and TREE(2,n,2,k) which is a broad subclass of 2-level decision trees. For the first class, there is an easy algorithm with time bound O(m logm). For the second-one (resp. the third-one), we design an algorithm that applies the well-known UNION-FIND data structure and has an almost quadratic time bound (resp. time bound O(n 2 m logm)).  相似文献   

13.
Aizenstein  Howard  Pitt  Leonard 《Machine Learning》1995,19(3):183-208
We present two related results about the learnability of disjunctive normal form (DNF) formulas. First we show that a common approach for learning arbitrary DNF formulas requires exponential time. We then contrast this with a polynomial time algorithm for learning most (rather than all) DNF formulas. A natural approach for learning boolean functions involves greedily collecting the prime implicants of the hidden function. In a seminal paper of learning theory, Valiant demonstrated the efficacy of this approach for learning monotone DNF, and suggested this approach for learning DNF. Here we show that no algorithm using such an approach can learn DNF in polynomial time. We show this by constructing a counterexample DNF formula which would force such an algorithm to take exponential time. This counterexample seems to capture much of what makes DNF hard to learn, and thus is useful to consider when evaluating the run-time of a proposed DNF learning algorithm. This hardness result, as well as other hardness results for learning DNF, relies on the construction of particular hard-to-learn formulas, formulas that appear to be relatively rare. This raises the question of whether most DNF formulas are learnable. For certain natural definitions of most DNF formulas, we answer this question affirmatively.  相似文献   

14.
We apply a DNA-based massively parallel exhaustive search to solving the computational learning problems of DNF (disjunctive normal form) Boolean formulae. Learning DNF formulae from examples is one of the most important open problems in computational learning theory and the problem of learning 3-term DNF formulae is known as intractable if RP NP. We propose new methods to encode any k-term DNF formula to a DNA strand, evaluate the encoded DNF formula for a truth-value assignment by using hybridization and primer extension with DNA polymerase, and find a consistent DNF formula with the given examples. By employing these methods, we show that the class of k-term DNF formulae (for any constant k) and the class of general DNF formulae are efficiently learnable on DNA computer.Second, in order for the DNA-based learning algorithm to be robust for errors in the data, we implement the weighted majority algorithm on DNA computers, called DNA-based majority algorithm via amplification (DNAMA), which take a strategy of ``amplifying' the consistent (correct) DNA strands. We show a theoretical analysis for the mistake bound of the DNA-based majority algorithm via amplification, and imply that the amplification to ``double the volumes' of the correct DNA strands in the test tube works well.  相似文献   

15.
We consider the problem of on-line prediction competitive with a benchmark class of continuous but highly irregular prediction rules. It is known that if the benchmark class is a reproducing kernel Hilbert space, there exists a prediction algorithm whose average loss over the first N examples does not exceed the average loss of any prediction rule in the class plus a “regret term” of O(N ?1/2). The elements of some natural benchmark classes, however, are so irregular that these classes are not Hilbert spaces. In this paper we develop Banach-space methods to construct a prediction algorithm with a regret term of O(N ?1/p ), where p∈[2,∞) and p?2 reflects the degree to which the benchmark class fails to be a Hilbert space. Only the square loss function is considered.  相似文献   

16.
We introduce the NP-hard graph-based data clustering problem s-Plex Cluster Vertex Deletion, where the task is to delete at most?k vertices from a graph so that the connected components of the resulting graph are s-plexes. In an s-plex, every vertex has an edge to all but at most s?1?other vertices; cliques are 1-plexes. We propose a new method based on “approximation and tidying” for kernelizing vertex deletion problems whose goal graphs can be characterized by forbidden induced subgraphs. The method exploits polynomial-time approximation results and thus provides a useful link between approximation and kernelization. Employing “approximation and tidying”, we develop data reduction rules that, in?O(ksn 2) time, transform an s-Plex Cluster Vertex Deletion instance with n vertices into an equivalent instance with O(k 2 s 3)?vertices, yielding a problem kernel. To this end, we also show how to exploit structural properties of the specific problem in order to significantly improve the running time of the proposed kernelization method.  相似文献   

17.
18.
Local iteration methods for the solution of nonlinear operator equationsT(x)=0 can be “globalized” by the method of embedding. This means that the initial valuex 0 need not be in the neighbourhood of a solution \(\bar x\) . We restrict ourselves to a stepwise calculation — see [17] — where one solves the chain of problems:T(x, s i )=0, 0=s 0<s 1<...<s N =1 by a local iteration method with the solutionx i?1 ofT(x, s i?1)=0 used as initial vector forT(x, s i )=0. In this paper we examine methods of minimal-residue-type. Results on the minimization of the whole effort by selection of an “optimal” step width can be obtained. Furthermore, a global convergence acceleration can be reached by suitable adaption of (linear) interpolation on the step width. The reslts obtained can be transferred without difficulties to any other local iteration method which is based on the principle contraction.  相似文献   

19.
In this paper we exhibit several new classes of hash functions with certain desirable properties, and introduce two novel applications for hashing which make use of these functions. One class contains a small number of functions, yet is almost universal2. If the functions hash n-bit long names into m-bit indices, then specifying a member of the class requires only O((m + log2log2(n)) · log2(n)) bits as compared to O(n) bits for earlier techniques. For long names, this is about a factor of m larger than the lower bound of m + log2n ? log2m bits. An application of this class is a provably secure authentication technique for sending messages over insecure lines. A second class of functions satisfies a much stronger property than universal2. We present the application of testing sets for equality.The authentication technique allows the receiver to be certain that a message is genuine. An “enemy”—even one with infinite computer resources—cannot forge or modify a message without detection. The set equality technique allows operations including “add member to set,” “delete member from set” and “test two sets for equality” to be performed in expected constant time and with less than a specified probability of error.  相似文献   

20.
Based on the uniform distribution PAC learning model, the learnability for the class of monotone disjunctive normal form formulas with at most O (log n ) terms, denoted O (log n )-term MDNF, is investigated. Using the technique of restriction, an algorithm that learns O (log n )-term MDNF by examples in polynomial time is given. Received February 2, 1998, and in revised form April 8, 1999, and in final form June 30, 1999.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号