期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Accelerating 2D orthogonal matching pursuit algorithm on GPU

Yuan Dai Dongjian He Yong Fang Long Yang 《The Journal of supercomputing》2014,69(3):1363-1381

Two-dimensional orthogonal matching pursuit (2D-OMP) algorithm is an extension of the one-dimensional OMP (1D-OMP), whose complexity and memory usage are lower than the 1D-OMP when they are applied to 2D sparse signal recovery. However, the major shortcoming of the 2D-OMP still resides in long computing time. To overcome this disadvantage, we develop a novel parallel design strategy of the 2D-OMP algorithm on a graphics processing unit (GPU) in this paper. We first analyze the complexity of the 2D-OMP and point out that the bottlenecks lie in matrix inverse and projection. After adopting the strategy of matrix inverse update whose performance is superior to traditional methods to reduce the complexity of original matrix inverse, projection becomes the most time-consuming module. Hence, a parallel matrix–matrix multiplication leveraging tiling algorithm strategy is launched to accelerate projection computation on GPU. Moreover, a fast matrix–vector multiplication, a parallel reduction algorithm, and some other parallel skills are also exploited to boost the performance of the 2D-OMP further on GPU. In the case of the sensing matrix of size 128 $\times $ 256 (176 $\times $ 256, resp.) for a 256 $\times $ 256 scale image, experimental results show that the parallel 2D-OMP achieves 17 $\times $ to 41 $\times $ (24 $\times $ to 62 $\times $ , resp.) speedup over the original C code compiled with the O $_2$ optimization option. Higher speedup would be further obtained with larger-size image recovery. 相似文献

2.

Another step to the full GPU implementation of the weather research and forecasting model

Juan Pablo Silva José Hagopian Marcel Burdiat Ernesto Dufrechou Martín Pedemonte Alejandro Gutiérrez Gabriel Cazes Pablo Ezzatti 《The Journal of supercomputing》2014,70(2):746-755

Uruguay is currently undergoing a gradual process of inclusion of wind energy in its matrix of electric power generation. In this context, a computational tool has been developed to predict the electrical power that will be injected into the grid. The tool is based on the Weather Research and Forecasting (WRF) numerical model, which is the performance bottleneck of the application. For this reason, and in line with several successful efforts of other researchers, this article presents advances in porting the WRF to GPU. In particular, we present the implementation of sintb and bdy_interp1 routines on GPU and the integration of these routines with previous efforts from other authors. The speedup values obtained for the newly ported routines on a Nvidia GeForce GTX 480 GPU are up to $33.9\times $ when compared with the sequential WRF and $9.2\times $ when compared with the four-threaded WRF. The integration of the newly ported routines along with previous works produces a reduction of more than a 30 % in the total runtime of the multi-core four-threaded WRF and of more than a 50 % in the single-threaded version. 相似文献

3.

Aircraft noise scattering prediction using different accelerator architectures

M. López-Portugués J. A. López-Fernández N. Díaz-Gracia R. G. Ayestarán José Ranilla 《The Journal of supercomputing》2014,70(2):612-622

In this work, we present a tool that exploits heterogeneous computing to calculate the noise scattered by an object from the pressure distribution over its surface and its normal derivative. The method mainly deals with a large Matrix–Vector Product where the matrix elements must be calculated on the fly in such a way that the problem fits in main memory. To prove the performance of the heterogeneous implementations, the tool is tested using one NVIDIA K20c GPU, one Intel Xeon Phi 5110P, and two Intel Xeon E5-2650 CPUs. The speedup of the accelerated implementations ranges from $3\times $ (Xeon Phi) to $8\times $ (Xeon Phi $+$ K20c) when compared to our parallel CPU code with $32$ threads. This work, combined with the authors’ previous works for the computation of the acoustic pressure over the obstacle surface, results in a valuable toolset for noise control applications during aircraft design. 相似文献

4.

Quantum discord in spin systems with dipole–dipole interaction

E. I. Kuznetsova M. A. Yurischev 《Quantum Information Processing》2013,12(11):3587-3605

The behavior of total quantum correlations (discord) in dimers consisting of dipolar-coupled spins 1/2 are studied. We found that the discord $Q=0$ at absolute zero temperature. As the temperature $T$ increases, the quantum correlations in the system increase at first from zero to its maximum and then decrease to zero according to the asymptotic law $T^{-2}$ . It is also shown that in absence of external magnetic field $B$ , the classical correlations $C$ at $T\rightarrow 0$ are, vice versa, maximal. Our calculations predict that in crystalline gypsum $\hbox {CaSO}_{4}\cdot \hbox {2H}_{2}{\hbox {O}}$ the value of natural $(B=0)$ quantum discord between nuclear spins of hydrogen atoms is maximal at the temperature of 0.644 $\upmu $ K, and for 1,2-dichloroethane $\hbox {H}_{2}$ ClC– $\hbox {CH}_{2}{\hbox {Cl}}$ the discord achieves the largest value at $T=0.517~\upmu $ K. In both cases, the discord equals $Q\approx 0.083$ bit/dimer what is $8.3\,\%$ of its upper limit in two-qubit systems. We estimate also that for gypsum at room temperature $Q\sim 10^{-18}$ bit/dimer, and for 1,2-dichloroethane at $T=90$ K the discord is $Q\sim 10^{-17}$ bit per a dimer. 相似文献

5.

Finding peculiar compositions of two frequent strings with background texts

Daisuke Ikeda Einoshin Suzuki 《Knowledge and Information Systems》2014,41(2):499-530

We consider mining unusual patterns from a set $T$ of target texts. A typical method outputs unusual patterns if their observed frequencies are far from their expectation estimated under an assumed probabilistic model. However, it is difficult for the method to deal with the zero frequency and thus it suffers from data sparseness. We employ another set $B$ of background texts to define a composition $xy$ to be peculiar if both $x$ and $y$ are more frequent in $B$ than in $T$ and conversely $xy$ is more frequent in $T$ . $xy$ is unusual because $x$ and $y$ are infrequent in $T$ while $xy$ is unexpectedly frequent compared to $xy$ in $B$ . To find frequent subpatterns and infrequent patterns simultaneously, we develop a fast algorithm using the suffix tree and show that it scales almost linearly under practical settings of parameters. Experiments using DNA sequences show that found peculiar compositions basically appear in rRNA while patterns found by an existing method seem not to relate to specific biological functions. We also show that discovered patterns have similar lengths at which the distribution of frequencies of fixed length substrings begins to skew. This fact explains why our method can find long peculiar compositions. 相似文献

6.

The Nonconvergence of h-Refinement in Prolate Elements

John P. Boyd Gregor Gassner Burhan A. Sadiq 《Journal of scientific computing》2013,57(2):372-389

Prolate elements are a “plug-compatible” modification of spectral elements in which Legendre polynomials are replaced by prolate spheroidal wave functions of order zero. Prolate functions contain a“bandwidth parameter” $c \ge 0 $ c ≥ 0 whose value is crucial to numerical performance; the prolate functions reduce to Legendre polynomials for $c\,=\,0$ c = 0 . We show that the optimal bandwidth parameter $c$ c not only depends on the number of prolate modes per element $N$ N , but also on the element widths $h$ h . We prove that prolate elements lack $h$ h -convergence for fixed $c$ c in the sense that the error does not go to zero as the element size $h$ h is made smaller and smaller. Furthermore, the theoretical predictions that Chebyshev and Legendre polynomials require $\pi $ π degrees of freedom per wavelength to resolve sinusoidal functions while prolate series need only 2 degrees of freedom per wavelength are asymptotic limits as $N \rightarrow \infty $ N → ∞ ; we investigate the rather different behavior when $N \sim O(4-10)$ N ～ O ( 4 ? 10 ) as appropriate for spectral elements and prolate elements. On the other hand, our investigations show that there are certain combinations of $N,\,h$ N , h and $c>0$ c > 0 where a prolate basis clearly outperforms the Legendre polynomial approximation. 相似文献

7.

A new method for solving an arbitrary fully fuzzy linear system

S. Moloudzadeh T. Allahviranloo P. Darabi 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2013,17(9):1725-1731

In this paper, We propose a simple and practical method (that works only for triangular fuzzy numbers) to solve an arbitrary fully fuzzy linear system (FFLS) in the form $\widetilde{A}\otimes \widetilde{x}=\widetilde{b},$ where $\widetilde{A}_{n \times n}$ is a fuzzy matrix, $\widetilde{x}$ and $\widetilde{b}$ are n × 1 fuzzy vectors. The idea of the presented method is constructed based on the extending 0-cut and 1-cut solution of original fully fuzzy linear systems (FFLS). We also define a fuzzy solution of FFLS and establish the necessary and sufficient conditions for the uniqueness of a fuzzy solution. 相似文献

8.

Fault tolerance in distributed systems using fused state machines

Bharath Balasubramanian Vijay K. Garg 《Distributed Computing》2014,27(4):287-311

Replication is a standard technique for fault tolerance in distributed systems modeled as deterministic finite state machines (DFSMs or machines). To correct $f$ crash or $\lfloor f/2 \rfloor $ Byzantine faults among $n$ different machines, replication requires $nf$ backup machines. We present a solution called fusion that requires just $f$ backup machines. First, we build a framework for fault tolerance in DFSMs based on the notion of Hamming distances. We introduce the concept of an ( $f$ , $m$ )-fusion, which is a set of $m$ backup machines that can correct $f$ crash faults or $\lfloor f/2 \rfloor $ Byzantine faults among a given set of machines. Second, we present an algorithm to generate an ( $f$ , $f$ )-fusion for a given set of machines. We ensure that our backups are efficient in terms of the size of their state and event sets. Third, we use locality sensitive hashing for the detection and correction of faults that incurs almost the same overhead as that for replication. We detect Byzantine faults with time complexity $O(n f)$ on average while we correct crash and Byzantine faults with time complexity $O(n \rho f)$ with high probability, where $\rho $ is the average state reduction achieved by fusion. Finally, our evaluation of fusion on the widely used MCNC’91 benchmarks for DFSMs shows that the average state space savings in fusion (over replication) is 38 % (range 0–99 %). To demonstrate the practical use of fusion, we describe its potential application to two areas: sensor networks and the MapReduce framework. In the case of sensor networks a fusion-based solution can lead to significantly fewer sensor-nodes than a replication-based solution. For the MapReduce framework, fusion can reduce the number of map-tasks compared to replication. Hence, fusion results in considerable savings in state space and other resources such as the power needed to run the backups. 相似文献

9.

A Speculative Parallel DFA Membership Test for Multicore,SIMD and Cloud Computing Environments

Yousun Ko Minyoung Jung Yo-Sub Han Bernd Burgstaller 《International journal of parallel programming》2014,42(3):456-489

We present techniques to parallelize membership tests for Deterministic Finite Automata (DFAs). Our method searches arbitrary regular expressions by matching multiple bytes in parallel using speculation. We partition the input string into chunks, match chunks in parallel, and combine the matching results. Our parallel matching algorithm exploits structural DFA properties to minimize the speculative overhead. Unlike previous approaches, our speculation is failure-free, i.e., (1) sequential semantics are maintained, and (2) speed-downs are avoided altogether. On architectures with a SIMD gather-operation for indexed memory loads, our matching operation is fully vectorized. The proposed load-balancing scheme uses an off-line profiling step to determine the matching capacity of each participating processor. Based on matching capacities, DFA matches are load-balanced on inhomogeneous parallel architectures such as cloud computing environments. We evaluated our speculative DFA membership test for a representative set of benchmarks from the Perl-compatible Regular Expression (PCRE) library and the PROSITE protein database. Evaluation was conducted on a 4 CPU (40 cores) shared-memory node of the Intel Academic Program Manycore Testing Lab (Intel MTL), on the Intel AVX2 SDE simulator for 8-way fully vectorized SIMD execution, and on a 20-node (288 cores) cluster on the Amazon EC2 computing cloud. Obtained speedups are on the order of $\mathcal O \left( 1+\frac{|P|-1}{|Q|\cdot \gamma }\right) $ , where $|P|$ denotes the number of processors or SIMD units, $|Q|$ denotes the number of DFA states, and $0<\gamma \le 1$ represents a statically computed DFA property. For all observed cases, we found that $0.02<\gamma <0.47$ . Actual speedups range from 2.3 $\times $ to 38.8 $\times $ for up to 512 DFA states for PCRE, and between 1.3 $\times $ and 19.9 $\times $ for up to 1,288 DFA states for PROSITE on a 40-core MTL node. Speedups on the EC2 computing cloud range from 5.0 $\times $ to 65.8 $\times $ for PCRE, and from 5.0 $\times $ to 138.5 $\times $ for PROSITE. Speedups of our C-based DFA matcher over the Perl-based ScanProsite scan tool range from 559.3 $\times $ to 15079.7 $\times $ on a 40-core MTL node. We show the scalability of our approach for input-sizes of up to 10 GB. 相似文献

10.

A note on the growth factor in Gaussian elimination for accretive-dissipative matrices

Minghua Lin 《Calcolo》2014,51(3):363-366

This short note proves that if $A$ is accretive-dissipative, then the growth factor for such $A$ in Gaussian elimination is less than $4$ . If $A$ is a Higham matrix, i.e., the accretive-dissipative matrix $A$ is complex symmetric, then the growth factor is less than $2\sqrt{2}$ . The result obtained improves those of George et al. in [Numer. Linear Algebra Appl. 9, 107–114 (2002)] and is one step closer to the final solution of Higham’s conjecture. 相似文献

11.

Greedy routing in small-world networks with power-law degrees

Pierre Fraigniaud George Giakkoupis 《Distributed Computing》2014,27(4):231-253

In this paper we study decentralized routing in small-world networks that combine a wide variation in node degrees with a notion of spatial embedding. Specifically, we consider a variant of J. Kleinberg’s grid-based small-world model in which (1) the number of long-range edges of each node is not fixed, but is drawn from a power-law probability distribution with exponent parameter $\alpha \ge 0$ and constant mean, and (2) the long-range edges are considered to be bidirectional for the purposes of routing. This model is motivated by empirical observations indicating that several real networks have degrees that follow a power-law distribution. The measured power-law exponent $\alpha $ for these networks is often in the range between 2 and 3. For the small-world model we consider, we show that when $2 < \alpha < 3$ the standard greedy routing algorithm, in which a node forwards the message to its neighbor that is closest to the target in the grid, finishes in an expected number of $O(\log ^{\alpha -1} n\cdot \log \log n)$ steps, for any source–target pair. This is asymptotically smaller than the $O(\log ^2 n)$ steps needed in Kleinberg’s original model with the same average degree, and approaches $O(\log n)$ as $\alpha $ approaches 2. Further, we show that when $0\le \alpha < 2$ or $\alpha \ge 3$ the expected number of steps is $O(\log ^2 n)$ , while for $\alpha = 2$ it is $O(\log ^{4/3} n)$ . We complement these results with lower bounds that match the upper bounds within at most a $\log \log n$ factor. 相似文献

12.

Multi-dimensional color image storage and retrieval for a normal arbitrary quantum superposition state

Hai-Sheng Li Qingxin Zhu Ri-Gui Zhou Lan Song Xing-jiang Yang 《Quantum Information Processing》2014,13(4):991-1011

Multi-dimensional color image processing has two difficulties: One is that a large number of bits are needed to store multi-dimensional color images, such as, a three-dimensional color image of $1024 \times 1024 \times 1024$ needs $1024 \times 1024 \times 1024 \times 24$ bits. The other one is that the efficiency or accuracy of image segmentation is not high enough for some images to be used in content-based image search. In order to solve the above problems, this paper proposes a new representation for multi-dimensional color image, called a $(n\,+\,1)$ -qubit normal arbitrary quantum superposition state (NAQSS), where $n$ qubits represent colors and coordinates of ${2^n}$ pixels (e.g., represent a three-dimensional color image of $1024 \times 1024 \times 1024$ only using 30 qubits), and the remaining 1 qubit represents an image segmentation information to improve the accuracy of image segmentation. And then we design a general quantum circuit to create the NAQSS state in order to store a multi-dimensional color image in a quantum system and propose a quantum circuit simplification algorithm to reduce the number of the quantum gates of the general quantum circuit. Finally, different strategies to retrieve a whole image or the target sub-image of an image from a quantum system are studied, including Monte Carlo sampling and improved Grover’s algorithm which can search out a coordinate of a target sub-image only running in $O(\sqrt{N/r} )$ where $N$ and $r$ are the numbers of pixels of an image and a target sub-image, respectively. 相似文献

13.

Self-Referential Justifications in Epistemic Logic

Roman Kuznets 《Theory of Computing Systems》2010,46(4):636-661

This paper is devoted to the study of self-referential proofs and/or justifications, i.e., valid proofs that prove statements about these same proofs. The goal is to investigate whether such self-referential justifications are present in the reasoning described by standard modal epistemic logics such as $\mathsf{S4}$ . We argue that the modal language by itself is too coarse to capture this concept of self-referentiality and that the language of justification logic can serve as an adequate refinement. We consider well-known modal logics of knowledge/belief and show, using explicit justifications, that $\mathsf{S4}$ , $\mathsf{D4}$ , $\mathsf{K4}$ , and $\mathsf{T}$ with their respective justification counterparts $\mathsf{LP}$ , $\mathsf{JD4}$ , $\mathsf{J4}$ , and $\mathsf{JT}$ describe knowledge that is self-referential in some strong sense. We also demonstrate that self-referentiality can be avoided for $\mathsf{K}$ and $\mathsf{D}$ . In order to prove the former result, we develop a machinery of minimal evidence functions used to effectively build models for justification logics. We observe that the calculus used to construct the minimal functions axiomatizes the reflected fragments of justification logics. We also discuss difficulties that result from an introduction of negative introspection. 相似文献

14.

A new set of generators and a physical interpretation for the SU (3) finite subgroup D(9,1,1;2,1,1)

Bela Bauer Claire Levaillant 《Quantum Information Processing》2013,12(7):2509-2521

After 100 years of effort, the classification of all the finite subgroups of $SU(3)$ is yet incomplete. The most recently updated list can be found in Ludl (J Phys A Math Theory 44:255204, 2011), where the structure of the series $(C)$ and $(D)$ of $SU(3)$ -subgroups is studied. We provide a minimal set of generators for one of these groups which has order $162$ . These generators appear up to phase as the image of an irreducible unitary braid group representation issued from the Jones–Kauffman version of $SU(2)$ Chern–Simons theory at level $4$ . In light of these new generators, we study the structure of the group in detail and recover the fact that it is isomorphic to the semidirect product $\mathbb Z _9\times \mathbb Z _3\rtimes S_3$ with respect to conjugation. 相似文献

15.

Better Size Estimation for Sparse Matrix Products

Rasmus Resen Amossen Andrea Campagna Rasmus Pagh 《Algorithmica》2014,69(3):741-757

We consider the problem of doing fast and reliable estimation of the number z of non-zero entries in a sparse boolean matrix product. This problem has applications in databases and computer algebra. Let n denote the total number of non-zero entries in the input matrices. We show how to compute a 1±ε approximation of z (with small probability of error) in expected time for any $\varepsilon> 4/\sqrt[4]{z}$ . The previously best estimation algorithm, due to Cohen (J. Comput. Syst. Sci. 53(3):441–453, 1997), uses time . We also present a variant using I/Os in expectation in the cache-oblivious model. In contrast to these results, the currently best algorithms for computing a sparse boolean matrix product use time ω(n ^4/3) (resp. ω(n ^4/3/B) I/Os), even if the result matrix is restricted to nonzero entries. Our algorithm combines the size estimation technique of Bar-Yossef et al. (Proceedings of the 6th International Workshop on Randomization and Approximation Techniques (RANDOM ’02), pp. 1–10, 2002) with a particular class of pairwise independent hash functions that allows the sketch of a set of the form to be computed in expected time and I/Os. We then describe how sampling can be used to maintain (independent) sketches of matrices that allow estimation to be performed in time o(n) if z is sufficiently large. This gives a simpler alternative to the sketching technique of Ganguly et al. (Proceedings of the 24th ACM Symposium on Principles of Database Systems (PODS ’05), pp. 259–270, 2005), and matches a space lower bound shown in that paper. Finally, we present experiments on real-world data sets that show the accuracy of both our methods to be significantly better than the worst-case analysis predicts. 相似文献

16.

Water film washers and mixers: their rotational modes and electro-hydrodynamical flows induced by square-wave electric fields

Zhong-Qiang Liu Ying-Jun Li Kong-Yin Gan Su-Rong Jiang Guang-Cai Zhang 《Microfluidics and nanofluidics》2013,14(1-2):319-328

Our study indicates suspended water films, driven by crossed square-wave electric fields, may act as “washers”, “centrifuges” and “mixers”. Based on our recent model successfully describing experimentally observed electro-hydrodynamical flows in water films, we derive conditions for generating specific rotational flows. Our main findings, which are advantageous for microfluidic devices and basic research, are as follows: (1) the film’s rotational patterns depend on the phase difference $\Updelta\varphi$ and frequencies f of the applied fields. (2) For $\Updelta\varphi=\pi/2$ and $f\sim10^{-1}-10^{0}$ Hz, the film exhibits symmetrical reciprocating rotations, i.e. it constitutes a washer. (3) For $\Updelta\varphi=0$ and any f, it exhibits an ordinary anticlockwise (or clockwise) rotation, i.e. it constitutes a centrifuge or motor. (4) For $\Updelta\varphi$ with other values and f below 1 Hz, it exhibits asymmetrical reciprocating rotations, it constitutes an asymmetrical mixer. 相似文献

17.

Optimized Parameter Search for Large Datasets of the Regularization Parameter and Feature Selection for Ridge Regression

Pieter Buteneers Ken Caluwaerts Joni Dambre David Verstraeten Benjamin Schrauwen 《Neural Processing Letters》2013,38(3):403-416

In this paper we propose mathematical optimizations to select the optimal regularization parameter for ridge regression using cross-validation. The resulting algorithm is suited for large datasets and the computational cost does not depend on the size of the training set. We extend this algorithm to forward or backward feature selection in which the optimal regularization parameter is selected for each possible feature set. These feature selection algorithms yield solutions with a sparse weight matrix using a quadratic cost on the norm of the weights. A naive approach to optimizing the ridge regression parameter has a computational complexity of the order $O(R K N^{2} M)$ with $R$ the number of applied regularization parameters, $K$ the number of folds in the validation set, $N$ the number of input features and $M$ the number of data samples in the training set. Our implementation has a computational complexity of the order $O(KN^3)$ . This computational cost is smaller than that of regression without regularization $O(N^2M)$ for large datasets and is independent of the number of applied regularization parameters and the size of the training set. Combined with a feature selection algorithm the algorithm is of complexity $O(RKNN_s^3)$ and $O(RKN^3N_r)$ for forward and backward feature selection respectively, with $N_s$ the number of selected features and $N_r$ the number of removed features. This is an order $M$ faster than $O(RKNN_s^3M)$ and $O(RKN^3N_rM)$ for the naive implementation, with $N \ll M$ for large datasets. To show the performance and reduction in computational cost, we apply this technique to train recurrent neural networks using the reservoir computing approach, windowed ridge regression, least-squares support vector machines (LS-SVMs) in primal space using the fixed-size LS-SVM approximation and extreme learning machines. 相似文献

18.

ENAMeL: A Language for Binary Correlation Matrix Memories

Nathan Burles Simon O’Keefe James Austin Stephen Hobson 《Neural Processing Letters》2014,40(1):1-23

Despite their relative simplicity, correlation matrix memories (CMMs) are an active area of research, as they are able to be integrated into more complex architectures such as the Associative Rule Chaining Architecture (ARCA) “Austin et al. (International conference on artificial neural networks, pp 49–56, 2012)”. In this architecture, CMMs are used effectively in order to reduce the time complexity of a tree search from $O(b^d)$ to $O(d)$ —where $b$ is the branching factor and $d$ is the depth of the tree. This paper introduces the Extended Neural Associative Memory Language (ENAMeL)—a domain specific language developed to ease development of applications using CMMs. We discuss various considerations required while developing the language, and techniques used to reduce the memory requirements of CMM-based applications. Finally we show that the memory requirements of ARCA when using the ENAMeL interpreter compare favourably to our original results “Austin et al. (International conference on artificial neural networks, pp 49–56, 2012)” run in MATLAB. 相似文献

19.

Collaborative filtering with information-rich and information-sparse entities

Kai Zhu Rui Wu Lei Ying R. Srikant 《Machine Learning》2014,97(1-2):177-203

In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with $\omega (MK \log M)$ noisy entries while $MK$ entries are necessary, where $K$ is the number of clusters and $M$ is the number of items. In the case of co-clustering, we prove that $K^2$ entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when $K$ is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets. 相似文献

20.

Parameterized Complexity of Induced Graph Matching on Claw-Free Graphs

Danny Hermelin Matthias Mnich Erik Jan van Leeuwen 《Algorithmica》2014,70(3):513-560

The Induced Graph Matching problem asks to find $k$ disjoint induced subgraphs isomorphic to a given graph $H$ in a given graph $G$ such that there are no edges between vertices of different subgraphs. This problem generalizes the classical Independent Set and Induced Matching problems, among several other problems. We show that Induced Graph Matching is fixed-parameter tractable in $k$ on claw-free graphs when $H$ is a fixed connected graph, and even admits a polynomial kernel when $H$ is a complete graph. Both results rely on a new, strong, and generic algorithmic structure theorem for claw-free graphs. Complementing the above positive results, we prove $\mathsf {W}[1]$ -hardness of Induced Graph Matching on graphs excluding $K_{1,4}$ as an induced subgraph, for any fixed complete graph $H$ . In particular, we show that Independent Set is $\mathsf {W}[1]$ -hard on $K_{1,4}$ -free graphs. Finally, we consider the complexity of Induced Graph Matching on a large subclass of claw-free graphs, namely on proper circular-arc graphs. We show that the problem is either polynomial-time solvable or $\mathsf {NP}$ -complete, depending on the connectivity of $H$ and the structure of $G$ . 相似文献