Traditionally, in supervised machine learning, (a significant) part of the available data (usually 50%-80%) is used for training and the rest—for validation. In many problems, however, the data are highly imbalanced in regard to different classes or does not have good coverage of the feasible data space which, in turn, creates problems in validation and usage phase. In this paper, we propose a technique for synthesizing feasible and likely data to help balance the classes as well as to boost the performance in terms of confusion matrix as well as overall. The idea, in a nutshell, is to synthesize data samples in close vicinity to the actual data samples specifically for the less represented (minority) classes. This has also implications to the so-called fairness of machine learning. In this paper, we propose a specific method for synthesizing data in a way to balance the classes and boost the performance, especially of the minority classes. It is generic and can be applied to different base algorithms, for example, support vector machines, k-nearest neighbour classifiers deep neural, rule-based classifiers, decision trees, and so forth. The results demonstrated that (a) a significantly more balanced (and fair) classification results can be achieved and (b) that the overall performance as well as the performance per class measured by confusion matrix can be boosted. In addition, this approach can be very valuable for the cases when the number of actual available labelled data is small which itself is one of the problems of the contemporary machine learning. 相似文献
Nano Research - Insufficient intratumoral penetration greatly hurdles the anticancer performance of nanomedicine. To realize highly efficient tumor penetration in a precisely and spatiotemporally... 相似文献
Reconstructing gene regulatory networks (GRNs) plays an important role in identifying the complicated regulatory relationships, uncovering regulatory patterns in cells, and gaining a systematic view for biological processes. In order to reconstruct large-scale GRNs accurately, in this paper, we first use fuzzy cognitive maps (FCMs), which are a kind of cognition fuzzy influence graphs based on fuzzy logic and neural networks, to model GRNs. Then, a novel hybrid method is proposed to reconstruct GRNs from time series expression profiles using memetic algorithm (MA) combined with neural network (NN), which is labeled as MANNFCM-GRN. In MANNFCM-GRN, the MA is used to determine regulatory connections in GRNs and the NN is used to determine the interaction strength of the regulatory connections. In the experiments, the performance of MANNFCM-GRN is validated on both synthetic data and the benchmark dataset DREAM3 and DREAM4. The experimental results demonstrate the efficacy of MANNFCM-GRN and show that MANNFCM-GRN can reconstruct GRNs with high accuracy without expert knowledge. The comparison with existing algorithms also shows that MANNFCM-GRN outperforms ant colony optimization, non-linear Hebbian learning, and real-coded genetic algorithms.
World Wide Web - The wide spread use of positioning and photographing devices gives rise to a deluge of traffic trajectory data (e.g., vehicle passage records and taxi trajectory data), with each... 相似文献
Semiconductor particles doped Al2O3 coatings were prepared by cathode plasma electrolytic deposition in Al(NO3)3 electrolyte dispersed with SiC micro- and nano-particles (average particle sizes of 0.5–1.7?µm and 40?nm respectively). The effects of the concentrations and particle sizes of the SiC on the microstructures and tribological performances of the composite coatings were studied. In comparison with the case of dispersing with SiC microparticles, the dispersion of SiC nanoparticles in the coatings was more uniform. When the concentration of SiC nanoparticles was 5?g/L, the surface roughness of the composite coating was reduced by 63%, compared with that of the unmodified coating. Friction results demonstrated that the addition of 5?g/L SiC nanoparticles reduced the friction coefficient from 0.60 to 0.38 and decreased the wear volume under dry friction. The current density and bath voltage were measured to analyze the effects of SiC particles on the deposition process. The results showed that the SiC particles could alter the electrical behavior of the coatings during the deposition process, weaken the bombardment of the plasma, and improve the structures of the coatings. 相似文献