首页 | 官方网站   微博 | 高级检索  
     


Learning criteria for training neural network classifiers
Authors:P Zhou  J Austin
Affiliation:(1) Advanced Computer Architecture Group, Department of Computer Science, University of York, YO10 5DD Heslington, York, UK
Abstract:This paper presents a study of two learning criteria and two approaches to using them for training neural network classifiers, specifically a Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks. The first approach, which is a traditional one, relies on the use of two popular learning criteria, i.e. learning via minimising a Mean Squared Error (MSE) function or a Cross Entropy (CE) function. It is shown that the two criteria have different charcteristics in learning speed and outlier effects, and that this approach does not necessarily result in a minimal classification error. To be suitable for classification tasks, in our second approach an empirical classification criterion is introduced for the testing process while using the MSE or CE function for the training. Experimental results on several benchmarks indicate that the second approach, compared with the first, leads to an improved generalisation performance, and that the use of the CE function, compared with the MSE function, gives a faster training speed and improved or equal generalisation performance.Nomenclature x random input vector withd real number components x 1 ...x d ] - t random target vector withc binary components t 1 ...t c ] - y(·) neural network function or output vector - theta parameters of a neural model - eegr learning rate - agr momentum - gamma decay factor - O objective function - E mean sum-of-squares error function - L cross entropy function - n nth training pattern - N number of training patterns - phiv(·) transfer function in a neural unit - z j output of hidden unit-j - a i activation of unit-j - W ij weight from hidden unit-j to output unit-i - W jl 0 weight from input unit-l to hidden unit-j - mgr j centre vector mgr j 1 ... mgr jd ] of RBF unit-j - sgr j width vector sgr j 1, ...sgr jd ] of RBF unit-j - p( ·¦·) conditional probability function
Keywords:Benchmarks  Learning criteria  Multilayer perceptron networks  Pattern classification  Radial basis function networks  Training methods
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号