Learning criteria for training neural network classifiers期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Learning criteria for training neural network classifiers

Authors:	P. Zhou J. Austin

Affiliation:	(1) Advanced Computer Architecture Group, Department of Computer Science, University of York, YO10 5DD Heslington, York, UK

Abstract:	This paper presents a study of two learning criteria and two approaches to using them for training neural network classifiers, specifically a Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks. The first approach, which is a traditional one, relies on the use of two popular learning criteria, i.e. learning via minimising a Mean Squared Error (MSE) function or a Cross Entropy (CE) function. It is shown that the two criteria have different charcteristics in learning speed and outlier effects, and that this approach does not necessarily result in a minimal classification error. To be suitable for classification tasks, in our second approach an empirical classification criterion is introduced for the testing process while using the MSE or CE function for the training. Experimental results on several benchmarks indicate that the second approach, compared with the first, leads to an improved generalisation performance, and that the use of the CE function, compared with the MSE function, gives a faster training speed and improved or equal generalisation performance.Nomenclature x random input vector withd real number components [x₁ ...x_d] - t random target vector withc binary components [t₁ ...t_c] - y(·) neural network function or output vector - parameters of a neural model - learning rate - momentum - decay factor - O objective function - E mean sum-of-squares error function - L cross entropy function - n nth training pattern - N number of training patterns - (·) transfer function in a neural unit - z_j output of hidden unit-j - a_i activation of unit-j - W_ij weight from hidden unit-j to output unit-i - W_jl⁰ weight from input unit-l to hidden unit-j - _j centre vector [_{j 1}... _jd] of RBF unit-j - _j width vector [_{j 1}, ..._jd] of RBF unit-j - p( ·¦·) conditional probability function

Keywords:	Benchmarks Learning criteria Multilayer perceptron networks Pattern classification Radial basis function networks Training methods
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏