Theoretical results suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
Sequence recognition is a crucial element in many applications in the fields of speech analysis, control, and modeling. This book applies the techniques of neural networks and hidden Markov models to the problems of sequence recognition, and as such will prove valuable to researchers and graduate students alike.
"This thesis studies the introduction of a priori structure into the design of learning systems based on artificial neural networks applied to sequence recognition, in particular to phoneme recognition in continuous speech. Because we are interested in sequence analysis, algorithms for training recurrent networks are studied and an original algorithm for constrained recurrent networks is proposed and test results are reported. We also discuss the integration of connectionist models with other analysis tools that have been shown to be useful for sequences, such as dynamic programming and hidden Markov models. We introduce an original algorithm to perform global optimization of a neural network/hidden Markov model hybrid, and show how to perform such a global optimization on all the parameters of the system. Finally, we consider some alternatives to sigmoid networks: Radial Basis Functions, and a method for searching for better learning rules using a priori knowledge and optimization algorithms." --