- Login or register to hide this ad -

Main Page

From RNNLIBwiki

Jump to: navigation, search

Contents

Introduction

RNNLIB is a recurrent neural network library for sequence labelling problems, such as speech and handwriting recognition. It implements the Long Short-Term Memory (LSTM) architecture1, as well as more traditional neural network structures, such as Multilayer Perceptrons and standard recurrent networks with nonlinear hidden units. Its most important features are:

All of which are explained in more detail in my Ph.D. thesis5. The library also implements the multilayer, subsampling structure developed for offline arabic handwriting recognition6. This structure allows the network to efficiently label high resolution data such as raw images and speech waveforms.

Taken together, the above components make RNNLIB a generic system for labelling and classifying data with one or more spatiotemporal dimensions. Perhaps its greatest strength is its flexibility: as well as speech and handwriting7 recognition, it has so far been applied (with varying degrees of success) to image classification, object recognition, facial expression recognition, EEG and fMRI classification, motion capture labelling, robot localisation, wind turbine energy prediction, signature verification, image compression and touch sensor classification. RNNLIB is also able to accept a wide variety of different input representations for the same task, e.g. raw sensor data or hand-crafted features (as shown for online handwriting8). See my homepage for more publications.

Installation

RNNLIB is written in C++ and should compile on any platform. However it is currently only tested for Linux and OSX.

Building it requires the following:

In addition, the following python packages are needed for the auxiliary scripts in the ‘utils’ directory:

And these packages are needed to create and manipulate netcdf data files with python, and to run the experiments in the ‘examples’ directory:

To build RNNLIB, first download the source, then enter the root directory and type

./configure
make

This should create the binary file ‘rnnlib’ in the ‘src’ directory. Note that on most linux systems the default installation directory for the Boost headers is ‘/usr/local/include/boost-VERSION_NUMBER’ which is not on the standard include path. In this case type

CXXFLAGS=-I/usr/local/include/boost-VERSION_NUMBER/ ./configure
make

If you wish to install the binary type:

make install

By default this will use ‘/usr’ as the installation root (for which you will usually need administrator privileges). You can change the install path with the --prefix option of the configure script (use ./configure --help for other options)

It is recommended that you add the directory containing the ‘rnnlib’ binary to your path, as otherwise the tools in the ‘utilities’ directory will not work.

Project files are provided for the following integrated development environments in the ‘ide’ directory:

Usage

RNNLIB can be run from the command line as follows:

Usage: rnnlib [config_options] config_file
config_options syntax: --<variable_name>=<variable_value>
whitespace not allowed in variable names or values
all config_file variables overwritten by config_options
setting <variable_value> = "" removes the variable from the config
repeated variables overwritten by last specified

All the parameters determining the network structure, experimental setup etc. can be specified either in the config file or on the command line.

The main parameters are as follows:

RNNLIB parameters
Parameter Type Allowed Values Default Comment
autosave boolean true,false false see below
batchLearn boolean true,false true if RPROP is used, false otherwise false => gradient descent updates at the end of each sequence, true => at the end of epochs only
dataFraction real 0-1 1 determines fraction of the data to load
hiddenBlock list of integer lists all >=1 Hidden layer block dimensions
hiddenSize integer list all >=1 Sizes of the hidden layers
hiddenType string tanh, linear, logistic, lstm, linear_lstm, softsign lstm Type of units in the hidden layers
inputBlock integer list all >= 1 Input layer block dimensions
maxTestsNoBest integer >=0 20 Number of error tests without improvement on the validation set before early stopping
optimiser steepest, rprop steepest
learnRate real 0-1 1e-4 Learning rate (steepest descent optimiser only)
momentum real 0-1 0.9 Momentum (steepest descent optimiser only)
subsampleSize integer list all >= 1 Sizes of hidden subsample layers
task string classification, sequence_classification, transcription Network task. sequence_* => one target for whole sequence (not for each point in the sequence). transcription => unsegmented sequence labelling with CTC.
trainFile string list Netcdf files used for training. Note that all datasets can consist of multiple files. During each training epoch, the files will be cycled through in random order, with the sequences cycled

randomly within each file

valFile string list Netcdf files used for validation / early stopping
testFile string list Netcdf files used for testing
verbose boolean true,false false Verbose console output

Parameter names and values are separated by whitespace, and must themselves contain no whitespace. Lists are comma separated, e.g.:

trainFile a.nc,b.nc,c.nc

and lists of lists are semicolon separated, e.g.:

hiddenBlock 3,3;4,4

See the ‘examples’ directory for examples of config files.

To override parameters at the command line, the syntax is:

rnnlib --OPTION_NAME=VALUE CONFIG_FILE

so e.g.

rnnlib --learnRate=1e-5 CONFIG_FILE

will override the learnRate set in the config file.

Autosave

If the 'autosave' option is true the system will store all dynamic information (e.g. network weights) as it runs. Without this there will be no way to to resume an interrupted experiment (e.g. if a computer crashes) and the final trained system will not be saved. If saving is activated, timestamped config files with dynamic information appended will be saved after each training epoch, and whenever one of the error measures for the given task is improved on. In addition a timestamped log file will be saved, containing all the console output. For example, for a classification task, the command

rnnlib --autosave=true classification.config

might create the following files

Data File Format

All RNNLIB data files (for training, testing and validation) are in netCDF format, a binary file format designed for large scientific datasets.

A netCDF file has the following basic structure:

o …

o …

o …

Following the statement ‘Variables’ the variables that will listed in the ‘Data’ section are declared. For example

float foo[ 3 ]

would declare an array of floats with size 3. For saving variable sized array the size can be declared after ‘Dimensions’. So the example would look like:

Dimensions:
fooSize= 3
Variables:
float foo[ fooSize ];

Following ‘Data’ the actual values are stored:

Data:
foo = 1,2,3;

The data format for RNNLIB is specified below. The codes at the start determine which tasks the dimension/variable is required for:

Dimensions:

Variables:

netCDF Operator provides several tools for creating, manipulating and displaying netCDF files, and is recommended for anyone wanting to make their own datasets. In particular the toold ncgen and ncdump convert ASCII text files to and from netcdf format.

Examples

The ‘examples’ directory provides example experiments that can be run with RNNLIB. To run the experiments, the ‘utilities’ directory must be added to your pythonpath, and the following python packages must be installed:

In each subdirectory type

./build_netcdf

to build the netcdf datasets, then

rnnlib SAMPLE_NAME.config

to run the experiments. Note that some directories may contain more than 1 config file, since different tasks may be defined for the same data.

The results of these experiments will not correspond to published results, because only a fraction of the complete dataset is used in each case (to keep the size of the distribution down). In addition, early stopping is not used, because no validation files are created. However the same scripts can be used to build realistic experiments, given more data.

If you want to adapt the python scripts to create netcdf files for your own experiments, here is a useful tutorial on using netcdf with python.

Utilities

The ‘utilities’ directory provides a range of auxiliary tools for RNNLIB. In order for these to work, the directory containing the ‘rnnlib’ binary must be added to your path. The ‘utilities’ directory must be added to your pythonpath for the experiments in the ‘examples’ directory to work. The most important utilities are:

All files should provide a list of arguments if called with no arguments.The python scripts will give a list of optional arguments, defaults etc. if called with the single argument ‘-h’. The following python libraries are required for some of the scripts:

Experimental Results

RNNLIB’s best results so far have been in speech and handwriting recognition. It has matched the best recorded performance in phoneme recognition on the TIMIT database9, and recently won three handwriting recognition competitions at the ICDAR 2009 conference, for offline French10, offline Arabic11 and offline Farsi character classification12. Unlike the competing systems, RNNLIB worked entirely on raw inputs, and therefore did not require any preprocessing or alphabet-specific feature extraction. It also has among the best published results on the IAM Online and IAM offline English handwriting databases13.

Citations

If you use RNNLIB for your research, please cite it with the following reference:

@misc{
rnnlib,
Author = {Alex Graves},
Title = {RNNLIB: A recurrent neural network library for sequence learning problems},
howpublished = {\url{http://sourceforge.net/projects/rnnl/}}
}

Support

PLEASE NOTE: all support questions should be posted in the help forum. I will not respond to support emails.

References

1 Sepp Hochreiter and Jürgen Schmidhuber Long Short-Term Memory Neural Computation, 9(8):1735-1780, 1997

2 Alex Graves and Jürgen Schmidhuber Framewise phoneme classification with bidirectional LSTM and other neural network architectures Neural Networks, 18(5-6):602-610, June 2005

3 Alex Graves, Santiago Fernández, Faustino Gomez and Jürgen Schmidhuber Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks International Conference on Machine Learning, June 2006, Pittsburgh

4 Alex Graves, Santiago Fernández and Jürgen Schmidhuber Multidimensional recurrent neural networks International Conference on Artificial Neural Networks, September 2007, Porto

5 Alex Graves Supervised Sequence Labelling with Recurrent Neural Networks PhD thesis, July 2008, Technische Universität München

6 Alex Graves and Jürgen Schmidhuber Offline handwriting recognition with multidimensional recurrent neural networks Advances in Neural Information Processing Systems, December 2008, Vancouver

7 Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber. A novel connectionist system for unconstrained handwriting recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):855-868, May 2009

8 Alex Graves, Santiago Fernández, Marcus Liwicki, Horst Bunke, and Jürgen Schmidhuber. Unconstrained online handwriting recognition with recurrent neural networks Advances in Neural Information Processing Systems, December 2007, Vancouver

9 Santiago Fernández, Alex Graves, and Jürgen Schmidhuber Phoneme recognition in TIMIT with BLSTM-CTC Technical Report IDSIA-04-08, IDSIA, April 2008.

10 E. Grosicki, H. El Abed ICDAR 2009 Handwriting Recognition Competition International Conference on Document Analysis and Recognition, July 2009, Barcelona

11 V. Märgner and H. El Abed ICDAR 2009 Arabic Handwriting Recognition Competition International Conference on Document Analysis and Recognition, July 2009, Barcelona

12 S. Mozaffari and H. Soltanizadeh ICDAR 2009 Handwritten Farsi/Arabic Character Recognition Competition International Conference on Document Analysis and Recognition, July 2009, Barcelona

13 Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber. A novel connectionist system for unconstrained handwriting recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):855-868, May 2009

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox