keras

SimpleRNN

keras.layers.recurrent.SimpleRNN(input_dim, output_dim, 
        init='glorot_uniform', inner_init='orthogonal', activation='sigmoid', weights=None,
        truncate_gradient=-1, return_sequences=False)

Fully connected RNN where output is to fed back to input. Not a particularly useful model, included for demonstration purposes.

Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).
Output shape:
- if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
- else: 2D tensor with shape: (nb_samples, output_dim).
Arguments:
- input_dim: dimension of the input.
- output_dim: dimension of the internal projections and the final output.
- init: weight initialization function. Can be the name of an existing function (str), or a Theano function (see: initializations).
- activation: activation function. Can be the name of an existing function (str), or a Theano function (see: activations).
- weights: list of numpy arrays to set as initial weights. The list should have 3 elements, of shapes: [(input_dim, output_dim), (output_dim, output_dim), (output_dim,)].
- truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.

SimpleDeepRNN

keras.layers.recurrent.SimpleDeepRNN(input_dim, output_dim, depth=3,
        init='glorot_uniform', inner_init='orthogonal', 
        activation='sigmoid', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Fully connected RNN where the output of multiple timesteps (up to "depth" steps in the past) is fed back to the input:

output = activation( W.x_t + b + inner_activation(U_1.h_tm1) + inner_activation(U_2.h_tm2) + ... )

Not a particularly useful model, included for demonstration purposes.

Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).
Output shape:
- if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
- else: 2D tensor with shape: (nb_samples, output_dim).
Arguments:
- input_dim: dimension of the input.
- output_dim: dimension of the internal projections and the final output.
- depth: int >= 1. Lookback depth (eg. depth=1 is equivalent to SimpleRNN).
- init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
- inner_init: weight initialization function for the inner cells.
- activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
- inner_activation: activation function for the inner cells.
- weights: list of numpy arrays to set as initial weights. The list should have depth+2 elements.
- truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.

GRU

keras.layers.recurrent.GRU(input_dim, output_dim=128, 
        init='glorot_uniform', inner_init='orthogonal',
        activation='sigmoid', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Gated Recurrent Unit - Cho et al. 2014.

Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).
Output shape:
- if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
- else: 2D tensor with shape: (nb_samples, output_dim).
Arguments:
- input_dim: dimension of the input.
- output_dim: dimension of the internal projections and the final output.
- init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
- inner_init: weight initialization function for the inner cells.
- activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
- inner_activation: activation function for the inner cells.
- weights: list of numpy arrays to set as initial weights. The list should have 9 elements.
- truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
References:
- On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
- Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

LSTM

keras.layers.recurrent.LSTM(input_dim, output_dim=128, 
        init='glorot_uniform', inner_init='orthogonal', 
        activation='tanh', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Long-Short Term Memory unit - Hochreiter 1997.

Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).
Output shape:
- if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
- else: 2D tensor with shape: (nb_samples, output_dim).
Arguments:
input_dim: dimension of the input.
- output_dim: dimension of the internal projections and the final output.
- init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
- inner_init: weight initialization function for the inner cells.
- activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
- inner_activation: activation function for the inner cells.
- weights: list of numpy arrays to set as initial weights. The list should have 12 elements.
- truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
References:
- Long short-term memory (original 1997 paper)
- Learning to forget: Continual prediction with LSTM
- Supervised sequence labelling with recurrent neural networks