keras

SimpleRNN

keras.layers.recurrent.SimpleRNN(input_dim, output_dim, 
        init='glorot_uniform', inner_init='orthogonal', activation='sigmoid', weights=None,
        truncate_gradient=-1, return_sequences=False)

Fully connected RNN where output is to fed back to input. Not a particularly useful model, included for demonstration purposes.

  • Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).

  • Output shape:

    • if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
    • else: 2D tensor with shape: (nb_samples, output_dim).
  • Arguments:

    • input_dim: dimension of the input.
    • output_dim: dimension of the internal projections and the final output.
    • init: weight initialization function. Can be the name of an existing function (str), or a Theano function (see: initializations).
    • activation: activation function. Can be the name of an existing function (str), or a Theano function (see: activations).
    • weights: list of numpy arrays to set as initial weights. The list should have 3 elements, of shapes: [(input_dim, output_dim), (output_dim, output_dim), (output_dim,)].
    • truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
    • return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.

SimpleDeepRNN

keras.layers.recurrent.SimpleDeepRNN(input_dim, output_dim, depth=3,
        init='glorot_uniform', inner_init='orthogonal', 
        activation='sigmoid', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Fully connected RNN where the output of multiple timesteps (up to "depth" steps in the past) is fed back to the input:

output = activation( W.x_t + b + inner_activation(U_1.h_tm1) + inner_activation(U_2.h_tm2) + ... )

Not a particularly useful model, included for demonstration purposes.

  • Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).

  • Output shape:

    • if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
    • else: 2D tensor with shape: (nb_samples, output_dim).
  • Arguments:

    • input_dim: dimension of the input.
    • output_dim: dimension of the internal projections and the final output.
    • depth: int >= 1. Lookback depth (eg. depth=1 is equivalent to SimpleRNN).
    • init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
    • inner_init: weight initialization function for the inner cells.
    • activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
    • inner_activation: activation function for the inner cells.
    • weights: list of numpy arrays to set as initial weights. The list should have depth+2 elements.
    • truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
    • return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.

GRU

keras.layers.recurrent.GRU(input_dim, output_dim=128, 
        init='glorot_uniform', inner_init='orthogonal',
        activation='sigmoid', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Gated Recurrent Unit - Cho et al. 2014.

  • Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).

  • Output shape:

    • if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
    • else: 2D tensor with shape: (nb_samples, output_dim).
  • Arguments:

    • input_dim: dimension of the input.
    • output_dim: dimension of the internal projections and the final output.
    • init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
    • inner_init: weight initialization function for the inner cells.
    • activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
    • inner_activation: activation function for the inner cells.
    • weights: list of numpy arrays to set as initial weights. The list should have 9 elements.
    • truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
    • return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
  • References:


LSTM

keras.layers.recurrent.LSTM(input_dim, output_dim=128, 
        init='glorot_uniform', inner_init='orthogonal', 
        activation='tanh', inner_activation='hard_sigmoid',
        weights=None, truncate_gradient=-1, return_sequences=False)

Long-Short Term Memory unit - Hochreiter 1997.

  • Input shape: 3D tensor with shape: (nb_samples, timesteps, input_dim).

  • Output shape:

    • if return_sequences: 3D tensor with shape: (nb_samples, timesteps, ouput_dim).
    • else: 2D tensor with shape: (nb_samples, output_dim).
  • Arguments:

  • input_dim: dimension of the input.

    • output_dim: dimension of the internal projections and the final output.
    • init: weight initialization function for the output cell. Can be the name of an existing function (str), or a Theano function (see: initializations).
    • inner_init: weight initialization function for the inner cells.
    • activation: activation function for the output. Can be the name of an existing function (str), or a Theano function (see: activations).
    • inner_activation: activation function for the inner cells.
    • weights: list of numpy arrays to set as initial weights. The list should have 12 elements.
    • truncate_gradient: Number of steps to use in truncated BPTT. See: Theano "scan".
    • return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
  • References: