Best use of LSTM for within sequence event prediction

Assume the following 1 dimensional sequence:

A, B, C, Z, B, B, #, C, C, C, V, $, W, A, % ...

Letters A, B, C, .. here represent ‘ordinary’ events.

Symbols #, $, %, ... here represent ‘special’ events

The temporal spacing between all events is non-uniform (anything from seconds, to days) though the further in the past an event is the less likely it is to influence future events. Ideally I can take into account these time delays explicitly.

There are on the order of 10000 ordinary event types, and on the order of 100 special event types. The amount of ordinary events preceding a special event varies but unlikely to be more than a 100-300.

Fundamentally I’m interested in looking for patterns in the ordinary event sequence that end up being predictive for the special events.

Now you can approach this in different ways: creating feature vectors + standard classification, association rule learning, HMMs, etc.

In this case Im curious as to how an LSTM based network would fit best. Straightforward would be to do something like Karparthy’s char-rnn and predict the next event given a history. Then for a new sequence

C, Z, Q, V, V, ... , V, W

You could run it through the model and see what special event is most probable to come next. But it does not quite feel the right fit.

Since this is a temporal classification problem it seems the proper thing to do though is use Connectionist Temporal Classification as described by Alex Graves.

However, before investing too much at the moment I’m looking for something easier and quicker to experiment with to get a feel of how well LSTMs would fit here. Tensorflow will see an CTC example at some point, but not yet.

So my (sub) questions are:

  1. Given the problem above and I’d like to experiment with LSTMs is it worth trying the char-rnn type approach, should I bite the bullet and get to grips with CTC, or is there a better place to start.
  2. How would you explicitly incorporate inter-event timing information. Using a fixed clock with no-op events obviously works but seems ugly.
  3. Assuming I managed to train an LSTM is there a way to inspect the model to see what kind of event ‘motifs’ it has picked up? (ie, analogous to the filters in convnets)

Any sample code (python preferred) always helpful.

Edit: Just to add that there is some noise in the sequence. Some events can be safely ignored but exactly which ones are not always possible to say up front. So ideally the model (and the motifs derived from it) is robust against this.


Your data seems to be just sequences of tokens. Try build a LSTM autoencoder and let the encoder learns some fixed representations of the first part of your sequence and the decoder to predict the remaining.

These representations would be your motifs.


Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

Srivastava, N., Mansimov, E., & Salakhutdinov, R. (2015). Unsupervised learning of video representations using LSTMs. arXiv preprint arXiv:1502.04681.

Source : Link , Question Author : dgorissen , Answer Author : horaceT

Leave a Comment