Semi-supervised learning, active learning and deep learning for classification

Final edit with all resources updated:

For a project, I am applying machine learning algorithms for classification.

Challenge:
Quite limited labeled data and much more unlabeled data.

Goals:

  1. Apply semi-supervised classification
  2. Apply a somehow semi-supervised labeling process (known as active learning)

I’ve found a lot of information from research papers, like applying EM, Transductive SVM or S3VM (Semi Supervised SVM), or somehow using LDA, etc. Even there are few books on this topic.

Question:
Where are the implementations and practical sources?


Final update (based on helps provided by mpiktas, bayer, and Dikran Marsupial)

Semi-supervised learning:

Active learning:

  • Dualist: an implementation of active learning with source code on text classification
  • This webpage serves a wonderful overview of active learning.
  • An experimental Design workshop: here.

Deep learning:

Answer

It seems as if deep learning might be very interesting for you. This is a very recent field of deep connectionist models which are pretrained in an unsupervised way and fine tuned afterwards with supervision. The fine tuning requires a much less samples than the pretraining.

To wet your tongue, I recommend [Semantig Hashing Salakhutdinov, Hinton. Have a look at the codes this finds for distinct documents of the Reuters corpus: (unsupervised!)

enter image description here

If you need some code implemented, check out deeplearning.net. I don’t believe there are out of the box solutions, though.

Attribution
Source : Link , Question Author : Flake , Answer Author : bayerj

Leave a Comment