I am looking for a good tutorial on clustering data in R using hierarchical dirichlet process (HDP) (one of the recent and popular nonparametric Bayesian methods).

There is DPpackage (IMHO, the most comprehensive of all the available ones) in R for nonparametric Bayesian analysis. But I am unable to understand the examples provided in R News or in the package reference manual well enough to code HDP.

Any help or pointer is appreciated.

A C++ implementation of HDP for topic modeling is available here (please look at the bottom for C++ code)


Here are some online ressources I found interesting without going into detail (and I’m not a specialist of this topic):

The definitive reference seems to be

About R, there seems to be some other packages worth to explore if the DPpackage does not suit your needs, e.g. dpmixsim, BHC, or mbsc found on

