R: An R wrapper for the Mallet topic modeling package
mallet-package
R Documentation
An R wrapper for the Mallet topic modeling package
Description
This package provides an interface to the Java implementation of latent Dirichlet allocation in the Mallet machine learning package. Mallet has many functions, this wrapper focuses on the topic modeling sub-package written by David Mimno. The package uses the rJava package to connect to a JVM.
Details
Package:
mallet
Type:
Package
Version:
1.0
Date:
2013-08-08
License:
MIT
Create a topic model trainer: MalletLDA
Load documents from disk and import them:
mallet.read.dirmallet.import
Get info about word frequencies: mallet.word.freqs
Get trained model parameters:
mallet.doc.topicsmallet.topic.wordsmallet.subset.topic.words
Reports on topic words:
mallet.top.wordsmallet.topic.labels
Clustering of topics: mallet.topic.hclust
Author(s)
Maintainer: David Mimno
References
The model, Latent Dirichlet allocation (LDA):
David M Blei, Andrew Ng, Michael Jordan. Latent Dirichlet Allocation. J. of Machine Learning Research, 2003.
The Java toolkit:
Andrew Kachites McCallum. The Mallet Toolkit. 2002.
Details of the fast sparse Gibbs sampling algorithm:
Limin Yao, David Mimno, Andrew McCallum. Streaming Inference for Latent Dirichlet Allocation. KDD, 2009.
Hyperparameter optimization:
Hanna Wallach, David Mimno, Andrew McCallum. Rethinking LDA: Why Priors Matter. NIPS, 2010.