Extended Usage Example
To show you how text analysis might work in practice, we're going to work with a text corpus composed of political speeches from American presidents given as part of the State of the Union Address tradition.
using TextAnalysis, MultivariateStats, Clustering
crps = DirectoryCorpus("sotu")
standardize!(crps, StringDocument)
crps = Corpus(crps[1:30])
remove_case!(crps)
prepare!(crps, strip_punctuation)
update_lexicon!(crps)
update_inverse_index!(crps)
crps["freedom"]
m = DocumentTermMatrix(crps)
D = dtm(m, :dense)
T = tf_idf(D)
cl = kmeans(T, 5)