Classifier
TextAnalysis currently offers a Naive Bayes Classifier for text classification.
To load the Naive Bayes Classifier, use the following command:
using TextAnalysis: NaiveBayesClassifier, fit!, predictBasic Usage
It can be used in the following 3 steps:
- Create an instance of the Naive Bayes Classifier model:
TextAnalysis.NaiveBayesClassifier — Type
NaiveBayesClassifier([dict, ]classes)A Naive Bayes Classifier for classifying documents.
Arguments
classes: Array of possible classes that the data could belong todict: (Optional) Array of possible tokens (words). This is automatically updated if a new token is detected during training or prediction
Example
julia> using TextAnalysis: NaiveBayesClassifier, fit!, predict
julia> m = NaiveBayesClassifier([:spam, :non_spam])
NaiveBayesClassifier{Symbol}(String[], [:spam, :non_spam], Matrix{Int64}(undef, 0, 2))
julia> fit!(m, "this is spam", :spam)
NaiveBayesClassifier{Symbol}(["this", "is", "spam"], [:spam, :non_spam], [2 1; 2 1; 2 1])
julia> fit!(m, "this is not spam", :non_spam)
NaiveBayesClassifier{Symbol}(["this", "is", "spam", "not"], [:spam, :non_spam], [2 2; 2 2; 2 2; 1 2])
julia> predict(m, "is this a spam")
Dict{Symbol, Float64} with 2 entries:
:spam => 0.59883
:non_spam => 0.40117source- Fit the model weights on training data:
TextAnalysis.fit! — Function
fit!(model::NaiveBayesClassifier, str, class)
fit!(model::NaiveBayesClassifier, ::Features, class)
fit!(model::NaiveBayesClassifier, ::StringDocument, class)Fit the weights for the model on the input data.
source- Make predictions on new data:
TextAnalysis.predict — Function
predict(::NaiveBayesClassifier, str)
predict(::NaiveBayesClassifier, ::Features)
predict(::NaiveBayesClassifier, ::StringDocument)Predict probabilities for each class on the input Features or String.
sourceExample
julia> using TextAnalysisjulia> m = NaiveBayesClassifier([:legal, :financial])NaiveBayesClassifier{Symbol}(String[], [:legal, :financial], Matrix{Int64}(undef, 0, 2))julia> fit!(m, "this is financial doc", :financial)NaiveBayesClassifier{Symbol}(["financial", "this", "is", "doc"], [:legal, :financial], [1 2; 1 2; 1 2; 1 2])julia> fit!(m, "this is legal doc", :legal)NaiveBayesClassifier{Symbol}(["financial", "this", "is", "doc", "legal"], [:legal, :financial], [1 2; 2 2; … ; 2 2; 2 1])julia> predict(m, "this should be predicted as a legal document")Dict{Symbol, Float64} with 2 entries: :legal => 0.666667 :financial => 0.333333