Gaussian Naive Bayes

by Lucy X. Shi, on November 1, 2022.

Introduction

Gaussian Naive Bayes is a supervised learning algorithm based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable. It is typically used for classification.

Probabilistic Derivation

Naive Bayes

Bayes’ theorem states the following relationship, given class variable $y$ and dependent feature vector $x_1$ through $x_n$ :

Using the naive conditional independence assumption that

for all $i$, this relationship is simplified to

Since $P(x_1, \dots, x_n)$ is constant given the input, we can use the following classification rule:

and we can use the relative frequency of class $y$ in the training set to estimate $P(y)$.

Gaussian Naive Bayes

The likelihood of the features $P(x_i \mid y)$ is assumed to be Gaussian:

The parameters $\sigma_y$ and $\mu_y$ are estimated using maximum likelihood.

In practice, instead of likelihood, we compute log likelihood.

Implementation

We first compute the mean and variance of each feature for all data points in the same class (X[i] for the ith class):

mu = torch.mean(X[i], dim=0)
var = torch.var(X[i], dim=0, unbiased=False)

Then we compute the joint log likelihood for the given data point to belong to each of the classes:

n_ij = -0.5 * torch.sum(torch.log(2.0 * math.pi * self.var[i, :]), 0, True)
n_ij -= 0.5 * torch.sum(((X - self.theta[i, :]) ** 2) / (self.var[i, :]), 1, True)

Finally, we take an argmax for the joint log likelihood amongst all classes:

return self.classes[joint_log_likelihood.argmax(1)]

The torchml Interface

First fit the classifier with training data, then make prediction using the classifier.

clf = GaussianNB()
clf.fit(X_train, y_train)
pred = clf.predict(X_test)

References