Classification

Building a Spam Filter with Naive Bayes

In this project, we’re going to build a spam filter for SMS messages using the multinomial Naive Bayes algorithm. Our goal is to write a program that classifies new messages with an accuracy greater than 80% — so we expect that more than 80% of the new messages will be classified correctly as spam or ham (non-spam). To train the algorithm, we’ll use a dataset of 5,572 SMS messages that are already classified by humans.