Finding the Best ‘k’ in a KNN-Classifier

Dataset: Penguin Dataset: EDA, Classification, and Clustering

Background: My task is to build a KNN classification function and find the best ‘k’ value (number of nearest neighbors), based on which produces the highest F1 score. The features being trained and tested are Culmen Depth and Flipper Length in order to predict which island the penguin is located on.

Goals: I hope to build a successful classifier with an F1 score around 0.70.

Methods: In order to achieve my goal, I used Spyder software to build a KNN classification function and to calculate the F1 score of each K-fold. I also used excel software in order to create a visual representation of the highest F1 scores.

Results and Analysis: The ‘k’ value that produced the highest F1 score was 31 nearest neighbors. It had an F1 score of 0.697, meaning that this classifier meets my definition of successful.

Future Directions: I would be interested to evaluate a KNN classifier based on accuracy and precision in the future.






Leave a Reply

Your email address will not be published. Required fields are marked *