Building an Eight-Feature Classifier to Predict Academic Success

Dataset: Predict Dropout or Academic Success

Background: This is an assignment given to build a feature classifier to predict whether a student will drop out or graduate.

Goal: I hope to become more familiar with the sklearn library and to build a successful eight-feature classifier.

Methods: In order to achieve my goal, I used Spyder software and sklearn library to train and test data based on different features in order to build a strong classifier. The features I chose to test were marital status, gender, debt, scholarship, educational special needs, displacement, course, and nationality. The feature classifier initially ran based on one feature, and the strongest feature (based of F1-Score) was chosen, then the feature classifier was ran using two features, and the process continued until it reached eight features.

Results and Analysis:

The tables above shows the accuracy, precision, recall, and F1 score for each of the feature combinations. The conditional formatting of the F1 column provides a visual reference for the F1 scores of each classifier model. The most successful one-feature classifier was the course category, then course and scholarship, and so on, until eight-features were completed and the final classifier was course, scholarship, debt, nationality, gender, marital status, educational special needs, and displacement. This model has an accuracy of 60.6%, a precision of 55.2%, a recall of 60.6% and an F1 score of 54.7%

Course

The one-feature classifier shows a 55.8% accuracy and a 0.484 F1 score, making it the best of the tested one-feature classifiers

Course, Scholarship

The two-feature classifier shows a 56.8% accuracy and a 0.516 F1 score, showing improvement from the one-feature classifier and making it the best of the two-feature classifiers.

Course, Scholarship, Debt

The three-feature classifier shows a 59.6% accuracy and a 0.540 F1 score, making it the best of the three-feature classifiers.

Course, Scholarship, Debt, Nationality

The four-feature classifier shows a 59.7% accuracy and a 0.542 F1 score, making it the best of the four-feature classifiers.

Course, Scholarship, Debt, Nationality, Gender

The five-feature classifier shows a 60.7% accuracy and a 0.552 F1 score, making it the best of the five-feature classifiers.

Course, Scholarship, Debt, Nationality, Gender, Marital Status

The six-feature classifier shows a 59.5% accuracy and a 0.537 F1 score, making it the best of the six-feature classifiers.

Course, Scholarship, Debt, Nationality, Gender, Marital Status, Educational Special Needs

The seven-feature classifier shows a 58.8% accuracy and a 0.531 F1 score, making it the best of the seven-feature classifiers.

Course, Scholarship, Debt, Nationality, Gender, Marital Status, Educational Special Needs, Displacement

The eight-feature classifier shows a 60.6% accuracy and a 0.547 F1 score, making it the best of the eight-feature classifiers. Overall it has the second highest accuracy and the second highest F1 score. The five-feature classifier scores better in both regards, making it the best classifier overall.

Future Directions:

I would be interested to explore more features and study their impact on the accuracy of the classifier.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php