Project
Comparative Analysis of NLP Models
This independent project compared classical and neural approaches for sentiment classification on IMDb movie reviews. The goal was not only to maximize accuracy, but also to understand how different modeling choices affect performance and interpretability.
Key contributions
- Built a preprocessing pipeline with Pandas and Gensim for cleaning, tokenization, and stemming.
- Reproduced and evaluated Logistic Regression, Random Forest, and neural network approaches.
- Reached 89.3% peak accuracy with a tuned TF-IDF plus Logistic Regression setup.
- Generated t-SNE visualizations, word clouds, and multiple evaluation metrics including AUROC, AUPRC, and F1-score.
What to add later
- Model comparison plots across different metrics.
- Short explanation of the experimental setup and hyperparameter tuning process.
- Selected visualizations exported from the notebook or report.