Skip to content

Conversation

@luistatera
Copy link

No description provided.

… training, and evaluation

- Added TextPreprocessor class for text cleaning and lemmatization
- Integrated TfidfVectorizer and LogisticRegression into a scikit-learn Pipeline
- Implemented caching mechanism for cleaned training and test data
- Evaluated model performance on training and test sets with classification reports and confusion matrices
- Added baseline evaluation using DummyClassifier
- Processed validation data and generated predictions
- Saved validation predictions to CSV and entire pipeline to a pickle file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant