The Wafer Anomaly Detection project in Python aims to utilize machine learning algorithms to detect and classify defects on silicon wafers in semiconductor manufacturing.
The project involves the detection of anomalies in silicon wafers using sensor data collected during the manufacturing process. By analyzing sensor readings, the system can identify faulty wafers and classify them accordingly. This helps in ensuring the quality of semiconductor products and minimizing production defects.
The dataset comprises sensor readings from over 500 sensors for each wafer. Based on these readings, the system determines whether the wafer is functioning properly (+1) or faulty (-1).
- Validates the filename and presence of all columns
- Checks the name and data type of each column
- Creates and connects to the database
- Creates tables in the database
- Inserts files into the tables
- Exports data from the database as CSV files for model training
- Data Preprocessing
- Checks for null values and uses KNN Imputer to fill them
- Removes columns with a standard deviation of 0
- Uses K-means to cluster preprocessed data
- Trains separate models for each cluster to achieve better accuracy
- Selects the best model for each cluster using RandomForest and XGBoost algorithms
- Performs Grid Search CV to optimize model parameters
- Validates data, inserts it into the database, preprocesses it, and performs clustering
- Loads the appropriate model based on the cluster group and makes predictions
Feel free to reach out if you have any questions or suggestions