LingualSense_Infosys_Internship_Oct2024

To build a model that can automatically identify the language of a given text. Language identification is essential for various applications, including machine translation, multilingual document tracking, and electronic devices (e.g., mobiles, laptops).

LingualSense: Deep Learning for Language Detection Across Texts

LingualSense is a deep learning project for classifying text languages. This README provides step-by-step instructions from data analysis to deployment.

Steps to Follow

1. Exploratory Data Analysis (EDA)

Perform EDA to analyze your dataset.
Check the distribution of languages and clean any irregularities in the dataset.

2. Data Preprocessing

Tokenize the text data and pad sequences to a uniform length for model compatibility.
Save the tokenizer and label encoder for future use in the app.

3. Model Building

Use a GRU-based model for text classification.
Train the model using tokenized and padded sequences.
Save the trained model as gru_model.h5.

4. Streamlit Application Development

Create a Streamlit app for real-time predictions.
Include input text areas, model loading, and prediction functionality.
Add a styled user interface for better interaction.

5. Setup Environment

Clone the repository:

git clone https://github.com/Springboard429/LingualSense_Infosys_Internship_Oct2024.git
cd LingualSense

Create a virtual environment:

Windows:

python -m venv lingualsense_env
lingualsense_env\Scripts\activate

Mac/Linux:

python -m venv lingualsense_env
source lingualsense_env/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

Place the following files in the project directory:

 gru_model.h5
 tokenizer.joblib
 label_encoder.joblib

6. Run the Streamlit App

Execute the following command:
```
streamlit run app.py
```

Open the local URL (e.g., http://localhost:XXXX) to access the app.

7. Usage

Input text in the text area.
Click "Detect Languages" to get the predicted language of the text.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Dataset		Dataset
Model		Model
Scripts		Scripts
.gitattributes		.gitattributes
LICENSE		LICENSE
LingualSense- Deep Learning for Language Detection Across Texts.pdf		LingualSense- Deep Learning for Language Detection Across Texts.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LingualSense_Infosys_Internship_Oct2024

LingualSense: Deep Learning for Language Detection Across Texts

Steps to Follow

1. Exploratory Data Analysis (EDA)

2. Data Preprocessing

3. Model Building

4. Streamlit Application Development

5. Setup Environment

6. Run the Streamlit App

7. Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LingualSense_Infosys_Internship_Oct2024

LingualSense: Deep Learning for Language Detection Across Texts

Steps to Follow

1. Exploratory Data Analysis (EDA)

2. Data Preprocessing

3. Model Building

4. Streamlit Application Development

5. Setup Environment

6. Run the Streamlit App

7. Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages