Skip to content

SIT-SandBox/thai-bad-words

Repository files navigation

🔍 Thai Bad Words Detection Library

📖 Overview

A powerful TypeScript library for detecting inappropriate Thai words in text content. Perfect for content moderation, chat filters and other.

✨ Key Features

  • 🎯 Smart detection combining prefixes and root words
  • 🚫 Customizable ignore list for false positives
  • 🔄 Dynamic updates to word lists
  • 🛠️ Easy to integrate and configure

📦 Installation

Choose your preferred package manager:

# Using npm
npm install @sit-sandbox/thai-bad-words

# Using yarn
yarn add @sit-sandbox/thai-bad-words

🛠️ API Reference

Core Functions

🔍 scanBadWords(input: Record<string,any>): void

// Throws an error if bad words are found
scanBadWords("some text");
scanBadWords(["some text"]);
scanBadWords({"key":"some text"});
scanBadWords({
  "level1": {
    "key1": "some text",
    "key2": {
      "level2": [
        {
          "keyA": "some text",
          "keyB": {
            "level3": [
              {
                "keyX": "some text",
                "keyY": {
                  "level4": [
                    {
                      "key1": "some text",
                      "key2": [
                        {
                          "keyZ": "some text",
                          "level5": {
                            "keyM": "some text",
                            "level6": [
                              {
                                "keyP": "some text",
                                "level7": [
                                  "some text",
                                  "some text",
                                  "some text"
                                ....
)

addBadWords(newBadWords: string[]): void

addBadWords(["word1", "word2"]);

🚫 addIgnoreList(newIgnoreWords: string[]): void

addIgnoreList(["false_positive1", "false_positive2"]);

addPrefixes(newPrefixes: string[]): void

addPrefixes(["prefix1", "prefix2"]);

removeBadWords(wordsToRemove: string[]): void

removeBadWords(["word1"]);

📋 getBadWords(): string[]

const badWords = getBadWords();

🌟 Usage Example

import { scanBadWords, addBadWords, addIgnoreList } from "@sit-sandbox/thai-bad-words";

// Add words to ignore
addIgnoreList(["หีบ", "สัสดี"]);

// Add new bad words
addBadWords(["โง่", "บ้า"]);

// Check text
try {
  scanBadWords("some text to check");
} catch (error) {
  console.log("❌ Bad word detected:", error.message);
}

📝 Default Configuration

🔤 Prefixes

Common prefixes used for word combinations:

["กู", "มึง", "ไอ้", "อี", "ไอ", "ผม", "คุณ", "กระผม", "เธอ", "พ่อ", "แม่", "นาย"];

🚫 Ignore List

Words that should be skipped during detection:

["หีบ", "สัสดี", "หน้าหีบ", "ตด"];

📋 Root Words

Base inappropriate words (shortened for README):

["ควย", "เหี้ย", "หี", "สัส", "เชี่ย" /* ... and more ... */];

🤝 Contributing

Contributions are welcome! Feel free to:

  • 🐛 Report bugs
  • 💡 Suggest new features
  • 📝 Improve documentation
  • 🔧 Submit pull requests

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

💬 Support

If you have any questions or need support, please:

  • 📫 Open an issue
  • 🌟 Star the repository if you find it helpful