diff --git a/README.md b/README.md new file mode 100644 index 0000000..8dd6763 --- /dev/null +++ b/README.md @@ -0,0 +1,193 @@ +# Smart Code Autocomplete Engine (C++) + +The Smart Code Autocomplete Engine is a C++ project designed to suggest intelligent code completions in real time โ€” similar to how modern IDEs like VS Code or IntelliJ offer autocomplete suggestions. +It applies core Data Structures and Algorithms (DSA) concepts such as Tries, Heaps, and LRU (Least Recently Used) caching to efficiently predict the next most probable code tokens based on user input frequency and context. + +## Idea +When a programmer starts typing part of a keyword or function name, the engine quickly: +Searches through a Trie (prefix tree) for all words starting with that prefix. +Uses a Heap to rank suggestions by frequency or relevance. +Employs an LRU Cache to prioritize recently used or selected completions, making the system adaptive over time. + +This results in fast, memory-efficient, and intelligent autocomplete suggestions that simulate the logic behind real-world code editors โ€” but built purely from scratch using fundamental DSA concepts. + + +## DSA Concepts used + +### 1. Trie (Prefix Tree) + +- Files: tst.h, tst.cpp, tst_test.cpp +- Used for storing and retrieving words efficiently based on their prefixes. +- Enables O(L) time complexity lookups (where L = length of prefix). +- Supports real-time suggestions as the user types each character. + +๐Ÿ”น Concepts used: String manipulation, recursion, tree traversal, prefix-based searching. + +### 2. Min-Heap / Max-Heap + +- Files: minheap.h, minheap.cpp, heap_test.cpp +- Maintains the top N most frequent or relevant words efficiently. +- Provides constant-time access to the best-ranked suggestion. +- Used during ranking and sorting of autocomplete results. + +๐Ÿ”น Concepts used: Binary heap operations, priority queue logic, partial sorting. + +### 3. LRU (Least Recently Used) Cache +- Files: lru.h, lru.cpp, lru_test.cpp +- Stores recently used suggestions for quick access. +- Improves responsiveness by avoiding repetitive Trie lookups. +- Implemented using a combination of doubly linked list + hash map. + +๐Ÿ”น Concepts used: Linked lists, hashing, cache eviction policy. + +### 4. KMP (Knuthโ€“Morrisโ€“Pratt) Algorithm +- Files: kmp.h, kmp.cpp +- Used for efficient substring pattern matching between typed input and stored code tokens. +- Ensures fast lookup of partial matches even in large word lists. + +๐Ÿ”น Concepts used: Prefix table computation, linear-time pattern searching. + +### 5. Graph Data Structure + +- Files: graph.h, graph.cpp +- Represents relationships between tokens or code components. +- Can model transitions between function calls or variable dependencies for context-aware suggestions. + +๐Ÿ”น Concepts used: Adjacency list representation, graph traversal (BFS/DFS). + +### 6. Stack + +- Files: stack.h, stack.cpp +- Used internally for recursive operations, backtracking, or maintaining function call hierarchies. +- Simplifies control flow during traversal or undo operations in text editing logic. + +๐Ÿ”น Concepts used: LIFO operations, template-based generic stack implementation. + +### 7. Ranking System +- Files: ranker.h, ranker.cpp +- Combines frequency and recency scores from Trie, Heap, and LRU cache to rank autocomplete suggestions. +- Implements a weighted scoring system for realistic, adaptive predictions. + +๐Ÿ”น Concepts used: Comparator functions, dynamic sorting, frequency-based ranking. + +### 8. Frequency Storage +- Files: freq_store.h, freq_store.cpp, frequency.txt +- Keeps track of how often each word is used. +- Updates dynamically after every suggestion selection, making the model โ€œlearnโ€ over time. + +๐Ÿ”น Concepts used: File handling, hash mapping, frequency analysis. + +--- + +## Features +- Insert code keywords or phrases +- Autocomplete suggestions based on prefix +- Suggestions ranked by frequency +- Snippet support (e.g., `fori` โ†’ `for (int i = 0; i < n; i++)`) +- Practical demonstration of Trie + Hash Map + Heap working together + +## Tech Stack Used + +| Category | Technologies / Tools | Description | +|-----------|----------------------|--------------| +| **Language** | C++ (C++17 Standard) | Core implementation of all modules, data structures, and algorithms. | +| **Build System** | GNU Make (Makefile) | Used for compiling multiple source files and linking into a single executable. | +| **Compiler** | GCC / G++ | To compile and build the C++ source files efficiently. | +| **Version Control** | Git + GitHub | For code management, versioning, and collaboration among team members. | +| **Testing Framework** | Custom test files (`tests/`) | Unit testing for core modules like Trie, LRU, and Heap. | +| **Data Storage** | Text files (`data/words.txt`, `data/frequency.txt`) | Stores training words and their frequency for suggestion ranking. | +| **Editor / IDE** | VS Code | Primary development environment for coding, debugging, and project organization. | + +--- + + +## How It Works +1. User enters keywords (e.g., `print`, `printf`, `private`, etc.) +2. Trie stores all words for fast prefix lookup +3. Hash map tracks how often each word is used +4. Heap finds the most frequent matches for a prefix +5. Snippets expand small abbreviations into full code blocks + +## Example Usage +- Type: `pri` +- Suggestions: `print`, `printf`, `private` (ranked) +- Type: `fori` +- Expanded snippet: `for (int i = 0; i < n; i++)` + +## Setup Instructions +--- +### 1. Clone the repository +- git clone https://github.com/maahi271005/Smart-Code-Autocomplete-Engine-DSA-Project +- cd smart_autocomplete + +--- + + +### 2. (Optional) Create and activate virtual environment + +If your project uses Python utilities or scripts (e.g., preprocessing): + +- python3 -m venv dsavenv +- source dsavenv/bin/activate + +--- + +### 3. Install build tools (for Linux/Ubuntu) +- sudo apt update +- sudo apt install build-essential + +--- +### 4. Build the project + +Use the provided Makefile: + +- make clean +- make + + +This will generate the executable: + +- ./smart_autocomplete + +--- +### 5. Run the program +./smart_autocomplete + + +If you encounter GLIBCXX_3.4.32 not found, simply rebuild the project on your machine (make clean && make) to link against your local C++ standard library. + +--- +## Running Tests + +To verify components: + +- g++ tests/heap_test.cpp -o heap_test && ./heap_test +- g++ tests/lru_test.cpp -o lru_test && ./lru_test +- g++ tests/tst_test.cpp -o tst_test && ./tst_test + +--- +## Applications +- Code Editors (VS Code, JetBrains) +- Search Engines +- Chatbots +- AI-assisted development tools + +--- + + +## Contributors +| Name | GitHub | +|------|---------| +| Tanisha Ray | [![GitHub](https://img.shields.io/badge/-@tanisharay-181717?logo=github&style=flat)](https://github.com/coderTanisha22) | +| Maahi Ratanpara | [![GitHub](https://img.shields.io/badge/-@maahiratanpara-181717?logo=github&style=flat)](https://github.com/maahi271005) | +| Anika Sharma | [![GitHub](https://img.shields.io/badge/-@anikasharma-181717?logo=github&style=flat)](https://github.com/Anika438) | +| Akshita Maheshwari | [![GitHub](https://img.shields.io/badge/-@akshitamaheshwari-181717?logo=github&style=flat)](https://github.com/AkshitaM1234) | + + +--- +## Educational Purpose +This project demonstrates the application of DSA in a real-world scenario โ€” showing how core structures like tries, heaps, and caches can combine to form an intelligent system used in everyday developer tools. +//test + + + diff --git a/setup_cpp.sh b/setup_cpp.sh deleted file mode 100755 index a66ca3e..0000000 --- a/setup_cpp.sh +++ /dev/null @@ -1,40 +0,0 @@ -#!/bin/bash - -# === Smart Autocomplete (C++) Project Scaffolder === -# Run: bash setup_cpp.sh - -ROOT="smart_autocomplete" -echo "๐Ÿ“ Creating C++ project: $ROOT" -mkdir -p $ROOT/{include,src,data,tests} - -# --- Header files --- -touch $ROOT/include/{tst.h,minheap.h,lru.h,kmp.h,stack.h,graph.h,freq_store.h,ranker.h} - -# --- Source files --- -touch $ROOT/src/{main.cpp,tst.cpp,minheap.cpp,lru.cpp,kmp.cpp,stack.cpp,graph.cpp,freq_store.cpp,ranker.cpp} - -# --- Data + Tests --- -touch $ROOT/data/seeds.txt -touch $ROOT/tests/{tst_test.cpp,lru_test.cpp,heap_test.cpp} - -# --- README + Makefile --- -touch $ROOT/README.md -cat << 'EOF' > $ROOT/Makefile -CXX = g++ -CXXFLAGS = -std=c++17 -O2 -Wall -Iinclude -SRC = $(wildcard src/*.cpp) -OBJ = $(SRC:.cpp=.o) -TARGET = smart_autocomplete - -all: \$(TARGET) - -\$(TARGET): \$(OBJ) - \$(CXX) \$(CXXFLAGS) -o \$@ \$(OBJ) - -clean: - rm -f \$(OBJ) \$(TARGET) -EOF - -echo "โœ… C++ Smart Autocomplete project scaffold created!" -tree $ROOT - diff --git a/smart_autocomplete/.gitignore b/smart_autocomplete/.gitignore new file mode 100644 index 0000000..8d89da1 --- /dev/null +++ b/smart_autocomplete/.gitignore @@ -0,0 +1,10 @@ +#virtual environments +venv/ +env/ +dsavenv/ + +# Ignore object files and binaries +*.o +*.out +*.exe +smart_autocomplete diff --git a/smart_autocomplete/Makefile b/smart_autocomplete/Makefile index bc55677..2eaf783 100644 --- a/smart_autocomplete/Makefile +++ b/smart_autocomplete/Makefile @@ -4,10 +4,10 @@ SRC = $(wildcard src/*.cpp) OBJ = $(SRC:.cpp=.o) TARGET = smart_autocomplete -all: \$(TARGET) +all: $(TARGET) -\$(TARGET): \$(OBJ) - \$(CXX) \$(CXXFLAGS) -o \$@ \$(OBJ) +$(TARGET): $(OBJ) + $(CXX) $(CXXFLAGS) -o $@ $(OBJ) clean: - rm -f \$(OBJ) \$(TARGET) + rm -f $(OBJ) $(TARGET) diff --git a/smart_autocomplete/README.md b/smart_autocomplete/README.md deleted file mode 100644 index e69de29..0000000 diff --git a/smart_autocomplete/data/frequency.txt b/smart_autocomplete/data/frequency.txt new file mode 100644 index 0000000..decef56 --- /dev/null +++ b/smart_autocomplete/data/frequency.txt @@ -0,0 +1,4 @@ +for 1 +const_cast 1 +continue 1 +delete 1 diff --git a/smart_autocomplete/data/seeds.txt b/smart_autocomplete/data/seeds.txt deleted file mode 100644 index e69de29..0000000 diff --git a/smart_autocomplete/data/words.txt b/smart_autocomplete/data/words.txt new file mode 100644 index 0000000..c6c8503 --- /dev/null +++ b/smart_autocomplete/data/words.txt @@ -0,0 +1,88 @@ +for +while +if +else +switch +case +break +continue +return +int +float +double +char +bool +void +string +vector +map +set +list +queue +stack +auto +const +static +class +struct +namespace +public +private +protected +virtual +override +template +typename +include +iostream +std +cout +cin +endl +printf +scanf +main +new +delete +try +catch +throw +this +nullptr +true +false +array +deque +unordered_map +unordered_set +algorithm +sort +find +push_back +pop_back +size +empty +clear +begin +end +iterator +function +lambda +constexpr +inline +extern +typedef +using +enum +union +operator +friend +mutable +volatile +register +sizeof +typeid +dynamic_cast +static_cast +reinterpret_cast +const_cast \ No newline at end of file diff --git a/smart_autocomplete/include/freq_store.h b/smart_autocomplete/include/freq_store.h index e69de29..ee3b13b 100644 --- a/smart_autocomplete/include/freq_store.h +++ b/smart_autocomplete/include/freq_store.h @@ -0,0 +1,21 @@ +#ifndef FREQ_STORE_H +#define FREQ_STORE_H + +#include +#include + +class FreqStore { +private: + std::unordered_map frequencies; + std::string filePath; + +public: + FreqStore(const std::string& path); + void load(); + void save(); + int get(const std::string& token); + void bump(const std::string& token, int amount = 1); + void set(const std::string& token, int freq); +}; + +#endif \ No newline at end of file diff --git a/smart_autocomplete/include/graph.h b/smart_autocomplete/include/graph.h index e69de29..16f9414 100644 --- a/smart_autocomplete/include/graph.h +++ b/smart_autocomplete/include/graph.h @@ -0,0 +1,19 @@ +#ifndef GRAPH_H +#define GRAPH_H + +#include +#include +#include + +class CooccurrenceGraph { +private: + std::unordered_map> adjacencyList; + +public: + void addEdge(const std::string& from, const std::string& to); + double getBoost(const std::string& from, const std::string& to); + void display(); + int getEdgeWeight(const std::string& from, const std::string& to); +}; + +#endif \ No newline at end of file diff --git a/smart_autocomplete/include/kmp.h b/smart_autocomplete/include/kmp.h index e69de29..51a7cfb 100644 --- a/smart_autocomplete/include/kmp.h +++ b/smart_autocomplete/include/kmp.h @@ -0,0 +1,17 @@ +#ifndef KMP_H +#define KMP_H + +#include +#include + +using namespace std; + +class KMP { +private: + static vector computeLPS(const string& pattern); +public: + static bool contains(const string& text, const string& pattern); + static vector findAll(const string& text, const string& pattern); +}; + +#endif diff --git a/smart_autocomplete/include/lru.h b/smart_autocomplete/include/lru.h index e69de29..eb7078c 100644 --- a/smart_autocomplete/include/lru.h +++ b/smart_autocomplete/include/lru.h @@ -0,0 +1,45 @@ +#ifndef LRU_H +#define LRU_H + +#include +#include +#include +using namespace std; + +struct Node{ + string key; + vector val; + Node* prev; + Node* next; + + Node(string k, vectorv){ + key=k; + val=v; + prev=nullptr; + next=nullptr; + } +}; + +class lru_cache{ +private: + int cap; + unordered_mapcacheMap; + Node* head; + Node* tail; + + void addNodeToFront(Node* node); + void removeNode(Node* node); + void moveNodeToFront(Node* node); + void removeLRUNode(); + +public: + lru_cache(int cap); + vectorget(const string& key); + void put(const string& key, const vector& val); + bool exists(const string& key); + void clear(); +}; + +using LRUCache = lru_cache; + +#endif diff --git a/smart_autocomplete/include/minheap.h b/smart_autocomplete/include/minheap.h index e69de29..4b1133f 100644 --- a/smart_autocomplete/include/minheap.h +++ b/smart_autocomplete/include/minheap.h @@ -0,0 +1,45 @@ +#ifndef MINHEAP_H +#define MINHEAP_H + +#include +#include +#include + +class MinHeap { +private: + std::vector> heap; + int maxSize; + + void heapifyUp(int index); + void heapifyDown(int index); + int parent(int i){ + return (i - 1) / 2; + } + int leftChild(int i){ + return 2 * i + 1; + } + int rightChild(int i){ + return 2 * i + 2; + } + + public: + MinHeap(int k); + void insert(double score, const std::string& word); + + std::pair extractMin(); + std::pair getMin(); + int size() const{ + return heap.size(); + } + bool isEmpty() const { + return heap.empty(); + } + bool isFull() const { + return heap.size() >= maxSize; + } + std::vector> getAll(); + void clear(); +}; + +#endif + diff --git a/smart_autocomplete/include/ranker.h b/smart_autocomplete/include/ranker.h index e69de29..b08a397 100644 --- a/smart_autocomplete/include/ranker.h +++ b/smart_autocomplete/include/ranker.h @@ -0,0 +1,24 @@ +#ifndef RANKER_H +#define RANKER_H + +#include +#include +#include +#include "freq_store.h" +#include "graph.h" + +class Ranker { + private: + FreqStore* freqStore; + CooccurrenceGraph* graph; + std::string lastToken; + + public: + Ranker(FreqStore *fs,CooccurrenceGraph *g); + void setLastToken(const std::string &token); + double computeScore(const std::string &token); + std::vector> rankResults(const std::vector &candidates,int k); +}; + +#endif + diff --git a/smart_autocomplete/include/stack.h b/smart_autocomplete/include/stack.h index e69de29..66c00ed 100644 --- a/smart_autocomplete/include/stack.h +++ b/smart_autocomplete/include/stack.h @@ -0,0 +1,22 @@ +#ifndef STACK_H +#define STACK_H + +#include +#include +#include + +class UndoRedoStack { +private: + std::stack> undoStack; + std::stack> redoStack; + +public: + void pushInsert(int position, const std::string& text); + std::pair undo(); + std::pair redo(); + bool canUndo() const { return !undoStack.empty(); } + bool canRedo() const { return !redoStack.empty(); } + void clearRedo(); +}; + +#endif \ No newline at end of file diff --git a/smart_autocomplete/include/tst.h b/smart_autocomplete/include/tst.h index e69de29..6d58560 100644 --- a/smart_autocomplete/include/tst.h +++ b/smart_autocomplete/include/tst.h @@ -0,0 +1,39 @@ +#ifndef TST_H +#define TST_H + +#include +#include +#include + +struct TSTNode { + char data; + bool isEndOfString; + std::shared_ptr left; + std::shared_ptr eq; + std::shared_ptr right; + + TSTNode(char c) : data(c), isEndOfString(false), + left(nullptr), eq(nullptr), right(nullptr) {} +}; + +class TST { +private: + std::shared_ptr root; + + std::shared_ptr insertUtil(std::shared_ptr node, + const std::string& word, int index); + + void collectWords(std::shared_ptr node, + std::string prefix, + std::vector& results); + + std::shared_ptr searchPrefix(const std::string& prefix); + +public: + TST(); + void insert(const std::string& word); + std::vector prefixSearch(const std::string& prefix, int k = 10); + bool search(const std::string& word); + void getAllWords(std::vector& results); +}; +#endif \ No newline at end of file diff --git a/smart_autocomplete/src/freq_store.cpp b/smart_autocomplete/src/freq_store.cpp index e69de29..ae0e641 100644 --- a/smart_autocomplete/src/freq_store.cpp +++ b/smart_autocomplete/src/freq_store.cpp @@ -0,0 +1,57 @@ +#include "../include/freq_store.h" +#include +#include + +FreqStore::FreqStore(const std::string& path) : filePath(path) { + load(); +} + +void FreqStore::load() { + std::ifstream file(filePath); + + if (!file.is_open()) { + return; + } + + std::string line; + while (std::getline(file, line)) { + std::istringstream iss(line); + std::string token; + int freq; + + if (iss >> token >> freq) { + frequencies[token] = freq; + } + } + + file.close(); +} + +void FreqStore::save() { + std::ofstream file(filePath); + + if (!file.is_open()) { + return; + } + + for (const auto& [token, freq] : frequencies) { + file << token << " " << freq << "\n"; + } + + file.close(); +} + +int FreqStore::get(const std::string& token) { + auto it = frequencies.find(token); + return (it != frequencies.end()) ? it->second : 0; +} + +void FreqStore::bump(const std::string& token, int amount) { + frequencies[token] += amount; + save(); +} + +void FreqStore::set(const std::string& token, int freq) { + frequencies[token] = freq; + save(); +} \ No newline at end of file diff --git a/smart_autocomplete/src/graph.cpp b/smart_autocomplete/src/graph.cpp index e69de29..c8cf135 100644 --- a/smart_autocomplete/src/graph.cpp +++ b/smart_autocomplete/src/graph.cpp @@ -0,0 +1,48 @@ +#include "../include/graph.h" +#include +#include + +void CooccurrenceGraph::addEdge(const std::string& from, const std::string& to) { + adjacencyList[from][to]++; +} + +double CooccurrenceGraph::getBoost(const std::string& from, const std::string& to) { + if (adjacencyList.find(from) == adjacencyList.end()) { + return 0.0; + } + + auto& neighbors = adjacencyList[from]; + + if (neighbors.find(to) == neighbors.end()) { + return 0.0; + } + + int weight = neighbors[to]; + return std::log(1 + weight) * 0.5; +} + +int CooccurrenceGraph::getEdgeWeight(const std::string& from, const std::string& to) { + if (adjacencyList.find(from) == adjacencyList.end()) { + return 0; + } + + auto& neighbors = adjacencyList[from]; + + if (neighbors.find(to) == neighbors.end()) { + return 0; + } + + return neighbors[to]; +} + +void CooccurrenceGraph::display() { + std::cout << "\n=== Co-occurrence Graph ===" << std::endl; + + for (const auto& [from, neighbors] : adjacencyList) { + std::cout << from << " -> "; + for (const auto& [to, weight] : neighbors) { + std::cout << to << "(" << weight << ") "; + } + std::cout << std::endl; + } +} \ No newline at end of file diff --git a/smart_autocomplete/src/kmp.cpp b/smart_autocomplete/src/kmp.cpp index e69de29..dc88c99 100644 --- a/smart_autocomplete/src/kmp.cpp +++ b/smart_autocomplete/src/kmp.cpp @@ -0,0 +1,83 @@ +#include "../include/kmp.h" +using namespace std; + +vector KMP::computeLPS (const string& pattern){ + int n = pattern.length(); + vector lps(n,0); + + int len = 0; + int i=1; + + while(i lps = computeLPS(pattern); + + while(i KMP::findAll (const string& text, const string& pattern){ + vectorpos; + + int n = text.length(); + int m = pattern.length(); + + vector lps = computeLPS(pattern); + + int i=0; + int j=0; + + while(icap = cap; + head = nullptr; + tail = nullptr; +} + +void lru_cache::addNodeToFront(Node* node){ + if(node == nullptr) return; + node->next = head; + node->prev = nullptr; + + if(head != nullptr) head->prev = node; + head = node; + + if(tail == nullptr) tail = node; +} + +void lru_cache::removeNode(Node* node){ + if(node == nullptr) return; + if(node == head){ + head = node->next; + if(head) head->prev = nullptr; + } + else if(node == tail){ + tail = node->prev; + if(tail) tail->next = nullptr; + } + else{ + node->prev->next = node->next; + node->next->prev = node->prev; + } + node->prev = nullptr; + node->next = nullptr; +} + +void lru_cache::moveNodeToFront(Node* node){ + removeNode(node); + addNodeToFront(node); +} + +void lru_cache::removeLRUNode(){ + if(tail == nullptr) return; + + cacheMap.erase(tail->key); + Node* old_tail = tail; + removeNode(old_tail); + delete old_tail; +} + +vector lru_cache::get(const string& key){ + if(cacheMap.find(key) == cacheMap.end()) return {}; + Node* node = cacheMap[key]; + moveNodeToFront(node); + return node->val; +} + +void lru_cache::put(const string& key, const vector& val){ + if(cacheMap.find(key) != cacheMap.end()){ + Node* node = cacheMap[key]; + node->val = val; + moveNodeToFront(node); + } + else{ + Node* new_node = new Node(key,val); + addNodeToFront(new_node); + cacheMap[key] = new_node; + + if((int)cacheMap.size()>cap) removeLRUNode(); + } +} + +bool lru_cache::exists(const string& key) { + return cacheMap.find(key) != cacheMap.end(); +} + +void lru_cache::clear() { + Node* curr = head; + while (curr != nullptr) { + Node* next = curr->next; + delete curr; + curr = next; + } + + head = nullptr; + tail = nullptr; + cacheMap.clear(); +} diff --git a/smart_autocomplete/src/main.cpp b/smart_autocomplete/src/main.cpp index e69de29..ecaedf5 100644 --- a/smart_autocomplete/src/main.cpp +++ b/smart_autocomplete/src/main.cpp @@ -0,0 +1,271 @@ +#include +#include +#include +#include +#include +#include +#include "../include/tst.h" +#include "../include/minheap.h" +#include "../include/lru.h" +#include "../include/kmp.h" +#include "../include/stack.h" +#include "../include/graph.h" +#include "../include/freq_store.h" +#include "../include/ranker.h" + +class AutocompleteEngine { +private: + TST tst; + LRUCache cache; + FreqStore freqStore; + CooccurrenceGraph graph; + Ranker ranker; + UndoRedoStack undoRedo; + std::string lastAccepted; + bool useSubstringSearch; + + void loadSeeds(const std::string& filename) { + std::ifstream file(filename); + if (!file.is_open()){ + std::cerr << "Warning: Could not open " << filename << std::endl; + return; + } + + std::string word; + int count = 0; + while (file >> word) { + if (!word.empty()) { + tst.insert(word); + count++; + } + } + file.close(); + std::cout << "Loaded " << count << " tokens from seed file." << std::endl; + } + + std::vector substringSearch(const std::string& prefix){ + std::vector allWords; + tst.getAllWords(allWords); + + std::vector results; + for (const auto& word : allWords) { + if (KMP::contains(word, prefix)) { + results.push_back(word); + } + } + + return results; + } + + +public: + AutocompleteEngine() + : cache(50), + freqStore("data/frequency.txt"), + ranker(&freqStore, &graph), + useSubstringSearch(false){ + + loadSeeds("data/words.txt"); + } + + std::vector> getSuggestions(const std::string& prefix, int k = 5){ + if (prefix.empty()) { + return std::vector>(); + } + + if (cache.exists(prefix)) { + auto cached = cache.get(prefix); + std::vector> result; + for (const auto& token : cached) { + result.push_back({token, freqStore.get(token)}); + } + return result; + } + + std::vector candidates = tst.prefixSearch(prefix, k * 2); + + if (candidates.size()<3 && useSubstringSearch) { + auto substringResults = substringSearch(prefix); + candidates.insert(candidates.end(), substringResults.begin(), substringResults.end()); + + std::sort(candidates.begin(), candidates.end()); + candidates.erase(std::unique(candidates.begin(), candidates.end()), candidates.end()); + } + + ranker.setLastToken(lastAccepted); + auto ranked = ranker.rankResults(candidates, k); + + std::vector toCache; + for (const auto& [token, score] : ranked){ + toCache.push_back(token); + } + cache.put(prefix, toCache); + + return ranked; + } + + void acceptSuggestion(const std::string& token){ + freqStore.bump(token, 1); + + if (!lastAccepted.empty()){ + graph.addEdge(lastAccepted, token); + } + + undoRedo.pushInsert(0, token); + lastAccepted = token; + + std::cout << "Accepted: "<< token << std::endl; + } + + void bumpToken(const std::string& token) { + freqStore.bump(token, 5); + std::cout << "Bumped frequency of '" << token << "' by 5" << std::endl; + } + + void performUndo() { + if (undoRedo.canUndo()) { + auto [pos, token] = undoRedo.undo(); + std::cout << "Undo: Removed '" << token << "'" << std::endl; + } else { + std::cout << "Nothing to undo" << std::endl; + } + } + + void performRedo() { + if (undoRedo.canRedo()) { + auto [pos, token] = undoRedo.redo(); + std::cout << "Redo: Restored '" << token << "'" << std::endl; + } else { + std::cout << "Nothing to redo" << std::endl; + } + } + + void toggleSubstringSearch() { + useSubstringSearch = !useSubstringSearch; + std::cout << "Substring search: " << (useSubstringSearch ? "ON" : "OFF") << std::endl; + } + + void displayGraph() { + graph.display(); + } + + void showHelp() { + std::cout << "\n=== Smart Autocomplete Engine ===" << std::endl; + std::cout << "\nCommands:" << std::endl; + std::cout << ":help - Show this help message" << std::endl; + std::cout << ":exit or :q - Exit the program" << std::endl; + std::cout << " :bump - Increase frequency of a token" << std::endl; + std::cout << ":undo - Undo last accepted token" << std::endl; + std::cout << ":redo - Redo last undone token" << std::endl; + std::cout << ":toggle_contains- Toggle substring search" << std::endl; + std::cout << ":graph - Display co-occurrence graph" << std::endl; + std::cout << "\nUsage:" << std::endl; + std::cout << " - Type a prefix to get suggestions" << std::endl; + std::cout << " - Select by number or type the full token" << std::endl; + std::cout << std::endl; + } +}; + +int main() { + AutocompleteEngine engine; + + std::cout << "\n=== Smart Autocomplete Engine ===" << std::endl; + std::cout << "Type ':help' for commands" << std::endl; + std::cout << "Type ':exit' or ':q' to quit\n" << std::endl; + + std::string input; + + while (true) { + std::cout << "> "; + std::getline(std::cin, input); + + if (input.empty()) { + continue; + } + + if (input == ":exit" || input == ":q") { + std::cout << "Goodbye!" << std::endl; + break; + } + + + if (input == ":help") { + engine.showHelp(); + continue; + } + + if (input == ":undo") { + engine.performUndo(); + continue; + } + + if (input == ":redo") { + engine.performRedo(); + continue; + } + + if (input == ":toggle_contains") { + engine.toggleSubstringSearch(); + continue; + } + + if (input == ":graph") { + engine.displayGraph(); + continue; + } + + if (input.substr(0, 6) == ":bump ") { + std::string token = input.substr(6); + engine.bumpToken(token); + continue; + } + + auto suggestions = engine.getSuggestions(input, 5); + + if (suggestions.empty()) { + std::cout << "No suggestions found for '" << input << "'" << std::endl; + continue; + } + + std::cout << "\nSuggestions:" << std::endl; + for (size_t i = 0; i < suggestions.size(); i++) { + std::cout << " " << (i + 1) << ". " << suggestions[i].first + << " (score=" << suggestions[i].second << ")" << std::endl; + } + + std::cout << "\nAccept by number or token (or press Enter to skip): "; + std::string choice; + std::getline(std::cin, choice); + + if (choice.empty()) { + continue; + } + + if (isdigit(choice[0]) && choice.length() == 1) { + int num = choice[0] - '0'; + if (num >= 1 && num <= suggestions.size()) { + engine.acceptSuggestion(suggestions[num - 1].first); + } + else{ + std::cout << "Invalid selection" << std::endl; + } + } + else{ + bool found = false; + for (const auto& [token, score] : suggestions) { + if (token == choice) { + engine.acceptSuggestion(token); + found = true; + break; + } + } + if (!found) { + std::cout << "Token not in suggestions" << std::endl; + } + } + + std::cout << std::endl; + } + + return 0; +} \ No newline at end of file diff --git a/smart_autocomplete/src/minheap.cpp b/smart_autocomplete/src/minheap.cpp index e69de29..5b6c4c1 100644 --- a/smart_autocomplete/src/minheap.cpp +++ b/smart_autocomplete/src/minheap.cpp @@ -0,0 +1,68 @@ +#include"../include/minheap.h" +#include +#include +using namespace std; +MinHeap::MinHeap(int k):maxSize(k){} +void MinHeap::heapifyUp(int index){ + while(index>0 && heap[parent(index)].first>heap[index].first){ + swap(heap[parent(index)],heap[index]); + index=parent(index); + } +} +void MinHeap::heapifyDown(int index) { + int smallest=index; + int left=leftChild(index); + int right=rightChild(index); + if(leftheap[0].first){ + heap[0]={score,word}; + heapifyDown(0); + } +} +pair MinHeap::extractMin(){ + if(heap.empty()){ + throw runtime_error("Heap is empty"); + } + auto minElement=heap[0]; + heap[0]=heap.back(); + heap.pop_back(); + if(!heap.empty()){ + heapifyDown(0); + } + return minElement; +} +pair MinHeap::getMin(){ + if (heap.empty()){ + throw runtime_error("Heap is empty"); + } + return heap[0]; +} +struct Comparator{ + bool operator()(const pair& a,const pair& b) const{ + return a.first>b.first; + } +}; +vector> MinHeap::getAll(){ + vector> result=heap; + sort(result.begin(),result.end(),Comparator()); + return result; +} +void MinHeap::clear() { + heap.clear(); +} + diff --git a/smart_autocomplete/src/ranker.cpp b/smart_autocomplete/src/ranker.cpp index e69de29..b2e41b2 100644 --- a/smart_autocomplete/src/ranker.cpp +++ b/smart_autocomplete/src/ranker.cpp @@ -0,0 +1,45 @@ +#include "../include/ranker.h" +#include "../include/minheap.h" +#include + +Ranker::Ranker(FreqStore* fs, CooccurrenceGraph* g) + : freqStore(fs), graph(g), lastToken("") {} + +void Ranker::setLastToken(const std::string& token) { + lastToken = token; +} + +double Ranker::computeScore(const std::string& token){ + double freqScore = freqStore->get(token); + double graphBoost = 0.0; + + if (!lastToken.empty()){ + graphBoost = graph->getBoost(lastToken, token); + } + + return freqScore + graphBoost; +} + +std::vector> Ranker::rankResults( + const std::vector& candidates, int k){ + + if (candidates.empty()){ + return std::vector>(); + } + + MinHeap heap(k); + + for (const auto& token : candidates){ + double score = computeScore(token); + heap.insert(score, token); + } + + auto sortedResults = heap.getAll(); + + std::vector> result; + for (const auto& [score, token] : sortedResults){ + result.push_back({token, score}); + } + + return result; +} \ No newline at end of file diff --git a/smart_autocomplete/src/stack.cpp b/smart_autocomplete/src/stack.cpp index e69de29..7755bc1 100644 --- a/smart_autocomplete/src/stack.cpp +++ b/smart_autocomplete/src/stack.cpp @@ -0,0 +1,39 @@ +#include "../include/stack.h" +#include + +void UndoRedoStack::pushInsert(int position, const std::string& text) { + undoStack.push({position, text}); + while (!redoStack.empty()) { + redoStack.pop(); + } +} + +std::pair UndoRedoStack::undo() { + if (undoStack.empty()) { + throw std::runtime_error("Nothing to undo"); + } + + auto action = undoStack.top(); + undoStack.pop(); + redoStack.push(action); + + return action; +} + +std::pair UndoRedoStack::redo() { + if (redoStack.empty()) { + throw std::runtime_error("Nothing to redo"); + } + + auto action = redoStack.top(); + redoStack.pop(); + undoStack.push(action); + + return action; +} + +void UndoRedoStack::clearRedo() { + while (!redoStack.empty()) { + redoStack.pop(); + } +} \ No newline at end of file diff --git a/smart_autocomplete/src/tst.cpp b/smart_autocomplete/src/tst.cpp index e69de29..c3ad6aa 100644 --- a/smart_autocomplete/src/tst.cpp +++ b/smart_autocomplete/src/tst.cpp @@ -0,0 +1,130 @@ +#include "../include/tst.h" +#include + +TST::TST() : root(nullptr) {} + +std::shared_ptr TST::insertUtil(std::shared_ptr node, + const std::string& word, int index) { + if (node == nullptr) { + node = std::make_shared(word[index]); + } + + if (word[index] < node->data) { + node->left = insertUtil(node->left, word, index); + } else if (word[index] > node->data) { + node->right = insertUtil(node->right, word, index); + } else { + if (index < word.length() - 1) { + node->eq = insertUtil(node->eq, word, index + 1); + } else { + node->isEndOfString = true; + } + } + + return node; +} + +void TST::insert(const std::string& word) { + if (word.empty()) return; + root = insertUtil(root, word, 0); +} + +std::shared_ptr TST::searchPrefix(const std::string& prefix) { + if (prefix.empty()) return root; + + auto node = root; + int i = 0; + + while (node != nullptr && i < prefix.length()) { + if (prefix[i] < node->data) { + node = node->left; + } else if (prefix[i] > node->data) { + node = node->right; + } else { + i++; + if (i < prefix.length()) { + node = node->eq; + } + } + } + + return (i == prefix.length()) ? node : nullptr; +} + +void TST::collectWords(std::shared_ptr node, + std::string prefix, + std::vector& results) { + if (node == nullptr) return; + + collectWords(node->left, prefix, results); + + std::string current = prefix + node->data; + + if (node->isEndOfString) { + results.push_back(current); + } + + collectWords(node->eq, current, results); + collectWords(node->right, prefix, results); +} + +std::vector TST::prefixSearch(const std::string& prefix, int k) { + std::vector results; + + if (prefix.empty()) { + getAllWords(results); + if (results.size() > k) { + results.resize(k); + } + return results; + } + + auto node = searchPrefix(prefix); + + if (node == nullptr) { + return results; + } + + std::string base = prefix.substr(0, prefix.length() - 1); + + if (node->isEndOfString) { + results.push_back(prefix); + } + + collectWords(node->eq, prefix, results); + + if (results.size() > k) { + results.resize(k); + } + + return results; +} + +bool TST::search(const std::string& word) { + if (word.empty()) return false; + + auto node = root; + int i = 0; + + while (node != nullptr && i < word.length()) { + if (word[i] < node->data) { + node = node->left; + } else if (word[i] > node->data) { + node = node->right; + } else { + i++; + if (i < word.length()) { + node = node->eq; + } + } + } + + return (node != nullptr && i == word.length() && node->isEndOfString); +} + + +void TST::getAllWords(std::vector& results) { + collectWords(root, "", results); +} + + diff --git a/smart_autocomplete/tests/heap_test.cpp b/smart_autocomplete/tests/heap_test.cpp index e69de29..7579440 100644 --- a/smart_autocomplete/tests/heap_test.cpp +++ b/smart_autocomplete/tests/heap_test.cpp @@ -0,0 +1,68 @@ +#include +#include +#include "../include/minheap.h" +using namespace std; + +void test1(){ + MinHeap heap(5); + heap.insert(3.0,"three"); + heap.insert(1.0,"one"); + heap.insert(5.0,"five"); + heap.insert(2.0,"two"); + assert(heap.size()==4); + auto minele=heap.getMin(); + assert(minele.first==1.0); + assert(minele.second=="one"); + cout<<"Heap insert and extract tests passed"<=all[1].first); + assert(all[1].first>=all[2].first); + assert(all[2].first>=all[3].first); + cout << "Heap getall tests passed"< +#include +#include "../include/lru.h" + +void testLRU() { + lru_cache cache(3); + + cache.put("a", {"apple", "apricot"}); + cache.put("b", {"banana", "berry"}); + cache.put("c", {"cherry", "coconut"}); + + assert(cache.exists("a")); + assert(cache.exists("b")); + assert(cache.exists("c")); + + auto result = cache.get("a"); + assert(result.size() == 2); + assert(result[0] == "apple"); + + std::cout << "LRU Basic Operations tests passed" << std::endl; +} + +void testLRUEviction() { + lru_cache cache(2); + + cache.put("a", {"apple"}); + cache.put("b", {"banana"}); + cache.put("c", {"cherry"}); + + assert(!cache.exists("a")); + assert(cache.exists("b")); + assert(cache.exists("c")); + + std::cout << "LRU Eviction tests passed" << std::endl; +} + +void testLRUUpdate() { + lru_cache cache(3); + + cache.put("a", {"apple"}); + cache.put("b", {"banana"}); + cache.put("c", {"cherry"}); + + cache.get("a"); + + cache.put("d", {"date"}); + + assert(cache.exists("a")); + assert(!cache.exists("b")); + assert(cache.exists("c")); + assert(cache.exists("d")); + + std::cout << "LRU Update tests passed" << std::endl; +} + +void testLRUEmpty() { + lru_cache cache(5); + + auto result = cache.get("nonexistent"); + assert(result.empty()); + assert(!cache.exists("test")); + + std::cout << "LRU Empty tests passed" << std::endl; +} + +int main() { + std::cout << "\nRunning LRU Cache Tests...\n" << std::endl; + + testLRU(); + testLRUEviction(); + testLRUUpdate(); + testLRUEmpty(); + + std::cout << "\n All LRU Cache tests passed!\n" << std::endl; + + return 0; +} diff --git a/smart_autocomplete/tests/tst_test.cpp b/smart_autocomplete/tests/tst_test.cpp index e69de29..25dd7c2 100644 --- a/smart_autocomplete/tests/tst_test.cpp +++ b/smart_autocomplete/tests/tst_test.cpp @@ -0,0 +1,81 @@ +#include +#include +#include "../include/tst.h" + +void testTSTInsertAndSearch() { + TST tst; + + tst.insert("hello"); + tst.insert("world"); + tst.insert("help"); + tst.insert("hell"); + + assert(tst.search("hello")==true); + assert(tst.search("world")==true); + assert(tst.search("help")==true); + assert(tst.search("hell")==true); + assert(tst.search("he")==false); + assert(tst.search("worlds")==false); + + std::cout << "โœ“ TST Insert and Search tests passed" << std::endl; +} + +void testTSTPrefixSearch() { + TST tst; + + tst.insert("print"); + tst.insert("printf"); + tst.insert("println"); + tst.insert("private"); + tst.insert("protected"); + + auto results = tst.prefixSearch("pri", 10); + + assert(results.size() >= 3); + bool foundPrint = false; + bool foundPrintf = false; + bool foundPrivate = false; + + + for (const auto& word : results) { + if (word=="print"){ + foundPrint = true; + } + if (word=="printf"){ + foundPrintf = true; + } + if (word=="private"){ + foundPrivate = true; + } + } + + assert(foundPrint && foundPrintf && foundPrivate); + + std::cout << "โœ“ TST Prefix Search tests passed" << std::endl; +} + +void testTSTEmptyCases(){ + TST tst; + + assert(tst.search("") == false); + auto results = tst.prefixSearch("test", 5); + assert(results.empty()); + + tst.insert("test"); + assert(tst.search("test") == true); + + std::cout << "โœ“ TST Empty Cases tests passed" << std::endl; +} + +int main() { + std::cout << "\nRunning TST Tests...\n" << std::endl; + + testTSTInsertAndSearch(); + testTSTPrefixSearch(); + testTSTEmptyCases(); + + std::cout << "\n All TST tests passed!\n" << std::endl; + + return 0; +} +