Compiler frontend for a C subset language which implements advanced array support for a custom programming language. It includes features such as N-dimensional arrays, dynamic resizing, array operations, and more.
-
Lexer and Parser
ssc.l: Flex-based lexical analyzerssc.y: Bison-based parserssc_types.h: Core type definitions
-
Abstract Syntax Tree (AST)
ast.hpp: AST node definitions and visitor interfacesast.cpp: Implementation of AST node operations
-
LLVM Code Generation
llvmcodegen.hpp: LLVM code generator interfacellvmcodegen.cpp: Implementation of LLVM IR generationllvmruntime.cpp: Runtime support for LLVM-generated codeCodeGen.h: Code generation utilities and helpers
-
Intermediate Representation
IR.h: Symbol table and array operation definitions
-
Support Files
compile.sh: Build automation scriptMakefile: Project build configuration- Various test files (*.ssc): Example programs and test cases
- LLVM development libraries (version 14.0 or later)
- Flex (Fast Lexical Analyzer)
- Bison (Parser Generator)
- C++ compiler with C++17 support
- Make build system
-
Install dependencies:
# Ubuntu/Debian sudo apt-get install llvm-dev flex bison build-essential # macOS brew install llvm flex bison
-
Build the compiler:
# Using make to compile the compiler make clean make # then use the compile script to use the compiler to generate cpp and llvm ir code chmod +x compile.sh ./compile.sh test.ssc ./a.out
-
The build process will generate:
ssc_compiler: The main compiler executable- Various intermediate files (lex.yy.c, ssc.tab.c, etc.)
- N-dimensional array support
- Dynamic resizing
- Broadcasting
- Array slicing
- Map and reduce operations
- Built-in functions (sum, max, min, average)
- Static type checking
- Support for integers, doubles, and strings
- Array type inference
- Dimension checking for array operations
- Efficient code generation using LLVM IR
- Optimization passes
- Cross-platform support
- JIT compilation capability
-
Parsing: Source code → AST
- Lexical analysis (ssc.l)
- Syntax analysis (ssc.y)
- AST construction (ast.cpp)
-
Analysis:
- Type checking
- Dimension validation
- Symbol resolution
-
LLVM IR Generation:
- AST traversal
- LLVM IR instruction generation
- Runtime function integration
-
Optimization and Output:
- LLVM optimization passes
- Machine code generation
- Object file or executable output
-
Create a source file (e.g.,
test.ssc):array int a[3][3] = {{1,2,3},{4,5,6},{7,8,9}}; array int b = map(a, SQUARE); int sum = reduce(b, ADD); print(sum); -
Compile and run:
make ./ssc_compiler test.ssc ./a.out
- Use
array-test.sscfor array operation testing input.sscprovides basic functionality tests- Debug output can be enabled in both lexer and parser
- CPP is saved in output.cpp
- LLVM IR output is saved in
output.ll
This project is open source and available under the MIT License.
-
Language Features:
- N-dimensional array support
- Dynamic array resizing
- Array operations (sum, max, min, average)
- Broadcasting
- Mapping and reduction
- Array slicing
- Serialization and deserialization
-
Build Process:
- Use Flex to generate the lexer (
lex ssc.l) - Use Bison to generate the parser (
bison -d ssc.y) - Compile the generated C files along with your main program
- Use Flex to generate the lexer (
-
Execution:
- The main function in
ssc.yserves as the entry point - It can read input from a file (if provided as an argument) or from stdin
- The main function in
-
Symbol Table:
- Implemented using
std::mapin C++ - Separate tables for different data types (double, int, string)
- Implemented using
-
Array Implementation:
- Multi-dimensional arrays are flattened into 1D vectors for storage
- A separate map (
arrayDimensions) keeps track of the dimensions
-
Error Handling:
- The
yyerrorfunction is used to report parsing errors - Runtime errors (e.g., out-of-bounds access) are handled using C++ exceptions
- The
-
Debugging:
- Debug macros are provided in both the lexer and parser
- Can be enabled by defining
DEBUGSSCandDEBUGBISONrespectively
To compile and run the project:
- Generate the lexer:
flex ssc.l - Generate the parser:
bison -d ssc.y - Compile the generated files along with your main program:
make clean make chmod +x compile.sh ./compile.sh test.ssc - Run the compiler:
- With input file:
./ssc_compiler input_file.ssc - Interactive mode:
./ssc_compiler
- With input file:
The project includes a Makefile with the following commands:
make - Compile the SSC compiler
make run - Run the SSC compiler with the test file
make clean - Remove compiled and intermediate files
make distclean - Remove all build files and output
make help - Display this help message
To use these commands, simply type make followed by the desired target. For example:
make # Compile the compiler
make run # Run the compiler with the test file
make clean # Clean up build files
Use make help to see a list of available commands and their descriptions.
- Implement more array operations and functions
- Optimize memory usage for large arrays
- Add support for user-defined functions in map and reduce operations
- Enhance error reporting and recovery mechanisms
- Implement LLVM code generation
One of the major planned improvements for this project is the implementation of LLVM code generation. This will involve:
-
LLVM Integration: Integrating the LLVM libraries into the project to enable code generation.
-
IR to LLVM IR Translation: Developing a module to translate our custom Intermediate Representation (IR) into LLVM IR.
-
Optimization Passes: Implementing LLVM optimization passes to improve the generated code's performance.
-
Code Emission: Generating executable machine code from the LLVM IR.
-
Array Operation Optimization: Utilizing LLVM's vector operations to optimize array manipulations.
-
JIT Compilation: Exploring the possibility of Just-In-Time (JIT) compilation for improved runtime performance.
-
Cross-platform Support: Leveraging LLVM's cross-platform capabilities to generate code for multiple target architectures.
The addition of LLVM code generation will significantly enhance the project by:
- Improving execution speed of the compiled programs
- Enabling more advanced optimizations
- Providing better cross-platform support
- Allowing for potential future features like runtime code generation
This feature will require additional dependencies (LLVM libraries) and will likely introduce new build steps and make commands, which will be documented once implemented.