This project implements a compiler for a subset of Python called MiniPython, using the SableCC compiler generator.
The compiler supports:
- Lexical Analysis
- Syntax Analysis
- Abstract Syntax Tree (AST) generation
- Semantic Analysis
- Type Checking
The goal of the project is to simulate a real compiler pipeline and apply core concepts from:
- Compiler design
- Formal languages
- Static analysis
The compiler is built in phases:
- Lexical Analysis (Lexer)
- Syntax Analysis (Parser)
- Abstract Syntax Tree (AST)
- Symbol Table construction
- Semantic Analysis
- Type Checking
- Java
- SableCC
- Visitor Design Pattern
The lexical analyzer is implemented using SableCC.
Place the following files:
compiler/
│
├── lib/
│ └── sablecc.jar
│
├── sablecc.bat
├── minipython.grammar
├── LexerTest1.java
The sablecc.bat file contains:
java -jar lib\sablecc.jar %1 %2 %3 %4 %5 %6 %7 %8 %9
Run:
sablecc minipython.grammar
This generates:
minipython/
├── analysis/
├── lexer/
├── node/
└── parser/
javac LexerTest1.java
java LexerTest1 example.py
Output:
- Prints all tokens of the input program
Semantic analysis is implemented using the Visitor pattern provided by SableCC.
The system uses two-pass analysis:
Responsible for:
- Building the Symbol Table
- Variable declaration checks
- Function declaration tracking
- Argument validation
Handles:
- Variable scopes (global + function scope)
- Function calls (including forward references)
Responsible for:
- Type validation in expressions
- Function return type checking
- Operation compatibility
Focuses only on type correctness after structure is validated.
MiniPython allows forward references (calling functions before declaration).
To handle this correctly:
-
VisitorA
- Scans entire program
- Collects all declarations
- Builds symbol table
-
VisitorB
- Performs type checking
- Assumes valid structure
This ensures:
- Correct order of analysis
- Cleaner separation of concerns
- Checks if variables are declared before use
- Tracks:
- Global variables
- Function scope variables
- Stores function calls temporarily
- Validates after full scan
- Supports forward declarations
- Compares:
- Required arguments
- Provided arguments
- Supports default parameters
Examples:
- Integer + String
- String - Integer
Nonecannot be used in arithmetic expressions
- Checks return type compatibility
- Simulates function execution to determine return type
- Detects overlapping function signatures
- Prevents ambiguous calls
- Abstract Syntax Trees (AST)
- Symbol Tables
- Static Type Checking
- Scope Management
- Visitor Pattern
- Compiler Pipeline
- Generate parser:
sablecc minipython.grammar
- Compile:
javac *.java minipython/**/*.java
- Run:
java Main input.py
def add(x, y=2):
return x + y
print(add(3))Joanna Papadakaki