Skip to content

Joannapapad/minipython-compiler-sablecc

Repository files navigation

MiniPython Compiler (SableCC, Java)

Overview

This project implements a compiler for a subset of Python called MiniPython, using the SableCC compiler generator.

The compiler supports:

  • Lexical Analysis
  • Syntax Analysis
  • Abstract Syntax Tree (AST) generation
  • Semantic Analysis
  • Type Checking

The goal of the project is to simulate a real compiler pipeline and apply core concepts from:

  • Compiler design
  • Formal languages
  • Static analysis

Project Structure

The compiler is built in phases:

Phase A

  • Lexical Analysis (Lexer)
  • Syntax Analysis (Parser)

Phase B

  • Abstract Syntax Tree (AST)
  • Symbol Table construction
  • Semantic Analysis
  • Type Checking

Technologies Used

  • Java
  • SableCC
  • Visitor Design Pattern

Lexical Analysis

The lexical analyzer is implemented using SableCC.

Setup

Place the following files:


compiler/
│
├── lib/
│   └── sablecc.jar
│
├── sablecc.bat
├── minipython.grammar
├── LexerTest1.java

The sablecc.bat file contains:


java -jar lib\sablecc.jar %1 %2 %3 %4 %5 %6 %7 %8 %9


Generate Lexer & Parser

Run:


sablecc minipython.grammar

This generates:


minipython/
├── analysis/
├── lexer/
├── node/
└── parser/


Compile and Run


javac LexerTest1.java
java LexerTest1 example.py

Output:

  • Prints all tokens of the input program

Semantic Analysis

Semantic analysis is implemented using the Visitor pattern provided by SableCC.

The system uses two-pass analysis:


VisitorA – Symbol Table & Declarations

Responsible for:

  • Building the Symbol Table
  • Variable declaration checks
  • Function declaration tracking
  • Argument validation

Handles:

  • Variable scopes (global + function scope)
  • Function calls (including forward references)

VisitorB – Type Checking

Responsible for:

  • Type validation in expressions
  • Function return type checking
  • Operation compatibility

Focuses only on type correctness after structure is validated.


MiniPython allows forward references (calling functions before declaration).

To handle this correctly:

  1. VisitorA

    • Scans entire program
    • Collects all declarations
    • Builds symbol table
  2. VisitorB

    • Performs type checking
    • Assumes valid structure

This ensures:

  • Correct order of analysis
  • Cleaner separation of concerns

Semantic Rules Implemented

1. Use of Undeclared Variables

  • Checks if variables are declared before use
  • Tracks:
    • Global variables
    • Function scope variables

2. Call to Undeclared Function

  • Stores function calls temporarily
  • Validates after full scan
  • Supports forward declarations

3. Incorrect Number of Function Arguments

  • Compares:
    • Required arguments
    • Provided arguments
  • Supports default parameters

4. Type Mismatch in Expressions

Examples:

  • Integer + String
  • String - Integer

5. Invalid Use of None

  • None cannot be used in arithmetic expressions

6. Invalid Function Usage

  • Checks return type compatibility
  • Simulates function execution to determine return type

7. Conflicting Function Declarations

  • Detects overlapping function signatures
  • Prevents ambiguous calls

Key Concepts

  • Abstract Syntax Trees (AST)
  • Symbol Tables
  • Static Type Checking
  • Scope Management
  • Visitor Pattern
  • Compiler Pipeline

How to Run Full Compiler

  1. Generate parser:

sablecc minipython.grammar

  1. Compile:

javac *.java minipython/**/*.java

  1. Run:

java Main input.py


Example Input

def add(x, y=2):
    return x + y

print(add(3))

Author

Joanna Papadakaki

About

A compiler for a subset of Python (miniPython) built with SableCC, featuring lexical analysis, parsing, AST generation, symbol table construction, and semantic/type checking using the Visitor pattern.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors