Skip to content

Academic project implementing a Uni-C compiler in C using FSMs, FLEX, and BISON, covering lexical and syntactic analysis with a complete compiler pipeline for a subset of the C language. Includes FSM design, token recognition, grammar parsing, and executable testing (Compilers, UNIWA).

Notifications You must be signed in to change notification settings

Compilers-aka-Uniwa/Compiler-Uni-C

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

226 Commits
 
 
 
 
 
 
 
 

Repository files navigation

UNIWA

UNIVERSITY OF WEST ATTICA
SCHOOL OF ENGINEERING
DEPARTMENT OF COMPUTER ENGINEERING AND INFORMATICS


Compilers

Design and Implementation of a Compiler at Uni-C

Vasileios Evangelos Athanasiou
Student ID: 19390005

GitHub · LinkedIn

Georgios Theocharis
Student ID: 19390283

GitHub

Ioannis Iliou
Student ID: 19390066

GitHub · LinkedIn

Pantelis Tatsis
Student ID: 20390226

GitHub · LinkedIn

Vasileios Dominaris
Student ID: 21390055

GitHub

Supervisor: Michalis Iordanakis, Special Technical Laboratory Staff

UNIWA Profile

Athens, May 2024


Project Overview

This project involves the development of a compiler for Uni-C, a subset of the C programming language. The implementation was completed in three distinct phases, covering the fundamental stages of compiler construction:

  1. Finite State Machine (FSM) Encoding
    Design and simulation of automata for recognizing lexical units.

  2. Lexical Analysis (FLEX)
    Development of a lexical analyzer that identifies tokens using regular expressions.

  3. Syntactic Analysis (BISON)
    Construction of a parser that validates program structure based on predefined grammar rules.


Table of Contents

Section Folder Description
1 A-FLEX/ Lexical analysis phase using Finite State Machines and FLEX
1.1 A-FLEX/A2-FSM/ FSM design and implementation for Uni-C tokens
1.1.1 A-FLEX/A2-FSM/docs/ FSM theory notes, transition tables, and documentation (PDF/XLSX)
1.1.2 A-FLEX/A2-FSM/src/ FSM source files for identifiers, strings, numbers, comments, and whitespace
1.2 A-FLEX/A3-FLEX/ FLEX-based lexical analyzer implementation
1.2.1 A-FLEX/A3-FLEX/docs/ FLEX code documentation
1.2.2 A-FLEX/A3-FLEX/src/ FLEX source code, Makefile, input/output samples
1.3 A-FLEX/assign/ Assignment descriptions for Part A (FSM & FLEX)
2 B-BISON/ Syntax analysis phase using BISON
2.1 B-BISON/assign/ Assignment descriptions for Part B (BISON)
2.2 B-BISON/B2-FLEX-BISON/ Combined FLEX & BISON parser implementation
2.2.1 B-BISON/B2-FLEX-BISON/src/ Integrated lexer/parser source code and build files
2.3 B-BISON/B3-COMPILER/ Final compiler stage
2.3.1 B-BISON/B3-COMPILER/docs/ BISON grammar documentation
2.3.2 B-BISON/B3-COMPILER/src/ Final Uni-C compiler source code
3 Uni-C/ Language specification and usage guide for Uni-C

Technical Specifications

1. Lexical Analysis (Tokens)

The compiler recognizes the following categories of tokens:

  • Identifiers
    Names for variables and functions

    • Pattern: [a-zA-Z_][a-zA-Z0-9_]{0,31}
  • Keywords
    Reserved words such as:

    • if, else, while, int, return, func
  • Constants
    Supported constant types include:

    • Integers (decimal, octal, hexadecimal)
    • Floating-point numbers
    • Strings
  • Operators

    • Arithmetic: +, -, *, /
    • Relational: >, <, ==
    • Logical: &&, ||
  • Delimiters

    • Characters such as ; used to separate commands

2. Finite State Machine (FSM)

For each token category, a Finite State Automaton (FSA) was designed.

Example – Identifiers:

  • Starts at an initial state (SZ)
  • Transitions to a middle-character state (SMCH) upon receiving a letter or underscore
  • Reaches a GOOD exit state upon encountering a newline, provided the identifier is valid

3. Syntactic Analysis (BISON)

The BISON parser generator is used to define and enforce grammar rules for Uni-C programs:

  • Variable Declarations

    • Support for simple data types and arrays
  • Functions

    • Recognition of both built-in and user-defined functions
  • Expressions

    • Handling of simple and compound expressions
  • Error Handling

    • Detection and reporting of syntax errors
    • Handling of invalid tokens (TOKEN ERROR)

Project Files

  • 1_identifiers.fsm
    FSM encoding for identifier recognition

  • simple-flex-code.l
    FLEX source file containing regular expressions and token definitions

  • token.h
    Header file defining numeric constants for tokens

  • simple-bison-code.y
    BISON source file containing grammar and syntax rules


Installation & Run Guide

Prerequisites

Before compiling, ensure the required tools are installed.

Required Packages

sudo apt update
sudo apt install gcc flex bison make

Verify Installation

gcc --version
flex --version
bison --version

Install

Clone the repository

git clone https://github.com/Compilers-aka-Uniwa/Compiler-Uni-C.git

Navigate to project directory for testing the final version of Compiler

cd Compiler-Uni-C/B-BISON/B3-COMPILER/src

Also

Navigate to project directory for testing FSM

cd Compiler-Uni-C/A-FLEX/A2-FSM/src

Navigate to project directory for testing Flex

cd Compiler-Uni-C/A-FLEX/A3-FLEX/src

Navigate to project directory for testing Bison

cd Compiler-Uni-C/B-BISON/B2-FLEX-BISON/src

A2 – Finite State Machines (FSM)

Directory

A2-FSM/src

Compile

cd A2-FSM/src
gcc fsm.c -o fsm

Run

./fsm

Notes

  • FSM definitions are loaded from .fsm files (e.g. 1_identifiers.fsm, Final.fsm).
  • Transition tables are documented in the accompanying PDF and Excel files.

Open the Documentation

  1. Navigate to the A-FLEX/A2-FSM/docs/ directory
  2. Open the report corresponding to your preferred language:
    • English: Finite-State-Machines.pdf
    • Greek: Πεπερασμένα-Αυτόματα.pdf

A3 – FLEX (Lexical Analyzer)

Directory

A3-FLEX/src

Compile (using Makefile)

cd A3-FLEX/src
make

Compile (manual)

flex simple-flex-code.l
gcc lex.yy.c -o flex_app

Run

./flex_app < input.txt

Output

output.txt

Open the Documentation

  1. Navigate to the A-FLEX/A3-FLEX/docs/ directory
  2. Open the report corresponding to your preferred language:
    • English: Flex-Code.pdf
    • Greek: Κώδικας-Flex.pdf

B3 – BISON (Syntax Analyzer)

Directory

B3-COMPILER/src

Compile (using Makefile)

cd B3-COMPILER/src
make

Compile (manual)

bison -d simple-bison-code.y
flex simple-flex-code.l
gcc simple-bison-code.tab.c lex.yy.c -o bison_app

Run

./bison_app < input.txt

Output

output.txt

Open the Documentation

  1. Navigate to the B-BISON/B3-COMPILE/docs/ directory
  2. Open the report corresponding to your preferred language:
    • English: Bison-Code.pdf
    • Greek: Κώδικας-Bison.pdf

B2 – FLEX + BISON (Complete Compiler Pipeline)

Directory

B2-FLEX-BISON/src

Compile (using Makefile)

cd B2-FLEX-BISON/src
make

Compile (manual)

bison -d simple-bison-code.y
flex simple-flex-code.l
gcc simple-bison-code.tab.c lex.yy.c -o compiler

Run (test input)

./compiler < input-test.txt

Run (final input)

./compiler < input-final.txt

Output

output.txt

General Notes

  • Each module provides its own Makefile for convenience.
  • If execution permission is missing:
chmod +x <executable>
  • The project has been tested on Linux (Ubuntu).
  • For clean builds (when supported):
make clean

Troubleshooting

  • command not found: flex / bison Ensure the required packages are installed.

  • Linker errors Re-run make clean and rebuild the project.

  • Unexpected output Verify the correct input file is used and matches the grammar rules.

About

Academic project implementing a Uni-C compiler in C using FSMs, FLEX, and BISON, covering lexical and syntactic analysis with a complete compiler pipeline for a subset of the C language. Includes FSM design, token recognition, grammar parsing, and executable testing (Compilers, UNIWA).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •