Skip to content

chenxiaoyu233/CompilerExp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Why I Open This Object ?

I use this object to gain the ability to implement a compiler. Actually, I want to finish this task in a more theorical way. That is, I want to avoid the using of the traditional tools like Lex and Yacc. I want to finish this project using the theory that I learn directly from books and papers.

What I Have Already Done ?

A Lex Analyzer

It read regular expressoins and convert them to a determined finite automata. And I use this automata to capture the tokens from the input stream.

A LR(k) Parser

It takes a Grammer $G$, and build a parse tree for each finite input string. I learn this algorithm from <<On the Translation of Language from Left to Right>> which is written by Donald E. Knuth

A Complete FrontEnd framework for writting a compiler

It contains a main class named FrontEnd. The interface of this class looks like this:

class FrontEnd {
protected:
    string context;
    map<string, int> s2i;
    map<int, string> i2s;
    HumanGrammer hg;
    LR::Grammer g;
    LR::String sentence;
    LR::ParseTree* tree;
    vector<string> logContent;
    vector<LexicalAnalyzer::LexicalItemInfo> lexResult;
    vector<MCodeBase*> semantic;
    LexicalAnalyzer *lex;
    int k;

    FrontEnd(string context);
    ~FrontEnd();
    
    /* AUX Functions (you should not touch these functions) */
    /* aux function used to construct logContent, we use this function if we want to output the parse tree */
    string handleSpecialCharacter(string s);
    /* aux function to transform a human readable grammer to a grammer that is used by LR algorithm */
    void indexSymbols(HumanGrammer hg, map<string, int> &s2i, map<int, string> &i2s);
    /* Convert a HumanGrammer hg to a Grammer which is recognized by the parser */
    LR::Grammer HG2G(HumanGrammer hg, map<string, int> &s2i, map<int, string> &i2s);
    /* delete the parse tree */
    void deleteTree(LR::ParseTree *rt);

    // the function that the user need to implement
    /* define your lex rules here */
    virtual void lexDefinition() = 0;
    /* handle your lex errors here */
    virtual void lexErrorHandler(LexicalAnalyzer::LexicalErrorInfo errInfo) = 0;
    /* define your grammer and their semantic operations here */
    virtual void grammerDefinition() = 0;
    /* handle your grammer errors here */
    virtual bool grammerErrorHandler(int errorAt);
    
    /* states for translation */
    // Lex State
    void LexDefinition();
    void LexProcess();
    virtual bool AfterLex() {return true;}
    // Grammer Stage
    void GrammerDefinition();
    bool GrammerProcess();
    virtual bool AfterGrammer() {return true;}
    // Semantic Stage
    MCodeBase* SemanticAnalysis(LR::ParseTree *rt, int &cnt, int &errorCnt, int &warnCnt);
    
public:
    // End to End translation
    // @param int k, to perform LR(k) when parsing
    // @param string start, the start symbol of the grammer
    // @ret MCodeBase*, a pointer that represent the mid level code
    // @note: FrontEnd class will not free this pointer.
    MCodeBase* EndToEnd(int k, string start);
    
    // loggers
    // log the lexier's dfa
    void LogDFA();
    // log the parse tree generated by LR
    void LogParseTree();
};

When you use this framework to write you own languange, you only need to care about the grammer production, and the process you used to generate your target code.

To reach this target, I make a micro to do this conveninently. Here is the example of how to use this micro:

PE("var -> ID [ expression ]",
     /* check if the symbol exists */
     if (!symbolExists(ch(0))) {
         ErrorReport(context).Report(
             "error", "the symbol \'" + ch(0) + "\' does not exist",
             ret -> begin, ret -> end
         );
         ret -> errorCnt += 1;
     }
     /* check if the useage of this var is right */
     if (ret -> errorCnt == 0 && findSymbol(ch(0)).len == 0) {
         fprintf(stderr, "the symbol %s is not an array, it should not be followed by []\n", ch(0).c_str());
         exit(233);
     }
     /* generate the code */
     ret -> include(child[2]);
     string *s = new string();
     *s = ch(0) + "[" + ch(2) + "]";
     (ret -> info)["var"] = s;
 );

A CMinus Language

using the FrontEnd framework, I build a $C^-$ language.

How to deploy the CMinus language

First I have many options for compile the whole project.

option(TestLex "compile to test Lexical Analyzer" OFF)
option(TestParser "compile to test Parser" OFF)
option(TestCMinusFront "compile to test CMinus FrontEnd" OFF)
option(TestCMinus "compile to test CMinus" OFF)
option(NDEBUG "turn off debug" OFF)

If you only want to compile CMinus language. you could use the following command.

cmake -DTestCMinus=ON .
make && make install

Then the executable target TestCMinus will be complied and installed to CMinus/test/Debug.

the useage for TestCMinus

commandmeaning
./TestCMinus XXX.c lexperforming the lex stage on XXX.c
./TestCMinus XXX.c mcodegenerate middle stage code for XXX.c
./TestCMinus XXX.c mtreegenerate grammer tree for XXX.c
./TestCMinus XXX.cgenerate target code for XXX.c

About

a toy meta compiler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors