How to resolve the algorithm Compiler/code generator step by step in the ALGOL W programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Compiler/code generator step by step in the ALGOL W programming language

Table of Contents

Problem Statement

A code generator translates the output of the syntax analyzer and/or semantic analyzer into lower level code, either assembly, object, or virtual. Take the output of the Syntax analyzer task - which is a flattened Abstract Syntax Tree (AST) - and convert it to virtual machine code, that can be run by the Virtual machine interpreter. The output is in text format, and represents virtual assembly code. The program should read input from a file and/or stdin, and write output to a file and/or stdout. As shown in the table, above, the output from the syntax analyzer is a flattened AST. In the AST, Identifier, Integer, and String, are terminal nodes, e.g, they do not have child nodes. Loading this data into an internal parse tree should be as simple as: 32-bit integers and strings Each instruction is one byte. The following instructions also have a 32-bit integer operand: where index is an index into the data array. where index is an index into the data array. where value is a 32-bit integer that will be pushed onto the stack. where (n) is a 32-bit integer specifying the distance between the current location and the desired location. addr is an unsigned value of the actual code address. where (n) is a 32-bit integer specifying the distance between the current location and the desired location. addr is an unsigned value of the actual code address. The following instructions do not have an operand. They perform their operation directly against the stack: For the following instructions, the operation is performed against the top two entries in the stack: For the following instructions, the operation is performed against the top entry in the stack: Print the word at stack top as a character. Print the word at stack top as an integer. Stack top points to an index into the string pool. Print that entry. Unconditional stop. Your solution should pass all the test cases above and the additional tests found Here. The C and Python versions can be considered reference implementations.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Compiler/code generator step by step in the ALGOL W programming language

Source code in the algol programming language

begin % code generator %
    % parse tree nodes %
    record node( integer         type
               ; reference(node) left, right
               ; integer         iValue % nString/nIndentifier number or nInteger value %
               );
    integer    nIdentifier, nString, nInteger, nSequence, nIf,   nPrtc, nPrts
          ,    nPrti,       nWhile,  nAssign,  nNegate,   nNot,  nMultiply
          ,    nDivide,     nMod,    nAdd,     nSubtract, nLess, nLessEqual
          ,    nGreater,    nGreaterEqual,     nEqual,    nNotEqual,    nAnd, nOr
          ;
    string(14) array ndName ( 1 :: 25 );
    integer    array nOp    ( 1 :: 25 );
    integer    MAX_NODE_TYPE;
    % string literals and identifiers - uses a linked list - a hash table might be better... %
    string(1)  array text ( 0 :: 4095 );
    integer    textNext, TEXT_MAX;
    record textElement ( integer start, length; reference(textElement) next );
    reference(textElement) idList, stList;
    % op codes %
    integer    oFetch, oStore, oPush
          ,    oAdd,   oSub,   oMul, oDiv, oMod, oLt, oGt,   oLe,   oGe,   oEq,  oNe
          ,    oAnd,   oOr,    oNeg, oNot, oJmp, oJz, oPrtc, oPrts, oPrti, oHalt
          ;
    string(6)  array opName ( 1 :: 24 );
    % code - although this is intended to be byte code, as we are going to output    %
    %        an assembler source, we use integers for convenience                    %
    % labelLocations are: - ( referencing location + 1 ) if they have been referenced but not defined yet, %
    %                     zero     if they are unreferenced and undefined,                                 %
    %                     ( referencing location + 1 )   if they are defined                               %
    integer    array byteCode ( 0 :: 4095 );
    integer    array labelLocation( 1 :: 4096 );
    integer    nextLocation, MAX_LOCATION, nextLabelNumber, MAX_LABEL_NUMBER;

    % returns a new node with left and right branches %
    reference(node) procedure opNode ( integer value opType; reference(node) value opLeft, opRight ) ; begin
        node( opType, opLeft, opRight, 0 )
    end opNode ;

    % returns a new operand node %
    reference(node) procedure operandNode ( integer value opType, opValue ) ; begin
        node( opType, null, null, opValue )
    end operandNode ;

    % reports an error and stops %
    procedure genError( string(80) value message ); begin
        integer errorPos;
        write( s_w := 0, "**** Code generation error: " );
        errorPos := 0;
        while errorPos < 80 and message( errorPos // 1 ) not = "." do begin
            writeon( s_w := 0, message( errorPos // 1 ) );
            errorPos := errorPos + 1
        end while_not_at_end_of_message ;
        writeon( s_w := 0, "." );
        assert( false )
    end genError ;

    % reads a node from standard input %
    reference(node) procedure readNode ; begin
        reference(node) resultNode;

        % parses a string from line and stores it in a string in the text array %
        % - if it is not already present in the specified textElement list.     %
        % returns the position of the string in the text array                  %
        integer procedure readString ( reference(textElement) value result txList; string(1) value terminator ) ; begin
            string(256) str;
            integer     sLen, sPos, ePos;
            logical     found;
            reference(textElement) txPos, txLastPos;
            % get the text of the string %
            str  := " ";
            sLen := 0;
            str( sLen // 1 ) := line( lPos // 1 );
            sLen := sLen + 1;
            lPos := lPos + 1;
            while lPos <= 255 and line( lPos // 1 ) not = terminator do begin
                str( sLen // 1 ) := line( lPos // 1 );
                sLen := sLen + 1;
                lPos := lPos + 1
            end while_more_string ;
            if lPos > 255 then genError( "Unterminated String in node file." );
            % attempt to find the text in the list of strings/identifiers %
            txLastPos := txPos := txList;
            found := false;
            ePos := 0;
            while not found and txPos not = null do begin
                ePos  := ePos + 1;
                found := ( length(txPos) = sLen );
                sPos  := 0;
                while found and sPos < sLen do begin
                    found := str( sPos // 1 ) = text( start(txPos) + sPos );
                    sPos  := sPos + 1
                end while_not_found ;
                txLastPos := txPos;
                if not found then txPos := next(txPos)
            end while_string_not_found ;
            if not found then begin
                % the string/identifier is not in the list - add it %
                ePos := ePos + 1;
                if txList = null then txList := textElement( textNext, sLen, null )
                                 else next(txLastPos) := textElement( textNext, sLen, null );
                if textNext + sLen > TEXT_MAX then genError( "Text space exhausted." )
                else begin
                    for cPos := 0 until sLen - 1 do begin
                        text( textNext ) := str( cPos // 1 );
                        textNext := textNext + 1
                    end for_cPos
                end
            end if_not_found ;
            ePos
        end readString ;

        % gets an integer from the line - no checks for valid digits %
        integer procedure readInteger ; begin
            integer n;
            n := 0;
            while line( lPos // 1 ) not = " " do begin
                n    := ( n * 10 ) + ( decode( line( lPos // 1 ) ) - decode( "0" ) );
                lPos := lPos + 1
            end while_not_end_of_integer ;
            n
        end readInteger ;

        string(256) line;
        string(16)  name;
        integer     lPos, tPos, ndType;
        tPos := lPos := 0;
        readcard( line );
        % get the node type name %
        while line( lPos // 1 ) = " " do lPos := lPos + 1;
        name := "";
        while lPos < 256 and line( lPos // 1 ) not = " " do begin
            name( tPos // 1 ) := line( lPos // 1 );
            lPos := lPos + 1;
            tPos := tPos + 1
        end  while_more_name ;
        % determine the node type %
        ndType         := 1;
        resultNode     := null;
        if name not = ";" then begin
            % not a null node %
            while ndType <= MAX_NODE_TYPE and name not = ndName( ndType ) do ndType := ndType + 1;
            if ndType > MAX_NODE_TYPE then genError( "Malformed node." );
            % handle the additional parameter for identifier/string/integer, or sub-nodes for operator nodes %
            if ndType = nInteger or ndType = nIdentifier or ndType = nString then begin
                while line( lPos // 1 ) = " " do lPos := lPos + 1;
                if      ndType = nInteger    then resultNode := operandNode( ndType, readInteger )
                else if ndType = nIdentifier then resultNode := operandNode( ndType, readString( idList, " "  ) )
                else  % ndType = nString     %    resultNode := operandNode( ndType, readString( stList, """" ) )
                end
            else begin
                % operator node %
                reference(node) leftNode;
                leftNode   := readNode;
                resultNode := opNode( ndType, leftNode, readNode )
            end
        end if_non_null_node ;
        resultNode
    end readNode ;

    % returns the next free label number %
    integer procedure newLabel ; begin
        nextLabelNumber := nextLabelNumber + 1;
        if nextLabelNumber > MAX_LABEL_NUMBER then genError( "Program too complex" );
        nextLabelNumber
    end newLabel ;

    % defines the specified label to be at the next location %
    procedure defineLabel ( integer value labelNumber ) ; begin
        if labelLocation( labelNumber ) > 0 then genError( "Label already defined" )
        else begin
            % this is the first definition of the label, define it and if it has already been referenced, fill in the reference %
            integer currValue;
            currValue := labelLocation( labelNumber );
            labelLocation( labelNumber ) := nextLocation + 1; % we store pc + 1 to ensure the label location is positive %
            if currValue < 0 then % already referenced % byteCode( - ( currValue + 1 ) ) := labelLocation( labelNumber )
        end
    end defineLabel ;

    % stores a byte in the code %
    procedure genByte ( integer value byteValue ) ; begin
        if nextLocation > MAX_LOCATION then genError( "Program too large" );
        byteCode( nextLocation ) := byteValue;
        nextLocation := nextLocation + 1
    end genByte ;

    % stores an integer in the code %
    procedure genInteger ( integer value integerValue ) ; begin
        % we are storing the bytes of the code in separate integers for convenience %
        genByte( integerValue ); genByte( 0 ); genByte( 0 ); genByte( 0 )
    end genInteger ;

    % generates an operation acting on an address %
    procedure genDataOp ( integer value opCode, address ) ; begin
        genByte( opCode );
        genInteger( address )
    end genDataOp ;

    % generates a nullary operation %
    procedure genOp0  ( integer value opCode ) ; begin
        genByte( opCode )
    end genOp0 ;

    % generates a unary/binary operation %
    procedure genOp ( reference(node) value n ) ; begin
        gen(  left(n) );
        gen( right(n) ); % right will be null for a unary op so no code will be generated %
        genByte( nOp( type(n) ) )
    end genOp ;

    % generates a jump operation %
    procedure genJump   ( integer value opCode, labelNumber ) ; begin
        genByte( opCode );
        % if the label is not defined yet - set it's location to the negative of the referencing location %
        % so it can be resolved later %
        if labelLocation( labelNumber ) = 0 then labelLocation( labelNumber ) := - ( nextLocation + 1 );
        genInteger( labelLocation( labelNumber ) )
    end genJump ;

    % generates code for the node n %
    procedure gen ( reference(node) value n ) ; begin

        if           n  = null        then % empty node % begin end
        else if type(n) = nIdentifier then genDataOp( oFetch, iValue(n) )
        else if type(n) = nString     then genDataOp( oPush,  iValue(n) - 1 )
        else if type(n) = nInteger    then genDataOp( oPush,  iValue(n) )
        else if type(n) = nSequence   then begin
            gen(  left(n) );
            gen( right(n) )
            end
        else if type(n) = nIf         then % if-else         % begin
            integer elseLabel;
            elseLabel := newLabel;
            gen( left(n) );
            genJump( oJz, elseLabel );
            gen( left( right(n) ) );
            if right(right(n)) = null then % no "else" part % defineLabel( elseLabel )
            else begin
                % have an "else" part %
                integer endIfLabel;
                endIfLabel := newLabel;
                genJump( oJmp, endIfLabel );
                defineLabel( elseLabel );
                gen( right(right(n)) );
                defineLabel( endIfLabel )
            end
            end
        else if type(n) = nWhile      then % while-loop      % begin
            integer loopLabel, exitLabel;
            loopLabel := newLabel;
            exitLabel := newLabel;
            defineLabel( loopLabel );
            gen(  left(n) );
            genJump( oJz,  exitLabel );
            gen( right(n) );
            genJump( oJmp, loopLabel );
            defineLabel( exitLabel )
            end
        else if type(n) = nAssign     then % assignment      % begin
            gen( right( n ) );
            genDataOp( oStore, iValue(left(n)) )
            end
        else genOp( n )
    end gen ;

    % outputs the generated code to standard output %
    procedure emitCode ; begin

        % counts the number of elements in a text element list %
        integer procedure countElements ( reference(textElement) value txHead ) ; begin
            integer count;
            reference(textElement) txPos;
            count := 0;
            txPos := txHead;
            while txPos not = null do begin
                count := count + 1;
                txPos := next(txPos)
            end while_txPos_not_null ;
            count
        end countElements ;

        integer pc, op;
        reference(textElement) txPos;

        % code header %
        write( i_w := 1, s_w := 0
             , "Datasize: ", countElements( idList )
             , " Strings: ", countElements( stList )
             );
        % output the string literals %
        txPos := stList;
        while txPos not = null do begin
            integer cPos;
            write( """" );
            cPos := 1; % start from 1 to skip over the leading " %
            while cPos < length(txPos) do begin
                writeon( s_w := 0, text( start(txPos) + cPos ) );
                cPos := cPos + 1
            end while_not_end_of_string ;
            writeon( s_w := 0, """" );
            txPos := next(txPos)
        end while_not_at_end_of_literals ;

        % code body %
        pc := 0;
        while pc < nextLocation do begin
            op := byteCode( pc );
            write( i_w := 4, s_w := 0, pc, " ", opName( op ) );
            pc := pc + 1;
            if      op = oFetch or op = oStore then begin
                % data load/store - add the address in square brackets %
                writeon( i_w := 1, s_w := 0, "[", byteCode( pc ) - 1, "]" );
                pc := pc + 4
                end
            else if op = oPush                 then begin
                % push constant - add the constant %
                writeon( i_w := 1, s_w := 0, byteCode( pc ) );
                pc := pc + 4
                end
            else if op = oJmp   or op = oJz    then begin
                % jump - show the relative address in brackets and the absolute address %
                writeon( i_w := 1, s_w := 0, "(", ( byteCode( pc ) - 1 ) - pc, ") ", byteCode( pc ) - 1 );
                pc := pc + 4
            end
        end while_pc_lt_nextLocation
    end emitCode ;

    oFetch :=  1; opName( oFetch ) := "fetch"; oStore :=  2; opName( oStore ) := "store"; oPush :=  3; opName( oPush ) := "push";
    oAdd   :=  4; opName( oAdd   ) := "add";   oSub   :=  5; opName( oSub   ) := "sub";   oMul  :=  6; opName( oMul  ) := "mul";
    oDiv   :=  7; opName( oDiv   ) := "div";   oMod   :=  8; opName( oMod   ) := "mod";   oLt   :=  9; opName( oLt   ) := "lt";
    oGt    := 10; opName( oGt    ) := "gt";    oLe    := 11; opName( oLe    ) := "le";    oGe   := 12; opName( oGe   ) := "ge";
    oEq    := 13; opName( oEq    ) := "eq";    oNe    := 14; opName( oNe    ) := "ne";    oAnd  := 15; opName( oAnd  ) := "and";
    oOr    := 16; opName( oOr    ) := "or";    oNeg   := 17; opName( oNeg   ) := "neg";   oNot  := 18; opName( oNot  ) := "not";
    oJmp   := 19; opName( oJmp   ) := "jmp";   oJz    := 20; opName( oJz    ) := "jz";    oPrtc := 21; opName( oPrtc ) := "prtc";
    oPrts  := 22; opName( oPrts  ) := "prts";  oPrti  := 23; opName( oPrti  ) := "prti";  oHalt := 24; opName( oHalt ) := "halt";

    nIdentifier      :=  1; ndName( nIdentifier   ) := "Identifier";   nString   :=  2; ndName( nString   ) := "String";
    nInteger         :=  3; ndName( nInteger      ) := "Integer";      nSequence :=  4; ndName( nSequence ) := "Sequence";
    nIf              :=  5; ndName( nIf           ) := "If";           nPrtc     :=  6; ndName( nPrtc     ) := "Prtc";
    nPrts            :=  7; ndName( nPrts         ) := "Prts";         nPrti     :=  8; ndName( nPrti     ) := "Prti";
    nWhile           :=  9; ndName( nWhile        ) := "While";        nAssign   := 10; ndName( nAssign   ) := "Assign";
    nNegate          := 11; ndName( nNegate       ) := "Negate";       nNot      := 12; ndName( nNot      ) := "Not";
    nMultiply        := 13; ndName( nMultiply     ) := "Multiply";     nDivide   := 14; ndName( nDivide   ) := "Divide";
    nMod             := 15; ndName( nMod          ) := "Mod";          nAdd      := 16; ndName( nAdd      ) := "Add";
    nSubtract        := 17; ndName( nSubtract     ) := "Subtract";     nLess     := 18; ndName( nLess     ) := "Less";
    nLessEqual       := 19; ndName( nLessEqual    ) := "LessEqual";    nGreater  := 20; ndName( nGreater  ) := "Greater";
    nGreaterEqual    := 21; ndName( nGreaterEqual ) := "GreaterEqual"; nEqual    := 22; ndName( nEqual    ) := "Equal";
    nNotEqual        := 23; ndName( nNotEqual     ) := "NotEqual";     nAnd      := 24; ndName( nAnd      ) := "And";
    nOr              := 25; ndName( nOr           ) := "Or";
    MAX_NODE_TYPE    := 25; TEXT_MAX := 4095; textNext := 0;
    stList := idList := null;
    for nPos := 1 until MAX_NODE_TYPE do nOp( nPos ) := -1;
    nOp( nPrtc     ) := oPrtc; nOp( nPrts      ) := oPrts; nOp( nPrti    ) := oPrti; nOp( nNegate       ) := oNeg; nOp( nNot      ) := oNot;
    nOp( nMultiply ) := oMul;  nOp( nDivide    ) := oDiv;  nOp( nMod     ) := oMod;  nOp( nAdd          ) := oAdd; nOp( nSubtract ) := oSub;
    nOp( nLess     ) := oLt;   nOp( nLessEqual ) := oLe;   nOp( nGreater ) := oGt;   nOp( nGreaterEqual ) := oGe;  nOp( nEqual    ) := oEq;
    nOp( nNotEqual ) := oNe;   nOp( nAnd       ) := oAnd;  nOp( nOr      ) := oOr;
    nextLocation     := 0; MAX_LOCATION := 4095;
    for pc := 0 until MAX_LOCATION do byteCode( pc ) := 0;
    nextLabelNumber := 0; MAX_LABEL_NUMBER := 4096;
    for lPos := 1 until MAX_LABEL_NUMBER do labelLocation( lPos ) := 0;

    % parse the output from the syntax analyser and generate code from the parse tree %
    gen( readNode );
    genOp0( oHalt );
    emitCode
end.

  

You may also check:How to resolve the algorithm Real constants and functions step by step in the Plain English programming language
You may also check:How to resolve the algorithm Create an HTML table step by step in the 360 Assembly programming language
You may also check:How to resolve the algorithm History variables step by step in the AutoHotkey programming language
You may also check:How to resolve the algorithm Parsing/RPN calculator algorithm step by step in the AutoHotkey programming language
You may also check:How to resolve the algorithm Trabb Pardo–Knuth algorithm step by step in the Java programming language