Thursday 30 April 2015

How To Program to Eliminate Left Factoring in Compiler Design

Post By - Tanmay | 4/30/2015 11:40:00 am
In LL(1) Parser in Compiler Design, Even if a context-free grammar is unambiguous and non-left-recursion it still can not be a LL(1) Parser. That is because of Left Factoring.

What is  Left Factoring ?

Consider a part of regular grammar,

E->aE+bcD
E->aE+cBD

Here, grammar is non-left recursive, and unambiguous but there is left factoring.

How to resolve ?

E=aB | aC | aD | ............

then,

E=aX
X=B | C | D |...........

So, the above grammar will be as :

E=aE+X
X=bcD | cBD

Program :

1:  #include<stdio.h>  
2:  #include<string.h>  
3:  int main()  
4:  {  
5:       char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20];  
6:       int i,j=0,k=0,l=0,pos;  
7:       printf("Enter Production : A->");  
8:       gets(gram);  
9:       for(i=0;gram[i]!='|';i++,j++)  
10:            part1[j]=gram[i];  
11:       part1[j]='\0';  
12:       for(j=++i,i=0;gram[j]!='\0';j++,i++)  
13:            part2[i]=gram[j];  
14:       part2[i]='\0';  
15:       for(i=0;i<strlen(part1)||i<strlen(part2);i++)  
16:       {  
17:            if(part1[i]==part2[i])  
18:            {  
19:                 modifiedGram[k]=part1[i];  
20:                 k++;  
21:                 pos=i+1;  
22:            }  
23:       }  
24:       for(i=pos,j=0;part1[i]!='\0';i++,j++){  
25:            newGram[j]=part1[i];  
26:       }  
27:       newGram[j++]='|';  
28:       for(i=pos;part2[i]!='\0';i++,j++){  
29:            newGram[j]=part2[i];  
30:       }  
31:       modifiedGram[k]='X';  
32:       modifiedGram[++k]='\0';  
33:       newGram[j]='\0';  
34:       printf("\n A->%s",modifiedGram);  
35:       printf("\n X->%s\n",newGram);  
36:  }  

 Output :

How To Program Compiler Design Symbol Table Generator in C

Post By - Tanmay | 4/30/2015 10:32:00 am
Symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is associated with information relating to its declaration or appearance in the source, such as its type, scope level and sometimes its location.

--- Source : Wikipedia 

In this tutorial we will learn to build symbol table generator based on identifiers and keywords.

Test Case : Suppose in a program you encounter with some data types as

int a; float b; char c; double g,h;

Symbol table will be as follow :

INTEGER : a
FLOAT : b
CHAR : c
DOUBLE : g
DOUBLE : h

Problem :

YYSTYPE in y.tab.c and y.tab.h is by default INTEGER type, since we are using extern char *yylval it will say a warning and gives an error "error: conflicting types for ‘yylval’ In file included" along with it. To resolve it
just open your generated y.tab.c and y.tab.h and steps :

1: Search for typedef int YYSTYPE
2: Replace int with char*

Perform this in both of the file i.e. y.tab.c and y.tab.h
Program :

LEX File : Symbol.l
1:  %{  
2:  #include"y.tab.h"  
3:  extern char *yylval;  
4:  int x=0;  
5:  %}  
6:  %%  
7:  "int" {x++;return INT;}  
8:  "float" {x++;return FLOAT;}  
9:  "double" {x++;return DOUBLE;}  
10:  "char" {x++;return CHAR;;}  
11:  [a-z]+ {yylval=yytext; if(x>0)return ID; return O;}  
12:  [\n] {return NL;}  
13:  "," {return C;}  
14:  ";" {x--;return SE;}  
15:  . {}  
16:  %%  

YACC File : Symbol.y

1:  %{  
2:  #include<stdio.h>  
3:  #include<string.h>  
4:  int fl=0,i=0,type[100],j=0,error_flag=0;  
5:  char symbol[100][100],temp[100];  
6:  %}  
7:  %token INT FLOAT C DOUBLE CHAR ID NL SE O  
8:  %%  
9:  START:S1 NL {return;}  
10:  ;  
11:  S1:S NL S1  
12:  |S NL  
13:  ;  
14:  S:INT L1 E  
15:  |FLOAT L2 E  
16:  |DOUBLE L3 E  
17:  |CHAR L4 E  
18:  |INT L1 E S  
19:  |FLOAT L2 E S  
20:  |DOUBLE L3 E S  
21:  |CHAR L4 E S  
22:  |O  
23:  ;  
24:  L1:L1 C ID {strcpy(temp,(char *)$3);insert(0);}  
25:  |ID {strcpy(temp,(char *)$1);insert(0);}  
26:  ;  
27:  L2:L2 C ID {strcpy(temp,(char *)$3);insert(1);}  
28:  |ID {strcpy(temp,(char *)$1);insert(1);}  
29:  ;  
30:  L3:L3 C ID {strcpy(temp,(char *)$3);insert(2);}  
31:  |ID {strcpy(temp,(char *)$1);insert(2);}  
32:  ;  
33:  L4:L4 C ID {strcpy(temp,(char *)$3);insert(3);}  
34:  |ID {strcpy(temp,(char *)$1);insert(3);}  
35:  ;  
36:  E:SE  
37:  ;  
38:  %%  
39:  int main()  
40:  {  
41:       yyparse();  
42:       if(error_flag==0)  
43:       for(j=0;j<i;j++)  
44:       {  
45:            if(type[j]==0)  
46:                 printf(" INT - ");  
47:            if(type[j]==1)  
48:                 printf(" FLOAT - ");  
49:            if(type[j]==2)  
50:                 printf(" DOUBLE - ");  
51:            if(type[j]==3)  
52:                 printf(" CHAR - ");  
53:            printf(" %s\n",symbol[j]);  
54:       }  
55:  }  
56:  int yyerror(char *ch)  
57:  {  
58:       return 1;  
59:  }  
60:  int yywrap(){  
61:       return 1;  
62:  }  
63:  int insert(int type1)  
64:  {  
65:       fl=0;  
66:       for(j=0;j<fl;j++)  
67:       if(strcmp(temp,symbol[j])==0)  
68:       {  
69:            if(type[i]==type1)  
70:                 printf("REDECLARATION OF %s\n",temp);  
71:            else  
72:            {  
73:                 printf("MULTIPLE DECLARATION OF %s\n",temp);  
74:                 error_flag=1;  
75:            }  
76:            fl=1;  
77:       }  
78:       if(fl==0)  
79:       {  
80:            strcpy(symbol[i],temp);  
81:            type[i]=type1;  
82:            i++;  
83:       }  
84:  }  

Output :

How To Find Left Recursion and Remove it Using C Program

Post By - Tanmay | 4/30/2015 10:13:00 am
In this tutorial you will learn to develop a program in which you'll find and remove left recursion.

What is left recursion ?

Left Recursion:

Consider,

E->E+T
E=a
T=b

In it's parse tree E will grow left indefinitely, so to remove it

E=Ea | b

we take as

E=bE'
E'= aE'|E

Program :

1:  #include<stdio.h>  
2:  #include<string.h>  
3:  #define SIZE 10  
4:  int main () {  
5:       char non_terminal;  
6:       char beta,alpha;  
7:       int num;  
8:       char production[10][SIZE];  
9:       int index=3; /* starting of the string following "->" */  
10:       printf("Enter Number of Production : ");  
11:       scanf("%d",&num);  
12:       printf("Enter the grammar as E->E-A :\n");  
13:       for(int i=0;i<num;i++){  
14:            scanf("%s",production[i]);  
15:       }  
16:       for(int i=0;i<num;i++){  
17:            printf("\nGRAMMAR : : : %s",production[i]);  
18:            non_terminal=production[i][0];  
19:            if(non_terminal==production[i][index]) {  
20:                 alpha=production[i][index+1];  
21:                 printf(" is left recursive.\n");  
22:                 while(production[i][index]!=0 && production[i][index]!='|')  
23:                      index++;  
24:                 if(production[i][index]!=0) {  
25:                      beta=production[i][index+1];  
26:                      printf("Grammar without left recursion:\n");  
27:                      printf("%c->%c%c\'",non_terminal,beta,non_terminal);  
28:                      printf("\n%c\'->%c%c\'|E\n",non_terminal,alpha,non_terminal);  
29:                 }  
30:                 else  
31:                      printf(" can't be reduced\n");  
32:            }  
33:            else  
34:                 printf(" is not left recursive.\n");  
35:            index=3;  
36:       }  
37:  }   

Output :



Next : Symbol Table Generator

How to Program to Print First and Follow of Dynamic Regular Grammar

Post By - Tanmay | 4/30/2015 09:38:00 am
In this tutorial,you'll learn to program the algorithm to find the First of Dynamic Grammar. However, you can also build your own code for static grammar.

What is First of a Grammar ?

Consider, ('$' is for NULL)

E=E+T
E=T
T=T-E
T=F
T=id
F=lex
F=$
E=$

So, first of any grammar is defined by where a non terminal is heading to get a terminal.
In above case,

For E,
E goes to '$' and 'T', which can be included in First{E}={'$' and First{T}}

For T,
T heads to 'id' and 'F', which can be included in First{t}={'id' and First{F}}


For F,
F heads to 'lex' and '$', which can be included in First{t}={'$','lex'}

so,
First{F}={'$','lex'}
First{T}={'$','lex','id'}
First{E}={'$','lex','id'}

Program :

1:  #include"ctype.h"  
2:  #include"string.h"  
3:  #include"stdio.h"  
4:  char gram[10][10],vFirst[5];  
5:  int elem[10],size,fPt,k=0;  
6:  int getGram(){  
7:       char ch; int i,j,k;       
8:       printf("\nEnter Number of Rule : ");  
9:       scanf("%d",&size);  
10:       printf("\nEnter Grammar as E=E+B \n");  
11:       for(i=0;i<size;i++){  
12:            scanf("%s%c",gram[i],&ch);  
13:            elem[i]=strlen(gram[i]);  
14:       }  
15:       printf("\nGrammar is :\n");  
16:       for(i=0;i<size;i++)  
17:            printf("\n%s",gram[i]);  
18:  }  
19:  int funcFirst(char victim){  
20:       int j,i;  
21:       if(!(isupper(victim)))  
22:            vFirst[k++]=victim;  
23:       else  
24:       for(j=0;j<size;j++){  
25:            if(gram[j][0]==victim){  
26:                 if(gram[j][2]=='$')  
27:                      vFirst[k++]='$';  
28:                 else if(islower(gram[j][2]))  
29:                      vFirst[k++]=gram[j][2];  
30:                 else  
31:                      funcFirst(gram[j][2]);  
32:            }  
33:       }  
34:  }  
35:  int main(){  
36:       int i,j,k;  
37:       getGram();  
38:       printf("\nEnter the Non Terminal : ");  
39:       char nt;  
40:       scanf("%c",&nt);  
41:       funcFirst(nt);  
42:       printf("\n{ ");  
43:       for(i=0;i<strlen(vFirst);i++)  
44:            printf(" %c",vFirst[i]);  
45:       printf(" }\n");  
46:  }  

Output :




The Code for Follow is also added, but it's something fishy. Go figure it out, find the bug and get a chance to get featured with one program on C#ODE STUDIO and CS-BEANS.

1:  #include"stdio.h"  
2:  #include"string.h"  
3:  #include"ctype.h"  
4:  int n=0,m=0,i=0,j=0,k=0;  
5:  char vGram[10][10], vFirst[5], vFollow[10];  
6:  int funcFirst(char);  
7:  int funcFirst_F(char);  
8:  int funcFollow(char);  
9:  int funcFirst(char victim){  
10:       int j,i;  
11:       if(!(isupper(victim)))  
12:            vFirst[k++]=victim;  
13:       for(j=0;j<n;j++){  
14:            if(vGram[j][0]==victim){  
15:                 if(vGram[j][2]=='$')  
16:                      vFirst[k++]='$';  
17:                 else if(islower(vGram[j][2]))  
18:                      vFirst[k++]=vGram[j][2];  
19:                 else  
20:                      funcFirst(vGram[j][2]);  
21:            }  
22:       }  
23:       printf("\n{ ");  
24:       for(i=0;i<strlen(vFirst);i++)  
25:            printf(" %c",vFirst[i]);  
26:       printf(" }\n");  
27:  }  
28:  int funcFirst_F(char victim){  
29:       int j;  
30:       if(!(isupper(victim)))  
31:            vFollow[k++]=victim;  
32:       for(j=0;j<n;j++){  
33:            if(vGram[j][0]==victim){  
34:                 if(vGram[j][2]=='$')  
35:                      funcFollow(vGram[j][0]);  
36:                 else if(islower(vGram[j][2]))  
37:                      vFollow[k++]=vGram[j][2];  
38:                 else  
39:                      funcFirst_F(vGram[j][2]);  
40:            }  
41:       }  
42:  }  
43:  int funcFollow(char victim){  
44:       int i=0,g=0;  
45:       if(vGram[0][0]==victim)  
46:            vFollow[m++]='$';  
47:       for(i=0;i<n;i++){  
48:            for(j=2;j<strlen(vGram[i]); j++){  
49:                 if(vGram[i][j]==victim){  
50:                      if(vGram[i][j+1]!='\0')  
51:                           funcFirst_F(vGram[i][j+1]);  
52:                      if(vGram[i][j+1]=='\0' && victim!=vGram[1][0])  
53:                           funcFollow(vGram[i][0]);  
54:                 }  
55:            }  
56:       }  
57:       printf("\n{ ");  
58:       for(g=0;g<strlen(vFollow);g++)  
59:            printf(" %c",vFollow[g]);  
60:       printf(" }\n");  
61:  }  
62:  int main(){  
63:       printf("\nProgram contains bug in Follow Module. ");  
64:       printf("\nVisit http://www.csbeans.com/ to submit resolved one.\n\n");  
65:       int i,z,choice,cont=1;  
66:       char ch, c;  
67:       printf("Enter the number of production : ");  
68:       scanf("%d",&n);  
69:       printf("Enter production as 'E=AB' and '$' for null\n");  
70:       for(i=0;i<n;i++)  
71:            scanf("%s%c",vGram[i],&ch);  
72:       while(cont==1){  
73:            printf("Enter Victim Character : ");  
74:            scanf("%c",&c);  
75:            printf("Enter Choice : 1:First 2:Follow : ");  
76:            scanf("%d",&choice);  
77:            switch(choice){  
78:                 case 1: funcFirst(c);  
79:                           break;  
80:                 case 2: funcFollow(c);  
81:                           break;  
82:            }  
83:            printf("\n 1:continue");  
84:            scanf("%d%c",&cont,&ch);  
85:       }  
86:       printf("\n\n\t------PROGRAMMED AT C#ODE STUDIO------");  
87:  } 

So, get open up your IDE and start debugging.

Wednesday 29 April 2015

How to Program to Perform Shift Reduce Steps in Compiler Design

Post By - Tanmay | 4/29/2015 07:33:00 am
The Source Code for Shift and Reduce for Dynamic Input is below. However, you're required to build for static input.

For Shift and Reduce algorithm, we will use C - Language. You can also use Turbo-C or any IDE of your own choice also.

Saved program as : shiftr.c

1:  #include"stdio.h"  
2:  #include"ctype.h"  
3:  #include"string.h"  
4:  char stack[10], gram[10][10], input[10];  
5:  int size, elem[5], sizei, sizes;  
6:  int getGram(){  
7:       int i,j,k;  
8:       char ch;  
9:       printf("\nEnter Number of Rules : ");  
10:       scanf("%d",&size);  
11:       printf("\nEnter Rule as 'E=E+T' : ");  
12:       for(i=0;i<size;i++){  
13:                 scanf("%s%c",gram[i],&ch);  
14:       }  
15:       for(i=0;i<size;i++)  
16:            printf("\n%s",gram[i]);  
17:  }  
18:  int getInput(){  
19:       char ch;  
20:       printf("\nEnter size of input by element : ");  
21:       scanf("%d",&sizei);  
22:       sizei+=1;  
23:       printf("\nEnter Input as 'a+b' followed by '$' : ");  
24:       scanf("%s%c",input,&ch);  
25:       printf("\nInput is %s",input);  
26:  }  
27:  char validate(){  
28:       int i,j,flag=0,pos;  
29:       char ch='q';  
30:       printf("\n\tStackTop : %c\n\tStack : %s",stack[sizes-1],stack);  
31:       for(i=0;i<size;i++){  
32:            if(stack[sizes-1]==gram[i][2]){  
33:                 flag=1;  
34:                 pos=i;  
35:                 break;  
36:            }  
37:       }  
38:       if(flag==1){  
39:            ch=gram[pos][0];  
40:            printf("\nReduce %s",gram[pos]);  
41:       }  
42:       return ch;  
43:  }  
44:  int applyOp(){  
45:       char ch;  
46:       int i;  
47:       for(i=0;i<sizei;i++){  
48:            if(input[i]!='$'){  
49:                 stack[sizes]=input[i];  
50:                 printf("\nShift %c",input[i]);  
51:                 input[i]=' ';  
52:                 sizes=strlen(stack);  
53:            }  
54:            /*if(stack[i]=='a'||'b')  
55:                 stack[i]='E';  
56:            else */  
57:            ch=validate();  
58:            if(ch!='q'){  
59:                 stack[sizes-1]=ch;  
60:            }  
61:       }  
62:       for(i=0;i<size;i++){  
63:            if(gram[i][3]==stack[2])  
64:                 printf("\nReduce %s\n",gram[i]);  
65:       }  
66:  }  
67:  int main(){  
68:       stack[0]='$';  
69:       sizes=strlen(stack);  
70:       getGram();  
71:       getInput();  
72:       applyOp();  
73:  }  

Compile it using : #gcc shiftr.c

Output:

 

How to Make a Desk Calculator using Flex/Lex and Yacc/Bison

Post By - Tanmay | 4/29/2015 05:38:00 am
In this tutorial you'll come to know about making a desk calculator using Lex and Yacc.

1:  statement:NAME'='expression {printf("x=%d",$2);};   
2:  |expression {printf("=%d",$1);};  
3:  expression:expression'-'NUMBER {$$=$1-$3;};  
4:  |expression'+'NUMBER {$$=$1+$3;};  
5:  |expression'*'NUMBER {$$=$1*$3;};  
6:  |expression'/'NUMBER {if($3==0) printf("0 Encountered"); else $$=$1/$3;};  
7:  |'('expression')' {$$=$2;}  
8:  |NUMBER {$$=$1;};  

Above grammar rule will be used in program. If you haven't installed the Flex and Bison in Windows, read the tutorial here.

Now, Lex File : calc.l

1:  %{  
2:  #include"y.tab.h"  
3:  extern int yylval;  
4:  %}  
5:  %%  
6:  [0-9]+ {yylval= atoi (yytext);printf("Number is Enc : %d",yylval); return NUMBER;}  
7:  [a-z] {printf("Character : %c",yytext[0]); return NAME;}  
8:  . {printf(""); return yytext[0];}  
9:  %%  

Yacc File : calc.y
1:  %{  
2:  #include"stdio.h"  
3:  int yyerror(char *str){  
4:        return fprintf(stderr,str);  
5:  }  
6:  int yywrap(){  
7:       return 1;  
8:  }  
9:  int main(){  
10:       yyparse();  
11:       return 1;  
12:  }  
13:  %}  
14:  %token NAME NUMBER  
15:  %%  
16:  statement:NAME'='expression {printf("x=%d",$2);};   
17:  |expression {printf("=%d",$1);};  
18:  expression:expression'-'NUMBER {$$=$1-$3;};  
19:  |expression'+'NUMBER {$$=$1+$3;};  
20:  |expression'*'NUMBER {$$=$1*$3;};  
21:  |expression'/'NUMBER {if($3==0) printf("0 Encountered"); else $$=$1/$3;};  
22:  |'('expression')' {$$=$2;}  
23:  |NUMBER {$$=$1;};  
24:  %%  

Compile & Link them using "gcc" to achieve the goal.

Output :


 

Tuesday 28 April 2015

How-To Make an Compiler to Check IF and ELSE Counts

Post By - Tanmay | 4/28/2015 09:29:00 am
This program is a basic way to calculate the "Nested If-Else" in a program. This takes input from Console and puts that into a buffer, however you can take input from a file as : yyin();

Initial : Lex and Yacc are pre-installed in Linux. If you haven't install Flex and Bison in Windows, refer to This Tutorial - How to Compile Lex and Yacc

Step -1: Make a file named as if_else.l. Copy the below code and paste it in that file.

IF ELSE.L
1:  %{  
2:  #include"y.tab.h"  
3:  %}  
4:  %%  
5:  "if" {return IF;}  //Token for If statements
6:  "else" {return ELSE;}  //Token for Else Statements
7:  [sS][0-9]* {return S;}  //Statement symbol token
8:  "<"|">"|"="|"!="|"<="|">=" {return RELOP;}  //Relational Operator
9:  [0-9]+ {return NUMBER;}  //Number
10:  [a-zA-Z][a-zA-Z0-9_]* {return ID;}   //Ids
11:  \n {;}  
12:  . {return yytext[0];}  //Character Not Desfined
13:  %%  

Step-2: After completing step 1, you've to create a file in yacc readable format i.e. if_else.y and copy and paste the below code.

IF ELSE.Y
1:  %token IF RELOP S NUMBER ID ELSE  
2:  %{  
3:  int count=0;  
4:  %}  
5:  %%  
6:  stmt:if_stmt {printf("Nested : %d\n",count);};  
7:  if_stmt:IF'('cond')'if_stmt {count++;}  
8:  |IF'('cond')'S' 'ELSE' 'if_stmt {count++;}  
9:  |IF'('cond')'ELSE' 'if_stmt {count++;}  
10:  |S;  
11:  cond:x RELOP x;  
12:  x:ID  
13:  |NUMBER  
14:  ;  
15:  %%  
16:  int yywrap(){return 1;}  
17:  int yyerror(char *ch)  
18:  {  
19:  return 1;  
20:  }  
21:  int main(){  
22:  printf("Enter the Statement : \n");  
23:  yyparse();  
24:  return 1;  
25:  }  

Now as you've got lex and yacc files, let's compile a compiler.

root@user~:#lex if_else.l
root@user~:#yacc -d if_else.y
root@user~:#gcc lex.yy.c y.tab.c
root@user~:#./a.out

These code will help you in achieving result.




Enter the input as:

if(----)S
if(----)S else S
if(----)S else if(----)S
...........and so on.

How it works:

Consider an input:

if(----)S else if(----)S

now,

we have :

1:stmt:if_stmt {printf("Nested : %d\n",count);};  
2:if_stmt:IF'('cond')'if_stmt {count++;}  
3:  |IF'('cond')'S' 'ELSE' 'if_stmt {count++;}  
4:  |IF'('cond')'ELSE' 'if_stmt {count++;}  
5:  |S;  
6:  cond:x RELOP x;  
7:  x:ID  
8:  |NUMBER  

Starting from "stmt":

stmt                                                                      //start symbol
if_stmt                                                                  //rule 1
IF'('cond')'S' 'ELSE' 'if_stmt                                //rule 3
IF'('cond')'S' 'ELSE' 'IF'('cond')'if_stmt               //rule 1
IF'('cond')'S' 'ELSE' 'IF'('cond')'S                        //rule 5
IF'('x RELOP x')'S' 'ELSE' 'IF'('x RELOP x')'S      //rule 6
IF'('ID RELOP ID')'S' 'ELSE' 'IF'('ID RELOP ID')'S  //rule 7

Now, we have got tokens at the input places, that means our derivation is correct.