next up previous contents index
Next: 9.2 A few words Up: 9. Compiler internals Previous: 9. Compiler internals   Contents   Index


9.1 The anatomy of the compiler

This is an excerpt from an article published in Circuit Cellar Magazine in August 2000. It's a little outdated (the compiler is much more efficient now and user/developer friendly), but pretty well exposes the guts of it all.

The current version of SDCC can generate code for Intel 8051 and Z80 MCU. It is fairly easy to retarget for other 8-bit MCU. Here we take a look at some of the internals of the compiler.


Parsing

Parsing the input source file and creating an AST (Annotated Syntax Tree). This phase also involves propagating types (annotating each node of the parse tree with type information) and semantic analysis. There are some MCU specific parsing rules. For example the storage classes, the extended storage classes are MCU specific while there may be a xdata storage class for 8051 there is no such storage class for z80 or Atmel AVR. SDCC allows MCU specific storage class extensions, i.e. xdata will be treated as a storage class specifier when parsing 8051 C code but will be treated as a C identifier when parsing z80 or ATMEL AVR C code.


Generating iCode

Intermediate code generation. In this phase the AST is broken down into three-operand form (iCode). These three operand forms are represented as doubly linked lists. ICode is the term given to the intermediate form generated by the compiler. ICode example section shows some examples of iCode generated for some simple C source functions.


Optimizations.

Bulk of the target independent optimizations is performed in this phase. The optimizations include constant propagation, common sub-expression elimination, loop invariant code movement, strength reduction of loop induction variables and dead-code elimination.


Live range analysis

During intermediate code generation phase, the compiler assumes the target machine has infinite number of registers and generates a lot of temporary variables. The live range computation determines the lifetime of each of these compiler-generated temporaries. A picture speaks a thousand words. ICode example sections show the live range annotations for each of the operand. It is important to note here, each iCode is assigned a number in the order of its execution in the function. The live ranges are computed in terms of these numbers. The from number is the number of the iCode which first defines the operand and the to number signifies the iCode which uses this operand last.


Register Allocation

The register allocation determines the type and number of registers needed by each operand. In most MCUs only a few registers can be used for indirect addressing. In case of 8051 for example the registers R0 & R1 can be used to indirectly address the internal ram and DPTR to indirectly address the external ram. The compiler will try to allocate the appropriate register to pointer variables if it can. ICode example section shows the operands annotated with the registers assigned to them. The compiler will try to keep operands in registers as much as possible; there are several schemes the compiler uses to do achieve this. When the compiler runs out of registers the compiler will check to see if there are any live operands which is not used or defined in the current basic block being processed, if there are any found then it will push that operand and use the registers in this block, the operand will then be popped at the end of the basic block.

There are other MCU specific considerations in this phase. Some MCUs have an accumulator; very short-lived operands could be assigned to the accumulator instead of a general-purpose register.

Code generation

Figure II gives a table of iCode operations supported by the compiler. The code generation involves translating these operations into corresponding assembly code for the processor. This sounds overly simple but that is the essence of code generation. Some of the iCode operations are generated on a MCU specific manner for example, the z80 port does not use registers to pass parameters so the SEND and RECV iCode operations will not be generated, and it also does not support JUMPTABLES.

Figure II
iCode Operands Description C Equivalent
       
'!' IC_LEFT() IC_RESULT() NOT operation IC_RESULT = ! IC_LEFT;
'~' IC_LEFT() IC_RESULT() Bitwise complement of IC_RESULT = ~IC_LEFT;
RRC IC_LEFT() IC_RESULT() Rotate right with carry IC_RESULT = (IC_LEFT « 1) | (IC_LEFT » (sizeof(IC_LEFT)*8-1));
RLC IC_LEFT() IC_RESULT() Rotate left with carry IC_RESULT = (IC_LEFT « (sizeof(LC_LEFT)*8-1) ) | (IC_LEFT » 1);
GETHBIT IC_LEFT() IC_RESULT() Get the highest order bit of IC_LEFT IC_RESULT = (IC_LEFT » (sizeof(IC_LEFT)*8 -1));
UNARYMINUS IC_LEFT() IC_RESULT() Unary minus IC_RESULT = - IC_LEFT;
IPUSH IC_LEFT() Push the operand into stack NONE
IPOP IC_LEFT() Pop the operand from the stack NONE
CALL IC_LEFT() IC_RESULT() Call the function represented by IC_LEFT IC_RESULT = IC_LEFT();
PCALL IC_LEFT() IC_RESULT() Call via function pointer IC_RESULT = (*IC_LEFT)();
RETURN IC_LEFT() Return the value in operand IC_LEFT return IC_LEFT;
LABEL IC_LABEL() Label IC_LABEL:
GOTO IC_LABEL() Goto label goto IC_LABEL();
'+' IC_LEFT() IC_RIGHT() IC_RESULT() Addition IC_RESULT = IC_LEFT + IC_RIGHT
'-' IC_LEFT() IC_RIGHT() IC_RESULT() Subtraction IC_RESULT = IC_LEFT - IC_RIGHT
'*' IC_LEFT() IC_RIGHT() IC_RESULT() Multiplication IC_RESULT = IC_LEFT * IC_RIGHT;
'/' IC_LEFT() IC_RIGHT() IC_RESULT() Division IC_RESULT = IC_LEFT / IC_RIGHT;
'%' IC_LEFT() IC_RIGHT() IC_RESULT() Modulus IC_RESULT = IC_LEFT % IC_RIGHT;
'<' IC_LEFT() IC_RIGHT() IC_RESULT() Less than IC_RESULT = IC_LEFT < IC_RIGHT;
'>' IC_LEFT() IC_RIGHT() IC_RESULT() Greater than IC_RESULT = IC_LEFT > IC_RIGHT;
EQ_OP IC_LEFT() IC_RIGHT() IC_RESULT() Equal to IC_RESULT = IC_LEFT == IC_RIGHT;
AND_OP IC_LEFT() IC_RIGHT() IC_RESULT() Logical and operation IC_RESULT = IC_LEFT && IC_RIGHT;
OR_OP IC_LEFT() IC_RIGHT() IC_RESULT() Logical or operation IC_RESULT = IC_LEFT || IC_RIGHT;
'^' IC_LEFT() IC_RIGHT() IC_RESULT() Exclusive OR IC_RESULT = IC_LEFT ^ IC_RIGHT;
'|' IC_LEFT() IC_RIGHT() IC_RESULT() Bitwise OR IC_RESULT = IC_LEFT | IC_RIGHT;
BITWISEAND IC_LEFT() IC_RIGHT() IC_RESULT() Bitwise AND IC_RESULT = IC_LEFT & IC_RIGHT;
LEFT_OP IC_LEFT() IC_RIGHT() IC_RESULT() Left shift IC_RESULT = IC_LEFT « IC_RIGHT
RIGHT_OP IC_LEFT() IC_RIGHT() IC_RESULT() Right shift IC_RESULT = IC_LEFT » IC_RIGHT
GET_VALUE_
AT_ ADDRESS
IC_LEFT() IC_RESULT() Indirect fetch IC_RESULT = (*IC_LEFT);
POINTER_SET IC_RIGHT() IC_RESULT() Indirect set (*IC_RESULT) = IC_RIGHT;
'=' IC_RIGHT() IC_RESULT() Assignment IC_RESULT = IC_RIGHT;
IFX IC_COND IC_TRUE IC_LABEL Conditional jump. If true label is present then jump to true label if condition is true else jump to false label if condition is false if (IC_COND) goto IC_TRUE;
  Or
If (!IC_COND) goto IC_FALSE;
ADDRESS_OF IC_LEFT() IC_RESULT() Address of IC_RESULT = &IC_LEFT();
JUMPTABLE IC_JTCOND IC_JTLABELS Jump to list of labels depending on the value of JTCOND Switch statement
CAST IC_RIGHT() IC_LEFT() IC_RESULT() Cast types IC_RESULT = (typeof IC_LEFT) IC_RIGHT;
SEND IC_LEFT() This is used for passing parameters in registers;
move IC_LEFT to the next available parameter register.
None
RECV IC_RESULT() This is used for receiving parameters passed in registers;
Move the values in the next parameter register to IC_RESULT
None
(some more have been added)     see f.e. gen51Code() in src/mcs51/gen.c


ICode Example

This section shows some details of iCode. The example C code does not do anything useful; it is used as an example to illustrate the intermediate code generated by the compiler.

1. xdata int * p; 
2. int gint; 
3. /* This function does nothing useful. It is used 
4.    for the purpose of explaining iCode */ 
5. short function (data int *x) 
6. { 
7.   short i=10;  /* dead initialization eliminated */ 
8.   short sum=10; /* dead initialization eliminated */ 
9.   short mul; 
10.  int j ; 
11.  while (*x) *x++ = *p++;  
12.    sum = 0 ;  
13.  mul = 0; 
14.  /* compiler detects i,j to be induction variables */ 
15.  for (i = 0, j = 10 ; i < 10 ; i++, j--) { 
16.    sum += i; 
17.    mul += i * 3;  /* this multiplication remains */ 
18.    gint += j * 3;  /* this multiplication changed to addition */ 
19.  } 
20.  return sum+mul; 
21. }
In addition to the operands each iCode contains information about the filename and line it corresponds to in the source file. The first field in the listing should be interpreted as follows:
Filename(linenumber: iCode Execution sequence number : ICode hash table key : loop depth of the iCode).
Then follows the human readable form of the ICode operation. Each operand of this triplet form can be of three basic types a) compiler generated temporary b) user defined variable c) a constant value. Note that local variables and parameters are replaced by compiler generated temporaries. Live ranges are computed only for temporaries (i.e. live ranges are not computed for global variables). Registers are allocated for temporaries only. Operands are formatted in the following manner:
Operand Name [lr live-from : live-to ] { type information } [ registers allocated ].
As mentioned earlier the live ranges are computed in terms of the execution sequence number of the iCodes, for example
the iTemp0 is live from (i.e. first defined in iCode with execution sequence number 3, and is last used in the iCode with sequence number 5). For induction variables such as iTemp21 the live range computation extends the lifetime from the start to the end of the loop.
The register allocator used the live range information to allocate registers, the same registers may be used for different temporaries if their live ranges do not overlap, for example r0 is allocated to both iTemp6 and to iTemp17 since their live ranges do not overlap. In addition the allocator also takes into consideration the type and usage of a temporary, for example itemp6 is a pointer to near space and is used as to fetch data from (i.e. used in GET_VALUE_AT_ADDRESS) so it is allocated a pointer register (r0). Some short lived temporaries are allocated to special registers which have meaning to the code generator e.g. iTemp13 is allocated to a pseudo register CC which tells the back end that the temporary is used only for a conditional jump the code generation makes use of this information to optimize a compare and jump ICode.
There are several loop optimizations performed by the compiler. It can detect induction variables iTemp21(i) and iTemp23(j). Also note the compiler does selective strength reduction, i.e. the multiplication of an induction variable in line 18 (gint = j * 3) is changed to addition, a new temporary iTemp17 is allocated and assigned a initial value, a constant 3 is then added for each iteration of the loop. The compiler does not change the multiplication in line 17 however since the processor does support an 8 * 8 bit multiplication.
Note the dead code elimination optimization eliminated the dead assignments in line 7 & 8 to I and sum respectively.

Sample.c (5:1:0:0) _entry($9) :

Sample.c(5:2:1:0) proc _function [lr0:0]{function short}

Sample.c(11:3:2:0) iTemp0 [lr3:5]{_near * int}[r2] = recv

Sample.c(11:4:53:0) preHeaderLbl0($11) :

Sample.c(11:5:55:0) iTemp6 [lr5:16]{_near * int}[r0] := iTemp0 [lr3:5]{_near * int}[r2]

Sample.c(11:6:5:1) _whilecontinue_0($1) :

Sample.c(11:7:7:1) iTemp4 [lr7:8]{int}[r2 r3] = @[iTemp6 [lr5:16]{_near * int}[r0]]

Sample.c(11:8:8:1) if iTemp4 [lr7:8]{int}[r2 r3] == 0 goto _whilebreak_0($3)

Sample.c(11:9:14:1) iTemp7 [lr9:13]{_far * int}[DPTR] := _p [lr0:0]{_far * int}

Sample.c(11:10:15:1) _p [lr0:0]{_far * int} = _p [lr0:0]{_far * int} + 0x2 {short}

Sample.c(11:13:18:1) iTemp10 [lr13:14]{int}[r2 r3] = @[iTemp7 [lr9:13]{_far * int}[DPTR]]

Sample.c(11:14:19:1) *(iTemp6 [lr5:16]{_near * int}[r0]) := iTemp10 [lr13:14]{int}[r2 r3]

Sample.c(11:15:12:1) iTemp6 [lr5:16]{_near * int}[r0] = iTemp6 [lr5:16]{_near * int}[r0] + 0x2 {short}

Sample.c(11:16:20:1) goto _whilecontinue_0($1)

Sample.c(11:17:21:0)_whilebreak_0($3) :

Sample.c(12:18:22:0) iTemp2 [lr18:40]{short}[r2] := 0x0 {short}

Sample.c(13:19:23:0) iTemp11 [lr19:40]{short}[r3] := 0x0 {short}

Sample.c(15:20:54:0)preHeaderLbl1($13) :

Sample.c(15:21:56:0) iTemp21 [lr21:38]{short}[r4] := 0x0 {short}

Sample.c(15:22:57:0) iTemp23 [lr22:38]{int}[r5 r6] := 0xa {int}

Sample.c(15:23:58:0) iTemp17 [lr23:38]{int}[r7 r0] := 0x1e {int}

Sample.c(15:24:26:1)_forcond_0($4) :

Sample.c(15:25:27:1) iTemp13 [lr25:26]{char}[CC] = iTemp21 [lr21:38]{short}[r4] < 0xa {short}

Sample.c(15:26:28:1) if iTemp13 [lr25:26]{char}[CC] == 0 goto _forbreak_0($7)

Sample.c(16:27:31:1) iTemp2 [lr18:40]{short}[r2] = iTemp2 [lr18:40]{short}[r2] + ITemp21 [lr21:38]{short}[r4]

Sample.c(17:29:33:1) iTemp15 [lr29:30]{short}[r1] = iTemp21 [lr21:38]{short}[r4] * 0x3 {short}

Sample.c(17:30:34:1) iTemp11 [lr19:40]{short}[r3] = iTemp11 [lr19:40]{short}[r3] + iTemp15 [lr29:30]{short}[r1]

Sample.c(18:32:36:1:1) iTemp17 [lr23:38]{int}[r7 r0]= iTemp17 [lr23:38]{int}[r7 r0]- 0x3 {short}

Sample.c(18:33:37:1) _gint [lr0:0]{int} = _gint [lr0:0]{int} + iTemp17 [lr23:38]{int}[r7 r0]

Sample.c(15:36:42:1) iTemp21 [lr21:38]{short}[r4] = iTemp21 [lr21:38]{short}[r4] + 0x1 {short}

Sample.c(15:37:45:1) iTemp23 [lr22:38]{int}[r5 r6]= iTemp23 [lr22:38]{int}[r5 r6]- 0x1 {short}

Sample.c(19:38:47:1) goto _forcond_0($4)

Sample.c(19:39:48:0)_forbreak_0($7) :

Sample.c(20:40:49:0) iTemp24 [lr40:41]{short}[DPTR] = iTemp2 [lr18:40]{short}[r2] + ITemp11 [lr19:40]{short}[r3]

Sample.c(20:41:50:0) ret iTemp24 [lr40:41]{short}

Sample.c(20:42:51:0)_return($8) :

Sample.c(20:43:52:0) eproc _function [lr0:0]{ ia0 re0 rm0}{function short}

Finally the code generated for this function:

.area DSEG (DATA)

_p::

  .ds 2

_gint::

  .ds 2

; sample.c 5

; -----------------------

; function function

; -----------------------

_function:

; iTemp0 [lr3:5]{_near * int}[r2] = recv

  mov r2,dpl

; iTemp6 [lr5:16]{_near * int}[r0] := iTemp0 [lr3:5]{_near * int}[r2]

  mov ar0,r2

;_whilecontinue_0($1) :

00101$:

; iTemp4 [lr7:8]{int}[r2 r3] = @[iTemp6 [lr5:16]{_near * int}[r0]]

; if iTemp4 [lr7:8]{int}[r2 r3] == 0 goto _whilebreak_0($3)

  mov ar2,@r0

  inc r0

  mov ar3,@r0

  dec r0

  mov a,r2

  orl a,r3

  jz 00103$

00114$:

; iTemp7 [lr9:13]{_far * int}[DPTR] := _p [lr0:0]{_far * int}

  mov dpl,_p

  mov dph,(_p + 1)

; _p [lr0:0]{_far * int} = _p [lr0:0]{_far * int} + 0x2 {short}

  mov a,#0x02

  add a,_p

  mov _p,a

  clr a

  addc a,(_p + 1)

  mov (_p + 1),a

; iTemp10 [lr13:14]{int}[r2 r3] = @[iTemp7 [lr9:13]{_far * int}[DPTR]]

  movx a,@dptr

  mov r2,a

  inc dptr

  movx a,@dptr

  mov r3,a

; *(iTemp6 [lr5:16]{_near * int}[r0]) := iTemp10 [lr13:14]{int}[r2 r3]

  mov @r0,ar2

  inc r0

  mov @r0,ar3

; iTemp6 [lr5:16]{_near * int}[r0] =

; iTemp6 [lr5:16]{_near * int}[r0] +

; 0x2 {short}

  inc r0

; goto _whilecontinue_0($1)

  sjmp 00101$

; _whilebreak_0($3) :

00103$:

; iTemp2 [lr18:40]{short}[r2] := 0x0 {short}

  mov r2,#0x00

; iTemp11 [lr19:40]{short}[r3] := 0x0 {short}

  mov r3,#0x00

; iTemp21 [lr21:38]{short}[r4] := 0x0 {short}

  mov r4,#0x00

; iTemp23 [lr22:38]{int}[r5 r6] := 0xa {int}

  mov r5,#0x0A

  mov r6,#0x00

; iTemp17 [lr23:38]{int}[r7 r0] := 0x1e {int}

  mov r7,#0x1E

  mov r0,#0x00

; _forcond_0($4) :

00104$:

; iTemp13 [lr25:26]{char}[CC] = iTemp21 [lr21:38]{short}[r4] < 0xa {short}

; if iTemp13 [lr25:26]{char}[CC] == 0 goto _forbreak_0($7)

  clr c

  mov a,r4

  xrl a,#0x80

  subb a,#0x8a

  jnc 00107$

00115$:

; iTemp2 [lr18:40]{short}[r2] = iTemp2 [lr18:40]{short}[r2] +

; iTemp21 [lr21:38]{short}[r4]

  mov a,r4

  add a,r2

  mov r2,a

; iTemp15 [lr29:30]{short}[r1] = iTemp21 [lr21:38]{short}[r4] * 0x3 {short}

  mov b,#0x03

  mov a,r4

  mul ab

  mov r1,a

; iTemp11 [lr19:40]{short}[r3] = iTemp11 [lr19:40]{short}[r3] +

; iTemp15 [lr29:30]{short}[r1]

  add a,r3

  mov r3,a

; iTemp17 [lr23:38]{int}[r7 r0]= iTemp17 [lr23:38]{int}[r7 r0]- 0x3 {short}

  mov a,r7

  add a,#0xfd

  mov r7,a

  mov a,r0

  addc a,#0xff

  mov r0,a

; _gint [lr0:0]{int} = _gint [lr0:0]{int} + iTemp17 [lr23:38]{int}[r7 r0]

  mov a,r7

  add a,_gint

  mov _gint,a

  mov a,r0

  addc a,(_gint + 1)

  mov (_gint + 1),a

; iTemp21 [lr21:38]{short}[r4] = iTemp21 [lr21:38]{short}[r4] + 0x1 {short}

  inc r4

; iTemp23 [lr22:38]{int}[r5 r6]= iTemp23 [lr22:38]{int}[r5 r6]- 0x1 {short}

  dec r5

  cjne r5,#0xff,00104$

  dec r6

; goto _forcond_0($4)

  sjmp 00104$

; _forbreak_0($7) :

00107$:

; ret iTemp24 [lr40:41]{short}

  mov a,r3

  add a,r2

  mov dpl,a

; _return($8) :

00108$:

  ret



Subsections
next up previous contents index
Next: 9.2 A few words Up: 9. Compiler internals Previous: 9. Compiler internals   Contents   Index
2008-12-05