[FrontPage] [TitleIndex] [WordIndex

Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

C is generally compiled to assembly language first, and then an assembler compiles the assembly language to actual machine code. Typically the intermediate assembly language code is deleted.

If you are very curious about what the compiler does to your code, you can run gcc with the -S option to tell it to stop after creating the assembly file; for example, the command gcc -S program.c will create a file program.s containing the compiled code. What this code will look like depends both on what your C code looks like and what other options you give the C compiler. You can learn how to read (or write) assembly for the Intel i386 architecture by following this link (but see these notes for an explanation of the variant of i386 assembly language supported by gas, the GNU assembler used by gcc), or you can just jump in and guess what is going on by trying to expand the keywords (hint: movl stands for "move long," addl for "add long," etc., numbers preceded by dollar signs are constants, and things like %ebx are the names of registers, high-speed memory locations built in to the CPU).

For example, here's a short C program that implements a simple loop in two different ways, the first using the standard for loop construct and the second building a for loop by hand using goto (don't do this).

   1 #include <stdio.h>
   2 
   3 int
   4 main(int argc, char **argv)
   5 {
   6     int i;
   7 
   8     /* normal loop */
   9     for(i = 0; i < 10; i++) {
  10         printf("%d\n", i);
  11     }
  12 
  13     /* non-standard implementation using if and goto */
  14     i = 0;
  15   start:
  16     if(i < 10) {
  17         printf("%d\n", i);
  18         i++;
  19         goto start;
  20     }
  21     
  22     return 0;
  23 }
loops.c

and here is the output of the compiler using gcc -S loops.c:

        .file   "loops.c"
        .section        .rodata
.LC0:
        .string "%d\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        andl    $-16, %esp
        movl    $0, %eax
        subl    %eax, %esp
        movl    $0, -4(%ebp)
.L2:
        cmpl    $9, -4(%ebp)
        jle     .L5
        jmp     .L3
.L5:
        movl    -4(%ebp), %eax
        movl    %eax, 4(%esp)
        movl    $.LC0, (%esp)
        call    printf
        leal    -4(%ebp), %eax
        incl    (%eax)
        jmp     .L2
.L3:
        movl    $0, -4(%ebp)
.L6:
        cmpl    $9, -4(%ebp)
        jg      .L7
        movl    -4(%ebp), %eax
        movl    %eax, 4(%esp)
        movl    $.LC0, (%esp)
        call    printf
        leal    -4(%ebp), %eax
        incl    (%eax)
        jmp     .L6
.L7:
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.3.4 (Debian)"

Note that even though the two loops are doing the same thing, the structure is different. The first uses jle (jump if less than or equal to) to jump over the jmp that skips the body of the loop if it shouldn't be executed, while the second uses jg (jump if greater than) to skip it directly, which is closer to what the C code says to do. This is not unusual; for built-in control structures the compiler will often build odd-looking implementations that still work, for subtle and mysterious reasons of its own.

We can make them even more different by turning on the optimizer, e.g. with gcc -S -O3 -funroll-loops loops.c:

        .file   "loops.c"
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "%d\n"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        pushl   %ebp
        xorl    %ecx, %ecx
        movl    %esp, %ebp
        pushl   %ebx
        subl    $20, %esp
        movl    $2, %ebx
        andl    $-16, %esp
        movl    %ecx, 4(%esp)
        movl    $.LC0, (%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $1, %edx
        movl    %edx, 4(%esp)
        call    printf
        movl    %ebx, 4(%esp)
        movl    $5, %ebx
        movl    $.LC0, (%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $3, %ecx
        movl    %ecx, 4(%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $4, %edx
        movl    %edx, 4(%esp)
        call    printf
        movl    %ebx, 4(%esp)
        xorl    %ebx, %ebx
        movl    $.LC0, (%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $6, %ecx
        movl    %ecx, 4(%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $7, %edx
        movl    %edx, 4(%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $8, %eax
        movl    %eax, 4(%esp)
        call    printf
        movl    $.LC0, (%esp)
        movl    $9, %edx
        movl    %edx, 4(%esp)
        .p2align 4,,15
.L31:
        call    printf
.L7:
        cmpl    $9, %ebx
        jg      .L8
        movl    %ebx, 4(%esp)
        incl    %ebx
        movl    $.LC0, (%esp)
        jmp     .L31
.L8:
        movl    -4(%ebp), %ebx
        xorl    %eax, %eax
        leave
        ret
        .size   main, .-main
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.3.4 (Debian)"

Now the first loop is gone, replaced by ten separate calls to printf. Some of the arguments are loaded up in odd places; for example, 2 and 5 are stored in their registers well before they are used. Again, the compiler moves in mysterious ways, but guarantees that the resulting code will do what you asked. The second loop is left pretty much the same as it was, since the C compiler doesn't have enough information to recognize it as a loop.

The moral of the story: if you write your code using standard idioms, it's more likely that the C compiler will understand what you want and be able to speed it up for you. The odds are that using a non-standard idiom will cost you more in time by confusing the compiler than it will gain you in slight code improvements.


CategoryProgrammingNotes


2014-06-17 11:57