How can I see parse tree, intermediate code, optimization code and assembly code during COMPILATION?
Asked Answered
S

3

32

I am studying Compilers course, compilation of program follows below steps

  1. Lexical analysis
  2. Syntax analysis
  3. Semantic analysis
  4. Intermediate code generation
  5. Code optimization
  6. Target code generation.

How can I see output of each step e.g I want to see parse tree after syntax analysis.

I am compiling program on Linux machine with GCC compiler.

We can see assembly code of the program by using -Wa compiler option in gcc, similarly is there options to see Tokens, Parse tree and Inetmediate code.

Schuman answered 30/9, 2009 at 7:0 Comment(0)
T
23

While you can use the -fdump-tree-all and -fdump-rtl-all options in gcc, I don't think that their output is very useful to a compiler student. FWIW, I started working on gcc as part of my PhD studies, having already completed two undergraduate courses, and I found gcc and its debug files to be opaque and hard to follow.

In addition, gcc doesn't really follow the textbook design of compilers. No-one does, really, because it doesn't work well that way. I'm pretty sure gcc doesn't produce a parse tree, or an abstract-syntax-tree. It does build an IR (called gimple) on which to perform its high-level optimizations.

I would suggest to try LLVM instead, which has a reputation for being well designed and easy to follow. Another alternative is to download the code from a textbook, especially the Appel book, assuming its available.

Another suggestion, if I may recommend my own for a moment, is to use phc. With phc, you can see the parse tree as an image, and view the AST and the source code after every single pass in the compiler. Here is a comparison of parts of the AST and the parse tree. They are generated trivially using phc. You can see the compiler IRs, the CFG, SSA form, and debug output of type inference and alias analysis. You can also turn optimizations and passes on and off to see the effect that they have.

I think this could be useful for you.

Thundery answered 1/10, 2009 at 20:45 Comment(3)
Oh no way, that's your site? That site's explanation of virtual inheritance is fantastic. After reading that I really got the "why" of virtual and it's foreshadowing in C object systems. Are your thoughts on good pedagogical compilers any different today?Soilure
But phc seems to be a compiler on its own, i.e. it is not an application to ease GCC's bytecode understanding.Traject
"No-one [follow the textbook design] because it doesn't work well" Not in C++ anyway. The language is ambiguous in such a way as to encourage implementations where semantic information might travel all the way back to lexical analysis, e.g. to figure out whether W<X<Y>>Z; is a variable declaration or an expression.Crew
B
12

You can see the preprocessor output with -E. -fdump-tree-* dumps the tree internal represenation, e.g. -fdump-tree-all. Various -d options exist to dump the RTL intermediate representations, e.g. -fdump-rtl-all (see the manual for the invidual passes that you get dumps of); in addition, -dD dumps all macro definitions.

Blanca answered 1/10, 2009 at 20:31 Comment(1)
A note for readers: with the option the compiler dumps the tree to a file with the name of source code and a postfix like .optimized. It isn't obvious at all, I spent ≈20 minutes, looked through a documentation, and searched cases that gcc doesn't produces the dump, when occasionally noted the new file (which isn't easy since I did a tests in /tmp/, which is pretty junky).Traject
S
3

From the point of view of the clang compiler, you can not see each and every output that is generated by the compiler. This is because clang works in a different way compared to other compilers.

Lexical analysis

The tokens can be emitted through:

clang test.c -Xclang -dump-tokens
clang test.c -Xclang -dump-raw-tokens

Intermediate code generation

The byte code can be emitted through: clang test.c -S -emit-llvm

Semantic analysis

The semantic analysis is simultaneously performed while the AST is being generated. The AST can be emitted through:

clang test.c -Xclang -ast-dump
clang test.c -Xclang -ast-view (this generates a graph for the textual AST)

Code optimization

You can query code optimizations through printing the optimization pipeline as it is applied to the c-code:

clang test.c -S -mllvm -print-after-all

Target code generation

The generated code (i.e. the assembly output) can be viewed through:

clang test.c -S

Bonus

You can also see the complete pipeline that clang invokes for a program. For example, the pipeline for emitting an object file can be viewd through:

clang -ccc-print-phases test.c -c

The output generated on the terminal is:

0: input, "test.c", c
1: preprocessor, {0}, cpp-output
2: compiler, {1}, ir
3: backend, {2}, assembler
4: assembler, {3}, object
Sonata answered 27/2, 2019 at 14:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.