Radare2 Summer of Code 2019 - decompiler in Rust

Hello everyone! Since we weren't accepted in the GSoC, we organized our own summer of code, and as usual we have a slot for improving our decompiler Radeco.

Radeco (based on radeco-lib) is a radare2 based static binary analysis framework. Currently, radeco is stable enough and has several analysis passes built in. We believe that this RSoC is a good opportunity to push radeco further and implement our very own decompiler within radare2!

4 Likes

See more information about RSoC'19.

Here are the Radeco RSoC ideas:

Memory SSA and platforms

This task involves completion of a decompiler backend using the analysis in radeco. Once the preliminary results are obtained, students are expected to continue working on improving the quality of decompiled code.

Task

  • Implement Memory SSA
  • Complete the VSA (Value Set Analysis)
  • Expand the supported architectures list
  • Improve the pseudocode output and add more tests (compared with output of Hex Rays)
  • Use Godbolt to produce binaries with different compilers and optimization levels for tests

Skills

The student should be familiar with Rust and decompilation basics or be able to learn it quickly.

Difficulty

Advanced

Benefits for the student

The student will learn decompilation theory and perform complex graph transformations, as well as learn the specifics of particular compiler optimization passes.

Benefits for the project

Successful completion of this task will mean the first release of radeco which can generate readable and optimized C code.

Mentors

  • xvilka
  • deroad

Assess requirements for midterm/final evaluation

  • 1st term: Implementing Memory SSA and VSA.
  • 2nd term: Supporting architectures: x86, amd64, ARMv7, ARMv8, PowerPC, MIPS, V850, and implementing regression tests for them.
  • Final term: Refining C output, finished integration with radare2 and Cutter, writing regression and unit tests, updating documentation (including r2book).
1 Like

Links/Resources

  • Radeco
  • Radeco-lib
  • Memory SSA - A Unified Approach for Sparsely Representing Memory Operations: hxxp://www.airs.com/dnovillo/Papers/mem-ssa.pdf
  • Effective Representation of Aliases and Indirect Memory Operations in SSA Form: hxxp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.6974&rep=rep1&type=pdf
  • Papers about decompilation: hxxps://drive.google.com/drive/folders/0B1X32SwXTZPuYWwxWW5BNi1oWDA?usp=sharing

Type System

This task is for improving the results of decompilation by recovering types (char, char *, structures, unions, classes, etc). Apart from ability to inference them through analyzing data flow, radeco should be able to exchange this information with radare2 and Cutter, initially loading from them, then synchronizing back refined results.

Task

  • Define and implement type system
  • Implement type inference techniques
  • Support for structural types loading and inference
  • Support for constrained types
  • Implement IR writer/reader with type information
  • Implement a backend to convert the IR to C AST with type information

Skills

The student should be familiar with Rust and decompilation basics or be able to learn it quickly.

Difficulty

Advanced

Benefits for the student

The student will learn decompilation theory and work with the type system.

Benefits for the project

This task allows to produce the more readable IR/C output.

Mentors

  • xvilka

Assess requirements for midterm/final evaluation

  • 1st term: Basic and structured types support in IR and propagation through all stages of radeco
  • 2nd term: Types inference engine
  • Final term: Integration with radare2 and Cutter, regression tests, complex types inference, radare2 book documentation

Links/Resources

  • Commands and API for setting/changing types of the variables - Issue #183
  • Constrained types support in Radeco - Issue #232
  • Value limits support and analysis - Issue #91 hxxps://github.com/radareorg/radeco-lib/issues/91
  • Radare2 types issues - hxxps://github.com/radare/radare2/labels/types
  • HexRaysCodeXplorer - hxxps://github.com/REhints/HexRaysCodeXplorer
  • Virtuailor - hxxps://github.com/0xgalz/Virtuailor - vtables reconstruction based on runtime information
  • Pharos - types inference with Prolog - hxxps://github.com/cmu-sei/pharos

Because we need to be sure students can qualify for this internship, it is required to take one of the microtasks (or some of the simple issues from the repositories):

radeco

  • Implementing a command for reporting bugs #40
  • Ability to cache intermediate analyses to a file #49

radeco-lib

These tasks are big enough big to be splitted and picked any small part of them as a microtask.

  • Implementing simple type system #118
  • Simplify the conditions #216
  • Use basic block information from radare2 #207
  • Use SDB to determine the number of argument of a given function #213

Feel free to ask questions or provide a feedback here.

Sorry for splitting it into the multiple parts, but there is a restriction for new users to not use more than 2 links in the post. And due to that, even after the split I had to skip many useful reference links, sigh. Moderators/Admins - please merge them into the one message, thank you, and sorry for the inconvenience.
I also added some links as hxxp:// to bypass the limitation.

Very cool to see that the Radare project is going to implement some parts in Rust.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.