Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

dump program in LLVM IR #575

Open
ivg opened this issue Oct 10, 2016 · 13 comments
Open

dump program in LLVM IR #575

ivg opened this issue Oct 10, 2016 · 13 comments

Comments

@ivg
Copy link
Member

ivg commented Oct 10, 2016

Motivation

Dumping project in the LLVM IR will open an opportunity to many interesting projects, e.g., JIT compilation, running LLVM analyses, creating binaries, lifter verification, etc.

This can be a nice toy project, for someone who would like to learn BAP. And it is the best way to learn both LLVM and BAP intermediate representations.

Implementation

Since BAP IR is quite close to the LLVM IR, the direct transformation should be easy. A proper place, to inject it, would be to write a pretty printer for the project data structure. Here comes the skeleton setup.

Initial setup

  1. create a folder bir_to_llvm.
  2. create file bir_to_llvm.ml with the following initial contents:
open Core_kernel.Std
open Bap.Std
open Regular.Std
open Format

let pp_nop ppf t =
  fprintf ppf "%%r%a = add i1 0, 0@\n" Tid.pp (Term.tid t)

let pp_ret ppf = fprintf ppf "ret void@\n"

let pp_phi = pp_nop
let pp_def = pp_nop
let pp_jmp = pp_nop

let pp_elts ppf elts =
  Seq.iter elts ~f:(function
      | `Phi phi -> fprintf ppf "%a" pp_phi phi
      | `Def def -> fprintf ppf "%a" pp_def def
      | `Jmp jmp -> fprintf ppf "%a" pp_jmp jmp)

let pp_args ppf sub = ()

let pp_body ppf blks =
  Seq.iter blks ~f:(fun blk ->
      fprintf ppf "\n@[bb_%a:@\n%a@\n%t@]@\n"
        Tid.pp (Term.tid blk)
        pp_elts (Blk.elts blk)
        pp_ret)


let pp_ret ppf sub =
  fprintf ppf "void"

let pp_sub ppf sub =
  let args = Term.enum arg_t sub in
  let blks = Term.enum blk_t sub in
  fprintf ppf "@[<2>define %a @%s(%a) {@\n%a@]@\n}"
    pp_ret args (Sub.name sub) pp_args args pp_body blks

let pp_prog ppf prog =
  Term.enum sub_t prog |>
  Seq.iter ~f:(fprintf ppf "@[%a@]@\n" pp_sub)

let pp ppf proj =
  fprintf ppf "@[%a@]" pp_prog (Project.program proj)

let () =
  let writer = Data.Write.create ~pp () in
  Project.add_writer ~desc:"print program in LLVM IR"
    ~ver:"0.1" "llvm" writer

Building and running

  1. build with bapbuild bir_to_llvm.plugin
  2. install with bapbundle install bir_to_llvm.plugin
  3. run with bap /bin/true -dllvm

or as a one liner:

bapbuild bir_to_llvm.plugin && bapbundle install bir_to_llvm.plugin && bap /bin/true -dllvm

Testing

The generated code should be acceptable llc:

bap /bin/true -dllvm > true.ll
llc true.ll

The command will spill out true.s file with an assembly representation.

Alternative implementation

It would be even nicer to use Term.visitor to implement the printer, however, it relies on the object system and may raise the bar.

@dnivra
Copy link

dnivra commented Feb 26, 2017

Is BAP 0.8 available for download? bap.ece.cmu.edu doesn't seem to have hosted it. I believe it had an LLVM code generator too. Perhaps someone might find that useful until this issue is resolved - for use or even for writing the LLVM IR translator.

@ivg
Copy link
Member Author

ivg commented Feb 26, 2017

There are quite a few forks of the legacy BAP available around the Hub. You can try to use GitHub's search ability to find them all. The first that comes to my mind is https://github.com/0day1day/bap

@dbrumley
Copy link
Contributor

BAP 0.8 may be available someplace. I would warn that LLVM IR translator for binary has been tried by many, and often does not get you what you're looking for. Imagine the LLVM IR with 1 function that is 1 MB using only goto's. The LLVM IR isn't designed for that. You can do per-function, but you still end up with lots of design choices, e.g., representing the stack (and shared stack frame).

Just my opinion, so take it for what it's worth, LLVM IR is the wrong thing for binary analysis. It's great for a compiler, but the right data structures for binary analysis (although the result of compilation) is different than for compilation itself.

The current BAP is what we think is the best approach.

@issue-sh
Copy link

issue-sh bot commented Nov 9, 2017

ivg set pipeline to Icebox

@issue-sh issue-sh bot added the Icebox label Nov 9, 2017
@yuedeji
Copy link

yuedeji commented Jan 30, 2018

Great work! I have a question for the BAP IR. Is it a "high-level" IR or "low-level"? Here, I refer the "high-level" to the original IRs without optimizations, such as no O1~O3. The "low-level" IR is like a direct translator from assembly code to IR.

@ivg
Copy link
Member Author

ivg commented Jan 30, 2018

It is low level, as it expands instructions up to the CPU microcode, so it's lower than assembly or machine code.

@yueyuep
Copy link

yueyuep commented Mar 26, 2019

hello
i had installed bap and want to use zhe bap to transform the executable procedure to LLVM IR,
do i need to follow this ,thanks you very much

@ivg
Copy link
Member Author

ivg commented Mar 26, 2019

This issue is basically saying that dumping BIR into IR is not implemented and suggests anyone, who would like to implement it, a course of actions. Note, that it is not trivial, so do not expect an easy trip. A few of us went down this road with no success :)

@yueyuep
Copy link

yueyuep commented Mar 26, 2019

thank you for your answer
i dont't understand the BAP well ,and i want to tansform the binary procedure into LLVM IR( i had read a paper where it use the BAP tool). i have read BAP command ,but could't figure out.

@ivg
Copy link
Member Author

ivg commented Mar 26, 2019

It is not possible in modern BAP, that's why this issue is open.

@yueyuep
Copy link

yueyuep commented Mar 26, 2019

ok ,thanks

@XVilka
Copy link
Contributor

XVilka commented Apr 1, 2020

Curiously, with the modern move of BAP to the KB and CT, implementing something like this might be easier (might be not, depending on some conversion peculiarities).

@zyt755
Copy link

zyt755 commented Jun 27, 2021

Excuse me, this issue is still open. Does it mean that dumping BIR into IR is not implemented yet?
None went down this road with succesc =.=

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

7 participants