Skip to content

Initial support for AArch32 #987

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from
Draft

Initial support for AArch32 #987

wants to merge 23 commits into from

Conversation

chameco
Copy link
Contributor

@chameco chameco commented Jan 4, 2021

This PR adds support for 32-bit ARM verification using LLVM specifications, introducing a new top-level function llvm_verify_aarch32 with an interface similar to llvm_verify_x86.

I've marked this as a draft since it's very hacky and fragile right now.
A few things that I'd like to fix before merging:

  • Generalize the utility functions copied from SAWScript.X86 and SAWScript.X86Spec, and use them in both SAWScript.Crucible.LLVM.X86 and SAWScript.Crucible.LLVM.AArch32.
  • Share as much code as possible between the x86_64 and AArch32 support. Ideally, there would be a common interface (at least internally) to all Macaw backends, parameterized by architecture information, ABI, etc.
  • Potentially reuse more code from asl-translator.
  • Add integration tests.

There's also a problem of compile time. Some of the new dependencies take significantly longer to build, which could negatively impact the CI experience.

In the long run, I'd like to make these low-level backends less of a "walled garden". Right now, the llvm_verify_<arch> commands have two (tightly-coupled) steps: translate an LLVM spec to an <arch> spec according to the ABI, and then verify that <arch> spec. It might be nice to decouple these steps a bit to make future development cleaner.

Example usage! I used Zig for a convenient binary because it's really easy to cross compile.

// test.zig
export fn add(a: i32, b: i32) i32 { return a + b; }
pub fn main() void {}

zig build-exe test.zig -target arm-linux --release-fast will build a 32-bit ARM ELF with

add:
    add r0, r1, r0
    bx lr

To test:

// test.c
extern int add(int a, int b);
int main() { return add(1, 0); }

clang -m32 -c -emit-llvm test.c will build some reasonable bitcode that uses add.

// test.saw
enable_experimental;
m <- llvm_load_module "test.bc";
let spec = do {
  a <- crucible_fresh_var "a" (llvm_int 32);
  b <- crucible_fresh_var "b" (llvm_int 32);
  crucible_execute_func [crucible_term a, crucible_term b];
  crucible_return (crucible_term {{ a + b : [32] }});
};
llvm_verify_aarch32 m "test" "add" [] false spec w4;

@andreistefanescu
Copy link
Contributor

cc @travitch

@travitch
Copy link
Contributor

travitch commented Jan 4, 2021

Neat! This is exciting. The compile times are a huge problem generally - I'd love to find some ways to improve them.

Is there anything in particular I can do to support this? It seems like a good opportunity for us to take stock of what shared infrastructure is lacking around the binary analysis symbolic execution backends.

@sauclovian-g
Copy link
Contributor

This is not going to be buildable until someone updates asl-parser to use text >= 2; currently it demands text < 2 and that's inconsistent with things it depends on.

@RyanGlScott
Copy link
Contributor

This is not going to be buildable until someone updates asl-parser to use text >= 2; currently it demands text < 2 and that's inconsistent with things it depends on.

This surprises me, as there is no upper bound on text in asl-parser.cabal at the given submodule commit. Is some other submodule causing the build plan to reject text >= 2? (My bet would be that the dismantle, asl-translator, and/or semmc submodules would also need to be bumped, not just macaw.)

@sauclovian-g
Copy link
Contributor

Erm. It's asl-translator, not asl-parser. Also it occurs to me I didn't think to check whether those submodule references were anything like current and I'm sure they aren't.

@RyanGlScott
Copy link
Contributor

The compile times are a huge problem generally - I'd love to find some ways to improve them.

I've opened GaloisInc/macaw#480 to discuss this.

@RyanGlScott
Copy link
Contributor

Regarding the submodules, it almost suffices to just bump asl-translator and friends. The catch is that saw-script is using a newer crucible submodule commit than what asl-translator and friends are using, and some minor code changes are required to make them build against the newer crucible commit. I've opened GaloisInc/asl-translator#57 and GaloisInc/macaw#481 to do this.

@RyanGlScott
Copy link
Contributor

Now that the submodule bumps above have landed, I tried to build this branch using them, but I discovered that the code in SAWScript.Crucible.LLVM.AArch32 has bitrotted quite a bit. As one example, it relies on functionality from crucible-saw-core, which is a package that no longer exists. Removing that reveals quite a few other imports that have moved around or otherwise no longer exist. I imagine that it would take some doing to fix up everything.

(I merged macaw master with the small aarch32 commit in macaw and this
is the result)

(There was an earlier version of this commit that references a broken
version of the macaw commit; that commit got reconned and so did the
one referencing it, and if you're seeing this both should have
disappeared. This note is to prevent confusion in case one of them
comes back from the dead again at some point.)
(merge branch 'master' into aarch32)
(the new git submodules on this branch, aarch32, are exactly the ones
updated in this commit, since all were quite out of date)
macaw-loader-aarch32 is now needed by the other aarch32 bits;
what4-serialize was merged into what4 upstream.
This will be SAWScript.Crucible.CrucibleSAW.SAWCore in the module
namespace, at least for now. (The old module path was
Lang.Crucible.Backend.SAWCore.)

This is a verbatim copy of the file from crucible-saw as it existed
when it was deleted from the crucible tree two months ago. It won't
build without updating its module header, but I'm going to commit that
separately for ease of possible later rearrangements.
(not clear why, not worth arguing about)
Interface changes have assimilated some of the code that was here into
the Crucible OnlineBackend interface. So we don't need that any
more. However, those changes also make the initialization we need to
do circular. The only viable way to break that cycle without upstream
interface changes is to set the initial online state to Nothing and
then update it after we've constructed the objects. This means it will
always be Just at runtime, so fake out the prior field accessor to
unwrap the Maybe and panic if it's Nothing, which it should never be.

XXX: This is using the wrong panic interface; clean that up later if
XXX: this code sticks around.

XXX: This is a hack. If this code sticks around, we should look into
XXX: upstream interface changes to avoid the cyclic initialization.
XXX: this can almost certainly be done better
@sauclovian-g
Copy link
Contributor

I took a look at it and concluded the path of least resistance was to resurrect crucible-saw-core temporarily, so I did that, and got it probably 3/4 of the way to building. And probably spent longer on it than I should have, and it may not have been the right idea anyway, and some of the changes should be polished further if they're wanted (e.g. actually removing obsolete stuff instead of commenting it out). However, having done so I figured it was better to push what I had for later use.

If it was a bad idea we can always prune it and start over at c8509bc.

@sauclovian-g
Copy link
Contributor

The CI is still failing on deps, but it was getting past that for me, I expect because I was test-building without applying one of the freeze files. FWIW

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants