Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Double ampersand in asm constraints #382

Closed
chrysn opened this issue Apr 24, 2022 · 5 comments
Closed

Double ampersand in asm constraints #382

chrysn opened this issue Apr 24, 2022 · 5 comments
Assignees

Comments

@chrysn
Copy link
Contributor

chrysn commented Apr 24, 2022

This assembly snippet causes a panic when transpiled using the current master (9c1600a) -- for testing, note it's ARM assembly as per the target in the compile commands:

/* test.c */
#include <stdint.h>
uint32_t __STREXB(uint8_t value, volatile uint8_t *addr)
{
   uint32_t result;

   asm volatile ("strexb %0, %2, %1" : "=&r" (result), "=Q" (*addr) : "r" ((uint32_t)value) );
   return(result);
}
[
{
  "arguments": [
    "clang",
    "-target",
    "arm-none-eabi",
    "test.c"
  ],
  "directory": "/tmp/strexb-test",
  "file": "/tmp/strexb-test/test.c",
  "output": "/tmp/strexb-test/test.o"
}
]
$ c2rust transpile test.json  --fail-on-error --overwrite-existing
Error while trying to load a compilation database:
Could not auto-detect compilation database from directory "/tmp/strexb-test/test.json"
No compilation database found in /tmp/strexb-test/test.json or any parent directory
fixed-compilation-database: Error while opening fixed database: Not a directory
json-compilation-database: Error while opening JSON database: Not a directory
Running without flags.
Transpiling test.c
thread 'main' panicked at '"&r" is not a valid Ident', /home/chrysn/.cargo/registry/src/github.heygears.com-1ecc6299db9ec823/proc-macro2-1.0.37/src/fallback.rs:701:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/de1bc0008be096cf7ed67b93402250d3b3e480d0/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/de1bc0008be096cf7ed67b93402250d3b3e480d0/library/core/src/panicking.rs:142:14
   2: proc_macro2::fallback::Ident::_new
   3: proc_macro2::Ident::new
   4: <alloc::string::String as c2rust_ast_builder::builder::Make<proc_macro2::Ident>>::make
   5: <alloc::vec::Vec<S> as c2rust_ast_builder::builder::Make<syn::path::Path>>::make
   6: c2rust_ast_builder::builder::Builder::path_expr
   7: c2rust_transpile::translator::assembly::<impl c2rust_transpile::translator::Translation>::convert_asm
   8: c2rust_transpile::cfg::CfgBuilder::convert_stmt_help
   9: c2rust_transpile::cfg::CfgBuilder::convert_stmts_help
  10: c2rust_transpile::translator::Translation::with_scope
  11: c2rust_transpile::cfg::Cfg<c2rust_transpile::cfg::Label,c2rust_transpile::cfg::StmtOrDecl>::from_stmts
  12: c2rust_transpile::translator::Translation::convert_function_body
  13: c2rust_transpile::translator::Translation::convert_function
  14: c2rust_transpile::translator::Translation::convert_decl
  15: c2rust_transpile::translator::translate
  16: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
  17: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  18: c2rust_transpile::transpile
  19: c2rust_transpile::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

I didn't track down far enough to see how the && came to be, but I've tracked things down to a crude workaround and stopgap:

diff --git a/c2rust-transpile/src/translator/assembly.rs b/c2rust-transpile/src/translator/assembly.rs
index f5cd5f32..8776a63d 100644
--- a/c2rust-transpile/src/translator/assembly.rs
+++ b/c2rust-transpile/src/translator/assembly.rs
@@ -61,7 +61,7 @@ fn parse_constraints(mut constraints: &str) ->
     };
 
     let early_clobber = if constraints.starts_with('&') {
-        constraints = &constraints[1..];
+        constraints = constraints.trim_start_matches('&');
         true
     } else {
         false
@@ -590,6 +590,7 @@ impl<'c> Translation<'c> {
             let constraints_ident = if is_regname_or_int {
                 mk().lit_expr(operand.constraints.trim_matches('"'))
             } else {
+                assert!(operand.constraints.chars().next() != Some('&'), "Constraints {:?} would fail later", operand.constraints);
                 mk().ident_expr(operand.constraints)
             };

I'm not even PR'ing this, because given that there is no double ampersand in the C constraint, my impression is that something is adding the ampersand in without checking whether there is already one present. Also, I couldn't test whether plain head trimming of all "&" characters even does the right thing (as the assembly snippet in question is unused in the RIOT project which I'm transpiling, but I can't easily remove it either as it's part of a larger vendor import).

@fw-immunant fw-immunant self-assigned this Apr 25, 2022
fw-immunant added a commit that referenced this issue Apr 25, 2022
not doing so could lead to a double-ampersand being emitted, which is not correct

see #382
@fw-immunant
Copy link
Contributor

I did some debugging and it turns out that the constraint that c2rust-transpile/src/translator/assembly.rs actually sees here is =&&r. This is because the C AST exporter does its own processing of inline assembly (to try to make use of Clang's understanding of it) and for some reason it doesn't correctly handle this scenario, even though I believe similar constraints are present in the test suite.

After a quick-and-dirty patch to fix double ampersand emission, a number of other things didn't work in your example, because c2rust was (as noted here) assuming that all inline assembly was x86_64, and in fact generally does not consider the target architecture when doing translation.

I've added logic to attempt to extract target architecture information from Clang flags (though not currently from the flags in the compilation db) and woven this info through our assembly translation; the result is in the fw/target-dependent-asm branch. I'd appreciate if you can try that branch and see if it works any better.

For the example you provided, it generates this output:

#[no_mangle]
pub unsafe extern "C" fn rust___STREXB(
    mut value: uint8_t,
    mut addr: *mut uint8_t,
) -> uint32_t {
    let mut result: uint32_t = 0;
    asm!(
        "strexb {0}, [{2}], {1}", out(reg) result, in (reg) & mut * addr, inlateout(reg)
        value as uint32_t => _, options(preserves_flags)
    );
    return result;
}

However, playing around with this code at godbolt (I don't have an ARM Rust toolchain locally), it seems we may not be generating the right templates for the assembly operands here--it complains of "invalid instruction", which only seems to be fixed if I manually change which argument is marked as a memory reference (strexb {0}, {2}, [{1}]).

@fw-immunant
Copy link
Contributor

I believe I've fixed the cause of the incorrect operand being marked as a memory reference in fead6b0.

@chrysn
Copy link
Contributor Author

chrysn commented Apr 26, 2022

Thank you. The fix does allow C2Rust to get through the translation, similar to my previous crude hack (but more properly).

Like with that, I wind up with tons of other errors later, but one at a time I guess.

fw-immunant added a commit that referenced this issue Apr 29, 2022
not doing so could lead to a double-ampersand being emitted, which is not correct

see #382
@chrysn
Copy link
Contributor Author

chrysn commented May 2, 2022

The current branch fw/target-dependent-asm (505fe05) transpiles the RIOT assembly well as far as I can tell. The only change I had to apply is:

diff --git a/c2rust-transpile/src/translator/assembly.rs b/c2rust-transpile/src/translator/assembly.rs
index 6f0a37de..66844808 100644
--- a/c2rust-transpile/src/translator/assembly.rs
+++ b/c2rust-transpile/src/translator/assembly.rs
@@ -62,7 +62,8 @@ fn parse_arch(target_tuple: &str) -> Option<Arch> {
     target_tuple.starts_with("armv8") ||
     target_tuple.starts_with("arm64") {
         Some(Arch::Aarch64)
-    } else if target_tuple.starts_with("arm") {
+    } else if target_tuple.starts_with("arm") ||
+    target_tuple.starts_with("thumbv") {
         Some(Arch::Arm)
     } else if target_tuple.starts_with("riscv") {
         Some(Arch::Riscv)

Judging from GCC's documentation on machine constraints, thumb and arm are the same for these purposes. (Generally they are different encodings of the same instruction set as far as I understand, so that fits).

Once this is merged (and maybe there is a release, even alpha/beta as in #388), I'd love to switch over to using this.

fw-immunant added a commit that referenced this issue May 3, 2022
not doing so could lead to a double-ampersand being emitted, which is not correct

see #382
@fw-immunant
Copy link
Contributor

Fixed in #383.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants