Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

A few discrepancies in X86-64 Instruction Semantics #376

Closed
sdasgup3 opened this issue Nov 5, 2019 · 5 comments
Closed

A few discrepancies in X86-64 Instruction Semantics #376

sdasgup3 opened this issue Nov 5, 2019 · 5 comments
Assignees
Labels
bug x86 Related to x86/x86-64/AMD64 lifting support

Comments

@sdasgup3
Copy link

sdasgup3 commented Nov 5, 2019

Hello Team,
I was validating McSema's semantics of various x86-64 instructions against the formal sematics using solver checks and found the following discrepancies.

Example Instruction Potential Reason Affected Variants
xaddq %rax, %rax R4 (5) xaddq_r64_r64 xaddb_r8_r8 xaddb_rh_rh xaddl_r32_r32 xaddw_r16_r16
andnps %xmm2, %xmm1 R1 (20) andnps_xmm_xmm andnq_r64_r64_r64 pandn_xmm_xmm vandnpd_xmm_xmm_xmm vandnpd_ymm_ymm_ymm vandnps_xmm_xmm_xmm vandnps_ymm_ymm_ymm vpandn_xmm_xmm_xmm vpandn_ymm_ymm_ymm andnl_r32_r32_m32 andnpd_xmm_m128 andnps_xmm_m128 andnq_r64_r64_m64 pandn_xmm_m128 vandnpd_xmm_xmm_m128 vandnpd_ymm_ymm_m256 vandnps_xmm_xmm_m128 vandnps_ymm_ymm_m256 vpandn_xmm_xmm_m128 vpandn_ymm_ymm_m256
pmuludq %xmm2, %xmm1 R2 (2) pmuludq_xmm_xmm pmuludq_xmm_m128
cmpxchgl %ecx, %ebx R3 (1) cmpxchgl_r32_r32
cmpxchgb %ah, %al R5 (1) cmpxchgb_r8_rh

Reasons

  • R4
xaddq %rax, %rbx expects the operations (1)
temp ← %rax + %rbx, (2) %rax ← %rbx, and (3) %rbx
← temp, in that order. McSema performs the same operation
differently as (A) old_rbx = %rbx, (B) temp ← %rax +
%rbx, (C) %rbx ← temp, and (D) %rax ← old_rbx. This
will fail to work when the operands are the same registers.
  • R1
    The Intel Manual says the implementation should be DEST←NOT(DEST) AND SRC, whereas McSema performs DEST←NOT(SRC) AND DEST

  • R2
    The Intel Manual says the implementation should be

 DEST[63:0] ← DEST[31:0] ∗ SRC[31:0];
 DEST[127:64] ← DEST[95:64] ∗ SRC[95:64];

OTOH McSema performs DEST[63:0] ← DEST[31:0] ∗ SRC[31:0];

  • R3 and R5
    As per the Manual, the semantics should be
TEMP ← DEST

IF accumulator = TEMP
    THEN
        ZF ← 1;
        DEST ← SRC;
    ELSE
        ZF ← 0;
        accumulator ← TEMP;
        DEST ← TEMP;
FI;

For cmpxchgl %ecx, %ebx
However, McSema compares the entire 64'DEST, which is TEMP as per the above pseudocode, against the accumulator Concat(32'0, RAX[31:0])

For cmpxchgb %ah, %al,
The control should get into the THEN part which must lead to DEST (al) <- SRC(ah). However, McSema keeps the lower 8 bits of RAX unchanged.

Please note that all the bugs are double-checked by looking into the lifted IR that McSema generated for these cases. We hope that this information might be useful to you.
Let me know your opinion.

@sdasgup3 sdasgup3 changed the title Bugs: Instruction Semantics Discrepancies in Instruction Semantics Nov 5, 2019
@sdasgup3 sdasgup3 changed the title Discrepancies in Instruction Semantics Discrepancies in X86-64 Instruction Semantics Nov 5, 2019
@sdasgup3 sdasgup3 changed the title Discrepancies in X86-64 Instruction Semantics A Few discrepancies in X86-64 Instruction Semantics Nov 5, 2019
@sdasgup3 sdasgup3 changed the title A Few discrepancies in X86-64 Instruction Semantics A few discrepancies in X86-64 Instruction Semantics Nov 5, 2019
@pgoodman pgoodman transferred this issue from lifting-bits/mcsema Nov 13, 2019
@pgoodman pgoodman added bug x86 Related to x86/x86-64/AMD64 lifting support labels Nov 13, 2019
@pgoodman
Copy link
Contributor

@sdasgup3 Can you verify if the fixes are acceptable?

@sdasgup3
Copy link
Author

Sure, I will.

@sdasgup3
Copy link
Author

sdasgup3 commented Dec 14, 2019

Thanks, @kumarak & @pgoodman for the fixes.

Most of the semantics got fixed except for the followings. Also, I have attached the artfacts against which I am comparing.

  • cmpxchgl %ecx, %ebx
    In the event of %ebx != %eax, the accumulator (%eax) should be updated with dest. But the higher 32 bits of %rax needs to be zeroed out, which seems to be missing.

The X86 semantics generates the following summary for the %rax register (please refer to artifacts/cmpxchgl_r32_r32/Output/test-z3.py)

xvar = (V_R == z3.If((z3.Extract(31, 0, VX_RAX) == z3.Extract(31, 0, VX_RBX)), VX_RAX, z3.Concat(z3.BitVecVal(0, 32), z3.Extract(31, 0, VX_RBX))))

and symbolically executing the LLVM IR that remill generates gives the following summary (with embedded comments to highlight the potential discrepenacy)

lvar = z3.And(

        # Case: Accumulator != dest
	z3.Implies(
		(z3.If((z3.Concat(z3.Extract(31, 24, VL_RBX),z3.Extract(23, 16, VL_RBX),z3.Extract(15, 8, VL_RBX),z3.Extract(7, 0, VL_RBX),) == z3.Concat(z3.Extract(31, 24, VL_RAX),z3.Extract(23, 16, VL_RAX),z3.Extract(15, 8, VL_RAX),z3.Extract(7, 0, VL_RAX),)), z3.BitVecVal(1, 8), z3.BitVecVal(0, 8)) == z3.BitVecVal(0, 8)), 

		V_R == z3.Concat(z3.Extract(63, 56, VL_RAX), z3.Extract(55, 48, VL_RAX), z3.Extract(47, 40, VL_RAX), z3.Extract(39, 32, VL_RAX), # <-- Most significans 8 bytes must be zero'ed.
                  z3.Extract(31, 24, z3.Concat(z3.Extract(31, 24, VL_RBX),z3.Extract(23, 16, VL_RBX),z3.Extract(15, 8, VL_RBX),z3.Extract(7, 0, VL_RBX),)),
                  z3.Extract(23, 16, z3.Concat(z3.Extract(31, 24, VL_RBX),z3.Extract(23, 16, VL_RBX),z3.Extract(15, 8, VL_RBX),z3.Extract(7, 0, VL_RBX),)),
                  z3.Extract(15, 8, z3.Concat(z3.Extract(31, 24, VL_RBX),z3.Extract(23, 16, VL_RBX),z3.Extract(15, 8, VL_RBX),z3.Extract(7, 0, VL_RBX),)),
                  z3.Extract(7, 0, z3.Concat(z3.Extract(31, 24, VL_RBX),z3.Extract(23, 16, VL_RBX),z3.Extract(15, 8, VL_RBX),z3.Extract(7, 0, VL_RBX),)))
	), 
  ............... omitted for brevity .....................
)

You may reproduce the error using

cd artifacts/cmpxchgl_r32_r32/
make provez3
  • cmpxchgb %ah, %al
    Discrepancy with the update of rax register

  • xaddb %ah, %ah & xaddw %ax, %ax
    The mcsema generated files does not seem to get updated since last time the bugs where files
    . (Attached the respective ll files)

artifacts.tar.gz

@pgoodman
Copy link
Contributor

@sdasgup3 does the issue_376 branch resolve these issues?

@sdasgup3
Copy link
Author

Hi @pgoodman
Thanks for the update. Here is my test outcome.

  1. cmpxchgl %ecx, %ebx
    Now we are correctly handling case %ebx != %eax
    However, in the event of %ebx == %eax, the accumulator (%eax), along with its upper 32 bits, should remain unaffected. Whereas, the current implementation zero-extends in either cases. For example, the curent implementation says
rax = Concat(32'0, Extract(32, 0, %rax)) if %ebx == %eax // zero extension not needed
         Concat(32'0, Extract(32, 0, %rbx)), Otherwise
  1. cmpxchgb %ah, %al
    The following explains the difference.
// Current Implementaton
lvar = (V_R == z3.Concat(
      z3.Extract(63, 56, VL_RAX),
      z3.Extract(55, 48, VL_RAX),
      z3.Extract(47, 40, VL_RAX),
      z3.Extract(39, 32, VL_RAX),
      z3.Extract(31, 24, VL_RAX),
      z3.Extract(23, 16, VL_RAX),
      z3.Extract(15, 8, VL_RAX),
      z3.Extract(7, 0, z3.Extract(7, 0, VL_RAX))))

// Seemngly correct mplemenntation
xvar = (V_R == z3.Concat(z3.Extract(63, 8, VX_RAX), z3.Extract(15, 8, VX_RAX)))

Let me know if I need to provide any other information.

pgoodman added a commit that referenced this issue Nov 5, 2020
@pgoodman pgoodman closed this as completed Nov 5, 2020
pgoodman pushed a commit that referenced this issue Nov 5, 2020
* New x86 instructions

* Add some isels

* Fixes Issue #376

* Fixes Issue #433. Thanks @adahsuzixin for the semantics and tests

* Fixes Issue #374

* Minor fix to the semantics for VINSERTF128, it should only look at the low bit of imm8

* Minor fixes for sparc isel naming
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug x86 Related to x86/x86-64/AMD64 lifting support
Projects
None yet
Development

No branches or pull requests

4 participants