Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

A segmentation fault occurs when running a simple ELF file #461

Closed
kjhhgt76 opened this issue Jun 17, 2024 · 9 comments · Fixed by #462
Closed

A segmentation fault occurs when running a simple ELF file #461

kjhhgt76 opened this issue Jun 17, 2024 · 9 comments · Fixed by #462

Comments

@kjhhgt76
Copy link

kjhhgt76 commented Jun 17, 2024

I ran hello.elf and it is fine.
But when I tried to run my own mul.elf, segmentation fault happened.
I tried to do some simple gdb debugging to figure out why but just gave up because my unfamiliarity with the emulator.
Can someone explain the cause?

Emulator build configuration:
Every ENABLE_* in Makefile are disabled.

$ cc --version      
cc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)     
Copyright (C) 2015 Free Software Foundation, Inc.    
This is free software; see the source for copying conditions.  There is NO    
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.     

hello.elf result:

$ ./rv32emu hello.elf 
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
inferior exit code 0

mul.elf result:

./rv32emu mul.elf 
Segmentation fault (core dumped)`

The objdump of mul.elf:
mul_objdump.txt

c source code:

int mul(int a, int b)
{
    a = 0x12345678;
    b = 0x87654321;
    return a * b;
}

It is just an integer multiply function.

@jserv
Copy link
Contributor

jserv commented Jun 17, 2024

Avoid uploading screenshots that contain only text. Instead, use Markdown formatting to present command lists, source code, and any text-based content. This approach is more accessible for individuals with visual impairments.

Please update the issue with steps to reproduce the problem you're encountering, including a minimal piece of source code that illustrates the issue. Why are you using GCC-4.8.5?

@kjhhgt76
Copy link
Author

kjhhgt76 commented Jun 17, 2024

Avoid uploading screenshots that contain only text. Instead, use Markdown formatting to present command lists, source code, and any text-based content. This approach is more accessible for individuals with visual impairments.

Please update the issue with steps to reproduce the problem you're encountering, including a minimal piece of source code that illustrates the issue. Why are you using GCC-4.8.5?

This is the default gcc version in my system which is centos 7. Do you think the issue is caused by this old version of gcc?

@jserv
Copy link
Contributor

jserv commented Jun 17, 2024

I attempted to reproduce with the following steps:

  1. Prepare C code (mul.c)
int mul(int a, int b)
{
    a = 0x12345678;
    b = 0x87654321;
    return a * b;
}

int main(int argc, char *argv[])
{
    return mul(argc, argc + 1) % 128;
} 
  1. Compile with GNU Toolchain for RISC-V, built by xPack
$ riscv-none-elf-gcc -o mul.elf -march=rv32i -mabi=ilp32 mul.c
  1. Run via rv32emu
 $ build/rv32emu mul.elf 
inferior exit code 120
  1. Validate the result with Python
$ python3 -c "print(0x12345678 * 0x87654321 % 128)"
120

So far, everything works as expected.

@visitorckw
Copy link
Collaborator

c source code:

int mul(int a, int b)
{
    a = 0x12345678;
    b = 0x87654321;
    return a * b;
}

It is just an integer multiply function.

Although this is probably not related to causing a segmentation fault, this code will cause signed integer overflow and result in undefined behavior.

@kjhhgt76
Copy link
Author

kjhhgt76 commented Jun 17, 2024

My elf generation is different from yours, my c code doesn't have main() and the start point is customized by linker script.

  1. prepare mul.c and mul.o, using riscv-none-embed-gcc (xPack GNU RISC-V Embedded GCC x86_64) 10.2.0
int mul(int a, int b)
{
    a=0x12345678;
    b=0x87654321;
    return a * b;
}

riscv-none-embed-gcc -Wall -nostdlib -nostartfiles -ffreestanding -march=rv32i -c mul.c -o mul.o

  1. prepare setup.s setup.o, using GNU assembler (xPack GNU RISC-V Embedded GCC x86_64) 2.35
sp_setup: li sp, 32768   # for setting up the stack pointer

riscv-none-embed-as --warn --fatal-warnings -march=rv32i setup.s -o setup.o

  1. prepare link_script.ld
OUTPUT_ARCH("riscv")
SECTIONS
{
    . = 0x0;
    .text : { setup.o (.text) 
                *(.text)
            }
}
  1. link object and produce mul.elf, using riscv-none-embed-gcc (xPack GNU RISC-V Embedded GCC x86_64) 10.2.0
    riscv-none-embed-gcc -nostartfiles -march=rv32i -T link_script.ld -o mul.elf mul.o setup.o;

riscv-none-embed-objdump -D mul.elf:

mul.elf:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <sp_setup>:
   0:	00008137          	lui	sp,0x8

00000004 <mul>:
   4:	fe010113          	addi	sp,sp,-32 # 7fe0 <__mulsi3+0x7f88>
   8:	00112e23          	sw	ra,28(sp)
   c:	00812c23          	sw	s0,24(sp)
  10:	02010413          	addi	s0,sp,32
  14:	fea42623          	sw	a0,-20(s0)
  18:	feb42423          	sw	a1,-24(s0)
  1c:	123457b7          	lui	a5,0x12345
  20:	67878793          	addi	a5,a5,1656 # 12345678 <__mulsi3+0x12345620>
  24:	fef42623          	sw	a5,-20(s0)
  28:	876547b7          	lui	a5,0x87654
  2c:	32178793          	addi	a5,a5,801 # 87654321 <__mulsi3+0x876542c9>
  30:	fef42423          	sw	a5,-24(s0)
  34:	fe842583          	lw	a1,-24(s0)
  38:	fec42503          	lw	a0,-20(s0)
  3c:	01c000ef          	jal	ra,58 <__mulsi3>
  40:	00050793          	mv	a5,a0
  44:	00078513          	mv	a0,a5
  48:	01c12083          	lw	ra,28(sp)
  4c:	01812403          	lw	s0,24(sp)
  50:	02010113          	addi	sp,sp,32
  54:	00008067          	ret

00000058 <__mulsi3>:
  58:	00050613          	mv	a2,a0
  5c:	00000513          	li	a0,0
  60:	0015f693          	andi	a3,a1,1
  64:	00068463          	beqz	a3,6c <__mulsi3+0x14>
  68:	00c50533          	add	a0,a0,a2
  6c:	0015d593          	srli	a1,a1,0x1
  70:	00161613          	slli	a2,a2,0x1
  74:	fe0596e3          	bnez	a1,60 <__mulsi3+0x8>
  78:	00008067          	ret

Disassembly of section .comment:

00000000 <.comment>:
   0:	3a434347          	fmsub.d	ft6,ft6,ft4,ft7,rmm
   4:	2820                	fld	fs0,80(s0)
   6:	5078                	lw	a4,100(s0)
   8:	6361                	lui	t1,0x18
   a:	4e47206b          	0x4e47206b
   e:	2055                	jal	b2 <__mulsi3+0x5a>
  10:	4952                	lw	s2,20(sp)
  12:	562d4353          	0x562d4353
  16:	4520                	lw	s0,72(a0)
  18:	626d                	lui	tp,0x1b
  1a:	6465                	lui	s0,0x19
  1c:	6564                	flw	fs1,76(a0)
  1e:	2064                	fld	fs1,192(s0)
  20:	20434347          	fmsub.s	ft6,ft6,ft4,ft4,rmm
  24:	3878                	fld	fa4,240(s0)
  26:	5f36                	lw	t5,108(sp)
  28:	3436                	fld	fs0,360(sp)
  2a:	2029                	jal	34 <mul+0x30>
  2c:	3031                	jal	fffff838 <__mulsi3+0xfffff7e0>
  2e:	322e                	fld	ft4,232(sp)
  30:	302e                	fld	ft0,232(sp)

@jserv
Copy link
Contributor

jserv commented Jun 17, 2024

My elf generation is different from yours, my c code doesn't have main() and the start point is customized by linker script.

You should specify the entry point that can jump into the mul function with the proper calling convention.
See https://github.com/sysprog21/rv32emu/blob/master/tests/asm-hello/hello.S

@kjhhgt76
Copy link
Author

kjhhgt76 commented Jun 17, 2024

My elf generation is different from yours, my c code doesn't have main() and the start point is customized by linker script.

You should specify the entry point that can jump into the mul function with the proper calling convention. See https://github.com/sysprog21/rv32emu/blob/master/tests/asm-hello/hello.S

I specified the entry point which is 0x0. The start of mul() is placed at 0x4. 0x0 is just an instruction which load sp with value 32768. So that shouldn't cause the problem.

@ChinYikMing
Copy link
Collaborator

I have tried it out and I get segmentation fault as well. After investigating, the root cause is generated when emulating ret instruction. ret instruction is a pseudo instruction for jalr instruction, the implementation of jalr will create a branch history table via LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE. Inside, LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE,

...
if (ir->branch_table->PC[i] == PC) {
    MUST_TAIL return ir->branch_table->target[i]->impl(
        rv, ir->branch_table->target[i], cycle, PC);
}
...

This if conditional statement causes side effect if the PC is 0x0 because the initial value of ir->branch_table->PC[i]( i = 0, 1, 2, 3, ..., HISTORY_SIZE-1 ) are 0x0 since it is initialized via calloc which initializes region of data storage to 0x0.

The trivial solution is initializing ir->branch_table->PC[i]( i = 0, 1, 2, 3, ..., HISTORY_SIZE-1 ) with value other than 0x0. For example, unsigned integer of -1. Since the HISTORY_SIZE is pretty small ( 16 ), I believe the performance drop can be negligible.

ChinYikMing added a commit to ChinYikMing/rv32emu that referenced this issue Jun 17, 2024
If the ra(return address) is 0x0, the
LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE will bahave abnormally since
calloc initialize ir->branch_table->PC[i] to 0x0. The 0x0 address might
be not yet translated to a valid block, thus ir->branch_table->target[i]
might be NULL, calling a NULL function pointer cause segmentation fault.
It can be solved by initializing ir->branch_table->PC will other value
than 0x0. Here, I choose unsigned integer of -1.

Close sysprog21#461
@ChinYikMing
Copy link
Collaborator

ChinYikMing commented Jun 17, 2024

My elf generation is different from yours, my c code doesn't have main() and the start point is customized by linker script.

  1. prepare mul.c and mul.o, using riscv-none-embed-gcc (xPack GNU RISC-V Embedded GCC x86_64) 10.2.0
int mul(int a, int b)
{
    a=0x12345678;
    b=0x87654321;
    return a * b;
}

riscv-none-embed-gcc -Wall -nostdlib -nostartfiles -ffreestanding -march=rv32i -c mul.c -o mul.o

  1. prepare setup.s setup.o, using GNU assembler (xPack GNU RISC-V Embedded GCC x86_64) 2.35
sp_setup: li sp, 32768   # for setting up the stack pointer

riscv-none-embed-as --warn --fatal-warnings -march=rv32i setup.s -o setup.o

  1. prepare link_script.ld
OUTPUT_ARCH("riscv")
SECTIONS
{
    . = 0x0;
    .text : { setup.o (.text) 
                *(.text)
            }
}
  1. link object and produce mul.elf, using riscv-none-embed-gcc (xPack GNU RISC-V Embedded GCC x86_64) 10.2.0
    riscv-none-embed-gcc -nostartfiles -march=rv32i -T link_script.ld -o mul.elf mul.o setup.o;

riscv-none-embed-objdump -D mul.elf:

mul.elf:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <sp_setup>:
   0:	00008137          	lui	sp,0x8

00000004 <mul>:
   4:	fe010113          	addi	sp,sp,-32 # 7fe0 <__mulsi3+0x7f88>
   8:	00112e23          	sw	ra,28(sp)
   c:	00812c23          	sw	s0,24(sp)
  10:	02010413          	addi	s0,sp,32
  14:	fea42623          	sw	a0,-20(s0)
  18:	feb42423          	sw	a1,-24(s0)
  1c:	123457b7          	lui	a5,0x12345
  20:	67878793          	addi	a5,a5,1656 # 12345678 <__mulsi3+0x12345620>
  24:	fef42623          	sw	a5,-20(s0)
  28:	876547b7          	lui	a5,0x87654
  2c:	32178793          	addi	a5,a5,801 # 87654321 <__mulsi3+0x876542c9>
  30:	fef42423          	sw	a5,-24(s0)
  34:	fe842583          	lw	a1,-24(s0)
  38:	fec42503          	lw	a0,-20(s0)
  3c:	01c000ef          	jal	ra,58 <__mulsi3>
  40:	00050793          	mv	a5,a0
  44:	00078513          	mv	a0,a5
  48:	01c12083          	lw	ra,28(sp)
  4c:	01812403          	lw	s0,24(sp)
  50:	02010113          	addi	sp,sp,32
  54:	00008067          	ret

00000058 <__mulsi3>:
  58:	00050613          	mv	a2,a0
  5c:	00000513          	li	a0,0
  60:	0015f693          	andi	a3,a1,1
  64:	00068463          	beqz	a3,6c <__mulsi3+0x14>
  68:	00c50533          	add	a0,a0,a2
  6c:	0015d593          	srli	a1,a1,0x1
  70:	00161613          	slli	a2,a2,0x1
  74:	fe0596e3          	bnez	a1,60 <__mulsi3+0x8>
  78:	00008067          	ret

Disassembly of section .comment:

00000000 <.comment>:
   0:	3a434347          	fmsub.d	ft6,ft6,ft4,ft7,rmm
   4:	2820                	fld	fs0,80(s0)
   6:	5078                	lw	a4,100(s0)
   8:	6361                	lui	t1,0x18
   a:	4e47206b          	0x4e47206b
   e:	2055                	jal	b2 <__mulsi3+0x5a>
  10:	4952                	lw	s2,20(sp)
  12:	562d4353          	0x562d4353
  16:	4520                	lw	s0,72(a0)
  18:	626d                	lui	tp,0x1b
  1a:	6465                	lui	s0,0x19
  1c:	6564                	flw	fs1,76(a0)
  1e:	2064                	fld	fs1,192(s0)
  20:	20434347          	fmsub.s	ft6,ft6,ft4,ft4,rmm
  24:	3878                	fld	fa4,240(s0)
  26:	5f36                	lw	t5,108(sp)
  28:	3436                	fld	fs0,360(sp)
  2a:	2029                	jal	34 <mul+0x30>
  2c:	3031                	jal	fffff838 <__mulsi3+0xfffff7e0>
  2e:	322e                	fld	ft4,232(sp)
  30:	302e                	fld	ft0,232(sp)

Note that your linker script makes the program trapping into infinite loop even no segmentation fault.

@jserv jserv changed the title segmentation fault happens when running a simple elf A segmentation fault occurs when running a simple ELF file Jun 17, 2024
ChinYikMing added a commit to ChinYikMing/rv32emu that referenced this issue Jun 17, 2024
If the ra(return address) is 0x0, the
LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE will bahave abnormally since
calloc initializes ir->branch_table->PC[i] to 0x0. The address 0x0 might
be not yet translated to a valid block, thus ir->branch_table->target[i]
might be NULL, accessing a NULL pointer causes segmentation fault. It
can be solved by initializing ir->branch_table->PC with other value than
0x0. Here, I choose unsigned integer of -1.

Close sysprog21#461
@jserv jserv added this to the release-2024.1 milestone Jun 18, 2024
vestata pushed a commit to vestata/rv32emu that referenced this issue Jan 24, 2025
If the ra(return address) is 0x0, the
LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE will bahave abnormally since
calloc initializes ir->branch_table->PC[i] to 0x0. The address 0x0 might
be not yet translated to a valid block, thus ir->branch_table->target[i]
might be NULL, accessing a NULL pointer causes segmentation fault. It
can be solved by initializing ir->branch_table->PC with other value than
0x0. Here, I choose unsigned integer of -1.

Close sysprog21#461
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants