Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add a variant of .definelabel that avoids getting incremented when used from an object file, even when defined in THUMB mode #248

Merged
merged 1 commit into from
Feb 2, 2025

Conversation

Zeturic
Copy link
Contributor

@Zeturic Zeturic commented Jan 26, 2025

The situation this is trying to help with is basically this:

Assume this is test.c and it gets compiled with -mthumb:

extern unsigned gLinkStatus;

unsigned getGLinkStatus() {
    return gLinkStatus;
}

And then this is test.asm:

.gba
.thumb

.open "rom.gba", "test.gba", 0x08000000

.org 0x0871A240
.importobj "test.o"

.close

.definelabel gLinkStatus, 0x03003F20

It assembles without complaint, but it's actually broken because the code inside the object file will actually try to read from 0x03003F21. This is because gLinkStatus was .definelabeled in THUMB mode, and even though it's not a function, it still gets incremented when used by object files.

You can fix it by doing

.arm
.definelabel gLinkStatus, 0x03003F20

But that's easy to forget and there's no error or warning if you do so.

This implements (among other things) a .defineobj/.defineobject that works the way .definelabel works in ARM mode, even if you're actually in THUMB mode. So you could just do .defineobj gLinkStatus, 0x03003F20 and it works.

Actually, this implements a few more variants of .definelabel: .definefunc, .definearmfunc, and .definethumbfunc (the first is an alias of .definelabel and the other two work the way you would probably expect from the names). I'm mostly interested in .defineobj, but they were easy to throw in.

While this works, I'm not really sure if this is really a reasonable way to do it. And even if it is generally sound, ArmInfoMode is a terrible name and it definitely doesn't belong in Commands/CAssemblerLabel.h. But I couldn't figure out a better name or place for it.

If you have any suggestions or concerns, please let me know.

@Kingcom
Copy link
Owner

Kingcom commented Jan 26, 2025

Thanks for the contribution! I agree that .definelabel is a bit error prone for ARM. Some thoughts on the changes:

  • a possible alternative to ArmInfoMode might be passing an std::optional<int> info to the constructor of the label, which overrides the default behavior if set
  • .define(arm/thumb)function is a misnomer at this point, as functions in armips have a size that can be written to .sym files - which these directives do not take
  • similarly, I find .defineobject confusing. I understand where you are coming from, but armips does not know the notion of an object - only functions and labels -, and an object would also want a size

I would probably prefer to just stick to labels for now, and name the directives accordingly - so only .definearmlabel/.definethumblabel. It would be great if you could add them to the readme, too!

@Zeturic
Copy link
Contributor Author

Zeturic commented Jan 26, 2025

First off, thank you for looking at this so quickly.

The std::optional suggestion is great. I don't know why I didn't think of that.

As far as the "object" naming, that mostly come out the way obdjump's output (and presumably ELF) distinguishes between "functions", and "objects", but yeah, they should technically have a size, so not quite accurate here.

Really, I was trying to come up with a name that wasn't so ARM-specific. What was bothering me about it, really, is that it's strange to have to do .definearmlabel gLinkStatus, 0x03003F20 when gLinkStatus isn't ARM in any meaningful sense - it's just data, and the C code it's originally from was even compiled in THUMB mode. Would you be okay with, in addition to the ARM-specific .definearmlabel and .definethumblabel, if I added an architecture-agnostic .definedatalabel (that, yes, for non ARM-achitectures would behave identically to .definelabel and for ARM architectures would behave identically to .definearmlabel)? If not, that's fine too.

As far as the README, I intended to do that with the first version and it completely slipped my mind.

@Zeturic Zeturic changed the title Add a variant of .definelabel that avoids getting incremented when used from an object file, even when defined in THUMB mode Add a variant of .definelabel that avoids getting incremented when used from an object file, even when defined in THUMB mode Add a variant of .definelabel that avoids getting incremented when used from an object file, even when defined in THUMB mode Jan 26, 2025
@Zeturic
Copy link
Contributor Author

Zeturic commented Jan 27, 2025

I made the requested changes, though I did go ahead and implement the .definedatalabel. Again, I can delete it if you prefer.

As far as the README, I'm not actually certain if .importobj is the only context in which ARMIPS cares about which mode a symbol was .definelabeled in, so it might need to be rewritten (or simply rewritten for clarity).

@Kingcom
Copy link
Owner

Kingcom commented Feb 2, 2025

Sorry for the delay. Looks good, just a small suggestion for the wording in the readme.

Readme.md Outdated
Comment on lines 912 to 914
For ARM architectures, the way `.definelabel` interacts with `.importobj` is subtly different depending on which mode - ARM or THUMB - was active at the time of the `.definelabel`. Because calling a THUMB function involves setting the least significant bit, within the code imported from the ELF object file, any extern labels referenced will have their LSB set if that label was `.definelabel`ed under THUMB mode (and will be left as-is if it was `.definelabel`ed under ARM mode). This behavior holds even if the label isn't even to a function, and thus can cause unaligned reads when attempting to access data at a specific address.

These two variants of `.definelabel` are for defining labels in a way that isn't dependent on the current ARM/THUMB mode. `.definearmlabel` always works the way that `.definelabel` works under ARM mode, and `.definethumblabel` always works the way `.definelabel` works under THUMB mode.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small wording suggestion:

Identical to `.definelabel`, but explicitly createy an ARM or THUMB label regardless of the current mode.

Only relevant when linking external code through `.importobj`. ARM uses the least significant bit in addresses to signify whether the code at the target address is using ARM or THUMB mode when setting the program counter through instructions such as `jr` or `blx`.  armips automatically remembers the current mode when labels are defined, but this mode may be incorrect when using `.definelabel`. Note that using `.definethumblabel` for data may result in incorrect addresses.

@Kingcom Kingcom merged commit a0d5e2d into Kingcom:master Feb 2, 2025
5 checks passed
@Kingcom
Copy link
Owner

Kingcom commented Feb 2, 2025

Thanks!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants