Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Adds ccmp logic into emitter backend. #112153

Merged
merged 3 commits into from
Feb 15, 2025
Merged

Conversation

anthonycanino
Copy link
Contributor

Overview


This PR adds APX new ccmp instruction to the X86 backend.

Design

For reference, there is a unique extended evex encoding for ccmp:

image

where SC0 - SC3 encode the condition for ccmp to conditionally execute on (please see SDM Vol 1, Appendix B). If the status codes fail to satisfy the condition encoded by SC0 - SC3, no compare will be performed, and the OF, SF, ZF, and CF flags will be set to the default flag value (DFV) fields of, sf, zf and cf.

Testing

Note: The testing plan for APX work has been discussed in #106557, please refer to that PR for details, only results and comments will be posted in this PR. Results

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 4, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 4, 2025
@anthonycanino
Copy link
Contributor Author

1. Emitter unit tests

The left is output from JitDisasm, and right from JitLateDisasm.

image
image

2. Intel SDE testing

Test run with SDE:

base

3. SuperPMI results

Diffs are based on 2,623,457 contexts (1,043,127 MinOpts, 1,580,330 FullOpts).

MISSED contexts: 2,983 (0.11%)

Base JIT options: JitBypassApxCheck=1

Diff JIT options: JitBypassApxCheck=1

No diffs found.

Details

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
aspnet.run.windows.x64.checked.mch 126,540 63,098 63,442 2,665 (2.06%) 2,665 (2.06%)
benchmarks.run.windows.x64.checked.mch 28,757 4 28,753 0 (0.00%) 0 (0.00%)
benchmarks.run_pgo.windows.x64.checked.mch 105,618 52,679 52,939 0 (0.00%) 0 (0.00%)
benchmarks.run_tiered.windows.x64.checked.mch 55,912 38,403 17,509 0 (0.00%) 0 (0.00%)
coreclr_tests.run.windows.x64.checked.mch 582,221 349,625 232,596 0 (0.00%) 0 (0.00%)
libraries.crossgen2.windows.x64.checked.mch 280,377 16 280,361 0 (0.00%) 0 (0.00%)
libraries.pmi.windows.x64.checked.mch 295,086 6 295,080 0 (0.00%) 0 (0.00%)
libraries_tests.run.windows.x64.Release.mch 751,895 517,237 234,658 0 (0.00%) 0 (0.00%)
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 342,818 22,045 320,773 0 (0.00%) 0 (0.00%)
realworld.run.windows.x64.checked.mch 24,824 2 24,822 0 (0.00%) 0 (0.00%)
smoke_tests.nativeaot.windows.x64.checked.mch 29,409 12 29,397 318 (1.07%) 318 (1.07%)
2,623,457 1,043,127 1,580,330 2,983 (0.11%) 2,983 (0.11%)

Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@anthonycanino
Copy link
Contributor Author

I believe failures are related to #112163

@anthonycanino
Copy link
Contributor Author

@dotnet/jit-contrib could we get this looked at for a review to see if any changes are needed?

@JulieLeeMSFT
Copy link
Member

CC @amanasifkhalid and @EgorBo for code review.

@EgorBo
Copy link
Member

EgorBo commented Feb 12, 2025

Do you have any idea why it shows up to 0.36% TP regression? Presumably, it's supposed to be zero-diff change?

@amanasifkhalid
Copy link
Member

amanasifkhalid commented Feb 13, 2025

Do you have any idea why it shows up to 0.36% TP regression? Presumably, it's supposed to be zero-diff change?

The insOpts enum now has entries that are larger than a byte -- perhaps the native compilers were previously compacting the enum (though that doesn't sound like a valid transformation)? Nothing else really sticks out to me.

@anthonycanino
Copy link
Contributor Author

anthonycanino commented Feb 13, 2025

There are a few places where SetDFVIfNeeded is called, particularly in emitInsRR that would have an additional if statement check. That's probably what it is.

We could adjust it so it only triggers for TARGET_AMD64 and not TARGET_X86. It currently follows the same precedent as SetNFIfNeeded etc.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to reduce the TP impact if possible. If this doesn't apply to x86, definitely make SetDFVIfNeeded() ifdef'ed for AMD64 (maybe create an empty function for x86 so call sites don't change). The full TP impact is only on MinOpts, fwiw.

_idCustom2 = ((value >> 1) & 1);
_idCustom3 = ((value >> 2) & 1);
_idCustom4 = ((value >> 3) & 1);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding:

assert(value == idGetEvexDFV());

emitter* theEmitter = GetEmitter();
genDefineTempLabel(genCreateTempLabel());

// #ifdef COMMENTOUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

CORINFO_FIELD_HANDLE hnd = theEmitter->emitFltOrDblConst(1.0f, EA_4BYTE);
theEmitter->emitIns_R_C(INS_ccmpe, EA_4BYTE, REG_RAX, hnd, 0, INS_OPTS_EVEX_dfv_cf);
theEmitter->emitIns_R_C(INS_ccmpe, EA_4BYTE, REG_RAX, hnd, 4, INS_OPTS_EVEX_dfv_cf);
// #endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@anthonycanino
Copy link
Contributor Author

@BruceForstall BruceForstall merged commit 01e4d44 into dotnet:main Feb 15, 2025
112 checks passed
@BruceForstall BruceForstall added the apx Related to the Intel Advanced Performance Extensions (APX) label Feb 15, 2025
grendello added a commit to grendello/runtime that referenced this pull request Feb 18, 2025
* main: (71 commits)
  Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20250212.3 (dotnet#112626)
  JIT: Unify struct arg morphing (dotnet#112612)
  Enable `SA1015`: Closing generic bracket should not be followed by a space (dotnet#112597)
  Clean up normalizeLocale for mono browser target (dotnet#112575)
  SPMI: Ensure proper zero extension for isObjectImmutable and friends (dotnet#112617)
  Quote --version-scripts path (dotnet#112603)
  Remove incompatible API from PKCS netstandard2.0 lib
  [main] Update dependencies from dotnet/emsdk (dotnet#112393)
  Avoid `Unsafe.As` in `RangeCharSearchValues` (dotnet#112606)
  Fixed the issue of incorrect return value of PalVirtualAlloc (dotnet#112579)
  Fix size used for vectorization check in BitArray (dotnet#111558) (dotnet#111564)
  Fix build of windows arm64 crossdac (dotnet#112553)
  Simplify `ShuffleTakeIterator.GetCount` (dotnet#112593)
  Fix VS div-by-0 in DacEnumerableHashTable code (dotnet#112542)
  R2RDump: normalize GC info totalInterruptibleLength (dotnet#112003)
  Fix alignment padding and add test for saving managed resources (dotnet#110915)
  Adds `ccmp` logic into emitter backend. (dotnet#112153)
  Disable AVX10.2 by default (dotnet#112572)
  Outbox AesGcm in to Microsoft.Bcl.Cryptography
  Make test `IUnknown` conforming (dotnet#112566)
  ...
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
apx Related to the Intel Advanced Performance Extensions (APX) area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants