Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[mono] Re-enable HAS_CUSTOM_BLOCKS for non-amd64 Mono #107358

Conversation

matouskozak
Copy link
Member

We've seen a bunch of regressions on WASM, AOT-arm64 and Interp-arm64 microbenchmark after #106764. Rather than reverting the PR and loosing the gains on x64 MonoJIT, we can try limiting the change to x64.

Regressions:

Copy link
Contributor

Tagging subscribers to this area: @steveisok, @lambdageek
See info in area-owners.md if you want to be subscribed.

@EgorBo

This comment was marked as resolved.

@EgorBo
Copy link
Member

EgorBo commented Sep 4, 2024

oops, class must be public

@EgorBot -mono -arm64

using BenchmarkDotNet.Attributes;

public class Bencha
{
    byte[] data = new byte[512];

    [Benchmark]
    public void Clear() => data.AsSpan().Clear();
}

@matouskozak matouskozak added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Sep 4, 2024
@EgorBo
Copy link
Member

EgorBo commented Sep 4, 2024

Seems like it regresses Jit-arm64 EgorBot/runtime-utils#65 (comment)

@EgorBo
Copy link
Member

EgorBo commented Sep 4, 2024

Let me try the interp:

@EgorBot -mono -arm64 -x64 --envvars MONO_ENV_OPTIONS:--interpreter

using BenchmarkDotNet.Attributes;

public class Bencha
{
    byte[] data = new byte[512];

    [Benchmark]
    public void Clear() => data.AsSpan().Clear();
}

@matouskozak
Copy link
Member Author

Seems like it regresses Jit-arm64 EgorBot/runtime-utils#65 (comment)

Indeed, Thank you for running the measurements. I'm trying to get some local measurements as well and I'll investigate more. I added NO-MERGE label until we figure if this fix works or not.

@EgorBo
Copy link
Member

EgorBo commented Sep 4, 2024

I suspect that indeed can help interpeter (because 1 block copy is faster than 8 scalars) and maybe LLVM is not able to merge 8 scalars into SIMD (SLP) due to some lack of aliasing/alignment info, at least on x64

@matouskozak
Copy link
Member Author

matouskozak commented Sep 4, 2024

I've checked locally and this fixes interp arm64 regression on new ReadOnlySpan<Int32>(_array).CopyTo(new Span<Int32>(_destination));

However, I don't like that it regressed MonoJIT-arm64, e.g.:

Seems like it regresses Jit-arm64 EgorBot/runtime-utils#65 (comment)

I will have to investigate more where the block optimizations are made.

@xtqqczze
Copy link
Contributor

Superseded by #107558.

@matouskozak
Copy link
Member Author

Superseded by #107558.

Yes, sorry I forgot to close this PR.

We decided to do full revert based on the investigation in #107308 (comment). We will address the regression in #106822.

@matouskozak matouskozak deleted the renable-HAS_CUSTOM_BLOCKS-for-some-mono branch October 3, 2024 13:13
@github-actions github-actions bot locked and limited conversation to collaborators Nov 3, 2024
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
arch-arm64 arch-wasm WebAssembly architecture area-Codegen-meta-mono NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants