Skip to content

[X86] llvm-mca reports wrong data for vpscatterqd, vscatterqps with -mcpu=skylake-avx512 #105675

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
mikolaj-pirog opened this issue Aug 22, 2024 · 13 comments

Comments

@mikolaj-pirog
Copy link
Contributor

llvm-mca with -mcpu=skx (or skylake-avx512 or common-avx512) reports wrong data (1==latency==uops==throughtput) for theses asm instructions:

vpscatterqd ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1
vscatterqps ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1

For their xmmword and qword the numbers reported are fine. For other newer -mcpu options the numbers are also fine. The llvm-mca version is 20.0.0git.

@mikolaj-pirog mikolaj-pirog changed the title [X86] llvm-mca reports wrong data for vpscatterqd, vscatterqq with -mcpu=skylake-avx512 [X86] llvm-mca reports wrong data for vpscatterqd, vscatterqps with -mcpu=skylake-avx512 Aug 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/issue-subscribers-tools-llvm-mca

Author: Mikołaj Piróg (mikolaj-pirog)

llvm-mca with -mcpu=skx (or skylake-avx512 or common-avx512) reports wrong data (1==latency==uops==throughtput) for theses asm instructions:

vpscatterqd ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1
vscatterqps ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1

For their xmmword and qword the numbers reported are fine. For other newer -mcpu options the numbers are also fine. The llvm-mca version is 20.0.0git.

@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/issue-subscribers-backend-x86

Author: Mikołaj Piróg (mikolaj-pirog)

llvm-mca with -mcpu=skx (or skylake-avx512 or common-avx512) reports wrong data (1==latency==uops==throughtput) for theses asm instructions:

vpscatterqd ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1
vscatterqps ymmword ptr [rsp + zmm0 + 16] {k1}, ymm1

For their xmmword and qword the numbers reported are fine. For other newer -mcpu options the numbers are also fine. The llvm-mca version is 20.0.0git.

@boomanaiden154 boomanaiden154 added the backend:X86 Scheduler Models Accuracy of X86 scheduler models label Aug 22, 2024
@boomanaiden154
Copy link
Contributor

Seems like the information is just missing in the scheduling model which should be a relatively simple fix to add in. uops.info has (seemingly) correct port data that should not be too difficult to move into the LLVM skylake-avx512 scheduling model.

@RKSimon RKSimon self-assigned this Aug 22, 2024
@RKSimon
Copy link
Collaborator

RKSimon commented Aug 22, 2024

we haven't gotten much llvm-mca test coverage for scatter - I'll get that added first

RKSimon added a commit that referenced this issue Aug 22, 2024
Missed IceLakeServer when I updated the other CPUs in 6ec4c9c
@goldsteinn
Copy link
Contributor

goldsteinn commented Aug 22, 2024

I think this is causing https://lab.llvm.org/buildbot/#/builders/155/builds/1675 failure @RKSimon

@cjappl
Copy link
Contributor

cjappl commented Aug 22, 2024

Put a revert up for review to try to unblock some of the build machines:

#105716

@cjappl cjappl reopened this Aug 22, 2024
@cjappl
Copy link
Contributor

cjappl commented Aug 22, 2024

Re-opened because I submitted that reversion, see: e738c81

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 22, 2024

Sorry for breakage, I'll get it fixed properly when I'm back in the office

kutemeikito added a commit to kutemeikito/llvm-project that referenced this issue Aug 23, 2024
* 'main' of https://github.com/llvm/llvm-project: (1385 commits)
  [llvm][NVPTX] Fix quadratic runtime in ProxyRegErasure (#105730)
  [ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (#104842)
  [clang][NFC] order C++ standards in reverse in release notes (#104866)
  Revert "[clang] Merge lifetimebound and GSL code paths for lifetime analysis (#104906)" (#105752)
  [SandboxIR] Implement CatchReturnInst (#105605)
  [RISCV][TTI] Use legalized element types when costing casts (#105723)
  [LTO] Use a helper function to add a definition (NFC) (#105721)
  [Vectorize] Fix a warning
  Revert "[clang][rtsan] Introduce realtime sanitizer codegen and drive… (#105744)
  [NFC][ADT] Add reverse iterators and `value_type` to StringRef (#105579)
  [mlir][SCF]-Fix loop coalescing with iteration arguements (#105488)
  [compiler-rt][test] Change tests to remove the use of `unset` command in lit internal shell  (#104880)
  [Clang] [Parser] Improve diagnostic for `friend concept` (#105121)
  [clang][rtsan] Introduce realtime sanitizer codegen and driver (#102622)
  [libunwind] Stop installing the mach-o module map (#105616)
  [VPlan] Fix typo in cb4efe1d.
  [VPlan] Don't trigger VF assertion if VPlan has extra simplifications.
  [LLD][COFF] Generate X64 thunks for ARM64EC entry points and patchable functions. (#105499)
  [VPlan] Factor out precomputing costs from LVP::cost (NFC).
  AMDGPU: Remove global/flat atomic fadd intrinics (#97051)
  [LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (#105555)
  Revert "[MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedu… (#105716)
  [libc] Fix locale structs with old headergen
  [libc] Add `ctype.h` locale variants (#102711)
  [NFC] [MLIR] [OpenMP] Fixing typo of clause. (#105712)
  [AMDGPU] Correctly insert s_nops for dst forwarding hazard (#100276)
  Fix dap stacktrace perf issue (#104874)
  [HLSL][SPIRV]Add SPIRV generation for HLSL dot (#104656)
  [libc] Fix leftover thread local
  [NFC] [Docs] add missing space
  [libc] Initial support for 'locale.h' in the LLVM libc (#102689)
  Revert " [libc] Add `ctype.h` locale variants (#102711)"
  [libc] Add `ctype.h` locale variants (#102711)
  [libc++] Fix transform_error.mandates.verify.cpp test on msvc (#104635)
  [VPlan] Move EVL memory recipes to VPlanRecipes.cpp (NFC)
  [Xtensa,test] Fix div.ll after #99981
  [MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data
  [MCA][X86] Add scatter instruction test coverage for #105675
  [IR] Simplify comparisons with std::optional (NFC) (#105624)
  Recommit "[FunctionAttrs] deduce attr `cold` on functions if all CG paths call a `cold` function"
  [lldb] Change the two remaining SInt64 settings in Target to uint (#105460)
  [libc++] Adjust armv7 XFAIL target triple for the setfill_wchar_max test. (#105586)
  [clang][bytecode] Fix 'if consteval' in non-constant contexts (#104707)
  [NFC] [SCCP] remove unused functions (#105603)
  [WebAssembly] Change half-precision feature name to fp16. (#105434)
  [C23] Remove WG14 N2517 from the status page
  [bindings][ocaml] Add missing AtomicRMW operations (#105673)
  [MCA][X86] Add scatter instruction test coverage for #105675
  [Driver] Add -Wa, options -mmapsyms={default,implicit}
  [CodeGen] Construct SmallVector with iterator ranges (NFC) (#105622)
  [lldb] Fix typos in ScriptedInterface.h
  [AMDGPU][GlobalISel] Disable fixed-point iteration in all Combiners (#105517)
  [AArch64,ELF] Allow implicit $d/$x at section beginning
  [AArch64] Fix a warning
  [Vectorize] Fix warnings
  Reland "[asan] Remove debug tracing from `report_globals` (#104404)" (#105601)
  [X86] Add BSR/BSF tests to check for implicit zero extension
  [AArch64] Lower aarch64_neon_saddlv via SADDLV nodes. (#103307)
  [lldb][test] Add a unit-test for importRecordLayoutFromOrigin
  [ARM] Fix missing ELF FPU attributes for fp-armv8-fullfp16-d16  (#105677)
  [lldb] Pick the correct architecutre when target and core file disagree (#105576)
  [Verifier] Make lrint and lround intrinsic cases concise. NFC (#105676)
  [SLP]Improve/fix subvectors in gather/buildvector nodes handling
  [DwarfEhPrepare] Assign dummy debug location for more inserted _Unwind_Resume calls (#105513)
  [RISCV][GISel] Implement canLowerReturn. (#105465)
  [AMDGPU] Generate checks for vector indexing. NFC. (#105668)
  [NFC] Replace bool <= bool comparison (#102948)
  [SLP]Do not count extractelement costs in unreachable/landing pad blocks.
  [SimplifyCFG] Fold switch over ucmp/scmp to icmp and br (#105636)
  [libc++] Post-LLVM19-release docs cleanup (#99667)
  [AArch64] optimise SVE cmp intrinsics with no active lanes (#104779)
  [RISCV] Introduce local peephole to reduce VLs based on demanded VL (#104689)
  [DAG][RISCV] Use vp_reduce_* when widening illegal types for reductions (#105455)
  [libc++][docs] Major update to the documentation
  [InstCombine] Handle logical op for and/or of icmp 0/-1
  [InstCombine] Add more test variants with poison elements (NFC)
  [LLVM][CodeGen][SVE] Increase vector.insert test coverage.
  [PowerPC] Fix mask for __st[d/w/h/b]cx builtins (#104453)
  [Analysis] Teach ScalarEvolution::getRangeRef about more dereferenceable objects (#104778)
  [mlir][LLVM] Add support for constant struct with multiple fields (#102752)
  [mlir][OpenMP][NFC] clean up optional reduction region parsing (#105644)
  [InstCombine] Add more tests for foldLogOpOfMaskedICmps transform (NFC)
  [clang][bytecode] Allow adding offsets to function pointers (#105641)
  [Clang][Sema] Rebuild template parameters for out-of-line template definitions and partial specializations (#104030)
  [InstCombine] Fold `scmp(x -nsw y, 0)` to `scmp(x, y)` (#105583)
  [flang][OpenMP] use reduction alloc region (#102525)
  [mlir][OpenMP] Convert reduction alloc region to LLVMIR (#102524)
  [mlir][OpenMP] Add optional alloc region to reduction decl (#102522)
  [libc++] Add link to the Github conformance table from the documentation
  [libc++] Fix the documentation build
  [NFC][SetTheory] Refactor to use const pointers and range loops (#105544)
  [NFC][VPlan] Correct two typos in comments.
  [clang][bytecode] Fix void unary * operators (#105640)
  Revert "[lldb] Extend frame recognizers to hide frames from backtraces (#104523)"
  Revert "[lldb-dap] Mark hidden frames as "subtle" (#105457)"
  Revert "[lldb][swig] Use the correct variable in the return statement"
  [DebugInfo][NFC] Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}  (#105570)
  [cmake] Include GNUInstallDirs before using variables defined by it. (#83807)
  [AMDGPU] GFX12 VMEM loads can write VGPR results out of order (#105549)
  [AMDGPU] Add GFX12 test coverage for vmcnt flushing in loop headers (#105548)
  [AArch64][GlobalISel] Libcall i128 srem/urem and scalarize more vector types.
  [AArch64] Add GISel srem/urem tests of various sizes. NFC
  LSV: forbid load-cycles when vectorizing; fix bug (#104815)
  [X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885)
  [lit] Fix substitutions containing backslashes (#103042)
  [Dexter] Sanitize user details from git repo URL in dexter --version (#105533)
  [SimplifyCFG] Add tests for switch over cmp intrinsic (NFC)
  [libc++] Refactor the std::unique_lock tests (#102151)
  Fix logf128 tests to allow negative NaNs from (#104929)
  [MemCpyOpt] Avoid infinite loops in `MemCpyOptPass::processMemCpyMemCpyDependence` (#103218)
  [mlir][dataflow] Propagate errors from `visitOperation` (#105448)
  Enable logf128 constant folding for hosts with 128bit long double (#104929)
  [mlir][llvmir][debug] Correctly generate location for phi nodes. (#105534)
  [Sparc] Add flags to enable errata workaround pass for GR712RC and UT700 (#104742)
  [lldb][AIX] Updating XCOFF,PPC entry in LLDB ArchSpec (#105523)
  [mlir][cuda] NFC: Remove accidentally committed 'asd' file. (#105491)
  [clang] Merge lifetimebound and GSL code paths for lifetime analysis (#104906)
  [Xtensa] Implement lowering Mul/Div/Shift operations. (#99981)
  [clang][bytecode] Don't discard all void-typed expressions (#105625)
  Build SanitizerCommon if ctx_profile enabled (#105495)
  [InstCombine] Fold icmp over select of cmp more aggressively (#105536)
  [SPIR-V] Rework usage of virtual registers' types and classes (#104104)
  [ELF] Move target to Ctx. NFC
  [Transforms] Refactor CreateControlFlowHub (#103013)
  [asan][Darwin] Simplify test (#105599)
  [Transforms] Construct SmallVector with iterator ranges (NFC) (#105607)
  [Flang][Runtime] Fix type used to store result of typeInfo::Value::Ge… (#105589)
  [PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)
  [clang][NFC] remove resolved issue from StandardCPlusPlusModules.rst (#105610)
  AMDGPU: Handle folding frame indexes into s_add_i32 (#101694)
  [RISCV][GISel] Correct registers classes in vector anyext.mir test. NFC
  [ELF] Move script into Ctx. NFC
  [ELF] LinkerScript: initialize dot. NFC
  [RISCV][GISel] Correct registers classes in vector sext/zext.mir tests. NFC
  [ELF] Remove unneeded script->. NFC
  [ELF] Move mainPart to Ctx. NFC
  [Symbolizer, DebugInfo] Clean up LLVMSymbolizer API: const string& -> StringRef (#104541)
  [flang][NFC] Move OpenMP related passes into a separate directory (#104732)
  [RISCV] Add CSRs and an instruction for Smctr and Ssctr extensions. (#105148)
  [SandboxIR] Implement FuncletPadInst, CatchPadInst and CleanupInst (#105294)
  [lldb-dap] Skip the lldb-dap output test on windows, it seems all the lldb-dap tests are disabled on windows. (#105604)
  [libc] Fix accidentally using system file on GPU
  [llvm][nsan] Skip function declarations (#105598)
  Handle #dbg_values in SROA. (#94070)
  Revert "Speculative fix for asan/TestCases/Darwin/cstring_section.c"
  [BPF] introduce __attribute__((bpf_fastcall)) (#105417)
  [SandboxIR] Simplify matchers in ShuffleVectorInst unit test (NFC) (#105596)
  [compiler-rt][test] Added REQUIRES:shell to fuzzer test with for-loop (#105557)
  [ctx_prof] API to get the instrumentation of a BB (#105468)
  [lldb] Speculative fix for trap_frame_sym_ctx.test
  [LTO] Compare std::optional<ImportKind> directly with ImportKind (NFC) (#105561)
  [LTO] Use enum class for ImportFailureReason (NFC) (#105564)
  [flang][runtime] Add build-time flags to runtime to adjust SELECTED_x_KIND() (#105575)
  [libc] Add `scanf` support to the GPU build (#104812)
  [SandboxIR] Add tracking for `ShuffleVectorInst::setShuffleMask`. (#105590)
  [NFC][TableGen] Change global variables from anonymous NS to static (#105504)
  [SandboxIR] Fix use-of-uninitialized in ShuffleVectorInst unit test. (#105592)
  [InstCombine] Fold `sext(A < B) + zext(A > B)` into `ucmp/scmp(A, B)` (#103833)
  Revert "[Coroutines] [NFCI] Don't search the DILocalVariable for __promise when constructing the debug varaible for __coro_frame"
  Revert "[Coroutines] Fix -Wunused-variable in CoroFrame.cpp (NFC)"
  Revert "[Coroutines] Salvage the debug information for coroutine frames within optimizations"
  [mlir] Add nodiscard attribute to allowsUnregisteredDialects (#105530)
  [libc++] Mark LWG3404 as implemented
  [lldb-dap] When sending a DAP Output Event break each message into separate lines. (#105456)
  [RFC][flang] Replace special symbols in uniqued global names. (#104859)
  [libc++] Improve the granularity of status tracking from Github issues
  [ADT] Add `isPunct` to StringExtras (#105461)
  [SandboxIR] Add ShuffleVectorInst (#104891)
  [AArch64] Add SVE lowering of fixed-length UABD/SABD (#104991)
  [SLP]Try to keep scalars, used in phi nodes, if phi nodes from same block are vectorized.
  [SLP]Fix PR105120: fix the order of phi nodes vectorization.
  [CGData] Fix tests for sed without using options (#105546)
  [flang][OpenMP] Follow-up to build-breakage fix (#102028)
  [NFC][ADT] Remove << operators from StringRefTest (#105500)
  [lldb-dap] Implement `StepGranularity` for "next" and "step-in" (#105464)
  [Docs] Update Loop Optimization WG call.
  [gn build] Port a6bae5cb3791
  [AMDGPU] Split GCNSubtarget into its own file. NFC. (#105525)
  [ctx_prof] Profile flatterner (#104539)
  [libc][docs] Update docs to reflect new headergen (#102381)
  [clang] [test] Use lit Syntax for Environment Variables in Clang subproject (#102647)
  [RISCV] Minor style fixes in lowerVectorMaskVecReduction [nfc]
  [libc++] Standardize how we track removed and superseded papers
  [libc++][NFC] A few mechanical adjustments to capitalization in status files
  [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (#101086)
  Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054)
  [AArch64] Set scalar fneg to free for fnmul (#104814)
  [libcxx] Add cache file for the GPU build (#99348)
  [Offload] Improve error reporting on memory faults (#104254)
  [bazel] Fix mlir build broken by 681ae097. (#105552)
  [CGData] Rename CodeGenDataTests to CGDataTests (#105463)
  [ELF,test] Enhance hip-section-layout.s
  [clang-format] Use double hyphen for multiple-letter flags (#100978)
  [mlir] [tablegen] Make `hasSummary` and `hasDescription` useful (#105531)
  [flang][Driver] Remove misleading test comment (#105528)
  [MLIR][OpenMP] Add missing OpenMP to LLVM conversion patterns (#104440)
  [flang][debug] Allow non default array lower bounds. (#104467)
  [DAGCombiner] Fix ReplaceAllUsesOfValueWith mutation bug in visitFREEZE (#104924)
  Fix bug with -ffp-contract=fast-honor-pragmas (#104857)
  [RISCV] Add coverage for fp reductions of <2^N-1 x FP> vectors
  [AMDGPU][True16][MC] added VOPC realtrue/faketrue flag and fake16 instructions (#104739)
  [libc++] Enable C++23 and C++26 issues to be synchronized
  [gn] port 7ad7f8f7a3d4
  Speculative fix for asan/TestCases/Darwin/cstring_section.c
  [libc++] Mark C++14 as complete and remove the status pages (#105514)
  [AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484)
  [LTO] Use a range-based for loop (NFC) (#105467)
  [LTO] Use DenseSet in computeLTOCacheKey (NFC) (#105466)
  Revert "[flang][NFC] Move OpenMP related passes into a separate directory (#104732)"
  [AArch64] Add support for ACTLR_EL12 system register (#105497)
  [InstCombine] Add tests for icmp of select of cmp (NFC)
  [NFC][ADT] Format StringRefTest.cpp to fit in 80 columns. (#105502)
  [flang][NFC] Move OpenMP related passes into a separate directory (#104732)
  [libcxx] Add `LIBCXX_HAS_TERMINAL_AVAILABLE` CMake option to disable `print` terminal checks (#99259)
  [clang] Diagnose functions with too many parameters (#104833)
  [mlir][memref]: Allow collapse dummy strided unit dim (#103719)
  [lldb][swig] Use the correct variable in the return statement
  [libc++] Avoid -Wzero-as-null-pointer-constant in operator<=> (#79465)
  [llvm-reduce] Disable fixpoint verification in InstCombine
  [libc++] Refactor the tests for mutex, recursive mutex and their timed counterparts (#104852)
  [Clang] fix generic lambda inside requires-clause of friend function template (#99813)
  Revert "[asan] Remove debug tracing from `report_globals` (#104404)"
  [analyzer] Limit `isTainted()` by skipping complicated symbols (#105493)
  [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (#102776)
  [InstCombine] Extend Fold of Zero-extended Bit Test (#102100)
  [LLVM][VPlan] Keep all VPBlend masks until VPlan transformation. (#104015)
  [gn build] Port 0cff3e85db00
  [NFC][Support] Move ModRef/MemoryEffects printers to their own file (#105367)
  [NFC][ADT] Add unit test for llvm::mismatch. (#105459)
  LAA: pre-commit tests for stride-versioning (#97570)
  [VPlan] Only use selectVectorizationFactor for cross-check (NFCI). (#103033)
  [SPIR-V] Sort basic blocks to match the SPIR-V spec (#102929)
  [DAG] Add select_cc -> abd folds (#102137)
  [MLIR][mesh] moving shardinginterfaceimpl for tensor to tensor extension lib (#104913)
  AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (#97050)
  [InstCombine] Remove some of the complexity-based canonicalization (#91185)
  [PS5][Driver] Link main components with -pie by default (#102901)
  [bazel] Port a3d41879ecf5690a73f9226951d3856c7faa34a4
  [gn build] Port 6c189eaea994
  [Clang][NFCI] Cleanup the fix for default function argument substitution (#104911)
  [AMDGPU][True16][test] added missing true16 flag in gfx12 asm vop1 (#104884)
  [RISCV] Make EmitRISCVCpuSupports accept multiple features (#104917)
  [AArch64] Add SME peephole optimizer pass (#104612)
  [RISCV] Remove experimental for Ssqosid ext (#105476)
  Revert "[LLVM] [X86] Fix integer overflows in frame layout for huge frames (#101840)"
  [llvm][test] Write temporary files into a temporary directory
  [GlobalIsel] Push cast through build vector (#104634)
  [Clang] Implement CWG2351 `void{}` (#78060)
  [VPlan] Introduce explicit ExtractFromEnd recipes for live-outs. (#100658)
  [gn build] Port 7c4cadfc4333
  [mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#104784)
  [mlir][Linalg] Bugfix for folder of `linalg.transpose` (#102888)
  [RISCV] Add Hazard3 Core as taped out for RP2350 (#102452)
  [X86][AVX10.2] Support AVX10.2-CONVERT new instructions. (#101600)
  [Flang][Runtime] Handle missing definitions in <cfenv> (#101242)
  [compiler-rt] Reland "SetThreadName implementation for Fuchsia" (#105179)
  [LAA] Collect loop guards only once in MemoryDepChecker (NFCI).
  [ELF] Move ppc64noTocRelax to Ctx. NFC
  [clang-repl] Fix printing preprocessed tokens and macros (#104964)
  [mlir][ODS] Optionally generate public C++ functions for type constraints (#104577)
  [Driver] Use llvm::make_range(std::pair) (NFC) (#105470)
  Revert "[AArch64] Optimize when storing symmetry constants" (#105474)
  [llvm][DWARFLinker] Don't attach DW_AT_dwo_id to CUs (#105186)
  [lldb-dap] Mark hidden frames as "subtle" (#105457)
  [clang][bytecode] Fix diagnostic in final ltor cast (#105292)
  [clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138)
  [CMake] Update CMake cache file for the ARM/Aarch64 cross toolchain builds. NFC. (#103552)
  Revert "[FunctionAttrs] deduce attr `cold` on functions if all CG paths call a `cold` function"
  [AMDGPU] Update instrumentAddress method to support aligned size and unusual size accesses. (#104804)
  [BOLT] Improve BinaryFunction::inferFallThroughCounts() (#105450)
  [lldb][test] Workaround older systems that lack gettid (#104831)
  [LTO] Teach computeLTOCacheKey to return std::string (NFC) (#105331)
  [gn build] Port c8a678b1e486
  [gn build] Port 55d744eea361
  [ELF,test] Improve error-handling-script-linux.test
  [gn] tblgen opts for llvm-cgdata
  [MLIR][MathDialect] fix fp32 promotion crash when encounters scf.if (#104451)
  Reland "[gn build] Port d3fb41dddc11 (llvm-cgdata)"
  RISC-V: Add fminimumnum and fmaximumnum support (#104411)
  [mlir] Fix -Wunused-result in ElementwiseOpFusion.cpp (NFC)
  [RISCV][GISel] Merge RISCVCallLowering::lowerReturnVal into RISCVCallLowering::lowerReturn. NFC
  [AArch64] Basic SVE PCS support for handling scalable vectors on Darwin.
  Fix KCFI types for generated functions with integer normalization (#104826)
  [RISCV] Add coverage for int reductions of <3 x i8> vectors
  Revert "[RISCV][GISel] Allow >2*XLen integers in isSupportedReturnType."
  [DirectX] Register a few DXIL passes with the new PM
  [RISCV][GISel] Allow >2*XLen integers in isSupportedReturnType.
  [mlir][linalg] Improve getPreservedProducerResults estimation in ElementwiseOpFusion (#104409)
  [lldb] Extend frame recognizers to hide frames from backtraces (#104523)
  [RISCV][GISel] Split LoadStoreActions in LoadActions and StoreActions.
  [lldb][test] XFAIL TestAnonNamespaceParamFunc.cpp on Windows
  [FunctionAttrs] deduce attr `cold` on functions if all CG paths call a `cold` function
  [FunctionAttrs] Add tests for deducing attr `cold` on functions; NFC
  [DXIL][Analysis] Update test to match comment. NFC (#105409)
  [flang] Fix test on ppc64le & aarch64 (#105439)
  [bazel] Add missing dependencies for c8a678b1e4863df2845b1305849534047f10caf1
  [RISCV][GISel] Remove s32 support for G_ABS on RV64.
  [TableGen] Rework `EmitIntrinsicToBuiltinMap` (#104681)
  [libc] move newheadergen back to safe_load (#105374)
  [cmake] Set up llvm-ml as ASM_MASM tool in WinMsvc.cmake (#104903)
  [libc] Include startup code when installing all (#105203)
  [DAG][RISCV] Use vp.<binop> when widening illegal types for binops which can trap (#105214)
  [BOLT] Reduce CFI warning verbosity (#105336)
  [flang] Disable part of failing test (temporary) (#105350)
  AMDGPU: Temporarily stop adding AtomicExpand to new PM passes
  [OpenMP] Temporarily disable test to keep bots green
  [Clang] Re-land Overflow Pattern Exclusions (#104889)
  [RISCV][GISel] Remove s32 support on RV64 for DIV, and REM. (#102519)
  [flang] Disable failing test (#105327)
  [NFC] Fix a typo in InternalsManual: ActOnCXX -> ActOnXXX (#105207)
  [NFC] Fixed two typos: "__builin_" --> "__builtin_" (#98782)
  [flang] Re-enable date_and_time intrinsic test (NFC) (#104967)
  [clang] Support -Wa, options -mmsa and -mno-msa (#99615)
  AMDGPU/NewPM: Start filling out addIRPasses (#102884)
  AMDGPU/NewPM: Fill out passes in addCodeGenPrepare (#102867)
  [SandboxIR] Implement CatchSwitchInst (#104652)
  clang/AMDGPU: Emit atomicrmw for flat/global atomic min/max f64 builtins (#96876)
  clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (#96875)
  clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (#96874)
  [Driver,DXIL] Fix build
  [Attributor] Improve AAUnderlyingObjects (#104835)
  [flang] Fix IEEE_NEAREST_AFTER folding edge cases (#104846)
  [flang] Silence spurious error (#104821)
  [flang] Silence an inappropriate warning (#104685)
  [flang] Fix inheritance of IMPLICIT typing rules (#102692)
  [flang] More support for anonymous parent components in struct constr… (#102642)
  clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (#96873)
  [lldb][test] Change unsupported cat -e to cat -v to work with lit internal shell (#104878)
  [llvm-lit][test] Updated built-in cat command tests (#104473)
  [mlir][gpu] Add extra value types for gpu::ShuffleOp (#104605)
  [AArch64][MachO] Add ptrauth ABI version to arm64e cpusubtype. (#104650)
  [libc++] Fix several double-moves in the code base (#104616)
  [lldb] Disable the API test TestCppBitfields on Windows (#105037)
  llvm.lround: Update verifier to validate support of vector types. (#98950)
  [mlir][sparse] support sparsification to coiterate operations. (#102546)
  Fix post-104491 (#105191)
  [mlir][tablegen] Fix tablegen bug with `Complex` class (#104974)
  [DirectX] Encapsulate DXILOpLowering's state into a class. NFC
  [ctx_prof] Add analysis utility to fetch ID of a callsite (#104491)
  [lldb] Fix windows debug build after 9d07f43 (#104896)
  [lldb][ClangExpressionParser] Implement ExternalSemaSource::ReadUndefinedButUsed (#104817)
  Revert "[compiler-rt][fuzzer] implements SetThreadName for fuchsia." (#105162)
  [lldb][ClangExpressionParser] Don't leak memory when multiplexing ExternalASTSources (#104799)
  [mlir][gpu] Add 'cluster_size' attribute to gpu.subgroup_reduce (#104851)
  [mlir][spirv] Support `gpu` in `convert-to-spirv` pass (#105010)
  [libc++][chono] Use hidden friends for leap_second comparison. (#104713)
  [OpenMP] Map `omp_default_mem_alloc` to global memory (#104790)
  [NFC][TableGen] Elminate use of isalpha/isdigit from TGLexer (#104837)
  [HLSL] Implement support for HLSL intrinsic  - saturate (#104619)
  [RISCV] Add isel optimization for (and (sra y, c2), c1) to recover regression from #101751. (#104114)
  [bazel] Add missing deps in {Arith,DLTI}DialectTdFiles (#105091)
  [bazel] Port bf68e9047f62c22ca87f9a4a7c59a46b3de06abb (#104907)
  [Clang] CWG722: nullptr to ellipses (#104704)
  [RISCV] Add coverage for VP div[u]/rem[u] with non-power-of-2 vectors
  Recommit "[CodeGenPrepare] Folding `urem` with loop invariant value"
  [CodeGenPrepare][X86] Add tests for fixing `urem` transform; NFC
  Fix a warning for -Wcovered-switch-default (#105054)
  [OpenMP][FIX] Check for requirements early (#104836)
  [mlir] [irdl] Improve IRDL documentation (#104928)
  [CMake] Remove HAVE_LINK_H
  [Support] Remove unneeded __has_include fallback
  [docs] Fix typo in llvm.experimental.vector.compress code-block snippet
  [clang][ASTMatcher] Fix execution order of hasOperands submatchers (#104148)
  InferAddressSpaces: Factor replacement loop into function [NFC] (#104430)
  [DXIL][Analysis] Delete unnecessary test (#105025)
  [MLIR][EmitC] Allow ptrdiff_t as result in sub op (#104921)
  [NFC] Remove explicit bitcode enumeration from BitCodeFormat.rst (#102618)
  [NVPTX] Add elect.sync Intrinsic (#104780)
  [AMDGPU] Move AMDGPUMemoryUtils out of Utils. NFC. (#104930)
  [clang][OpenMP] Fix typo in comment, NFC
  [AArch64] fix buildbot by removing dead code
  [llvm-cgdata] Fix -Wcovered-switch-default (NFC)
  Reenable anon structs (#104922)
  [DXIL][Analysis] Add validator version to info collected by Module Metadata Analysis  (#104828)
  Reland [CGData] llvm-cgdata #89884 (#101461)
  [CostModel][X86] Add missing costkinds for scalar CTLZ/CTTZ instructions
  [Driver] Make ffp-model=fast honor non-finite-values, introduce ffp-model=aggressive (#100453)
  [InstCombine] Thwart complexity-based canonicalization in test (NFC)
  [AArch64] Extend sxtw peephole to uxtw. (#104516)
  Reapply "[CycleAnalysis] Methods to verify cycles and their nesting. (#102300)"
  [AArch64] Optimize when storing symmetry constants (#93717)
  [lldb][Windows] Fixed the API test breakpoint_with_realpath_and_source_map (#104918)
  [SPARC] Remove assertions in printOperand for inline asm operands (#104692)
  [llvm][offload] Move AMDGPU offload utilities to LLVM (#102487)
  [AArch64][NEON] Extend faminmax patterns with fminnm/fmaxnm (#104766)
  [AArch64] Remove TargetParser CPU/Arch feature tests (#104587)
  [InstCombine] Adjust fixpoint error message (NFC)
  [LLVM] Add a C API for creating instructions with custom syncscopes. (#104775)
  [llvm-c] Add getters for LLVMContextRef for various types (#99087)
  [clang][NFC] Split invalid-cpu-note tests (#104601)
  [X86][AVX10] Fix unexpected error and warning when using intrinsic (#104781)
  [ScheduleDAG] Dirty height/depth in addPred/removePred even for latency zero (#102915)
  [gn build] Port 42067f26cd08
  [X86] Use correct fp immediate types in _mm_set_ss/sd
  [X86] Add clang codegen test coverage for #104848
  [SimplifyCFG] Add support for hoisting commutative instructions (#104805)
  [clang][bytecode] Fix discarding CompoundLiteralExprs (#104909)
  Revert "[CycleAnalysis] Methods to verify cycles and their nesting. (#102300)"
  [LLVM-Reduce] - Distinct Metadata Reduction (#104624)
  [clang][modules] Built-in modules are not correctly enabled for Mac Catalyst (#104872)
  [MLIR][DLTI] Introduce DLTIQueryInterface and impl for DLTI attrs (#104595)
  [Flang][OpenMP] Prevent re-composition of composite constructs (#102613)
  [BasicAA] Use nuw attribute of GEPs (#98608)
  [CycleAnalysis] Methods to verify cycles and their nesting. (#102300)
  [mlir][EmitC] Model lvalues as a type in EmitC (#91475)
  [mlir][EmitC] Do not convert illegal types in EmitC (#104571)
  [Clang][test] Add bytecode interpreter tests for floating comparison functions (#104703)
  [clang][bytecode] Fix initializing base casts (#104901)
  [mlir][ArmSME][docs] Update example (NFC)
  [llvm][GitHub] Fix formatting of new contributor comments
  [Coroutines] Salvage the debug information for coroutine frames within optimizations
  [lldb][AIX] 1. Avoid namespace collision on other platforms (#104679)
  [MLIR][Bufferize][NFC] Fix documentation typo (#104881)
  [LV] Simplify !UserVF.isZero() -> UserVF (NFC).
  [DataLayout] Refactor the rest of `parseSpecification` (#104545)
  [LLD][COFF] Detect weak reference cycles. (#104463)
  [MLIR][Python] remove unused init python file (#104890)
  [clang-doc] add support for block commands in clang-doc html output (#101108)
  [Coroutines] Fix -Wunused-variable in CoroFrame.cpp (NFC)
  [IR] Check that arguments of naked function are not used (#104757)
  [Coroutines] [NFCI] Don't search the DILocalVariable for __promise when constructing the debug varaible for __coro_frame
  [MLIR] Introduce a SelectLikeOpInterface (#104751)
  Revert "[scudo] Add partial chunk heuristic to retrieval algorithm." (#104894)
  [NVPTX] Fix bugs involving maximum/minimum and bf16
  [SelectionDAG] Fix lowering of IEEE 754 2019 minimum/maximum
  [llvm-objcopy][WebAssembly] Allow --strip-debug to operate on relocatable files. (#102978)
  [lld][WebAssembly] Ignore local symbols when parsing lazy object files. (#104876)
  [clang][bytecode] Support ObjC blocks (#104551)
  Revert "[mlir] NFC: fix dependence of (Tensor|Linalg|MemRef|Complex) dialects on LLVM Dialect and LLVM Core in CMake build (#104832)"
  [ADT] Fix a minor build error (#104840)
  [Driver] Default -msmall-data-limit= to 0 and clean up code
  [docs] Revise the doc for __builtin_allow_runtime_check
  [MLIR][Transforms] Fix dialect conversion inverse mapping (#104648)
  [scudo] Add partial chunk heuristic to retrieval algorithm. (#104807)
  [mlir] NFC: fix dependence of (Tensor|Linalg|MemRef|Complex) dialects on LLVM Dialect and LLVM Core in CMake build (#104832)
  [offload] - Fix issue with standalone debug offload build (#104647)
  [ValueTracking] Handle incompatible types instead of asserting in `isKnownNonEqual`; NFC
  [AMDGPU] Add VOPD combine dependency tests. NFC. (#104841)
  [compiler-rt][fuzzer] implements SetThreadName for fuchsia. (#99953)
  [Support] Do not ignore unterminated open { in formatv (#104688)
  Reapply "[HWASan] symbolize stack overflows" (#102951) (#104036)
  Fix StartDebuggingRequestHandler/ReplModeRequestHandler in lldb-dap (#104824)
  Emit `BeginSourceFile` failure with `elog`. (#104845)
  [libc][NFC] Add sollya script to compute worst case range reduction. (#104803)
  Reland "[asan] Catch `initialization-order-fiasco` in modules without…" (#104730)
  [NFC][asan] Create `ModuleName` lazily (#104729)
  [asan] Better `___asan_gen_` names (#104728)
  [NFC][ADT] Add range wrapper for std::mismatch (#104838)
  [Clang] Fix ICE in SemaOpenMP with structured binding (#104822)
  [MC] Remove duplicate getFixupKindInfo calls. NFC
  [C++23] Fix infinite recursion (Clang 19.x regression) (#104829)
  AMDGPU/NewPM: Start implementing addCodeGenPrepare (#102816)
  [AMDGPU][Docs] DWARF aspace-aware base types
  Pre-commit AMDGPU tests for masked load/store/scatter/gather (#104645)
  [ADT] Add a missing call to a unique_function destructor after move (#98747)
  [ADT] Minor code cleanup in STLExtras.h (#104808)
  [libc++abi] Remove unnecessary dependency on std::unique_ptr (#73277)
  [clang] Increase the default expression nesting limit (#104717)
  [mlir][spirv] Fix incorrect metadata in SPIR-V Header (#104242)
  [ADT] Fix alignment check in unique_function constructor (#99403)
  LSV: fix style after cursory reading (NFC) (#104793)
  Revert "[BPF] introduce `__attribute__((bpf_fastcall))` (#101228)"
  [NFC][asan] Don't `cd` after `split-file` (#104727)
  [NFC][Instrumentation] Use `Twine` in `createPrivateGlobalForString` (#104726)
  [mlir][spirv] Add `GroupNonUniformBallotFindLSB` and `GroupNonUniformBallotFindMSB` ops (#104791)
  [GlobalISel] Bail out early for big-endian (#103310)
  [compiler-rt][nsan] Add more tests for shadow memory (#100906)
  [Flang] Fix test case for AIX(big-endian) system for issuing an extra message. (#104792)
  [asan] Change Apple back to fixed allocator base address (#104818)
  [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (#102969)
  [RISCV] Improve BCLRITwoBitsMaskHigh SDNodeXForm. NFC
  [clang][dataflow] Collect local variables referenced within a functio… (#104459)
  [AMDGPU][GlobalISel] Save a copy in one case of addrspacecast (#104789)
  [AMDGPU] Simplify, fix and improve known bits for mbcnt (#104768)
  [TableGen] Detect invalid -D arguments and fail (#102813)
  [DirectX] Disentangle DXIL.td's op types from LLVMType. NFC
  [Clang] Check constraints for an explicit instantiation of a member function (#104438)
  [DirectX] Differentiate between 0/1 overloads in the OpBuilder. NFC
  [docs] Add note about "Re-request review" (#104735)
  [lld][ELF] Combine uniqued small data sections (#104485)
  [BPF] introduce `__attribute__((bpf_fastcall))` (#101228)
  [SmallPtrSet] Optimize find/erase
  [PowerPC] Fix codegen for transparent_union function params (#101738)
  [llvm-mca] Add bottle-neck analysis to JSON output. (#90056)
  [lldb][Python] Silence GCC warning for modules error workaround
  [gn build] Port a56663591573
  [gn build] Port a449b857241d
  [clang][bytecode] Discard NullToPointer cast SubExpr (#104782)
  [lldb] PopulatePrpsInfoTest can fail due to hardcoded priority value (#104617)
  [mlir][[spirv] Add support for math.log2 and math.log10 to GLSL/OpenCL SPIRV Backends (#104608)
  [lldb][test] Fix GCC warnings in TestGetControlFlowKindX86.cpp
  [TableGen] Resolve References at top level (#104578)
  [LLVM] [X86] Fix integer overflows in frame layout for huge frames (#101840)
  [lldb][ASTUtils] Remove unused SemaSourceWithPriorities::addSource API
  [lldb][test] Fix cast dropping const warnin in TestBreakpointSetCallback.cpp
  [SimplifyCFG] Add tests for hoisting of commutative instructions (NFC)
  [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (#103721)
  Revert "[clang][ExtractAPI] Stop dropping fields of nested anonymous record types when they aren't attached to variable declaration (#104600)"
  MathExtras: template'ize alignToPowerOf2 (#97814)
  [AMDGPU] Move AMDGPUCodeGenPassBuilder into AMDGPUTargetMachine(NFC) (#103720)
  [clang][ExtractAPI] Stop dropping fields of nested anonymous record types when they aren't attached to variable declaration (#104600)
  [Clang][NFC] Fix potential null dereference in encodeTypeForFunctionPointerAuth (#104737)
  [DebugInfo] Make tests SimplifyCFG-independent (NFC)
  [mlir][ArmSME] Remove XFAILs (#104758)
  [RISCV] Add vector and vector crypto to SiFiveP400 scheduler model (#102155)
  [clang][OpenMP] Diagnose badly-formed collapsed imperfect loop nests (#60678) (#101305)
  Require !windows instead of XFAIL'ing ubsan/TestCases/Integer/bit-int.c
  [clang][bytecode] Fix member pointers to IndirectFieldDecls (#104756)
  [AArch64] Add fneg(fmul) and fmul(fneg) tests. NFC
  [clang][bytecode] Use first FieldDecl instead of asserting (#104760)
  [DataLayout] Refactor parsing of i/f/v/a specifications (#104699)
  [X86] LowerABD - simplify i32/i64 to use sub+sub+cmov instead of repeating nodes via abs (#102174)
  [docs] Update a filename, fix indentation (#103018)
  [CostModel][X86] Add cost tests for scmp/ucmp intrinsics
  [NFC][SLP] Remove useless code of the schedule (#104697)
  [VPlan] Rename getBestPlanFor -> getPlanFor (NFC).
  [InstCombine] Fold `(x < y) ? -1 : zext(x != y)` into `u/scmp(x,y)` (#101049)
  [VPlan] Emit note when UserVF > MaxUserVF (NFCI).
  [LLVM][NewPM] Add C API for running the pipeline on a single function. (#103773)
  [mlir][vector] Populate sink patterns in apply_patterns.vector.reduction_to_contract (#104754)
  [lld][MachO] Fix a suspicous assert in SyntheticSections.cpp
  [PowerPC] Support -mno-red-zone option (#94581)
  [PAC][ELF][AArch64] Encode several ptrauth features in PAuth core info (#102508)
  [VPlan] Rename getBestVF -> computeBestVF (NFC).
  [MLIR][LLVM] Improve the noalias propagation during inlining (#104750)
  [LoongArch] Fix the assertion for atomic store with 'ptr' type
  [AArch64][SME] Return false from produceCompactUnwindFrame if VG save required. (#104588)
  [X86] Cleanup lowerShuffleWithUNPCK/PACK signatures to match (most) other lowerShuffle* methods. NFC.
  [X86] VPERM2*128 instructions aren't microcoded on znver1
  [X86] VPERM2*128 instructions aren't microcoded on znver2
  [VPlan] Move some LoopVectorizationPlanner helpers to VPlan.cpp (NFC).
  [mlir][docs] Update Bytecode documentation (#99854)
  [SimplifyCFG] Don't block sinking for allocas if no phi created (#104579)
  [LoongArch] Merge base and offset for LSX/LASX memory accesses (#104452)
  [RISCV] Make extension names lower case in RISCVISAInfo::checkDependency() error messages.
  [RISCV] Add helper functions to exploit similarity of some RISCVISAInfo::checkDependency() error strings. NFC
  [RISCV] Merge some ISA error reporting together and make some errors more precise.
  [RISCV] Simplify reserse fixed regs (#104736)
  [RISCV] Add more tests for RISCVISAInfo::checkDependency(). NFC
  [Sparc] Add errata workaround pass for GR712RC and UT700 (#103843)
  [TableGen] Print Error and not crash on dumping non-string values (#104568)
  [RISCV][MC] Support experimental extensions Zvbc32e and Zvkgs (#103709)
  Revert "[CodeGenPrepare] Folding `urem` with loop invariant value"
  [SelectionDAG][X86] Preserve unpredictable metadata for conditional branches in SelectionDAG, as well as JCCs generated by X86 backend. (#102101)
  [MLIR][Python] enhance python api for tensor.empty (#103087)
  [AMDGPU][NFC] Fix preload-kernarg.ll test after attributor move (#98840)
  [CodeGenPrepare] Folding `urem` with loop invariant value
  [CodeGenPrepare][X86] Add tests for folding `urem` with loop invariant value; NFC
  [MC] Remove ELFRelocationEntry::OriginalAddend
  [TLI] Add support for inferring attr `cold`/`noreturn` on `std::terminate` and `__cxa_throw`
  [DAG][PatternMatch] Add support for matchers with flags; NFC
  Update Clang version from 19 to 20 in scan-build.1.
  [clang-format] Change GNU style language standard to LS_Latest (#104669)
  [MIPS] Remove expensive LLVM_DEBUG relocation dump
  [MC] Add test that requires multiple relaxation steps
  [libc][gpu] Add Atan2 Benchmarks (#104708)
  [libc] Add single threaded kernel attributes to AMDGPU startup utility (#104651)
  [HIP] search fatbin symbols for libs passed by -l (#104638)
  [gn build] Port 0d150db214e2
  [llvm][clang] Move RewriterBuffer to ADT. (#99770)
  [Clang] Do not allow `[[clang::lifetimebound]]` on explicit object member functions (#96113)
  [clang][OpenMP] Change /* ParamName */ to /*ParamName=*/, NFC
  [clang-tidy] Support member functions with modernize-use-std-print/format (#104675)
  [clang] fix divide by zero in ComplexExprEvaluator (#104666)
  [clang][OpenMP] Avoid multiple calls to getCurrentDirective in DSAChecker, NFC
  [clang][bytecode] Only booleans can be inverted
  [Flang]: Use actual endianness for Integer<80> (#103928)
  [libc++][docs] Fixing hyperlink for mathematical special function documentation (#104444)
  [InstSimplify] Simplify `uadd.sat(X, Y) u>= X + Y` and `usub.sat(X, Y) u<= X, Y` (#104698)
  [LV] Don't cost branches and conditions to empty blocks.
  [clang][test] Remove bytecode interpreter RUN line from test
  [Clang] warn on discarded [[nodiscard]] function results after casting in C (#104677)
  [GlobalISel] Add and use an Opcode variable and update match-table-cxx.td checks. NFC
  [Clang] `constexpr` builtin floating point classification / comparison functions (#94118)
  [clang][bytecode] IntPointer::atOffset() should append (#104686)
  [clang][bytecode][NFC] Improve Pointer::print()
  [RISCV] Remove unused tablegen classes from unratified Zbp instructions. NFC
  [PowerPC] Use MathExtras helpers to simplify code. NFC (#104691)
  [clang-tidy] Correct typo in ReleaseNotes.rst (#104674)
  [APInt] Replace enum with static constexpr member variables. NFC
  [MLIR][OpenMP] Fix MLIR->LLVM value matching in privatization logic (#103718)
  [VE] Use SelectionDAG::getSignedConstant/getAllOnesConstant.
  [gn build] Port 27a62ec72aed
  [LSR] Split the -lsr-term-fold transformation into it's own pass (#104234)
  [AArch64] Use SelectionDAG::getSignedConstant/getAllOnesConstant.
  [ARM] Use SelectonDAG::getSignedConstant.
  [SelectionDAG] Use getAllOnesConstant.
  [LLD] [MinGW] Recognize the -rpath option (#102886)
  [clang][bytecode] Fix shifting negative values (#104663)
  [flang] Handle Hollerith in data statement initialization in big endian (#103451)
  [clang][bytecode] Classify 1-bit unsigned integers as bool (#104662)
  [RISCV][MC] Make error message of CSR with wrong extension more detailed (#104424)
  [X86] Don't save/restore fp around longjmp instructions (#102556)
  AMDGPU: Add tonearest and towardzero roundings for intrinsic llvm.fptrunc.round (#104486)
  [libc] Fix type signature for strlcpy and strlcat (#104643)
  [AArch64] Add a check for invalid default features (#104435)
  [clang][NFC] Clean up `Sema` headers
  [NFC] Cleanup in ADT and Analysis headers. (#104484)
  [InstCombine] Avoid infinite loop when negating phi nodes (#104581)
  Add non-temporal support for LLVM masked loads (#104598)
  [AMDGPU] Disable inline constants for pseudo scalar transcendentals (#104395)
  [mlir][Transforms] Dialect conversion: Fix bug in `computeNecessaryMaterializations` (#104630)
  [RISCV] Use getAllOnesConstant/getSignedConstant.
  [SelectionDAG] Use getSignedConstant/getAllOnesConstant.
  [NFC][asan] Make 'Module &M' class member
  [AMDGPU][NFC] Remove duplicate code by using getAddressableLocalMemorySize (#104604)
  [CodeGen][asan] Use `%t` instead of `cd` in test
  Revert "[asan] Catch `initialization-order-fiasco` in modules without globals" (#104665)
  [SelectionDAG][X86] Use getAllOnesConstant. NFC (#104640)
  [LLVM][NVPTX] Add support for brkpt instruction (#104470)
  [asan] Catch `initialization-order-fiasco` in modules without globals (#104621)
  [RISCV] Remove feature implication from Zvknhb.
  [clang-format] Adjust requires clause wrapping (#101550) (#102078)
  [MC,AArch64] Remove unneeded STT_NOTYPE/STB_LOCAL code for mapping symbols and improve tests
  [NFC][DXIL] move replace/erase in DXIL intrinsic expansion to caller (#104626)
  [flang] Allow flexible name in llvm.ident (NFC) (#104543)
  [SandboxIR] Implement SwitchInst (#104641)
  [Clang] Fix sema checks thinking kernels aren't kernels (#104460)
  [asan] Pre-commit test with global constructor without any global (#104620)
  [clang-doc] add support for enums comments in html generation (#101282)
  Revert "[AArch64] Fold more load.x into load.i with large offset"
  [NFC][cxxabi] Apply `cp-to-llvm.sh` (#101970)
  [Clang] fix crash by avoiding invalidation of extern main declaration during strictness checks (#104594)
  [Mips] Fix fast isel for i16 bswap. (#103398)
  [libc] Add missing math definitions for round and scal for GPU (#104636)
  [ScalarizeMaskedMemIntr] Optimize splat non-constant masks (#104537)
  [SandboxIR] Implement ConstantInt (#104639)
  [SLP]Fix PR104637: do not create new nodes for fully overlapped non-schedulable nodes
  [DataLayout] Refactor parsing of "p" specification (#104583)
  [flang][cuda] Remove run line
  Reland "[flang][cuda][driver] Make sure flang does not switch to cc1 (#104613)"
  Revert "Reland "[flang][cuda][driver] Make sure flang does not switch to cc1 (#104613)""
  [SandboxIR][Tracker][NFC] GenericSetterWithIdx (#104615)
  Reland "[flang][cuda][driver] Make sure flang does not switch to cc1 (#104613)"
  [MC] Drop whitespace padding in AMDGPU combined asm/disasm tests. (#104433)
  [gn build] Port 7ff377ba60bf
  [InstrProf] Support conditional counter updates (#102542)
  [Analysis] Fix null ptr dereference when using WriteGraph without branch probability info (#104102)
  [DirectX] Revert specialized createOp methods part of #101250
  [VPlan] Compute cost for most opcodes in VPWidenRecipe (NFCI). (#98764)
  [PowerPC] Do not merge TLS constants within PPCMergeStringPool.cpp (#94059)
  Revert "[flang][cuda][driver] Make sure flang does not switch to cc1" (#104632)
  [AArch64][MachO] Encode @AUTH to ARM64_RELOC_AUTHENTICATED_POINTER.
  [flang][cuda][driver] Make sure flang does not switch to cc1 (#104613)
  AMDGPU: Rename type helper functions in atomic handling
  [libc] Fix generated header definitions in cmake (#104628)
  [libcxx][fix] Rename incorrect filename variable
  [SDAG] Read-only intrinsics must have WillReturn and !Throws attributes to be treated as loads (#99999)
  Re-Apply "[DXIL][Analysis] Implement enough of DXILResourceAnalysis for buffers" (#104517)
  [SelectionDAGISel] Use getSignedConstant for OPC_EmitInteger.
  [DirectX] Add missing Analysis usage to DXILResourceMDWrapper
  [AArch64] Remove apple-a7-sysreg. (#102709)
  Revert "[libc] Disable old headergen checks unless enabled" (#104627)
  [LLD, MachO] Default objc_relative_method_lists on MacOS10.16+/iOS14+ (#104519)
  [Clang][OMPX] Add the code generation for multi-dim `thread_limit` clause (#102717)
  [lldb][test] Mark gtest cases as XFAIL if the test suite is XFAIL (#102986)
  [APINotes] Support fields of C/C++ structs
  [Attributor] Enable `AAAddressSpace` in `OpenMPOpt` (#104363)
  [HLSL] Change default linkage of HLSL functions to internal (#95331)
  [bazel] Fix cyclic dependencies for macos (#104528)
  [libc] Disable old headergen checks unless enabled (#104522)
  [SandboxIR] Implement AtomicRMWInst (#104529)
  [RISCV] Move vmv.v.v peephole from SelectionDAG to RISCVVectorPeephole (#100367)
  [nfc] Improve testability of PGOInstrumentationGen (#104490)
  [test] Prevent generation of the bigendian code inside clang test CodeGen/bit-int-ubsan.c (#104607)
  [TableGen] Refactor Intrinsic handling in TableGen (#103980)
  [mlir][emitc] Add 'emitc.switch' op to the dialect (#102331)
  [SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few places. (#104555)
  [mlir][AMDGPU] Implement AMDGPU DPP operation in MLIR. (#89233)
  [RISCV] Allow YAML file to control multilib selection (#98856)
  [mlir][vector] Group re-order patterns together (#102856)
  [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (#104109)
  [HLSL] Flesh out basic type typedefs (#104479)
  [mlir][vector] Add more tests for ConvertVectorToLLVM (4/n) (#103391)
  [TableGen] Sign extend constants based on size for EmitIntegerMatcher. (#104550)
  [gn] Port AST/ByteCode #104552
  [DAGCombiner] Remove TRUNCATE_(S/U)SAT_(S/U) from an assert that isn't tested. NFC (#104466)
  [RISCV] Don't support TRUNCATE_SSAT_U. (#104468)
  [Hexagon] Use range-based for loops (NFC) (#104538)
  [CodeGen] Use range-based for loops (NFC) (#104536)
  [Bazel] Port AST/ByteCode #104552
  [mlir][linalg] Implement TilingInterface for winograd operators (#96184)
  [libc++][math] Fix acceptance of convertible types in `std::isnan()` and `std::isinf()` (#98952)
  [clang] Rename all AST/Interp stuff to AST/ByteCode (#104552)
  [mlir] [tosa] Bug fixes in shape inference pass (#104146)
  [libc++] Fix rejects-valid in std::span copy construction (#104500)
  [InstCombine] Handle commuted variant of sqrt transform
  [InstCombine] Thwart complexity-based canonicalization in sqrt test (NFC)
  [InstCombine] Preserve nsw in A + -B fold
  [InstCombine] Add nsw tests for A + -B fold (NFC)
  [include-cleaner] fix 32-bit buildbots after a426ffdee1ca7814f2684b6
  [PhaseOrdering] Regenerate test checks (NFC)
  [InstCombine] Regenerate test checks (NFC)
  [X86] Fold extract_subvector(int_to_fp(x)) vXi32/vXf32 cases to match existing fp_to_int folds
  [InstCombine] Regenerate test checks (NFC)
  [mlir][spirv] Update documentation. NFC (#104584)
  [GlobalIsel] Revisit ext of ext. (#102769)
  [libc++] Fix backslash as root dir breaks lexically_relative, lexically_proximate and hash_value on Windows (#99780)
  [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners
  [SLP][REVEC] Fix CreateInsertElement does not use the correct result if MinBWs applied. (#104558)
  Add FPMR register and update dependencies of FP8 instructions (#102910)
  [InstCombine] Fix incorrect zero ext in select of lshr/ashr fold
  [InstCombine] Add i128 test for select of lshr/ashr transform (NFC)
  [llvm-c] Add non-cstring versions of LLVMGetNamedFunction and LLVMGetNamedGlobal (#103396)
  [InstCombine] Fold an unsigned icmp of ucmp/scmp with a constant to an icmp of the original arguments (#104471)
  [clang][Interp] Fix classifying enum types (#104582)
  [clang] Add a new test for CWG2091 (#104573)
  [mlir][ArmSME][docs] Fix broken link (NFC)
  [compiler-rt] Stop using x86 builtin on AArch64 with GCC (#93890)
  [DataLayout] Refactor parsing of "ni" specification (#104546)
  [X86] SimplifyDemandedVectorEltsForTargetNode - reduce width of X86 conversions nodes when upper elements are not demanded. (#102882)
  [include-cleaner] Add handling for new/delete expressions (#104033)
  InferAddressSpaces: Convert test to generated checks
  [LAA] Use computeConstantDifference() (#103725)
  [SimplifyCFG] Add test for #104567 (NFC)
  [bazel] Port for 75cb9edf09fdc091e5bc0f3d46a96c2877735a39
  [AMDGPU][NFC] AMDGPUUsage.rst: document corefile format (#104419)
  [lldb][NFC] Moved FindSchemeByProtocol() from Acceptor to Socket (#104439)
  [X86] lowerShuffleAsDecomposedShuffleMerge - don't lower to unpack+permute if either source is zero.
  [X86] Add shuffle tests for #104482
  [clang][Interp][NFC] Remove Function::Loc
  [clang][NFC] Update `cxx_dr_status.html`
  [MLIR][GPU-LLVM] Add GPU to LLVM-SPV address space mapping (#102621)
  [DAG] SD Pattern Match: Operands patterns with VP Context  (#103308)
  Revert "[clang][driver] Fix -print-target-triple OS version for apple targets" (#104563)
  [NFC][X86] Refactor: merge avx512_binop_all2 into avx512_binop_all (#104561)
  [RISCV] Merge bitrotate crash test into shuffle reverse tests. NFC
  [Passes] clang-format initialization files (NFC)
  [mlir][IR] Fix `checkFoldResult` error message (#104559)
  [RISCV] Merge shuffle reverse tests. NFC
  [RISCV] Use shufflevector in shuffle reverse tests. NFC
  [RISCV] Remove -riscv-v-vector-bits-max from reverse tests. NFC
  [flang][stack-arrays] Collect analysis results for OMP ws loops (#103590)
  [clang][Interp] Add scopes to conditional operator subexpressions (#104418)
  [RISCV] Simplify (srl (and X, Mask), Const) to TH_EXTU (#102802)
  [RISCV][NFC] Fix typo: "wererenamed" to "were renamed" (#104530)
  [RISCV] Lower fixed reverse vector_shuffles through vector_reverse (#104461)
  [asan] Fix build breakage from report_globals change
  [MLIR][test] Run SVE and SME Integration tests using qemu-aarch64 (#101568)
  [DAGCombiner] Don't let scalarizeBinOpOfSplats create illegal scalar MULHS/MULHU (#104518)
  [flang][cuda] Add version in libCufRuntime name (#104506)
  [mlir][tosa] Add missing check for new_shape of `tosa.reshape` (#104394)
  [Bitcode] Use range-based for loops (NFC) (#104534)
  [HLSL] update default validator version to 1.8. (#104040)
  [ScalarizeMaskedMemIntr] Pre-commit tests for splat optimizations (#104527)
  [Sparc] Remove dead code (NFC) (#104264)
  [Clang] [Sema] Error on reference types inside a union with msvc 1900+ (#102851)
  [Driver] Reject -Wa,-mrelax-relocations= for non-ELF
  [Analysis] Use a range-based for loop (NFC) (#104445)
  [llvm] Use llvm::any_of (NFC) (#104443)
  [PowerPC] Use range-based for loops (NFC) (#104410)
  [CodeGen] Use a range-based for loop (NFC) (#104408)
  [ORC] Gate testcase for 3e1d4ec671c on x86-64 and aarch64 target support.
  [builitins] Only try to use getauxval on Linux (#104047)
  [ORC] Add missing dependence on BinaryFormat library.
  [flang] Inline minval/maxval over elemental/designate (#103503)
  [Driver] Correctly handle -Wa,--crel -Wa,--no-crel
  [lldb] Correctly fix a usage of `PATH_MAX`, and fix unit tests (#104502)
  [gn build] Port 3e1d4ec671c5
  [asan] Remove debug tracing from `report_globals` (#104404)
  [workflows] Add a new workflow for checking commit access qualifications (#93301)
  [Driver] Improve error message for -Wa,-x=unknown
  [SandboxIR] Implement UnaryOperator (#104509)
  [ORC] loadRelocatableObject: universal binary support, clearer errors (#104406)
  [RISCV] Use significant bits helpers in narrowing of build vectors [nfc] (#104511)
  [LLDB] Reapply #100443 SBSaveCore Thread list (#104497)
  [Driver] Reject -Wa,-mrelax-relocations= for non-x86
  [docs] Stress out the branch naming scheme for Graphite. (#104499)
  [NFC][sanitizer] Use `UNLIKELY` in VReport/VPrintf (#104403)
  [asan] Reduce priority of "contiguous_container:" VPrintf (#104402)
  [libc] Make sure we have RISC-V f or d extension before using it (#104476)
  [Driver] Make CodeGenOptions name match MCTargetOptions names
  [Attributor][FIX] Ensure we do not use stale references (#104495)
  [libclang/python] Expose `clang_isBeforeInTranslationUnit` for `SourceRange.__contains__`
  [Clang] Add target triple to fix failing test (#104513)
  [clang][NFC] Fix table of contents in `Sema.h`
  [-Wunsafe-buffer-usage] Fix warning after #102953
  [flang] Make sure range is valid (#104281)
  [MC] Replace hasAltEntry() with isMachO()
  MCAsmInfo: Replace some Mach-O specific check with isMachO(). NFC
  [asan] De-prioritize VReport `DTLS_Find` (#104401)
  Revert "[DXIL][Analysis] Implement enough of DXILResourceAnalysis for buffers" (#104504)
  [ubsan] Limit _BitInt ubsan tests to x86-64 platform only (#104494)
  Update load intrinsic attributes (#101562)
  [MC] Replace HasAggressiveSymbolFolding with SetDirectiveSuppressesReloc. NFC
  [SandboxIR] Implement BinaryOperator (#104121)
  [RISCV][GISel] Support nxv16p0 for RV32. (#101573)
  [nfc][ctx_prof] Remove the need for `PassBuilder` to know about `UseCtxProfile` (#104492)
  [Clang] [NFC] Rewrite constexpr vectors test to use element access (#102757)
  (lldb) Fix PATH_MAX for Windows (#104493)
  [libc] Add definition for `atan2l` on 64-bit long double platforms (#104489)
  Revert "[sanitizer] Remove GetCurrentThread nullness checks from Allocate"
  Reapply "Fix prctl to handle PR_GET_PDEATHSIG. (#101749)" (#104469)
  [-Wunsafe-buffer-usage] Fix a small bug recently found (#102953)
  [TargetLowering] Don't call SelectionDAG::getTargetLoweringInfo() from TargetLowering methods. NFC (#104197)
  [PowerPC][GlobalMerge] Enable GlobalMerge by default on AIX (#101226)
  [Clang] Implement C++26’s P2893R3 ‘Variadic friends’ (#101448)
  clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (#96872)
  [llvm-objdump] Fix a warning
  [bazel] Port 47721d46187f89c12a13d07b5857496301cf5d6e (#104481)
  [libc++] Remove the allocator<const T> extension (#102655)
  [Clang] handle both gnu and cpp11 attributes to ensure correct parsing inside extern block (#102864)
  [gn build] Port 47721d46187f
  [lldb] Realpath symlinks for breakpoints (#102223)
  llvm-objdump: ensure a MachO symbol isn't STAB before looking up secion (#86667)
  [test]Fix test error due to CRT dependency (#104462)
  [clang][Interp] Call move function for certain primitive types (#104437)
  [llvm-objdump] Print out  xcoff file header for xcoff object file with option private-headers (#96350)
  [Clang] prevent null explicit object argument from being deduced (#104328)
  Revert "[Clang] Overflow Pattern Exclusions (#100272)"
  [flang][OpenMP] Fix 2 more regressions after #101009 (#101538)
  [InstCombine] Fold `ucmp/scmp(x, y) >> N` to `zext/sext(x < y)` when N is one less than the width of the result of `ucmp/scmp` (#104009)
  [bazel] Enable more lit self tests (#104285)
  Fix single thread stepping timeout race condition (#104195)
  [SPARC][Utilities] Add names for SPARC ELF flags in LLVM binary utilities (#102843)
  [SPARC][Driver] Add -m(no-)v8plus flags handling (#98713)
  [OpenMP] Add support for pause with omp_pause_stop_tool (#97100)
  Revert "[SLP][NFC]Remove unused using declarations, reduce mem usage in containers, NFC"
  [ValueTracking] Fix f16 fptosi range for large integers
  [InstSimplify] Add tests for f16 to i128 range (NFC)
  Revert "[Object][x86-64] Add support for `R_X86_64_GLOB_DAT` relocations. (#103029)" (#103497)
  [NFC] Fix spelling of "definitely". (#104455)
  [InstCombine][NFC] Add tests for shifts of constants by common factor (#103471)
  [OpenMP] Miscellaneous small code improvements (#95603)
  [clang][ExtractAPI] Emit environment component of target triple in SGF (#103273)
  [RISCV] Narrow indices to e16 for LMUL > 1 when lowering vector_reverse (#104427)
  [NFC] Fix code line exceeding 80 columns (#104428)
  [SLP][NFC]Remove unused using declarations, reduce mem usage in containers, NFC
  [Clang] Check explicit object parameter for defaulted operators properly (#100419)
  [LegalizeTypes][AMDGPU]: Allow for scalarization of insert_subvector (#104236)
  Allow optimization of __size_returning_new variants. (#102258)
  [SLP]Fix PR104422: Wrong value truncation
  [GlobalISel] Combiner: Fix warning after #102163
  [SLP][NFC]Add a test with incorrect minbitwidth analysis for reduced operands
  [ubsan] Display correct runtime messages for negative _BitInt (#96240)
  Revert "[SLP][NFC]Remove unused using declarations, reduce mem usage in containers, NFC"
  [DataLayout] Extract loop body into a function to reduce nesting (NFC) (#104420)
  [clang][ExtractAPI] Compute inherited availability information (#103040)
  [CodeGen] Fix -Wcovered-switch-default in Combiner.cpp (NFC)
  [CompilerRT][Tests] Fix profile/darwin-proof-of-concept.c (#104237)
  [mlir][gpu] Fix typo in test filename (#104053)
  [LoongArch] Pre-commit tests for validating the merge base offset in vecotrs. NFC
  [AArch64] optimise SVE prefetch intrinsics with no active lanes (#103052)
  [AMDGPU] MCExpr printing helper with KnownBits support (#95951)
  [GlobalISel] Combiner: Observer-based DCE and retrying of combines
  [libcxx] Use `aligned_alloc` for testing instead of `posix_memalign` (#101748)
  [VPlan] Run VPlan optimizations on plans in native path.
  [clang][Interp] Use first field decl for Record field lookup (#104412)
  InferAddressSpaces: Restore non-instruction user check
  [AMDGPU][llvm-split] Fix another division by zero (#104421)
  Reapply "[lldb] Tolerate multiple compile units with the same DWO ID (#100577)" (#104041)
  [lldb-dap] Expose log path in extension settings (#103482)
  [clang][Interp] Pass callee decl to null_callee diagnostics (#104426)
  [llvm][CodeGen] Resolve issues when updating live intervals in window scheduler (#101945)
  [DataLayout] Add helper predicates to sort specifications (NFC) (#104417)
  InferAddressSpaces: Make getPredicatedAddrSpace less confusing (#104052)
  [AArch64] Fold more load.x into load.i with large offset
  [AArch64] merge index address with large offset into base address
  [AArch64] Add verification for MemOp immediate ranges (#97561)
  Revert "[Clang] [AST] Fix placeholder return type name mangling for MSVC 1920+ / VS2019+ (#102848)"
  [analyzer] Do not reason about locations passed as inline asm input (#103714)
  [NFC][mlir][scf] Fix misspelling of replace (#101683)
  Revert "Remove empty line."
  [mlir][Transforms] Dialect conversion: Build unresolved materialization for replaced ops (#101514)
  Remove empty line.
  [DirectX] Use a more consistent pass name for DXILTranslateMetadata
  [Flang][OpenMP] Move assert for wrapper syms and block args to genLoopNestOp (#103731)
  [clang][driver] Fix -print-target-triple OS version for apple targets (#104037)
  [bazel] Port for 141536544f4ec1d1bf24256157f4ff1a3bc07dae
  [DAG] Adding m_FPToUI and m_FPToSI to SDPatternMatch.h (#104044)
  [llvm][Docs] `_or_null` -> `_if_present` in Programmer's Manual (#98586)
  [MLIR][LLVM]: Add an IR utility to perform slice walking (#103053)
  [lldb][test] Mark sys_info zdump test unsupported on 32 bit Arm Linux
  [flang][test] Run Driver/fveclib-codegen.f90 for aarch64 and x86_64 (#103730)
  [lldb] Remove Phabricator usernames from Code Owners file (#102590)
  [DataLayout] Move '*AlignElem' structs and enum inside DataLayout (NFC) (#103723)
  [flang][test] Fix Lower/default-initialization-globals.f90 on SPARC (#103722)
  [mlir][test] XFAIL little-endian-only tests on SPARC (#103726)
  [UnitTests] Convert some data layout parsing tests to GTest (#104346)
  Fix warnings in #102848 [-Wunused-but-set-variable]
  [VPlan] Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp (NFC).
  [include-cleaner] Remove two commented-out lines of code.
  [mlir][tosa] Add verifier for `tosa.table` (#103708)
  [X86][MC] Remove CMPCCXADD's CondCode flavor. (#103898)
  [ctx_prof] Remove an unneeded include in CtxProfAnalysis.cpp
  Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)
  Remove failing test until it can be fixed properly.
  [Clang][NFC] Move FindCountedByField into FieldDecl (#104235)
  Fix testcases. Use -emit-llvm and not -S. Use LABEL checking.
  [Clang] [AST] Fix placeholder return type name mangling for MSVC 1920+ / VS2019+ (#102848)
  [LLDB][OSX] Removed semi colon generating a warning during build (#104398)
  [OpenMP] Use range-based for loops (NFC) (#103511)
  [RISCV] Implement RISCVTTIImpl::shouldConsiderAddressTypePromotion for RISCV (#102560)
  [lld-macho] Fix crash: ObjC category merge + relative method lists (#104081)
  [ELF][NFC] Allow non-GotSection for addAddendOnlyRelocIfNonPreemptible (#104228)
  [ctx_prof] CtxProfAnalysis: populate module data (#102930)
  [sanitizer] Remove GetCurrentThread nullness checks from Allocate
  Remove '-emit-llvm' and use '-triple'
  Use clang_cc1 and specify the target explicitly.
  utils/git: Add linkify script.
  [mlir][MemRef] Add more ops to narrow type support, strided metadata expansion (#102228)
  [Clang] Overflow Pattern Exclusions (#100272)
  [Clang] Error on extraneous template headers by default. (#104046)
  [Sanitizers] Disable prctl test on Android.
  [RISCV] Don't combine (sext_inreg (fmv_x_anyexth X), i16) with Zhinx.
  Remove unused variable, and unneeded extract element instruction (#103489)
  [bazel] Port 4bac8fd8904904bc7d502f39851eef50b5afff73 (#104278)
  Reland "[flang][cuda] Use cuda runtime API #103488"
  [Clang] Add `__CLANG_GPU_DISABLE_MATH_WRAPPERS` macro for offloading math (#98234)
  [llvm-lit] Fix Unhashable TypeError when using lit's internal shell (#101590)
  [llvm-lit][test][NFC] Moved cat command tests into separate lit test file (#102366)
  [RISCV] Add signext attribute to return of fmv_x_w test in float-convert.ll. NFC
  [DXIL][Analysis] Implement enough of DXILResourceAnalysis for buffers
  Reapply "[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952)"
  [DXIL][Analysis] Boilerplate for DXILResourceAnalysis pass
  [mlir] Add bubbling patterns for non intersecting reshapes (#103401)
  Revert "[flang][cuda] Use cuda runtime API" (#104232)
  [libc++] Remove non-existent LWG issue from the .csv files
  [RISCV][GISel] Remove support for s32 G_VAARG on RV64. (#102533)
  [NVPTX] Add idp2a, idp4a intrinsics (#102763)
  [X86] Check if an invoked function clobbers fp or bp (#103446)
  [flang][cuda] Use cuda runtime API (#103488)
  [SLP][NFC]Remove unused using declarations, reduce mem usage in containers, NFC
  [TargetLowering] Remove unncessary null check. NFC
  [OpenMP] Fix buildbot failing on allocator test
  [clang] Turn -Wenum-constexpr-conversion into a hard error (#102364)
  [libcxx] Adjust inline assembly constraints for the AMDGPU target (#101747)
  [lld-macho] Make relative method lists work on x86-64 (#103905)
  [libcxx] Disable invalid `__start/__stop` reference on NVPTX (#99381)
  [libcxx] Add fallback to standard C when `unistd` is unavailable (#102005)
  [Clang] Fix 'nvlink-wrapper' not ignoring `-plugin` like lld does (#104056)
  [OpenMP] Implement 'omp_alloc' on the device (#102526)
  [vscode-mlir] Added per-LSP-server executable arguments (#79671)
  [flang] Read the extra field from the in box when doing reboxing (#102992)
  [HLSL] Split out the ROV attribute from the resource attribute, make it a new spellable attribute. (#102414)
  [libc++] Fix ambiguous constructors for std::complex and std::optional (#103409)
  AMDGPU: Avoid manually reconstructing atomicrmw (#103769)
  [libc] Fix 'float type' incorrectly being used as the return type
  [Clang] Adjust concept definition locus (#103867)
  [SandboxIR] Implement Instruction flags (#103343)
  [AArch64] Add some uxtw peephole tests. NFC
  AMDGPU: Stop promoting allocas with addrspacecast users (#104051)
  [NVPTX] Fix typo causing GCC warning (#103045)
  [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (#101585)
  [RISCV][GISel] Support G_SEXT_INREG for Zbb. (#102682)
  [SystemZ][z/OS] Continuation of __ptr32 support (#103393)
  [X86] concat(permv3(x0,m0,y0),permv3(x0,m1,y0)) -> permv3(concat(x0,u),m3,concat(y0,u))
  [X86] Add test coverage for #103564
  [X86] combineEXTRACT_SUBVECTOR - treat oneuse extractions from loads as free
  [libcxx] Set `_LIBCPP_HAS_CLOCK_GETTIME` for GPU targets (#99243)
  Fix bazel build (#104054)
  CodeGen/NewPM: Add ExpandLarge* passes to isel IR passes (#102815)
  AMDGPU/NewPM: Fill out addPreISelPasses (#102814)
  [libc++] Add mechanical update to CxxPapers.rst to git-blame-ignore-revs
  [libc++] Mechanical adjustments for the C++14 Paper status files
  [LLDB][OSX] Add a fallback support exe directory (#103458)
  [TextAPI] Use range-based for loops (NFC) (#103530)
  [mlir][vector] Add tests for `populateSinkVectorBroadcastPatterns` (1/n) (#102286)
  [libc++] Remove duplicate C++17 LWG issues from the CSVs
  [clang] Implement `__builtin_is_implicit_lifetime()` (#101807)
  Fix prctl test to execute all test cases if the first condition fails. (#102987)
  Revert "[scudo] Separated committed and decommitted entries." (#104045)
  [SelectionDAG] Scalarize binary ops of splats be…
@stefson
Copy link

stefson commented Aug 23, 2024

can you maybe backport the fix to 19.x branch?

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 23, 2024

/cherry-pick 6ec4c9c 7faf2c9 cf6cd1f

@RKSimon RKSimon reopened this Aug 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 23, 2024

Failed to cherry-pick: 6ec4c9c

https://github.com/llvm/llvm-project/actions/runs/10524207495

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 23, 2024

/cherry-pick 3276ee3 6ec4c9c 7faf2c9 cf6cd1f

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Aug 23, 2024
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Aug 23, 2024
Missed IceLakeServer when I updated the other CPUs in 6ec4c9c

(cherry picked from commit 7faf2c9)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Aug 23, 2024
…(REAPPLIED)

This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should

Reapplied with codegen fix for scatter-schedule.ll

Fixes llvm#105675

(cherry picked from commit cf6cd1f)
@llvmbot llvmbot closed this as completed Aug 23, 2024
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status Aug 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 23, 2024

/pull-request #105815

cjdb pushed a commit to cjdb/llvm-project that referenced this issue Aug 23, 2024
cjdb pushed a commit to cjdb/llvm-project that referenced this issue Aug 23, 2024
Missed IceLakeServer when I updated the other CPUs in 6ec4c9c
cjdb pushed a commit to cjdb/llvm-project that referenced this issue Aug 23, 2024
This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should

Fixes llvm#105675
cjdb pushed a commit to cjdb/llvm-project that referenced this issue Aug 23, 2024
…(REAPPLIED)

This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should

Reapplied with codegen fix for scatter-schedule.ll

Fixes llvm#105675
tru pushed a commit to llvmbot/llvm-project that referenced this issue Aug 26, 2024
…(REAPPLIED)

This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should

Reapplied with codegen fix for scatter-schedule.ll

Fixes llvm#105675

(cherry picked from commit cf6cd1f)
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
Development

No branches or pull requests

8 participants