-
Notifications
You must be signed in to change notification settings - Fork 179
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Configure build system to compile arch-specific extension for Arm NEON * Add NEON implementation for `util/simd/vector.h` * Add `tranpose16x16` implementation with NEON * Fix wrong macro used in dynamic dispatch with NEON enabled * Add NEON implementation of score vectors in `score_vector_int8.h` * Enable `benchmark_transpose` when compiled with NEON support * Add NEON implementation in `score_vector_int16.h` * Implement `SwipeProfile` with NEON in `swipe.h` * Add remaining unimplemented code in `score_vector.h` * Enable SIMD swipe and benchmarks when compiling for NEON * Add guards to compile NEON code on Armv7 platforms * Add NEON horizontal sums implementations using `vpadalq` cascades * Add fallback `ScoreVector::cmp_mask` implementation for Armv7 NEON * Add NEON support to `finger_print.h` and `hash_set.h` * Enable NEON architecture in `banded_3frame_swipe.cpp` * Add remaining benchmarks for which there exist a NEON implementation * List NEON in compile-time and runtime feature lists * Update define macros for compilation of NEON architecture library * Fix some define guards in `util/simd` headers * Fix define guards in `ungapped_simd` when compiling for Aarch64 * Fix `cmp_mask` implementations for NEON `ScoreVector` instantiations * Rewrite NEON `transpose` without Aarch64-specific instructions * Use Arm NEON include guards instead of architecture for banded swipe code * Add `expand_from_8bit` implementation for Armv7 NEON * Rewrite 16x16 NEON transpose without `vst1q_s8_x4` for Armv7 compatibility * Benchmark scalar 16x16 transpose to compare to vectorized code * Fix include guards in `swipe_wrapper.cpp` preventing the use of NEON vectors * Remove `Deque::Iterator::operator-` when compiling for Arm for `armv7` compatibility * Fix `benchmark_transpose` not using a valid scalar implementation * Refactor `cmp_mask` implementation into common inline function for NEON * Add Arm NEON implementation to `sse_dist.h` * Add Arm NEON implementation of `BitVector::one_count` using `vcntq_u8` intrinsic * Fix `reduce_seq_aarch64` definition in `sse_dist.h` * Implement `ScoreVector<int16_t>(unsigned, Register)` for NEON * Fix potential overflow in NEON code of `BitVector::one_count` * Fix `CMakeLists.txt` and `simd.h` to only build NEON on `armv7` with build support * Setup runtime detection of `NEON` for `armv7` in `simd.cpp` * Setup dispatch for NEON in `dispatch.h` * Fix scope of NEON helpers in `simd.h` * Change NEON architecture ID in `CMakeLists.txt` to avoid conflict with AVX512 * Fix remaining use of `::SIMD` namespace in NEON code * Fix NEON `letter_mask` not disabling sequence mask based on macros * Revert back commented code in `deque.h` * Fix iteration in `hauser_correction.cpp` * Use compile-time macro to avoid `Deque::Iterator::operator-` overload * Fix benchmarks not running properly for NEON * Fix `armv7l` implementation of `vmaskq_s8` * Fix `table.h` for 32-bit platforms
- Loading branch information
Showing
27 changed files
with
1,011 additions
and
102 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.