Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fixing a Integer Overflow Error During Index Construction #80

Merged
merged 8 commits into from
Jan 5, 2025

Conversation

BlaiseMuhirwa
Copy link
Owner

@BlaiseMuhirwa BlaiseMuhirwa commented Jan 4, 2025

This addresses the recall issue seen at larger scales. The stack trace below shows the overflow. The solution is to use 64-bit integers to compute byte offsets for memory accesses and index memory size.

==978673==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7ffff31ff800 at pc 0x55555579f7f8 bp 0x7ff665c97b20 sp 0x7ff665c97b10
READ of size 4 at 0x7ffff31ff800 thread T16
    #0 0x55555579f7f7 in defaultSquaredL2<float> /scratch0/brc7/flatnav/flatnav/distances/L2DistanceDispatcher.h:13
    #1 0x55555579f7f7 in float flatnav::distances::L2DistanceDispatcher::dispatch<float>(float const*, float const*, unsigned long const&) /scratch0/brc7/flatnav/flatnav/distances/L2DistanceDispatcher.h:124
    #2 0x55555579f7f7 in flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>::distanceImpl(void const*, void const*, bool) const /scratch0/brc7/flatnav/flatnav/distances/SquaredL2Distance.h:41
    #3 0x55555579f7f7 in flatnav::distances::DistanceInterface<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9> >::distance(void const*, void const*, bool) /scratch0/brc7/flatnav/flatnav/distances/DistanceInterface.h:27
    #4 0x55555579f7f7 in flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::initializeSearch(void const*, int) /scratch0/brc7/flatnav/flatnav/index/Index.h:1029
    #5 0x55555579f7f7 in flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::add(void*, int&, int, int) /scratch0/brc7/flatnav/flatnav/index/Index.h:371
    #6 0x55555579f7f7 in flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}::operator()(unsigned int) const /scratch0/brc7/flatnav/flatnav/index/Index.h:336
    #7 0x55555579f7f7 in void std::__invoke_impl<void, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, unsigned int>(std::__invoke_other, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, unsigned int&&) /usr/include/c++/9/bits/invoke.h:60
    #8 0x55555579f7f7 in std::__invoke_result<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, unsigned int>::type std::__invoke<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, unsigned int>(std::__invoke_result&&, (flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&)...) /usr/include/c++/9/bits/invoke.h:95
    #9 0x55555579f7f7 in decltype(auto) std::__apply_impl<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, std::tuple<unsigned int>, 0ul>(flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, std::tuple<unsigned int>&&, std::integer_sequence<unsigned long, 0ul>) /usr/include/c++/9/tuple:1684
    #10 0x55555579f7f7 in decltype(auto) std::apply<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, std::tuple<unsigned int> >(flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}&, std::tuple<unsigned int>&&) /usr/include/c++/9/tuple:1694
    #11 0x55555579f7f7 in flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}::operator()() const /scratch0/brc7/flatnav/flatnav/util/Multithreading.h:37
    #12 0x55555579f7f7 in flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1} std::__invoke_impl<void, flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}>(std::__invoke_other, flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}&&) /usr/include/c++/9/bits/invoke.h:60
    #13 0x55555579f7f7 in _ZSt8__invokeIZN7flatnav17executeInParallelIZNS0_5IndexINS0_9distances17SquaredL2DistanceILNS0_4util8DataTypeE9EEEiE8addBatchIfEEvPvRSt6vectorIiSaIiEEiiEUljE_JEEEvjjjT_DpT0_EUlvE_JEENSt15__invoke_resultISG_JSI_EE4typeEOSG_DpOSH_ /usr/include/c++/9/bits/invoke.h:95
    #14 0x55555579f7f7 in void std::thread::_Invoker<std::tuple<flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/include/c++/9/thread:244
    #15 0x55555579f7f7 in std::thread::_Invoker<std::tuple<flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}> >::operator()() /usr/include/c++/9/thread:251
    #16 0x55555579f7f7 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<flatnav::executeInParallel<flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1}>(unsigned int, unsigned int, unsigned int, flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int)::{lambda(unsigned int)#1})::{lambda()#1}> > >::_M_run() /usr/include/c++/9/thread:195
    #17 0x7ffff7467df3  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6df3)
    #18 0x7ffff685b608 in start_thread /build/glibc-e2p3jK/glibc-2.31/nptl/pthread_create.c:477
    #19 0x7ffff6780352 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11f352)

0x7ffff31ff800 is located 0 bytes to the right of 6400000000-byte region [0x7ffe75a7b800,0x7ffff31ff800)
allocated by thread T0 here:
    #0 0x7ffff7682587 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cc:104
    #1 0x5555557dbaef in load_the_npy_file(_IO_FILE*) (/scratch0/brc7/flatnav/build/construct_npy+0x287aef)

Thread T16 created by T0 here:
    #0 0x7ffff75ad815 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cc:208
    #1 0x7ffff74680c9 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd70c9)
    #2 0x555555794077 in void flatnav::Index<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9>, int>::addBatch<float>(void*, std::vector<int, std::allocator<int> >&, int, int) /scratch0/brc7/flatnav/flatnav/index/Index.h:329
    #3 0x555555794077 in void buildIndex<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9> >(float*, std::unique_ptr<flatnav::distances::DistanceInterface<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9> >, std::default_delete<flatnav::distances::DistanceInterface<flatnav::distances::SquaredL2Distance<(flatnav::util::DataType)9> > > >, int, int, int, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /scratch0/brc7/flatnav/tools/construct_npy.cpp:45
    #4 0x5555556cf1df in run(float*, flatnav::distances::MetricType, int, int, int, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) /scratch0/brc7/flatnav/tools/construct_npy.cpp:80
    #5 0x5555556c79fa in main /scratch0/brc7/flatnav/tools/construct_npy.cpp:127
    #6 0x7ffff6685082 in __libc_start_main ../csu/libc-start.c:308

SUMMARY: AddressSanitizer: heap-buffer-overflow /scratch0/brc7/flatnav/flatnav/distances/L2DistanceDispatcher.h:13 in defaultSquaredL2<float>
Shadow bytes around the buggy address:
  0x10007e637eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10007e637f00:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x10007e637f10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x10007e637f20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x10007e637f30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x10007e637f40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x10007e637f50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):

@BlaiseMuhirwa BlaiseMuhirwa marked this pull request as ready for review January 4, 2025 19:04
Copy link
Collaborator

@vihan-lakshman vihan-lakshman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BlaiseMuhirwa BlaiseMuhirwa merged commit 231b423 into main Jan 5, 2025
13 checks passed
@BlaiseMuhirwa BlaiseMuhirwa deleted the recall-debugging branch January 5, 2025 18:26
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants