Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Expressions cause segfault on Windows (MSVC 2019 v16) #1336

Closed
laxnpander opened this issue Nov 28, 2022 · 9 comments
Closed

Expressions cause segfault on Windows (MSVC 2019 v16) #1336

laxnpander opened this issue Nov 28, 2022 · 9 comments
Assignees
Milestone

Comments

@laxnpander
Copy link

Description

Expressions seem to cause a segfault on Windows 2019 v16 (potentially others as well). None of the examples with expressions work. According to the debugger, the following line causes the crash (ExpressionNode.h, L279):

    Record(const Values& values, const ExpressionNode<A1>& expression1, ExecutionTraceStorage* ptr)
        : value1(expression1.traceExecution(values, trace1, ptr + upAligned(sizeof(Record)))) {}

Stacktrace:

 	SFMExampleExpressions.exe!gtsam::internal::CallRecord<2>::CallRecord<2>()	C++
 	SFMExampleExpressions.exe!gtsam::internal::CallRecordImplementor<gtsam::internal::UnaryExpression<Eigen::Matrix<double,2,1,0,2,1>,Eigen::Matrix<double,3,1,0,3,1>>::Record,2>::CallRecordImplementor<gtsam::internal::UnaryExpression<Eigen::Matrix<double,2,1,0,2,1>,Eigen::Matrix<double,3,1,0,3,1>>::Record,2>()	C++
>	SFMExampleExpressions.exe!gtsam::internal::UnaryExpression<Eigen::Matrix<double,2,1,0,2,1>,Eigen::Matrix<double,3,1,0,3,1>>::Record::Record(const gtsam::Values & values, const gtsam::internal::ExpressionNode<Eigen::Matrix<double,3,1,0,3,1>> & expression1, boost::detail::aligned_storage::aligned_storage_imp<1,32> * ptr) Line 280	C++
 	SFMExampleExpressions.exe!gtsam::internal::UnaryExpression<Eigen::Matrix<double,2,1,0,2,1>,Eigen::Matrix<double,3,1,0,3,1>>::traceExecution(const gtsam::Values & values, gtsam::internal::ExecutionTrace<Eigen::Matrix<double,2,1,0,2,1>> & trace, boost::detail::aligned_storage::aligned_storage_imp<1,32> * ptr) Line 322	C++
 	SFMExampleExpressions.exe!gtsam::internal::BinaryExpression<Eigen::Matrix<double,2,1,0,2,1>,gtsam::Cal3_S2,Eigen::Matrix<double,2,1,0,2,1>>::Record::Record(const gtsam::Values & values, const gtsam::internal::ExpressionNode<gtsam::Cal3_S2> & expression1, const gtsam::internal::ExpressionNode<Eigen::Matrix<double,2,1,0,2,1>> & expression2, boost::detail::aligned_storage::aligned_storage_imp<1,32> * ptr) Line 404	C++
 	SFMExampleExpressions.exe!gtsam::internal::BinaryExpression<Eigen::Matrix<double,2,1,0,2,1>,gtsam::Cal3_S2,Eigen::Matrix<double,2,1,0,2,1>>::traceExecution(const gtsam::Values & values, gtsam::internal::ExecutionTrace<Eigen::Matrix<double,2,1,0,2,1>> & trace, boost::detail::aligned_storage::aligned_storage_imp<1,32> * ptr) Line 432	C++
 	SFMExampleExpressions.exe!gtsam::Expression<Eigen::Matrix<double,2,1,0,2,1>>::traceExecution(const gtsam::Values & values, gtsam::internal::ExecutionTrace<Eigen::Matrix<double,2,1,0,2,1>> & trace, void * traceStorage) Line 193	C++
 	SFMExampleExpressions.exe!gtsam::Expression<Eigen::Matrix<double,2,1,0,2,1>>::valueAndJacobianMap(const gtsam::Values & values, gtsam::internal::JacobianMap & jacobians) Line 218	C++
 	SFMExampleExpressions.exe!gtsam::ExpressionFactor<Eigen::Matrix<double,2,1,0,2,1>>::linearize(const gtsam::Values & x) Line 139	C++
 	gtsamDebug.dll!gtsam::NonlinearFactorGraph::linearize(const gtsam::Values & linearizationPoint) Line 270	C++
 	gtsamDebug.dll!gtsam::DoglegOptimizer::iterate() Line 87	C++
 	gtsamDebug.dll!gtsam::NonlinearOptimizer::defaultOptimize() Line 95	C++
 	gtsamDebug.dll!gtsam::NonlinearOptimizer::optimize() Line 98	C++
 	SFMExampleExpressions.exe!main(int argc, char * * argv) Line 95	C++

Steps to reproduce

CMake configuration is with GTSAM_BUILD_UNSTABLE=OFF and GTSAM_BUILD_PYTHON=OFF. Compiler used is MSVC2019 v16 with Boost 1.72 and GTSAM version of Eigen3.

D:\Workspace\spear\spear\3rd\gtsam\cmake-build>cmake --build . --config debug
CMake is re-running because D:/Workspace/spear/spear/3rd/gtsam/cmake-build/examples/CMakeFiles/generate.stamp is out-of-date.
  the file 'D:/Workspace/spear/spear/3rd/gtsam/examples/CMakeLists.txt'
  is newer than 'D:/Workspace/spear/spear/3rd/gtsam/cmake-build/examples/CMakeFiles/generate.stamp.depend'
  result='-1'
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19044.
-- GTSAM Version: 4.2a7
-- GTSAM_POSE3_EXPMAP=ON, enabling GTSAM_ROT3_EXPMAP as well
-- Found Eigen version: 3.3.7
-- Could NOT find MKL (missing: MKL_INCLUDE_DIR MKL_LIBRARIES)
-- Could NOT find TBB (missing: TBB_INCLUDE_DIRS TBB_LIBRARIES tbb tbbmalloc) (Required is at least version "4.4")
-- Building 3rdparty
-- Could NOT find GeographicLib (missing: GeographicLib_LIBRARY_DIRS GeographicLib_LIBRARIES GeographicLib_INCLUDE_DIRS)
-- Building base
-- Building basis
-- Building geometry
-- Building inference
-- Building symbolic
-- Building discrete
-- Building hybrid
-- Building linear
-- Building nonlinear
-- Building sam
-- Building sfm
-- Building slam
-- Building navigation
-- Adding precompiled header for MSVC
-- GTSAM Version: 4.2.0
-- Install prefix: D:/Workspace/spear/spear/3rd/gtsam/install
-- Building GTSAM - shared: ON
-- Wrote D:/Workspace/spear/spear/3rd/gtsam/cmake-build/GTSAMConfig.cmake
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- ===============================================================
-- ================  Configuration Options  ======================
--  CMAKE_CXX_COMPILER_ID type                       : MSVC
--  CMAKE_CXX_COMPILER_VERSION                       : 19.29.30146.0
--  CMake version                                    : 3.23.1
--  CMake generator                                  : Visual Studio 16 2019
--  CMake build tool                                 : C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/MSBuild/Current/Bin/MSBuild.exe
-- Build flags
--  Build Tests                                      : Enabled
--  Build examples with 'make all'                   : Enabled
--  Build timing scripts with 'make all'             : Disabled
--  Build shared GTSAM libraries                     : Enabled
--  Put build type in library name                   : Enabled
--  Build libgtsam_unstable                          : Disabled
--  Build GTSAM unstable Python                      : Enabled
--  Build MATLAB Toolbox for unstable                : Disabled
--  GTSAM_COMPILE_FEATURES_PUBLIC                    : cxx_std_11
--  GTSAM_COMPILE_OPTIONS_PUBLIC                     : /wd4267
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC                 : EIGEN_NO_STATIC_ASSERT;EIGEN_NO_STATIC_ASSERT
--  GTSAM_COMPILE_OPTIONS_PUBLIC_DEBUG               :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_DEBUG           :
--  GTSAM_COMPILE_OPTIONS_PUBLIC_RELEASE             :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_RELEASE         :
--  GTSAM_COMPILE_OPTIONS_PUBLIC_TIMING              :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_TIMING          :
--  GTSAM_COMPILE_OPTIONS_PUBLIC_PROFILING           :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_PROFILING       :
--  GTSAM_COMPILE_OPTIONS_PUBLIC_RELWITHDEBINFO      :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_RELWITHDEBINFO  :
--  GTSAM_COMPILE_OPTIONS_PUBLIC_MINSIZEREL          :
--  GTSAM_COMPILE_DEFINITIONS_PUBLIC_MINSIZEREL      :
--  Use System Eigen                                 : OFF (Using version: 3.3.7)
--  Use System Metis                                 : OFF
--  Using Boost version                              : 107200
--  Use Intel TBB                                    : TBB not found
--  Eigen will use MKL                               : MKL not found
--  Eigen will use MKL and OpenMP                    : OpenMP found but GTSAM_WITH_EIGEN_MKL is disabled
--  Default allocator                                : STL
--  Cheirality exceptions enabled                    : YES
-- Packaging flags
--  CPack Source Generator                           : TGZ
--  CPack Generator                                  : TGZ
-- GTSAM flags
--  Quaternions as default Rot3                      : Disabled
--  Runtime consistency checking                     : Disabled
--  Rot3 retract is full ExpMap                      : Enabled
--  Pose3 retract is full ExpMap                     : Enabled
--  Allow features deprecated in GTSAM 4.1           : Enabled
--  Metis-based Nested Dissection                    : Enabled
--  Use tangent-space preintegration                 : Enabled
-- MATLAB toolbox flags
--  Install MATLAB toolbox                           : Disabled
-- Python toolbox flags
--  Build Python module with pybind                  : Disabled
-- ===============================================================
CMake Warning at cmake/HandleFinalChecks.cmake:3 (message):
  TBB 4.4 or newer was not found - this is ok, but note that GTSAM
  parallelization will be disabled.  Set GTSAM_WITH_TBB to 'Off' to avoid
  this warning.

Expected behavior

It should not segfault.

Environment

MSVC2019 v16, CMake 3.23, Boost 1.72, no TBB, Eigen3 shipped with GTSAM

Additional information

@laxnpander
Copy link
Author

Update: I removed (ptr) in all lines like these:

Record* record = new (ptr) Record(values, *expression1_, ptr);

so it's like

Record* record = new Record(...);

instead of

Record* record = new (ptr) Record(...);

Now the examples seem to work (no crash, low final error). Can someone smarter than me say if this modification is safe?

@laxnpander laxnpander changed the title Expressions cause segfault on Windows (MSVC 16) Expressions cause segfault on Windows (MSVC 2019 v16) Nov 28, 2022
@dellaert
Copy link
Member

Interesting. This is the placement new operator and I remember it being crucial for performance and correctness. I have no idea why it would give problems on windows, unless there is some change in how they handle this (fairly rarely used) C++ feature.

@ProfFan
Copy link
Collaborator

ProfFan commented Nov 30, 2022

@laxnpander Could you try modifying

static const unsigned TraceAlignment = 32;
to be 64 instead of 32?

@laxnpander
Copy link
Author

laxnpander commented Nov 30, 2022

@ProfFan Still segfaulting. I just realised that Pose2SLAMExampleExpressions.exe works. However, Pose3SLAMExampleExpressions_BearingRangeWithTransform.exe as well as SFMExampleExpressions.exe are crashing. Not sure if this helps. Will recompile your change with debug flag and see if stacktrace changed or not.

Edit: Maybe I was mistaken, Pose2SLAMExampleExpressions seems to be crashing as well.

Stack trace for this:

	Pose2SLAMExampleExpressions.exe!gtsam::internal::ExecutionTrace<gtsam::Pose2>::~ExecutionTrace<gtsam::Pose2>() Zeile 174	C++
 	Pose2SLAMExampleExpressions.exe!gtsam::Expression<gtsam::Pose2>::valueAndJacobianMap(const gtsam::Values & values, gtsam::internal::JacobianMap & jacobians) Zeile 224	C++
 	Pose2SLAMExampleExpressions.exe!gtsam::ExpressionFactor<gtsam::Pose2>::linearize(const gtsam::Values & x) Zeile 139	C++
 	gtsamDebug.dll!gtsam::NonlinearFactorGraph::linearize(const gtsam::Values & linearizationPoint) Zeile 270	C++
 	gtsamDebug.dll!gtsam::GaussNewtonOptimizer::iterate() Zeile 49	C++
 	gtsamDebug.dll!gtsam::NonlinearOptimizer::defaultOptimize() Zeile 95	C++
 	gtsamDebug.dll!gtsam::NonlinearOptimizer::optimize() Zeile 98	C++
 	Pose2SLAMExampleExpressions.exe!main(int argc, char * * argv) Zeile 72	C++

@laxnpander
Copy link
Author

@ProfFan: Out of curiosity: Did you try running these examples on a windows system? I tried on two different laptops and both were throwing the same error. Same MSVC version though. The CI also just checks for compilation. I just wonder if this is a common issue or related to my configuration...

@ProfFan ProfFan self-assigned this Jan 9, 2023
@ProfFan ProfFan added this to the GTSAM 5 milestone Jan 9, 2023
@oicchris
Copy link
Contributor

It appears the issue is caused by destruction order. In MSVC, the pointer is being deallocated prior to internal::ExecutionTrace being destroyed.
gtsam/nonlinear/Expression-inl.h (209-224)

#ifdef _MSC_VER
  auto traceStorage = static_cast<internal::ExecutionTraceStorage*>(_aligned_malloc(size, internal::TraceAlignment));
#else
  internal::ExecutionTraceStorage traceStorage[size];
#endif

  internal::ExecutionTrace<T> trace;
  T value(this->traceExecution(values, trace, traceStorage));
  trace.startReverseAD1(jacobians);

#ifdef _MSC_VER
  _aligned_free(traceStorage); // <-- Deallocation and trace.content.ptr is now dangling.
#endif

  return value;
} // <-- internal::ExecutionTrace<T>::~ExecutionTrace() dtor calls trace.content.ptr if kind==Function

In the destructor, it is invoking a call on the dangling pointer.
gtsam/nonlinear/internal/ExecutionTrace.h (173-177)

  ~ExecutionTrace() {
    if (kind == Function) {
      content.ptr->~CallRecord<Dim>(); // <-- content.ptr is invalid
    }
  }

In my workaround, I used std::unique_ptr<> with a custom deleter. After this change, my test succeeded. Note that I seem to encounter the error when running in Debug, Release seems to be happy accessing the dangling pointer.

#ifdef _MSC_VER
  std::unique_ptr<void, void(*)(void*)> traceStorageDeleter(
    _aligned_malloc(size, internal::TraceAlignment),
    [](void *ptr) { _aligned_free(ptr); } );
  auto traceStorage = static_cast<internal::ExecutionTraceStorage*>(traceStorageDeleter.get());
#else
  internal::ExecutionTraceStorage traceStorage[size];
#endif

  internal::ExecutionTrace<T> trace;
  T value(this->traceExecution(values, trace, traceStorage));
  trace.startReverseAD1(jacobians);

  return value;
} // <-- traceStorage is deallocated after trace's dtor.

@dellaert
Copy link
Member

Hi @oicchris, great sleuthing! Would you be willing to do a quick PR with your fix for the _MSC_VER path?

@oicchris
Copy link
Contributor

oicchris commented Jan 25, 2023

Pfff please ignore the force pushes.. was attempting to reword a comment to make it palatable. Didn't realize it would retain the extra pushes. I'm new to Github, is there a way to clean up the above mess?

dellaert added a commit that referenced this issue Jan 27, 2023
Fix (issue #1336) dangling pointer access violation.
oicchris added a commit to oicchris/gtsam that referenced this issue Jan 28, 2023
@ProfFan
Copy link
Collaborator

ProfFan commented Jan 28, 2023

Closing, thank you @oicchris !

@ProfFan ProfFan closed this as completed Jan 28, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants