Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Frequent crashes loading Hasktorch #1465

Closed
tscholak opened this issue Mar 1, 2021 · 32 comments
Closed

Frequent crashes loading Hasktorch #1465

tscholak opened this issue Mar 1, 2021 · 32 comments
Labels
can-workaround status: blocked Not actionable, because blocked by upstream/GHC etc. type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..

Comments

@tscholak
Copy link
Contributor

tscholak commented Mar 1, 2021

This problem is new for me. I just finished updating my Hasktorch dev environment by bumping the version of ghc to 8.10.4. Before I was using hls 0.7 with Hasktorch, now I've tried both 0.9 and 1.0, but both are suddenly very unstable.

Your environment

Output of haskell-language-server --probe-tools or haskell-language-server-wrapper --probe-tools:

haskell-language-server version: 1.0.0.0 (GHC: 8.10.4) (PATH: /nix/store/wsrhqfn07rif6n27qglc3p68kjaflyax-haskell-language-server-exe-haskell-language-server-1.0.0.0/bin/haskell-language-server)
Tool versions found on the $PATH
cabal:		3.2.0.0
stack:		2.5.1
ghc:		8.10.4

Which lsp-client do you use:
vs code

Describe your project (alternative: link to the project):
https://github.com/hasktorch/hasktorch/blob/master/cabal.project

Contents of hie.yaml:
https://github.com/hasktorch/hasktorch/blob/master/hie.yaml

Steps to reproduce

start vscode from nix shell, work for a few seconds on random things, crash occurs

Expected behaviour

no crash

Actual behaviour

crash

Include debug information

Execute in the root of your project the command haskell-language-server --debug . and paste the logs here:

Debug output:

the full output is too long to post here. it finishes after a while with:

Completed (36 files worked, 298 files failed)

Paste the logs from the lsp-client, e.g. for VS Code

LSP logs:

likewise, this log is too long. I'll upload a file...

@konn
Copy link
Collaborator

konn commented Mar 1, 2021

I encountered this issue several times in a closed in-house codebase (without deps to Hasktorch), although I couldn't make minimal repro.
Telling from situational evidence in our project, it seems that modules with type-checker plugins can make HLS 1.0 crash, but not always. There is no concrete evidence though.

EDIT: We use ghc-typelits-natnormalise, ghc-typelits-presburger-knownnat and ghc-typelits-presburger in our code base, where the first two are common with hasktorch. Our crashing modules use -natnormalise and -presburger simultaneously.

@konn konn added the type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. label Mar 1, 2021
@jneira jneira changed the title frequent crashes Frequent crashes loading Hasktorch Mar 1, 2021
@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

Thanks for these pointers.
It would be a surprise to me, though. Have hls 0.9 and 1.0 become less stable under the presence of these plugins? I'm a longtime user of hls with hasktorch, and with 0.7 and ghc 8.10.3 I did not have such severe stability problems.

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

GHC 8.10.4 had some changes to the linker which might be responsible. Maybe you could try going back to 8.10.3 with HLS 1.0?

@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

let me try that @wz1000!

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

It seems like its segfaulting with GHC 8.10.4, but I can't manage to get a proper backtrace.

@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

at least I know now that I'm not crazy and alone with this issue. did you use the nix environment? stack? cabal?

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

I used cabal.

The segfault happens non-deterministically after editing for a few minutes. I can't figure out how to consistently reproduce it. My guess it that some code is not exception safe, probably in a plugin. ghcide uses async exceptions to cancel old compiles when the file is modified.

@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

type checker plugin or hls plugin?

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

I meant type checker plugin, but I'm doubting this hypothesis since an uninterruptibleMask_ around the typechecking code didn't seem to help.

The problem is not with a HLS plugin because I can reproduce it with plain ghcide.

On my last attempt, ghcide printed out ghcide: munmap: Invalid argument before dying.

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

I think the root cause is this ghc issue: https://gitlab.haskell.org/ghc/ghc/-/issues/19417

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

With the debug RTS, ghcide crashes with ghcide: internal error: ASSERTION FAILED: file rts/CheckUnload.c, line 457

@wz1000
Copy link
Collaborator

wz1000 commented Mar 1, 2021

Workaround for now would be to stick GHC 8.10.2 or earlier. (8.10.3 may or may not work)

@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

no luck with 8.10.3, it still crashes over and over again :(

@tscholak
Copy link
Contributor Author

tscholak commented Mar 1, 2021

hls 1.0 seems to be stable with 8.10.2 so far

@konn
Copy link
Collaborator

konn commented Mar 2, 2021

Sorry for the late reply, and thank you for the detailed survey @wz1000!

The segfault happens non-deterministically after editing for a few minutes.

Hmm. In my environment, it occurs WITHOUT any editing, but just by opening the module alone.

Workaround for now would be to stick GHC 8.10.2 or earlier. (8.10.3 may or may not work)

I tried with GHC 8.10.2, but unfortunately, it still continues to crash... Perhaps, mine issue deserves another issue?

We are using stack.

@konn
Copy link
Collaborator

konn commented Mar 2, 2021

How can we use the debug RTS to debug on our codebase?

@wz1000
Copy link
Collaborator

wz1000 commented Mar 2, 2021

@konn you just need to compile your executable (ghcide or HLS) with ghc-options: -debug

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

Hmm, I compiled ghcide with -debug and just run without RTS, but no trace is given before SEGV.
It seems there are several debug flags of form -D<x>, which flag was used to get ghcide: internal error: ASSERTION FAILED: file rts/CheckUnload.c?

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

I tried with ghcide typecheck path/to/module +RTS -Da by guess, and also get the same ghcide: internal error: ASSERTION FAILED: file rts/CheckUnload.c, line 457 with 8.10.4. I'm also trying with 8.10.2 now.

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

Hmm... It seems that on Linux, ghcide fails with 8.10.4 with error ghcide: internal error: ASSERTION FAILED: file rts/CheckUnload.c, line 457 but it succeeds with 8.10.2.

On the other hand, on macOS, it SEGVs also with 8.10.2. Perhaps I encountered the same issue as this issue AND another platform-dependent issue.

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

I'll also try with Hasktorch code base on my macOS laptop.

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

Tested against our codebase, it seems HLS + 8.10.2 suddenly SEGVs on macOS without any debug trace even with +RTS -Da.

@konn
Copy link
Collaborator

konn commented Mar 3, 2021

Agh, it seems Hasktorch won't compile on my environment, due to the C-errors in inline-c-cpp. I will try to use HLS as a testcase instead, as indicated in GitLab issue.

@tscholak
Copy link
Contributor Author

tscholak commented Mar 3, 2021

@konn I can help you getting through these pesky C errors.
Alternatively, try the nix shell environment Hasktorch ships with. To try ghc 8.10.2 with hls 1.0, set the compiler version to "ghc8102" here (https://github.com/hasktorch/hasktorch/blob/master/nix/haskell.nix#L8) and launch vscode from within the nix-shell:

$ nix-shell
[nix-shell:~/hasktorch] code .

vs code will then use the hls that the nix shell provides (which is 1.0)
(edit: to speed up compilation, use hasktorch's cachix: cachix use hasktorch)

@tscholak
Copy link
Contributor Author

tscholak commented Mar 3, 2021

update: still no stability issues with hls 1.0 and ghc 8.10.2 on hasktorch and macOS.
Of course I'd prefer to use 8.10.4, but at least I'm productive again.

@tscholak
Copy link
Contributor Author

tscholak commented Mar 3, 2021

@wz1000 do you know for sure now that this is what needs to be fixed? #1465 (comment)

there is a wip fix here, https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5128/diffs, but I'm not sure when/if this will be released. is this going into 8.10.5?

@wz1000
Copy link
Collaborator

wz1000 commented Mar 3, 2021

Yes, the fix should go in 8.10.5

@konn
Copy link
Collaborator

konn commented Mar 8, 2021

@tscholak Thanks, and sorry for the late reply.
I followed macos+stack+cpu instruction, but I added compiler: ghc-8.10.2 to stack.yaml to force stack to use GHC 8.10.2.

Then stack build inline-c-cpp fails with the following C-error:

$ stack build inline-c-cpp
inline-c-cpp> configure
inline-c-cpp> Configuring inline-c-cpp-0.4.0.3...
inline-c-cpp> build
inline-c-cpp> Preprocessing library for inline-c-cpp-0.4.0.3..
inline-c-cpp> Building library for inline-c-cpp-0.4.0.3..
inline-c-cpp> [1 of 2] Compiling Language.C.Inline.Cpp
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/src/Language/C/Inline/Cpp.hs:11:1: warning: [-Wunused-imports]
inline-c-cpp>     The import of ‘Data.Monoid’ is redundant
inline-c-cpp>       except perhaps to import instances from ‘Data.Monoid’
inline-c-cpp>     To import instances alone, use: import Data.Monoid()
inline-c-cpp>    |
inline-c-cpp> 11 | import           Data.Monoid ((<>), mempty)
inline-c-cpp>    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
inline-c-cpp> [2 of 2] Compiling Language.C.Inline.Cpp.Exceptions
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/In file included from cxx-src/HaskellException.cxx:1:0: error: 
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/include/HaskellException.hxx:26:23: error:
inline-c-cpp>      error: exception specification of overriding function is more lax than base version
inline-c-cpp>       virtual const char* what() const noexcept override;
inline-c-cpp>                           ^
inline-c-cpp>    |
inline-c-cpp> 26 |   virtual const char* what() const noexcept override;
inline-c-cpp>    |                       ^
inline-c-cpp> 
inline-c-cpp> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/exception:102:25: error:
inline-c-cpp>      note: overridden virtual function is here
inline-c-cpp>         virtual const char* what() const _NOEXCEPT;
inline-c-cpp>                             ^
inline-c-cpp>     |
inline-c-cpp> 102 |     virtual const char* what() const _NOEXCEPT;
inline-c-cpp>     |                         ^
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/In file included from cxx-src/HaskellException.cxx:1:0: error: 
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/include/HaskellException.hxx:26:35: error:
inline-c-cpp>      error: expected ';' at end of declaration list
inline-c-cpp>       virtual const char* what() const noexcept override;
inline-c-cpp>                                       ^
inline-c-cpp>                                       ;
inline-c-cpp>    |
inline-c-cpp> 26 |   virtual const char* what() const noexcept override;
inline-c-cpp>    |                                   ^
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/include/HaskellException.hxx:19:7: error:
inline-c-cpp>      error: exception specification of overriding function is more lax than base version
inline-c-cpp>    |
inline-c-cpp> 19 | class HaskellException : public std::exception {
inline-c-cpp>    |       ^
inline-c-cpp> class HaskellException : public std::exception {
inline-c-cpp>       ^
inline-c-cpp> 
inline-c-cpp> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/exception:101:13: error:
inline-c-cpp>      note: overridden virtual function is here
inline-c-cpp>         virtual ~exception() _NOEXCEPT;
inline-c-cpp>                 ^
inline-c-cpp>     |
inline-c-cpp> 101 |     virtual ~exception() _NOEXCEPT;
inline-c-cpp>     |             ^
inline-c-cpp> 
inline-c-cpp> /private/var/folders/pv/mtbzyjyj229g928n710c9d_40000gn/T/stack-d1fe17bdc8914168/inline-c-cpp-0.4.0.3/cxx-src/HaskellException.cxx:31:44: error:
inline-c-cpp>      error: expected function body after function declarator
inline-c-cpp>    |
inline-c-cpp> 31 | const char* HaskellException::what() const noexcept {
inline-c-cpp>    |                                            ^
inline-c-cpp> const char* HaskellException::what() const noexcept {
inline-c-cpp>                                            ^
inline-c-cpp> 4 errors generated.

Environment: macOS Catalina 10.15.7

@tscholak
Copy link
Contributor Author

@konn sorry I never followed up with you. It's not as easy to diagnose as I thought. Stack pulls its dependencies from Stackage independently of nix. It's possible that the version Stack uses is not working with ghc 8.10.2.

For those using Hasktorch with Cabal and nix, I can confirm that input-output-hk/haskell.nix@1abbd16 has resolved the issue originally described in this issue. I can use ghc 8.10.4 with hls again without problem. The necessary updates to hasktorch are in this branch, https://github.com/hasktorch/hasktorch/tree/mcwitt-update-deps, and I hope we can merge them soon as part of hasktorch/hasktorch#533.

@jneira jneira added status: blocked Not actionable, because blocked by upstream/GHC etc. can-workaround labels Mar 24, 2021
@jneira
Copy link
Member

jneira commented Mar 24, 2021

thanks for update the issue with the workaround for 8.10.4 until ghc-8.10.5 is released
I would keep open until that

@Anton-Latukha
Copy link
Collaborator

Anton-Latukha commented Dec 25, 2021

All prerequisites in this thread seem to be resolved (GHC reports, PRs (even backports are done) & releases made).

So, seems no longer depends on upstream.

It is still applies, or can be closed?

@tscholak
Copy link
Contributor Author

Hi @Anton-Latukha, thanks 🙏 this can be closed 😀

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
can-workaround status: blocked Not actionable, because blocked by upstream/GHC etc. type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..
Projects
None yet
Development

No branches or pull requests

5 participants