Skip to content
This repository has been archived by the owner on Feb 15, 2022. It is now read-only.

Simulator hangs or exits abruptly (again) #2636

Open
mmdiego opened this issue Feb 7, 2021 · 8 comments
Open

Simulator hangs or exits abruptly (again) #2636

mmdiego opened this issue Feb 7, 2021 · 8 comments
Labels

Comments

@mmdiego
Copy link
Contributor

mmdiego commented Feb 7, 2021

I've tracked several issues related to this problem is the past, and many attempts to fix it, but it seems it's not working for everybody (including me)

There was a critical change in commit 0ef671e to fix this issue, but it seems it worked for someones and broke it for others.

So, before the modification I found these related issues: #1922, #1971, #1976, #1977
After the modification I found these issues: #1983, #2487, #2315, #2412

I will tag some people involved in this issue: @jorisw @dlasher @Wheaties466 @tenaciousd93
It would be great to find a solution that works for everybody

Right now, I'm using node version 10.23, mongo version 3.2.11 and testing latest version of Zenbot unstable branch.
The command I'm using to test is:
node zenbot.js sim binance.BTC-USDT --strategy=noop --period=1m --days=7

For me, rolling back sim.js to version previous to the mentioned commit fixes the problem.

How can we procedd?

@kennylbj
Copy link

kennylbj commented Feb 8, 2021

Try to use node 8 or 14, in my case, node 10 always makes the process hang.

@mmdiego
Copy link
Contributor Author

mmdiego commented Feb 8, 2021

I've tried with different versions and they behave differently, but with none of them I got a consistent behaviour. So, I think that's not a fix for this problem. My idea is to find the way to fix this issue and make it work for any version, as that is what stated as requirements for Zenbot.

@DeviaVir DeviaVir added the bug label Feb 8, 2021
@jorisw
Copy link
Contributor

jorisw commented Feb 9, 2021

I welcome anyone to try Zenbot paper or live mode, first against Binance, then against GDAX, and debug the code that was altered in the mentioned commit. The change was necessary for me to keep Zenbot running against Binance, unfortunately it seems to have broken continuous operation against GDAX. I for one unfortunately don't have time to debug it, but I am confident that the code can be made to work for all exchanges.

@mmdiego
Copy link
Contributor Author

mmdiego commented Feb 9, 2021

But the problem I'm reporting has nothing to do with paper or live trading. It's in simulation and the modification was inside sim.js, especifically in getNext() function.
Also, I'm using Binance too. I don't know if it's related to the exchange.
I've been debugging it and I've seen the same behavoir described in #2487 .

@mmdiego
Copy link
Contributor Author

mmdiego commented Feb 9, 2021

I think this is harder than expected. I've been tracking other recent related changes and found these other issues:
#2412 : simulation not working
#2425 : fixed something related to lolex and mongodb that fixed simulation
#2600 : reverted back last change because sims results were incorrect

I think this is highly related to the simulator problem. And tracking back when this lolex thing was added (#1456), I found it was to fix simulation results against live trading.

I agree the problem is being caused by lolex and the recent incompatibility with newer versions on mongodb, but it seems the fix proposed in #2425 wasn't fully correct or something else is missing to be done.

@LuisAlejandro
Copy link
Contributor

I think the "recent versions of mongo" issue should be addressed pinning the mongo version to a fixed one that its well known to work. This should be done in the docker-compose files. For example:

mongodb:
    image: mongo:latest

to:

mongodb:
    image: mongo:4.4.2

@Makeshift
Copy link

I'm also (still) getting this issue. It's definitely something to do with lolex <-> Mongo, but I'm having trouble narrowing it down. I can only get it to appear when I'm doing a lot of sims at once (eg. using the genetic backtester) which makes it difficult to get a debugger involved.

So far I've managed to narrow down exactly which settings are required for lolex to keep sims consistent:

// engine.js withOnPeriod function
    if (!clock && so.mode !== 'live' && so.mode !== 'paper') {
      clock = lolex.install({
        shouldAdvanceTime: false,
        now: trade.time,
        toFake: ['setTimeout', 'Date']
      });
    }

The lack of faking 'date' was what broke sims in the previous PR. (#2425) Unfortunately, the above doesn't actually fix the issue.

Switching to a lower mongo version and changing Node versions didn't appear to help me, neither did dropping the mongodb library version down to 3.6.1. I even tried 3.5.0, but that slowed down sims to the point of being unuseable.

I've currently got a proof of concept replacing Mongo with Redis, so I'll see if this issue crops up there too.

@Makeshift
Copy link

Makeshift commented Dec 12, 2021

After all night of messing with it, I'm no closer to fixing it, but I did implement a workaround to kill sims that seem to be stuck:

diff --git a/lib/backtester.js b/lib/backtester.js
index f225d30f..d1860983 100644
--- a/lib/backtester.js
+++ b/lib/backtester.js
@@ -228,6 +228,9 @@ let ensureDirectoryExistence = function (filePath) {
   fs.mkdirSync(dirname)
 }

+let simPercent = {};
+let simProcs = {};
+
 let monitor = {
   periodDurations: [],
   phenotypes: [],
@@ -366,6 +369,13 @@ let monitor = {
           slowestP = p
           slowestEta = eta
         }
+        if (!simPercent[c.iteration]) simPercent[c.iteration] = [];
+        simPercent[c.iteration].push(percentage)
+        let lastPercents = simPercent[c.iteration].slice(Math.max(simPercent[c.iteration].length - 30, 0));
+        if (lastPercents.length === 30 && new Set(lastPercents).size == 1 && simProcs[c.iteration]) {
+          console.log(`${c.iteration} is stuck at ${lastPercents[0]}, killing it off!`);
+          simProcs[c.iteration].kill('SIGKILL')
+        }

         if (homeStretchMode)
           inProgressStr.push(`${(c.iteration + ':').gray} ${(percentage * 100).toFixed(1)}% ETA: ${monitor.distanceOfTimeInWords(eta, now)}`)
@@ -543,6 +553,7 @@ module.exports = {
     var cmdArgs = command.commandString.split(' ')
     var cmdName = cmdArgs.shift()
     const proc = spawn(cmdName, cmdArgs)
+    simProcs[command.iteration] = proc;
     var endData = ''

     proc.on('exit', () => {

Absolutely not ideal, but it appears to only happen to about ~2% of my sims, so I'll take it for now...

edit: Obviously this isn't useful if individual sims hang for you, this is only useful if a subset of sims hang when you're backtesting.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants