Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

OSRM taking long time to fetch the duration matrix #6140

Closed
kk2491 opened this issue Oct 3, 2021 · 15 comments
Closed

OSRM taking long time to fetch the duration matrix #6140

kk2491 opened this issue Oct 3, 2021 · 15 comments
Labels

Comments

@kk2491
Copy link

kk2491 commented Oct 3, 2021

Hi All,

First of all thanks for this amazing well maintained repository.

I noticed in rare cases OSRM takes huge amount of time to fetch the duration matrix. In my case, for 11 x 11 matrix, it took more than 400 seconds.
As I was going through the issues here, noticed an issue #6039 similar to this issue.
I tried to reproduce the issue by continuously querying OSRM in loop, but it was not successful.

Could you please confirm if both are same?
Also I see that there is an MR with fix has been merged with the master 3 days ago. Do we get this latest update when we use the docker for osrm-backend?

Thank you

@hoerup
Copy link
Contributor

hoerup commented Oct 3, 2021

Container images are only build on git tags and there havent been a tag/release after #6113 was merged

But if you are curious, it's quite easy to build the container image yourself

@mjjbell could you tag v5.26.1-beta1 (or similar) in order to trigger a build ?

@kk2491
Copy link
Author

kk2491 commented Oct 3, 2021

@hoerup Thanks for your swift response.
Yea, I have gone through the issue #6134 and it see that the steps are mentioned there. Is there any official documentation for the same?

Could you please also clarify my first question, > 400 seconds for 11 x 11 matrix is this possible? Does the Boost update fix this issue?

Thank you,
Kishor

@mjjbell
Copy link
Member

mjjbell commented Oct 3, 2021

@kk2491 can you provide reproduction steps of your setup and the queries your are making? It's difficult to answer this otherwise.

@kk2491
Copy link
Author

kk2491 commented Oct 4, 2021

To give a background, we have OSRM server configured in GCP VM instance, and a Cloud run service which queries the OSRM server for duration matrix in less than one second time interval.

I tried reproducing the issue by sending parallel duration matrix queries to OSRM server in infinite loops. I am partly successful in getting the issue reproduced.
For cost matrix of size 11 x 11, I see usually it takes less than a second to complete the query. However there are instance where it takes more than 50 seconds. But still this is very less when compared to what we see in previously.

Thank you

@nilsnolde
Copy link
Contributor

for reproduction I think the following is crucial:

  • exact commands & region used to produce the dataset
  • the exact http query

@kk2491
Copy link
Author

kk2491 commented Oct 4, 2021

@nilsnolde @mjjbell I was able to reproduce the issue. I used 3 parallel functions querying the OSRM duration matrix continuously by sending API requests.

2 of them querying for 10 x 10 matrix. And one of them was querying 222 x 222.

In most cases :

  • 2 x 2 was taking less than a second.
  • 222 x 222 was taking around 20 seconds

However there were rare cases where it was taking > 400 seconds.

Thank you

@nilsnolde
Copy link
Contributor

while it's nice that you could reproduce the issue. for someone to look into it, you will need to enable them to reproduce your exact (more or less) setup and queries (commit used to build, commads & region, http queries)..

@kk2491
Copy link
Author

kk2491 commented Oct 4, 2021

@nilsnolde @mjjbell I have attached the nodejs script here which I used to test and reproduce the issue.

Please replace the IP address of the OSRM. And also out these many iterations, I was able to find only 5 instances were the OSRM took > 400 seconds for 11 x 11 matrix.

const axios = require("axios");
const express = require("express");
const app = express();
const port = 2020;

// defining an endpoint to return all ads
app.get("/", (req, res) => {
  console.log("Home is working");
  res.send("Home! It is working! vro manager");
});

// starting the server
let server = app.listen(port, () => {
  console.log("listening on port " + port);
});

server.setTimeout(300000);

RunOsrmQuery1 = async function (osrmQueryId) {
  let numRequests = 10000;
  let i = 0;

  while (i < numRequests) {
    // let googlePolylineUrl =
    //   "http://OSRM_IP/table/v1/driving/polyline(afbaGnxb~Nx%60C%7CjXvjKo~%7B@odLdy%60Axmq@_reCatp@hw%60C_fU%7CsHvqa@mseAcsEb~QmxHlsO_yHz%7BQ)?annotations=duration";

    let googlePolylineUrl =
      "http://OSRM_IP/table/v1/driving/polyline(e%7B%60aG~hd~NfnVchs@ajVntq@cu%5C~oDpla@ueNoQtdCyvQhzZvnMomUrxCulH_cBbsHuqPlbo@sz%5E%7B%7CwAb@jCf%7DPyte@p%7C%60@%7CtmAlkF_pNuoa@co~@%60eBjw~AfvRqpMffMsueAo%7D%60@dgtAj%7Ch@ciGqojAktaAk~Exb%5Elj%7B@vi%5Cotn@ojh@itGjpy@vul@cji@n%7BJdjX%7By%7C@yk%5CtjDygHzrr@v%60%60@%7CrBbvBffDkd%5E_aa@upr@d~VxvhAkoXaiq@bgGrj~@tfTyeBdsI%7B~KunHzvL%60%7BM%7Djg@%7CqS%7BjKkb_AyrGppGjmkAsqd@sBdox@mnO%7DDnr@ysb@sqYfti@%60e%5BygCf%7Ct@rlFuckAamCwvgAu%7B_Ark%60Cltz@otd@%60yt@uf_@onjB_hFlsz@t%7Dk@yiE_xBkAGaZiZjjs@kp~@qiq@jp%7C@k%7B%5CedqAru%5Bn_tAanAzyU_eQ%60jFxyTg%60%60@%7B%5Dhn@d%60Dku%5EwhWh~a@d~UgzDikF_f@yfLazKxcPrjOax@fCwjN%7BnOcnAowU%7CrRbcf@mdC%7B%60Cqrs@n%7BUbCeDb%7B%7C@qkmBpTwB%7B%7CHt%60wAzfSszHsqOxhNe~IwlJlaXwhz@nm@jkBm%7CLny%60Axg@ddNxoo@o%60mAaxBfeF~oE~hV%60%7D@%60aJqlrBbzy@p%60CmaNh%60%7B@sf_@xzIyzG%60lE%7CjYcfKbyMzpLaihA?aA%7D%7Cd@zahAucDpjEbxSq%7D%5CtlNieFq%7CLdq%60A~AtDij~@c%7CsAuiKabOtpKd~NVg@dsKmyKenGe%7CQhgpAtir@q%60XbaGemA_%7BIwjPvbZb_M_tc@%7B%7CUm%7B%7D@omBbfJbd_@jxhA%60kMirKwdj@qyfApCtzNfme@~uu@hyEffJo~dAsyx@cjFxwPrn_BkdI%60nz@ff%5Ce~xAigHdbIdyCy%7COh_tBj%7DJ%7BzLyq%60AcqpBbcPwwFvvFuvMynKjaMlku@le%5CsmA%7B%7Df@_yCpqq@_cJgrAavUfsLkbJhj_@nxg@i%7Bd@rob@qk@a%7BwAgde@js@jK%7Cr%5Bkmg@dtBkuFlcWhxlAmpId_Pf%7BS%7Dim@gD%7DnW%7D%7CMhd~@k%5Cp%5DtDiGu%7DHs%60W%7BmC_t%7B@d%60NbytAbAPenBwdDwDRylLmxAz~Sx~Bv%7BCntT%7Bpw@_a%7B@fqNeux@hu%5CzzyBsrLavJphOynKyFstE_i%5Bilt@zF~%7DT%60fK~auAssGnrRpzBs_c@ocKbgN%60~Zccw@jiM%7ChKqp%5B%60HsaI_lQnpb@hhSpfAq%7BAk%7CQleH%7ClMmhKicN_hIlhJx_JoAvtM%7C%7CSovNofTrnApmR__KgmK%7CrL~HvQmhTnfQgjj@wpDguNzbDn~q@esPlgFfFa~CtkKfmCatKdrSwkb@hTg~CbeCnlPboNe%7CBgdV%7C%7CUse_@%7BgnAxrWwzVzlDtsgBzzC~tByys@e%7Dk@zvJtlN%60ed@vbYibFe%7DLuz%5CoyKyHgm@p%7Cd@bvw@xmq@_reC)?annotations=duration";

    try {
      var queryStartTime = Date.now();
      const response = await axios.get(googlePolylineUrl);
      var queryEndTime = Date.now();
      let timeTaken = (queryEndTime - queryStartTime) / 1000;
      console.log("RunOsrmQuery1 : Iteration : ", i, " Time taken : ", timeTaken);

      if (timeTaken > 30) {
        console.log("RunOsrmQuery1 : Taking longer time than usual");
      }
    } catch (err) {
      console.log("RunOsrmQuery1 Error : ", err);
    }
    i = i + 1;
    // await new Promise((resolve) => setTimeout(resolve, 500));
  }

  return;
};

RunOsrmQuery2 = async function (osrmQueryId) {
  let numRequests = 100000;
  let i = 0;

  while (i < numRequests) {
    let googlePolylineUrl =
      "http://OSRM_IP/table/v1/driving/polyline(afbaGnxb~Nx%60C%7CjXvjKo~%7B@odLdy%60Axmq@_reCatp@hw%60C_fU%7CsHvqa@mseAcsEb~QmxHlsO_yHz%7BQ)?annotations=duration";

    try {
      var queryStartTime = Date.now();
      const response = await axios.get(googlePolylineUrl);
      var queryEndTime = Date.now();
      let timeTaken = (queryEndTime - queryStartTime) / 1000;

      if (timeTaken > 30) {
        console.log("RunOsrmQuery2 : Taking longer time than usual");
      }

      console.log("RunOsrmQuery2 : Iteration : ", i, " Time taken : ", timeTaken);
    } catch (err) {
      console.log("RunOsrmQuery2 Error : ", err);
    }
    i = i + 1;
    // await new Promise((resolve) => setTimeout(resolve, 500));
  }

  return;
};

RunOsrmQuery3 = async function (osrmQueryId) {
  let numRequests = 100000;
  let i = 0;

  while (i < numRequests) {
    let googlePolylineUrl =
      "http://OSRM_IP/table/v1/driving/polyline(afbaGnxb~Nx%60C%7CjXvjKo~%7B@odLdy%60Axmq@_reCatp@hw%60C_fU%7CsHvqa@mseAcsEb~QmxHlsO_yHz%7BQ)?annotations=duration";

    try {
      var queryStartTime = Date.now();
      const response = await axios.get(googlePolylineUrl);
      var queryEndTime = Date.now();
      let timeTaken = (queryEndTime - queryStartTime) / 1000;

      if (timeTaken > 30) {
        console.log("RunOsrmQuery3 : Taking longer time than usual");
      }

      console.log("RunOsrmQuery3 : Iteration : ", i, " Time taken : ", timeTaken);
    } catch (err) {
      console.log("RunOsrmQuery3 Error : ", err);
    }
    i = i + 1;
    // await new Promise((resolve) => setTimeout(resolve, 500));
  }

  return;
};

RunOsrmQuery1("query-1");

RunOsrmQuery2("query-2");

RunOsrmQuery3("query-3");

@kk2491
Copy link
Author

kk2491 commented Oct 5, 2021

@nilsnolde @mjjbell Good day. Did you guys get chance to have a look at this?

@mjjbell
Copy link
Member

mjjbell commented Oct 6, 2021

Check the access logs for your OSRM instance. It will output a request time for each query. See if that matches with the durations your script is showing. If not, it could be an issue with the network setup.

And yes, if you want further help, you need to provide details of your OSRM setup so that we can try to reproduce the behaviour:

  • OSRM version
  • OSM dataset used
  • Preprocessing commands (osrm-extract, osrm-contract, etc.)
  • Server spec (RAM, CPUs, etc)

These are all contributory factors to request time, so it's not possible to debug without them.

@kk2491
Copy link
Author

kk2491 commented Oct 7, 2021

@mjjbell Thanks for your details, will provide you the above details soon.

@kk2491
Copy link
Author

kk2491 commented Oct 12, 2021

@mjjbell Apologies for the delay in response, please find the below information.

  • OSRM version : 5.25.0 (We are using the docker container, I assume docker comes with 5.25.0 version. Correct me if I am wrong here)
  • OSRM dataset used : Planet
  • Pre-processing commands : osrm-extract (We are using the documentation from here)
  • Server spec (RAM, CPUs, etc) : 40 vCPUs, 961 GB memory (Running on GCP)

It would be great, if you could help us to pin point the issue. Thanks in advance.

Thank you

@danpat
Copy link
Member

danpat commented Oct 12, 2021

@kk2491 Are you able to watch the CPU usage inside the container running osrm-routed during the wait? Is there a core pinned to 100% usage while it works, or is everything idle?

@kk2491
Copy link
Author

kk2491 commented Oct 12, 2021

@danpat That's the part we are unable to perform.
This issue is so random and rare, it is difficult to keep an eye on the core wise utilization.

Copy link

github-actions bot commented Jul 8, 2024

This issue seems to be stale. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Jul 8, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants