-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Readiness reported as success when FeatureHub is down #160
Comments
HI Amanda, questions like these are good because then I can add extra documentation to the "How the SDK works" docs :-) The docs for repository readiness are here: https://docs.featurehub.io/featurehub/1.7.0/sdks-development.html#_repository_readiness Because the repository is a cache, which depending on how the connector connects and updates - may be out of date for a period of time, its important that it stays Ready once it has received its first set of features. So once it moves from Not Ready -> Ready, it shouldn't ever (except in one circumstance) move back to Not Ready. It would only change to 'Failed". The situation whereby it will return to not ready is when you are using a server evaluated key and you change the context and "build" - then the repository does not reflect in any real sense the right feature values, so it swaps back to NotReady. Its quite important particularly for node servers not to stop being ready otherwise if your FeatureHub install goes down, then all your server apps would go down as well, and if your FeatureHub install is down, its not updating any features so its not really a problem there is no connection. |
Thank you for your explanation, and quick response. Sorry I missed that in the docs. I misunderstood from reading the SDK README (https://github.com/featurehub-io/featurehub-javascript-sdk#what-is-the-deal-with-readyness), so a link from the README to the page you suggested above might make it clearer for others to avoid confusion. |
If I exposed the Edge connector as a queryable status for a prometheus metric, would that be more useful? Or should we do both? I am thinking of looking for an env var that would trigger logging connection state? |
Oh, and by the way, you didn't miss it in the docs :-) I wrote it to clarify how the readiness works essentially as a FAQ and for any new SDKs, we keep adding to the SDKs docs as we build new SDKs. Would maybe a different FHLog level for Edge connector issues be more suitable then you can control it? Suggestions are welcome! |
Great addition to the docs then -> it's nice and clear! What we've ended up doing is having to pick out the logs whereby we see "refreshing connection in case of staleness" AND it has error.message - as that seems to be the case whereby we've actually got a problem connecting, compared to the "refreshing connection in case of staleness" which just has {"type":"error"} as we didn't want to also log all the standard refresh connection messages. We ended up, deciding logging was enough - so some easier way of picking up the logs would make it easier, so if those were at a new log-level that would make it easier. Ideally if it was easier to find out if we were recovering a previously failed connection, compared to a just refresh from staleness would be great. But I'm not sure if actually there really isn't any difference, as you only notice the failure because you are refreshing the connection - so essentially the logic really is the same in both cases. |
Ok, I'll have a think about how I can better surface this as SSE is often quite tricky to ascertain if you got an actual error or the connection dropped because of staleness as you say. Maybe a threshold of failure? like if you get a reconnect failure X times in a row we should start logging it? |
Hey, I ran a local node.js app that successfully connected to FH server, and then I just disconnected my LAN cable to see if some kind of error will appear, but didn't get anything in the logs nor did the readiness status change. I think it's important because the FH server might still be up&running allowing users to still change flags but the app itself doesn't get updates. |
Yes, it is probably worthwhile to add a callback when there is any kind of connectivity issue. There are some changes coming down the line around how SSE and polling are handled as well that have been trialled in a couple of other SDKs so we will be sure to add this capability in. |
Describe the bug
If my client code successfully manages to connect to FeatureHub at startup, then readiness is reported as Ready. I have added both a timer to regular check readiness, and also registered a ReadinessListener. However, if I stop featurehub then the client still always report that the readyness is true.
Which area does this issue belong to?
To Reproduce
Steps to reproduce the behavior:
The readyness listener has a simple log statement to report the readyness, eg..
console.log('Raw listener readyness is now ' + readiness);
The timer also reports readyness, and simply has:
Whereby the VERBOSE log is from the fhLog trace, and the readynesslistener is reporting the NotReady and Ready, and the timer then goes off and sees ready.
It never seems to discover that FeatureHub is down, as FeatureHub can be left down for a long time, and still we see this cycle.
Expected behavior
When FeatureHub is not running then we expect that the readinesslistener is called with NotReady, and that the readiness flag if called would report true.
Screenshots
If applicable, add screenshots to help explain your problem.
Versions
You can get the version of the FeatureHub container by running
docker ps
command.Please include the OS and what version of the OS and Docker you're running.
Additional context
Putting in some extra debug then there is different detailed in the
refreshing connection in case of staleness
in the case of when featurehub is down.When it is the standard refresh then the args on the refresh are:
but when featurehub is down we see:
where featurehub is the name of the dockercontainer that I was running the featurehub party/server in.
The text was updated successfully, but these errors were encountered: