Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Connection to firestore is getting lost after sometime with a message about "invalid authentication credentials". #130

Closed
skywritergr opened this issue Jul 27, 2020 · 19 comments
Labels
api: firestore Issues related to the googleapis/python-firestore API. external This issue is blocked on a bug with the actual product. needs more info This issue needs more information from the customer to proceed. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@skywritergr
Copy link

Environment details

  • OS type and version: Ubuntu (inside a docker container)
  • Python version: 3.7
  • pip version: 20.0.2
  • google-cloud-firestore version: 1.6.2

Steps to reproduce

This is a very weird one to reproduce. What is happening is that I deploy my docker container to Google Cloud Run and for some time everything works fine. I do successfully connect to the Firestore and retrieve data. After some point though things stop working and I get a Exception: 401 Request had invalid authentication credentials.

So very broad steps would be

  1. Deploy a container in Google Cloud run
  2. Execute some requests to Firestore. They should be successful
  3. Wait some time (the container will go into hibernate as that point if you don't have enough traffic)
  4. Throw some more requests to Firestore. Watch it fail.

Code example

from firebase_admin import firestore, initialize_app, credentials

cred = credentials.ApplicationDefault()
initialize_app(cred, {
  'projectId': 'my-project-id-1234',
})

db = firestore.client()

def function_A:
 ...
 data = db.collection(u'data').where(u'title', u'==', name).stream()
 ...

def function_B:
 ....

Stack trace

Exception: 401 Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

I have been trying to get to the bottom of this but I honestly don't know where the error is. Some assumptions:

  • It seems that after some time the auth credentials expire. So somehow I need to renew them, but I have no idea what's the best way to do this.
  • It could be the fact that when the container goes into hibernate the connection is lost, but when I tried something similar locally on docker by doing docker pause <container> the connection worked fine after "awakening" the container. But that doesn't mean that this is how GCR works.
  • I did try to initialize_app on every API request that uses that resource (essentially moving the code inside function_A) but that is obviously a hug hack and it makes the application complain that the app is already initialized.

I would really appreciate any pointers.

@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/python-firestore API. label Jul 27, 2020
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Jul 28, 2020
@crwilcox
Copy link
Contributor

Hi @skywritergr

When you say 'some time' before the credentials cease working, how long is that? Minutes, an hour, a day?

@crwilcox crwilcox added needs more info This issue needs more information from the customer to proceed. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Aug 20, 2020
@skywritergr
Copy link
Author

In the latest example I can see a request working as expected at: 2020-09-07 10:26:24.804 BST and then in the logs I can see the first batch of failures at 2020-09-07 17:13:07.161 BST. So I would say definitely hours.

@tseaver
Copy link
Contributor

tseaver commented Sep 16, 2020

@skywritergr Can you try updating google-cloud-firestore to version 1.9.0? Also, can you show the output of pip list in your environment?

@tseaver tseaver added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Sep 16, 2020
@skywritergr
Copy link
Author

skywritergr commented Sep 16, 2020

Thanks for your reply @tseaver, I have now updated to version 1.9.0.

Here is what I get after doing pip3 list (for what is worth, and not sure if it changes anything, I use poetry to handle the dependencies of my project)

Package            Version   
------------------ ----------
astroid            2.3.3     
attrs              19.3.0    
autopep8           1.4.4     
CacheControl       0.12.6    
cachy              0.3.0     
certifi            2020.4.5.1
chardet            3.0.4     
cleo               0.7.6     
clikit             0.4.3     
docutils           0.16      
html5lib           1.0.1     
idna               2.9       
importlib-metadata 1.1.3     
isort              4.3.4     
jsonschema         3.2.0     
keyring            20.0.1    
lazy-object-proxy  1.4.3     
lockfile           0.12.2    
mccabe             0.6.1     
msgpack            1.0.0     
pastel             0.2.0     
pexpect            4.8.0     
pip                20.0.2    
pkginfo            1.5.0.1   
poetry             1.0.5     
ptyprocess         0.6.0     
pycodestyle        2.5.0     
pylev              1.3.0     
pylint             2.4.3     
pyparsing          2.4.7     
pyrsistent         0.14.11   
requests           2.23.0    
requests-toolbelt  0.8.0     
rstcheck           3.3.1     
setuptools         46.0.0    
shellingham        1.3.2     
six                1.13.0    
tomlkit            0.5.11    
typed-ast          1.4.0     
urllib3            1.25.9    
virtualenv         16.2.0    
webencodings       0.5.1     
wheel              0.34.2    
wrapt              1.11.2    
zipp               3.1.0 

@tseaver
Copy link
Contributor

tseaver commented Sep 21, 2020

Hmm, pip3 list doesn't show anything related to google-cloud-* at all. Can you show the equivalent under (maybe poetry export -f requirements.txt? I'm not familiar with it).

@skywritergr
Copy link
Author

skywritergr commented Sep 22, 2020

The pyproject.toml, which is the equivalent of the package.json for the frontend, has the following dependancies.

[tool.poetry.dependencies]
python = "^3.7"
tmdbsimple = "^2.2.0"
flask = "^1.1.2"
requests = "^2.23.0"
beautifulsoup4 = "^4.9.0"
flask_cors = "^3.0.8"
requests_cache = "^0.5.2"
simplejson = "^3.17.0"
firebase-admin = "^4.4.0"
uwsgi = "^2.0.18"

[tool.poetry.dev-dependencies]
pytest = "^4.6"
flake8 = "^3.7.9"
mypy = "^0.770"
black = "^19.10b0"

The requirements.txt exports gives me the following list: https://pastebin.com/hLCRTFpY

@skywritergr
Copy link
Author

I thought that the problem might have solved itself by updating the library. It was working fine for a few days now. Today it happened again. Same error as before.

What I was doing:

  • Started a local server on my machine, developing the frontend of my app
  • Hitting the backend that it's already deployed in Google Cloud Run and inside it it's hitting firestore.
  • Everything was working fine for about an hour.
  • After an hour or so I'm getting HTTP 401 again
Exception: 401 Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

@zerotrickpony
Copy link

I also see this problem with the python client. I have several long-running download scripts, and they fail after one hour with this error. I assume that the token is expiring? I am going to try automatically expiring and re-creating the client every 30 minutes and see if that helps...

@paulharter
Copy link
Contributor

paulharter commented Dec 10, 2020

I have a wrapper I use around the Firestore client that automatically does the oauth refresh dance.

It does a bunch of other stuff you probably don't want/need but you can see the auth workflow here

https://github.com/paulharter/fam/blob/master/src/fam/database/firestore.py

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jan 23, 2021
@crwilcox crwilcox added the external This issue is blocked on a bug with the actual product. label Mar 1, 2021
@yoshi-automation yoshi-automation removed the 🚨 This issue needs some love. label Mar 1, 2021
@crwilcox
Copy link
Contributor

crwilcox commented Mar 2, 2021

One thing stood out: Cloud Run has a max time of 3600 seconds (an hour), and @skywritergr you state it was fine for an hour. But I'd imagine the request would time out at that point. Or are the requests smaller requests over the time period? The sample included is a streaming call I assumed was very large.

Also, when docker pause was tried local, how long did you leave it paused?

@skywritergr
Copy link
Author

skywritergr commented Mar 11, 2021

I admit that I paused it for maybe a couple of minutes. Not longer than that. Also the request isn't very large at all. The data are actually the IMDB file document ingested into firebase. So the Python request is a streaming call into firebase that searches for a movie.

It's the dataset from here: https://www.imdb.com/interfaces/ (the ratings one I think)

@crwilcox
Copy link
Contributor

@skywritergr is that how long it is pausing in cloud run? Or is the pause there >60 minutes?

@skywritergr
Copy link
Author

Oh definitely more than 60 minutes on cloud run. This is a very low traffic website that might get 3-4 searches a day.

@crwilcox
Copy link
Contributor

I think auth needs to be refreshed every 60 minutes, and I believe what you are observing are the credentials used by the stream going stale.

@skywritergr
Copy link
Author

skywritergr commented Mar 11, 2021

I saw your message in the other issue. Is the working assumption that the token refresh is happening in the middle of a streaming event and that fails to complete? I can see it happening somewhere in the beginning.

  • A stream starts
  • Credentials are stale
  • Credentials need updating but stream is already underway.

@busunkim96
Copy link
Contributor

Hi @skywritergr if it's not too much trouble, could you enable gRPC logging and let us know what you see?

GRPC_TRACE=plugin_credentials
GRPC_VERBOSITY=DEBUG

https://github.com/grpc/grpc/blob/master/TROUBLESHOOTING.md
https://github.com/grpc/grpc/blob/master/doc/environment_variables.md

This was suggested by a gRPC engineer for tracking down the root cause: https://groups.google.com/g/grpc-io/c/5rOhCcqyHng

@busunkim96
Copy link
Contributor

busunkim96 commented Sep 7, 2021

Hi - could the original author (@skywritergr) or a repo maintainer look this over please? I am not very familiar with Firestore and may have ended up making some incorrect assumptions.

Re-reading the original post it sounds like each end user request results in a new, separate call to firestore. Is this correct, or does the .stream() call stay alive even when the container goes idle?

  • A user makes a request that results in function_a being triggered. function_a queries firestore for items that match query.
  • Container is put in "idle mode" as there are no requests. The client object and its credential is preserved even when the container goes idle.
  • After a few hours, a user makes another request that triggers function_a. SInce there's a pre-existing firestore client, the credentials are re-used. However, the (presumably) expired token is not refreshed pre-request and the request fails with 401.

If this sounds like the correct flow of events, this looks very similar to googleapis/python-api-core#223.

Notes for reproducing:

To make sure the container goes into "idle mode" instead of shutting down, min-instances must be at least 1. (see docs).

@tseaver
Copy link
Contributor

tseaver commented Nov 18, 2021

@skywritergr, a couple of observations.

  • Query.stream returns a generator, but does not make any API call until you begin iterating over the generated results. Your sketch above doesn't show any such iteration -- is it possible that the data in functionA above is being stashed somehow, without first iterating, and then the failure comes later (after token expiration)?

  • You haven't shown the actual "stack trace", but only the exception, so we can't see where in the Firestore code the exception is being raised.

  • The requirements.txt export link is expired. Could you upload it as a Github gist, instead? Those last basically forever.

@meredithslota
Copy link
Contributor

We don't have enough info to continue debugging this, so I'm going to close it out. If the original author or anyone else is seeing an issue like this, can you open a new issue with repro steps? Thank you!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
api: firestore Issues related to the googleapis/python-firestore API. external This issue is blocked on a bug with the actual product. needs more info This issue needs more information from the customer to proceed. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

9 participants