-
Notifications
You must be signed in to change notification settings - Fork 314
load_table_from_dataframe support JSON column dtype #1966
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
I have also tried uploading the data as dtype STRING in hopes that it would be converted to JSON server-side, but that also results in the following error: |
I was able to upload my data with the following code, but JSON support should be added for pandas dataframes. Code:
|
Thanks @jlynchMicron for providing a workaround. I think there are a few problems we'll need to work through, one of which is that the bigquery backend doesn't support JSON in load jobs from Parquet files: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-parquet#type_conversions Please file a Feature Request https://cloud.google.com/support/docs/issue-trackers (specifically, Create new BigQuery issue). Another possible workaround is to use CSV as the source format. In your example
For googlers watching this issue, I have proposed a design go/bf-json which proposes a JSONDtype in https://github.com/googleapis/python-db-dtypes-pandas which would allow us to autodetect when JSON is used in a DataFrame. Before that though, we could do the same and choose the appropriate serialization format depending on the provided schema. For example, parquet must be used with STRUCT/ARRAY columns, but CSV must be used for JSON. |
googleapis/python-db-dtypes-pandas#284 can potentially fulfill this feature request. |
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Is your feature request related to a problem? Please describe.
python-bigquery does not seem to currently support uploading dataframes where one of the columns in the destination table is JSON dtype.
Quick partial code example:
Result:
google.api_core.exceptions.BadRequest: 400 Unsupported field type: JSON; reason: invalid, message: Unsupported field type: JSON
Describe the solution you'd like
Implement support for loading data to BigQuery that contain JSON columns.
Additional context
Related issues:
googleapis/python-bigquery-sqlalchemy#399
googleapis/python-bigquery-pandas#698
googleapis/python-bigquery-dataframes#816
The text was updated successfully, but these errors were encountered: