-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Error while parsing queries from json file #57
Comments
I think changing from parse(query) to parse(query['query'],query['name']) might work?
|
Looks like you are using the older version of the notebook but a newer version of the json file. Your change should work though. |
the new version of the notebook seems to be taking credentials and building catalog on the fly, I wanted to utilize the catalog generated by dbcat..please correct me if I'm wrong. |
I haven't removed any of the APIs. You can continue to use the example you are comfortable with. I intend to keep them unchanged and support their usage. Refer to
For official APIs without using a server, REST and sdk. |
I tried using the latest docker file..when I tried to execute the sample notebook, it gave me the following error:
On a side note, I'm planning on doing the server implementation without using docker, I thought I'll try to see if I can achieve using the docker file first. |
Ugh! psycopg2 packaging is giving me problems. Can you please download a previous version of the container and try? |
I tried the updated docker file, receiving the following error when I execute the sample notebook:
Looks like one of the sql statements is looking for a missing column called source_id to compare against parameters:
|
Can you recreate the database and recreate it with the latest dbcat? Looks like the database has an older version of the schema. |
I recreated the db within the container through CLI..was able to catalog but query parsing seems to be failing:
Code Executed:
|
Can you paste the output of 441 Error code is Most probably the catalog does not have information about
|
Thanks for all your support. You're right..looks like the catalog is not being loaded through docker(even though catalog.scan_source executes successfully), the catalog tables were empty and only the connection details were added into source table. Any suggestions on how to point Catalog to local host or have the catalog loaded in docker? I tried multiple solutions for the former(by editing docker compose file/data lineage engine file) but nothing worked. To be specific, to connect to local catalog: I tried adding localhost, host.docker.internal to CATALOG_HOST, also modified postgres_conf.sql to listen to all IP addresses and modified pg_hba.conf to include all IP addresses. Container log:
Output of catalog.get_table:
|
On a lighter note..I did a google search for 441, I should have looked through the code :) |
If you want to use an external Postgres database, replace the following parameters in
|
This was my first approach but it wasn't working..Here are my observations:
Values provided for External Catalog:
|
Ref: https://dev.to/natterstefan/docker-tip-how-to-get-host-s-ip-address-inside-a-docker-container-5anh Can you change to:
|
Thanks...I use Mac primarily and I have already tried using host.docker.internal..I also tried to playaround with networks listed in compose file..tokern-net and tokern-internal, nothing helped. Also changing the catalog_host value in tokern_lineage_engine.yml file doesn't seem to cause any effect, looks like only parameters from compose file(docker-compose.yml) are being considered. I'll keep trying and post an update if I was able to get any success. |
There were bugs in both the docker-compose files and in command line parsing. I've fixed them and there should be a new release published in pypi and docker hub in a few minutes. |
Thanks @vrajat ..I just tried the latest file, app was failing after pulling the latest dock file with additional parameters as given in the instructions..It worked when I replaced the catalog_host parameter manually on tokern_lineage_engine.yml file and added "tokern-net" to list of networks for tokern-api Couple of issues:
|
Can you paste the output of 'docker container logs ' ? The visualizer is not able to connect to the api. So it must be shutting down due to errors |
I restarted the the container..this time, it gave me no errors but I do not see any updated visuals at 127.0.0.1:8000 Visualizer log:
Lineage log:
|
I see that this error log got appended on lineage log after I posted the previous comment( would have took about ~3-5 mins)
|
This solution might help us in resolving the issue: https://stackoverflow.com/questions/55457069/how-to-fix-operationalerror-psycopg2-operationalerror-server-closed-the-conn |
Interesting. I did not hit these issues. Seems specific to Mac. I'll see how I can add make these changes appropriately. |
I've uploaded a potential fix by setting Instructions:
Then replace |
I'm new to docker, build_image.sh wasn't working..so I executed the command that I interpreted from the shell script..ended up with the following error: failed to compute cache key: "/docker/docker-entrypoint.sh" not found: not found Wasn't able to find anything from google..please let me know if you have any suggestions. Log:
|
You should run the command from the parent directory.
|
Thanks..I was able to successfully install and I think I figured out whats going on:
Next steps: I'll try to use the scan_source() to see if it helps. |
You need to call an API to set default schema:
This is a new feature to find tables without schema mentioned. I'll document it today. Sorry about that. Also it shouldn't fail if there is no default schema set. I'll look into that as well. Did the fix for operational error help? If yes, I'll commit it. |
yes..the fix definitely helped the app to be resilient. May I know what would be schema_id for default schema or the update_schema()? I'll go through the code as well to figure out |
Couple of issues:
Code executed:
Last line from visualizer log: *removed the query as its pretty long |
I've pushed a new release with fixes for the connection error as well as update_source. Can you please move to the latest I am also adding tests for the docker image before deploying as well as update documentation. Sorry for all the trouble. On on no lineage being visualized, can you paste logs from |
No probs..Here's the log, there's no sign of errors:
When I try to execute the same code sometime later, I started to receive the same error which was fixed for in the current version:
|
I'll try to run a stress test on my machine and reproduce the OperationalError. In the meantime, I can give you instructions to run the API server and visualizer. Do you want to give that a try? |
sure..that would help, thanks! |
I am using this opportunity to improve docs. I have added instructions here: https://tokern.io/docs/data-lineage/installation Can you please check if it helps you run in native mode? I'll also be grateful if you have any feedback on docs in general. I hope to add more info in the next few days. |
Thank you! I'll try out this week and give you an update |
Hi..Sorry for the delay, I tried the native mode and I had to make the following updates:
I have ended up with the following error after making the setup:
Code executed:
|
Looks like the server has an error. Can you make sure you are running the latest version ? v0.8.x ? Are there any logs from the server? It should print logs to stderr. |
Closing as there is no activity. |
Sorry..haven't got a chance to work on this, I will definitely take it up later and give some feedback |
I was able to successfully load catalog using dbcat but I'm geting the following error when I tried to parse queries using a file in json format(I also tried the given test file)
File "~/Python/3.8/lib/python/site-packages/data_lineage/parser/init.py", line 124, in parse
name = str(hash(sql))
TypeError: unhashable type: 'dict'
Here's line 124:
data-lineage/data_lineage/parser/__init__.py
Line 124 in f347484
Code executed:
The text was updated successfully, but these errors were encountered: