-
Notifications
You must be signed in to change notification settings - Fork 926
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[KED-2643] kedro install fails on pyspark starter #767
Comments
maybe something like this in context.py of the pyspark starters try:
from pyspark import SparkConf
from pyspark.sql import SparkSession
except ModuleNotFoundError:
logging.warning("This starter requires PySpark to function. "
"Run 'kedro install' to install project dependencies.") |
Hi @glebrh, thanks for flagging this issue! This indeed isn't working properly. I've created a ticket on our backlog to address it. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Thank you @glebrh for bringing this up! We just merged this PR with a quick fix (thanks @thver for the inspiration!). It will become effective once Kedro 0.17.5 is released. We will also be revisiting the flow triggered when calling any CLI command to potentially replace this quick fix with a more robust solution. |
@ignacioparicio can we close this as well? kedro-org/kedro-starters#38 |
Description
After creation of a new kedro project on a brand new conda environment using pyspark starter,
kedro install
fails.It seems that kedro tries to import module with project context (where import from pyspark is done) and fails, since spark is not yet installed.
Also, other cli commands (e.g. kedro --version) fail with the same error (while executed inside project's directory).
Steps to Reproduce
pip install kedro
kedro new --starter=pyspark
kedro install
Expected Result
kedro installs packages specified in requirements.txt
Not sure why cli goes into project settings. I guess there are several cli commands that do need to care about project specifics anyway.
Actual Result
Error with the following stacktrace:
Your Environment
MacOS Catalina (originally got it on Windows 10).
PyCharm CE 2020.3.2
Conda environment (Python 3.7.10)
Kedro 0.17.3
The text was updated successfully, but these errors were encountered: