All flytekitplugins maintained by the core team are added here. It is not necessary to add plugins here, but this is a good starting place.
Please file an issue
Flyte plugins are structured as micro-libs and can be authored in an independent repository. The plugins maintained by the core team are maintained in this repository and provide a simple way of discovery. When authoring plugins here are some tips
-
The folder name has to be
flytekit-*
. e.g.flytekit-hive
. In case you want to group for a specific service then useflytekit-aws-athena
. -
Flytekit plugins uses a concept called Namespace packages. Thus the package structure is very important. Use the following python package structure,
flytekit-myplugin/ - README.md - setup.py - flytekitplugins/ - myplugin/ - __init__.py - tests - __init__.py
NOTE the inner package
flytekitplugins
DOES NOT have an__init__.py
file. -
The published packages have to be named as
flytekitplugins-{package-name}
, where{package-name}
is a unique identifier for the plugin. -
The setup.py has the following template. You can simply copy paste it and edit the TODO sections
from setuptools import setup
# TODO put the plugin name here
PLUGIN_NAME = "<plugin-name e.g. pandera>"
# TODO decide if the plugin is regular or `data`
# for regular plugins
microlib_name = f"flytekitplugins-{PLUGIN_NAME}"
# For data/persistence plugins
# microlib_name = f"flytekitplugins-data-{PLUGIN_NAME}"
# TODO add additional requirements
plugin_requires = ["flytekit>=0.21.3,<1.0.0", "<other requirements>"]
__version__ = "0.0.0+develop"
setup(
name=microlib_name,
version=__version__,
author="flyteorg",
author_email="admin@flyte.org",
# TODO Edit the description
description="My awesome plugin.....",
# TODO alter the last part of the following URL
url="https://github.com/flyteorg/flytekit/tree/master/plugins/flytekit-...",
long_description=open("README.md").read(),
long_description_content_type="text/markdown",
namespace_packages=["flytekitplugins"],
packages=[f"flytekitplugins.{PLUGIN_NAME}"],
install_requires=plugin_requires,
license="apache2",
python_requires=">=3.7",
classifiers=[
"Intended Audience :: Science/Research",
"Intended Audience :: Developers",
"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Software Development",
"Topic :: Software Development :: Libraries",
"Topic :: Software Development :: Libraries :: Python Modules",
],
# TODO OPTIONAL
# FOR Plugins where auto-loading on installation is desirable, please uncomment this line and ensure that the
# __init__.py has the right modules available to be loaded, or point to the right module
# entry_points={"flytekit.plugins": [f"{PLUGIN_NAME}=flytekitplugins.{PLUGIN_NAME}"]},
)
-
Each plugin should have a README.md, which describes how to install it, and has a simple example for it.
-
Each plugin should have its own tests package NOTE it has an
__init__.py
file. -
There may be some cases in which you might want to Auto-load some of your modules when the plugin is installed. This is especially true for
data-plugins
andtype-plugins
. In such cases, you can add a special directive in thesetup.py
which will instruct flytekit to automatically load the prescribed modules. Following shows an excerpt from theflytekit-data-fsspec
plugin's setup.py
setup(
entry_points={"flytekit.plugins": [f"{PLUGIN_NAME}=flytekitplugins.{PLUGIN_NAME}"]},
)
- Examples:
- Example of a simple python task that allows adding some python side functionality only flytekit-greatexpectations
- Example of a TypeTransformer or a Type Plugin flytekit-pandera. These plugins add new types to Flyte and tell Flyte how to transform them and add additional features through types. Remeber, Flyte is a multi-lang system and type transformers allow marshalling between flytekit and backend and other languages.
- Example of TaskTemplate plugin, which also allows plugin writers to supply a prebuilt container for runtime. flytekit-sqlalchemy
- Example of SQL backend plugin. The actual query invocation is done by a backend plugin. flytekit-snowflake
- Example of a Meta plugin, that can wrap other tasks flytekit-papermill
- Example of a plugin that modifies the execution command flytekit-spark OR flytekit-aws-sagemaker
- Example that allows executing the user container with some other context modifications flytekit-kf-tensorflow
- Example of a Persistence Plugin, that allows data to be stored to different persistence layers flytekit-data-fsspec
Refer to this Blog to understand the idea of microlibs
Plugins should have their own unit tests