Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Support declaring variant for use in package name #1241

Merged

Conversation

kgpayne
Copy link
Contributor

@kgpayne kgpayne commented Dec 5, 2022

This resolves a case where the package name differs from the executable name (e.g. meltanolabs-tap-snowflake with executable tap-snowflake)


📚 Documentation preview 📚: https://meltano-sdk--1241.org.readthedocs.build/en/1241/

@codecov
Copy link

codecov bot commented Dec 5, 2022

Codecov Report

Merging #1241 (27122e7) into main (3fcac26) will not change coverage.
The diff coverage is n/a.

❗ Current head 27122e7 differs from pull request most recent head fb6481c. Consider uploading reports for the commit fb6481c to get more accurate results

@@           Coverage Diff           @@
##             main    #1241   +/-   ##
=======================================
  Coverage   85.19%   85.19%           
=======================================
  Files          54       54           
  Lines        4722     4722           
  Branches      803      803           
=======================================
  Hits         4023     4023           
  Misses        507      507           
  Partials      192      192           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@edgarrmondragon
Copy link
Collaborator

pre-commit.ci autofix

@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 5, 2022

@edgarrmondragon @aaronsteers FYI this resolves an issue I found where meltanolabs-tap-snowflake can't report its version correctly using --about. Setting cls.name to meltano-tap-snowflake would cause issues as the executable is tap-snowflake (by convention), so we either have to allow the user to set the package name explicitly (if it is different from the exec name) or discover the version by some other means. Open to other solutions, but this seemed cleanest (vs. an explicit __version__ = "<version>" somewhere, or passing pyproject.toml 😅). WDYT?

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Dec 5, 2022

Setting cls.name to meltano-tap-snowflake would cause issues as the executable is tap-snowflake (by convention)

@kgpayne Can you use tap-snowflake for the executable and meltano-tap-snowflake for the package name in pyproject.toml and cls.name?

singer_sdk/plugin_base.py Outdated Show resolved Hide resolved
@aaronsteers aaronsteers changed the title feat: support sourcing version when package name differs from executable name feat: support declaring package name separate from executable name Dec 5, 2022
Co-authored-by: Edgar R. M. <edgar@meltano.com>
@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 5, 2022

@edgarrmondragon

Can you use tap-snowflake for the executable and meltano-tap-snowflake for the package name in pyproject.toml and cls.name?

Here is the pyproject.toml I am working on. The project name is meltanolabs-tap-snowflake which controls the wheel and PyPI name, but also included is the tap_snowflake module and an entry point called tap-snowflake for the CLI 😅 In the SDK, it seems Tap.name is mostly used for log outputs and as a stream prefix. I guess I could technically use meltanolabs-tap-snowflake in all places (except the cli) but it seems unnecessary verbose. It is also neat for all variants of tap-snowflake to produce the same streams (ideally), and for the logs to reference the same name as the executable output (rather than its package). Just as with the PPW plugins (e.g. pipelinewise-tap-snowflake) the meltanolabs- prefix is only important to avoid PyPI name clashes; in all other respects the package is really tap-snowflake 🙂

@edgarrmondragon
Copy link
Collaborator

Here is the pyproject.toml I am working on. The project name is meltanolabs-tap-snowflake which controls the wheel and PyPI name, but also included is the tap_snowflake module and an entry point called tap-snowflake for the CLI 😅 In the SDK, it seems Tap.name is mostly used for log outputs and as a stream prefix. I guess I could technically use meltanolabs-tap-snowflake in all places (except the cli) but it seems unnecessary verbose. It is also neat for all variants of tap-snowflake to produce the same streams (ideally), and for the logs to reference the same name as the executable output (rather than its package). Just as with the PPW plugins (e.g. pipelinewise-tap-snowflake) the meltanolabs- prefix is only important to avoid PyPI name clashes; in all other respects the package is really tap-snowflake 🙂

@kgpayne That makes a lot of sense, thanks for the detailed explanation! Yeah, the package name is only really necessary for extracting the package version at the moment.

@kgpayne kgpayne marked this pull request as ready for review December 5, 2022 20:02
@kgpayne kgpayne requested a review from cjohnhanson as a code owner December 5, 2022 20:03
@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Dec 5, 2022

@kgpayne can you update the cookiecutter templates to

  • prompt for the package name and default to the plugin name
  • declare the package_name in the generated plugin class

@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 5, 2022

@edgarrmondragon what do you think about adding a variant name prompt to the cookiecutter, defaulting to None, rather than a free-form package name? If provided, the template will prepend the variant and a "-" to the tap id, and if not just the plain tap id will be used. I think this might encourage a more formal <variant name>-<tap id> convention, and would allow us to inject that variant info elsewhere in future (e.g. --about for even easier hub PRs) 🙂

@kgpayne kgpayne marked this pull request as draft December 5, 2022 21:11
@edgarrmondragon
Copy link
Collaborator

@edgarrmondragon what do you think about adding a variant name prompt to the cookiecutter, defaulting to None, rather than a free-form package name? If provided, the template will prepend the variant and a "-" to the tap id, and if not just the plain tap id will be used. I think this might encourage a more formal <variant name>-<tap id> convention, and would allow us to inject that variant info elsewhere in future (e.g. --about for even easier hub PRs) 🙂

@kgpayne I think that would require also adding the variant name attribute to the plugin class, which is probably beyond the scope of this PR. That's definitely useful and worth its own issue, though.

I'm not proposing a completely free-form package name, but rather default to the answer given for tap_id. Similar to what we already have for library_name:

"tap_id": "tap-{{ cookiecutter.source_name.lower() }}",
"library_name": "{{ cookiecutter.tap_id.replace('-', '_') }}",

Wdyt?

Ken Payne added 3 commits December 6, 2022 14:06
…e' of github.com:meltano/sdk into kgpayne/version-when-package-name-differs-from-exec-name
@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 6, 2022

@edgarrmondragon as I was working on the package_name property, it made sense to just go ahead and implement with variant rather than package_name in the cookiecutters. This solves the meltanolabs-tap-snowflake issue and the common case, and leaves the door open to power users to overload the package_name property if they really want to customise 🙂 Also I think this snippet is neater than package_name directly:

class TapExample(Tap):

    name = "tap-example"
	variant = "meltanolabs"
	# Rather than:
    # package_name = "meltanolabs-tap-example"

This PR is accompanied by a docs PR in meltano#7050

@kgpayne kgpayne marked this pull request as ready for review December 6, 2022 12:53
@kgpayne kgpayne requested a review from a team as a code owner December 6, 2022 12:53
@kgpayne kgpayne requested a review from aaronsteers December 6, 2022 15:12
Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this basically makes sense. The last thing I want to confirm before we merge is the comment in #1241 (comment):

Does python provide any facility to auto-detect the library name? For instance, can we do something like import sys; module_name = vars(sys.modules[__name__])['__package__'], except using the class name reference and not our own SDK code as the __name__ ref?

Possible lead here.

Maybe this way, if we assume (and it seems like a safe bet) that the tap is being invoked as an app and not as a library?

import sys
sys.modules['__main__']

I feel like there should be some way to auto-detect the library name - although it's also feasible that it just isn't possible to do so. (And if not, for my part, I think this PR is basically ready to go.)

@aaronsteers
Copy link
Contributor

aaronsteers commented Dec 6, 2022

@edgarrmondragon as I was working on the package_name property, it made sense to just go ahead and implement with variant rather than package_name in the cookiecutters. This solves the meltanolabs-tap-snowflake issue and the common case, and leaves the door open to power users to overload the package_name property if they really want to customise 🙂 Also I think this snippet is neater than package_name directly:

class TapExample(Tap):

    name = "tap-example"
	variant = "meltanolabs"
	# Rather than:
    # package_name = "meltanolabs-tap-example"

This PR is accompanied by a docs PR in meltano#7050

I just noticed the above in the thread.

@kgpayne - In the context of #1241, I really don't know if we should add any new references to variant until we do some work on long-term naming patterns.

Also, if I read this correctly, the above infers that the folder name will have the variant as a prefix, which I think we would not want to implement as default.


🥷 Edit:

Regarding the case with meltanolabs-tap-snowflake, for instance, I think it might be worth renaming the folder meltanolabs_tap_snowflake (and therefor the library name) to simply tap_snowflake. But either way, we should encourage decouple the library name from the published package name - especially since forks of the meltanolabs/tap-snowflake repo would have an easier time with the main source code folder being tap_snowflake, since our own meltanolabs- prefix is taken already on PyPi.

None of this changes the importance of this issue though, since we either way want to properly support taps and targets even if library name is not what we expect.

@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 7, 2022

@aaronsteers @edgarrmondragon we are completely aligned on folder and invocation names - as above the variant only applies to the package name in pyproject.toml and as the lookup name when finding the version in the SDK 🙂👍 I 100% agree that the tap should in all ways (folders, CLI entry point etc.) use the unmodified names (i.e. tap-example and tap_example). This code simply adds the variant name prefix to the name field of pyproject.toml if one is supplied (defaulting to None (Skip)). This saves the user having to figure out the packages = [...] addition that is also required when the library name does not match the project/package name. Here is a snippet that illustrates the change:

[tool.poetry]
{%- if cookiecutter.variant != "None (Skip)" %}
name = "{{cookiecutter.variant}}-{{cookiecutter.target_id}}"
{%- else %}
name = "{{cookiecutter.target_id}}"
{%- endif %}
version = "0.0.1"
description = "`{{cookiecutter.target_id}}` is a Singer target for {{cookiecutter.destination_name}}, built with the Meltano Singer SDK."
readme = "README.md"
authors = ["{{ cookiecutter.admin_name }}"]
keywords = [
    "ELT",
    "{{cookiecutter.destination_name}}",
]
license = "Apache 2.0"
{%- if cookiecutter.variant != "None (Skip)" %}
packages = [
    { include = "{{cookiecutter.library_name}}" },
]
{%- endif %}

cookiecutter.variant is not used in any other places (including folder names), which remain determined by cookiecutter.library_name. Sorry for any confusion on this!

Ken Payne added 3 commits December 7, 2022 12:20
…e' of github.com:meltano/sdk into kgpayne/version-when-package-name-differs-from-exec-name
@kgpayne
Copy link
Contributor Author

kgpayne commented Dec 7, 2022

Another update re: @aaronsteers question on avoiding hard-coding the variant or package name:

Does python provide any facility to auto-detect the library name? For instance, can we do something like import sys; module_name = vars(sys.modules[name])['package'], except using the class name reference and not our own SDK code as the name ref?

I was able to figure something out 🎉

Have removed the version-related changes in this PR to here, so as to separate any ongoing variant packaging discussions from the bug fix that prompted them.

Initially I found importlib.metadata.packages_distributions(), which provides a mapping of package_name to distributions. Unfortunately it was only introduced in 3.10 (and back-ported by importlib-metadata for <3.8), which leaves us hanging for 3.8 and 3.9 🤦‍♂️ But I found another solution on SO that works by iterating over the Distributions on sys.path to check included paths for file-path matches of the current package. This works in all supported python versions, so is likely the most robust approach 🚀

@aaronsteers
Copy link
Contributor

Awesome, great news, @kgpayne !

@edgarrmondragon
Copy link
Collaborator

@kgpayne is this blocked by #1254?

@kgpayne
Copy link
Contributor Author

kgpayne commented Feb 8, 2023

@edgarrmondragon no, this is standalone. It basically just adds the variant prefix to the tool.poetry.name value and adds the packages snippet with the un-prefixed plugin name 🙂 Its just a convenience for folks who regularly create taps/targets to be released on PyPi with a variant prefix (e.g. meltanolabs-tap-snowflake) 👍 Related Meltano docs change: meltano#7050

@edgarrmondragon edgarrmondragon changed the title feat: support declaring variant for use in package name feat: Support declaring variant for use in package name Feb 8, 2023
@kgpayne kgpayne enabled auto-merge (squash) February 9, 2023 08:59
@kgpayne kgpayne merged commit a304bb2 into main Feb 9, 2023
@kgpayne kgpayne deleted the kgpayne/version-when-package-name-differs-from-exec-name branch February 9, 2023 09:09
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants