-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add an interface for TransformationService and a basic implementation #1932
Add an interface for TransformationService and a basic implementation #1932
Conversation
@achals: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -423,6 +423,20 @@ def serve_command(ctx: click.Context, port: int): | |||
store.serve(port) | |||
|
|||
|
|||
@cli.command("serve_transformations") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to use a different serving (compared to just serve
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so that the transformation server can be started up separately from the feature server. I can collapse in a single command if that'd be simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I like the idea of rolling it into a single server so that users don't need to run multiple servers during dev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do want the option though because this is also intended to work standalone when users deploy to higher scale? during dev users probably don't need an FTS at all since it's covered during the regular get_online_features flow?
could see that we expose a debug endpoint but that seems slightly different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to stick to a separate command and we can revisit if it becomes too cumbersome during dev cycles, unless anyone feels strongly.
a0a0216
to
26ef01c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm!
|
||
message GetTransformationServiceInfoRequest {} | ||
|
||
message GetTransformationServiceInfoResponse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense for the most advanced user.
for a user just getting started, i'd want maybe an "EMBEDDED" mode too where transformations happen on the feature server itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the python feature server, right? How would that be different from just running get_online_features
pointing to a python feature server API, which should look up the values and do the transformation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the earlier question is what is the BATCH or ONLINE mode used for? seems like in both cases, we're applying the same row-level udf to all values?
string transformation_name = 1; | ||
string project = 2; | ||
|
||
ValueType transformation_input = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that this input is WIP? since you call .arrow_value below on this, which ValueType doesn't have.
Are you envisioning the FTS would communicate with the registry? My preference is probably to start without to have as many stateless servers as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless of course we run into performance issues, in which case having a cached registry would be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have to get the registry at least once because that's how it would unpickle the transformations, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm yeah it needs the udf definition which we currently store in the FTS.
no strong feelings. an alternative would be to pass the udf directly in the request so the FTS becomes purely about execution. though at that point, we may consider using a ray cluster directly instead of managing an FTS.
If both this and the FTS can pull the registry, wondering if there can now be synchronization issues where e.g. a feature server that calls into this has a different version of the registry than what is on the FTS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing in the UDF seems like a performance and dependency nightmare - I think we should stick with a simple approach for now assuming access to the registry. If we don't want to do that then yea we can farm out to a ray cluster from the client directly.
Re: sync issues, i think it's possible, but I don't know if we want to solve that problem now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm for the first cut
unclear imo though whether udf passing would be more complex.
e.g .the dependency issues we'd run into today anyways on the regular feature server no?
e.g. performance here referring to payload increases causing issues? payload size isn't really directly correlated with network latency, and it probably will be relatively negligible esp if users are passing in large sets of entity keys.
Codecov Report
@@ Coverage Diff @@
## master #1932 +/- ##
==========================================
- Coverage 82.26% 81.94% -0.32%
==========================================
Files 96 97 +1
Lines 7669 7727 +58
==========================================
+ Hits 6309 6332 +23
- Misses 1360 1395 +35
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
// Feast version of this transformation service deployment. | ||
string version = 1; | ||
|
||
// Type of serving deployment, either ONLINE or BATCH. Different store types support different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what this comment means - ONLINE
and BATCH
also don't match up with the TransformationServiceType
options below?
sdk/python/feast/repo_config.py
Outdated
@@ -76,6 +76,19 @@ class RegistryConfig(FeastBaseModel): | |||
expire. Users can manually refresh the cache by calling feature_store.refresh_registry() """ | |||
|
|||
|
|||
class TransformationServerConfig(FeastBaseModel): | |||
"""Server Configuration that determines to how the transformation server is configured.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "determines how"
sdk/python/feast/repo_config.py
Outdated
|
||
|
||
class ServerConfig(FeastBaseModel): | ||
"""Server Configuration that determines to how feast servers are configured. """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same as above
sdk/python/feast/repo_config.py
Outdated
@@ -175,6 +194,11 @@ def _validate_online_store_config(cls, values): | |||
[ErrorWrapper(e, loc="online_store")], model=RepoConfig, | |||
) | |||
|
|||
if "servers" not in values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're validating servers
here, shouldn't we have a separate validate_servers_config
function?
context.set_code(grpc.StatusCode.INVALID_ARGUMENT) | ||
raise | ||
|
||
assert odfv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: throw an exception instead of assert
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
148dadc
to
2a73ae7
Compare
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
Signed-off-by: Achal Shah <achals@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, felixwang9817 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature |
Signed-off-by: Achal Shah achals@gmail.com
What this PR does / why we need it:
Working Example:
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: