Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

feat: Limit session types per scaling group #523

Merged
merged 20 commits into from
Feb 15, 2022

Conversation

vesselofgod
Copy link
Contributor

@vesselofgod vesselofgod commented Jan 21, 2022

resolves lablup/backend.ai#343
Related to: lablup/backend.ai-webui#1190

Limit session types per scaling group

  • Check allowed_session_type in scheduled_opts and don't start forbidden type's session
    If it is not a allowed_session_type in scaling_group, do not start the session and stay preparing status
    session 안돌게끔

  • Add a column StructuredJSONBColumn to check scheduler_opts's validation using trafaret.
    For validation check, modify "process_result_value" method (receive a result-row column value to be converted)
    validation check

@vesselofgod vesselofgod changed the title feat: Limit session types per scaling group #343 feat: Limit session types per scaling group Jan 21, 2022
@codecov
Copy link

codecov bot commented Jan 21, 2022

Codecov Report

Merging #523 (c9cbe96) into main (468ddbf) will increase coverage by 0.07%.
The diff coverage is 70.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #523      +/-   ##
==========================================
+ Coverage   48.80%   48.88%   +0.07%     
==========================================
  Files          54       54              
  Lines        9001     9185     +184     
==========================================
+ Hits         4393     4490      +97     
- Misses       4608     4695      +87     
Impacted Files Coverage Δ
src/ai/backend/manager/scheduler/dispatcher.py 26.39% <ø> (+0.92%) ⬆️
src/ai/backend/manager/scheduler/predicates.py 27.17% <0.00%> (-1.24%) ⬇️
src/ai/backend/manager/models/base.py 53.01% <80.00%> (+0.01%) ⬆️
src/ai/backend/manager/models/scaling_group.py 66.25% <100.00%> (+0.42%) ⬆️
src/ai/backend/manager/registry.py 16.89% <0.00%> (-0.10%) ⬇️
src/ai/backend/manager/defs.py 100.00% <0.00%> (ø)
src/ai/backend/manager/api/auth.py 49.30% <0.00%> (+1.25%) ⬆️
src/ai/backend/manager/server.py 61.57% <0.00%> (+1.84%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 468ddbf...c9cbe96. Read the comment docs.

Comment on lines 210 to 214
schedulerOption = t.Dict(
{
tx.AliasedKey(['allowed_session_type', 'session_type', 'sessionType'],
default='interactive') >> 'allowed_session_type': tx.Enum(SessionTypes),
}).allow_extra('*')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be provided at the time of instantiation (i.e., the model field definitions), not hard-coded inside the class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I received the trafaret object as a parameter of StructuredJSONBColumn, but when I tested it, an error occurred
dialector
dialector2
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass a trafaret object when instantiating the column class, it should work as expected.

Example:

    sa.Column('scheduler_opts', StructuredJSONBColumn(
        t.Dict({
            t.Key('allowed_session_types', default=['interactive', 'batch']):
                t.List(tx.Enum(SessionTypes)),
        }).allow_extra('*'),
    ), nullable=False, default={}),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there is no need to use tx.AliasedKey because it is for legacy clients which use different names for parameter keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I edit source code, I referred to task_template_v1 in the session_template.py, but I didn't fix it according to the allowed_session_types...😥

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the good point!

Comment on lines 734 to 743
sgroup_name = scheduled_session.scaling_group
async with self.db.begin() as conn:
query = (
sa.select(scaling_groups.c.scheduler_opts)
.select_from(scaling_groups)
.where(scaling_groups.c.name == sgroup_name)
)
result = await conn.execute(query)
row = result.first()
allowed_session_type = row['scheduler_opts']['allowed_session_type']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location of this check should be moved to PENDING -> SCHEDULED transition handler, not here in SCHEDULED -> PREPARING transition handler, to complete the decision before proceeding to the actual preparation step.

@achimnol achimnol added this to the 22.03 milestone Jan 22, 2022
@achimnol
Copy link
Member

I've implemented the StructuredJSONBColumn. There are more things to fix up and it's up to you! 😉

@vesselofgod vesselofgod requested a review from achimnol January 26, 2022 11:36
@achimnol achimnol modified the milestones: 22.03, 21.09 Jan 27, 2022
@vesselofgod vesselofgod self-assigned this Jan 28, 2022
Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor style fix please. 😉

Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. I think the check should be implemented as part of the check_scaling_group predicate.
This has the following advantages over the current implementation:

  • The user becomes able to know why the session is pending as the reason can be recorded as a part of predicate check results.
  • There no need to issue a separate database transaction. (Note: currently the newly added transaction uses db.begin() but it should use db.begin_readonly() because it contains only a single SELECT query.)

Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the additional query inside _query() so that it could be retried.

Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a good idea to use already available information. 😉

@achimnol achimnol merged commit 96a2dce into main Feb 15, 2022
@achimnol achimnol deleted the feature/limit-session-types-per-scaling-group branch February 15, 2022 06:23
achimnol pushed a commit that referenced this pull request Feb 15, 2022
* feat: Check `allowed_session_type` in `scheduler_opts` during the scaling-group predicate check
* feat: Add `StructuredJSONBColumn` to provide validation/conversion of JSONB columns
  using Trafarets.

Backported-From: main (22.03)
Backported-To: 21.09
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit session types per scaling group
2 participants