Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

fix: Migrate persistent kernel stats to Redis to reduce database txn overheads #532

Merged
merged 25 commits into from
Mar 29, 2022

Conversation

kyujin-cho
Copy link
Member

@kyujin-cho kyujin-cho commented Feb 14, 2022

This PR resolves lablup/backend.ai#348.

@kyujin-cho kyujin-cho added this to the 21.09 milestone Feb 14, 2022
@kyujin-cho kyujin-cho self-assigned this Feb 14, 2022
@codecov
Copy link

codecov bot commented Feb 14, 2022

Codecov Report

Merging #532 (3c86a82) into main (02bf6a9) will decrease coverage by 0.17%.
The diff coverage is 35.05%.

@@            Coverage Diff             @@
##             main     #532      +/-   ##
==========================================
- Coverage   49.37%   49.20%   -0.18%     
==========================================
  Files          55       55              
  Lines        9008     9079      +71     
==========================================
+ Hits         4448     4467      +19     
- Misses       4560     4612      +52     
Impacted Files Coverage Δ
src/ai/backend/manager/config.py 63.73% <ø> (ø)
src/ai/backend/manager/registry.py 18.07% <0.00%> (-0.14%) ⬇️
src/ai/backend/manager/models/kernel.py 44.60% <25.92%> (+0.42%) ⬆️
src/ai/backend/manager/cli/__main__.py 47.54% <34.37%> (-6.23%) ⬇️
src/ai/backend/manager/cli/context.py 70.37% <53.33%> (-21.94%) ⬇️
src/ai/backend/manager/models/utils.py 78.01% <0.00%> (-5.68%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02bf6a9...3c86a82. Read the comment docs.

Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor code issues.

Copy link
Member

@fregataa fregataa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave a minor comment.

@@ -311,12 +314,23 @@ async def get_container_stats_for_period(request: web.Request, start_date, end_d
result = await conn.execute(query)
rows = result.fetchall()

async def _pipe_builder(r: Redis) -> RedisPipeline:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

굉장히 사소하긴 하지만, 여기 async가 없어도 되나요?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's wrong;;;
https://github.com/lablup/backend.ai-common/blob/main/src/ai/backend/common/redis.py#L228-L234
It should return just pipeline, not awaitable of pipeline!

@kyujin-cho kyujin-cho requested review from fregataa and achimnol March 6, 2022 05:12
Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.redis.execute() accepts a function that returns either:

  • async function that executes Redis commands
  • non-async function that returns a Redis pipeline to execute

To use pipeline, the callable must NOT be async.

refs: https://github.com/lablup/backend.ai-common/blob/3a6466fe/src/ai/backend/common/redis.py#L228-L234

@achimnol
Copy link
Member

common.redis.execute() accepts a function that returns either:

  • async function that executes Redis commands
  • non-async function that returns a Redis pipeline to execute

To use pipeline, the callable must NOT be async.

refs: https://github.com/lablup/backend.ai-common/blob/3a6466fe/src/ai/backend/common/redis.py#L228-L234

I just added a PR to allow async pipeline builders: lablup/backend.ai-common#125.

@achimnol achimnol modified the milestones: 21.09, 22.03 Mar 28, 2022
* Statistics are stored and queried per-container (=per-kernel), not per-session.
* SessionStatistics should be implemented separately by aggregating
  its belong KernelStatistics.
Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to merge redis clear-history with clear-history command. How about your thoughts?

kyujin-cho and others added 5 commits March 29, 2022 12:12
* Introduce more-itertools as a new dependency
* Use attr-class instead of typed dict
  - When there is no need for serialization, it's generally faster
    and safer to use typed attr-classes or dataclasses.
* Adjust the log message for Redis stats records cleanup.
@achimnol achimnol changed the title refactor: store stat data on redis only refactor: Migrate persistent kernel stats to Redis to reduce database txn overheads Mar 29, 2022
@achimnol achimnol changed the title refactor: Migrate persistent kernel stats to Redis to reduce database txn overheads fix: Migrate persistent kernel stats to Redis to reduce database txn overheads Mar 29, 2022
@achimnol achimnol merged commit 17158e7 into main Mar 29, 2022
@achimnol achimnol deleted the refactor/stats-on-redis branch March 29, 2022 08:19
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor kernel statistics to use Redis only
3 participants