Skip to content

packages in pipe report #680

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

packages in pipe report #680

wants to merge 10 commits into from

Conversation

glehmann
Copy link
Contributor

@glehmann glehmann commented Feb 28, 2025

image

@stormi stormi requested review from stormi and ydirson March 3, 2025 11:35
@stormi
Copy link
Member

stormi commented Mar 3, 2025

Looks like a good candidate for a sync review.

Copy link
Contributor

@ydirson ydirson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some overlap with #677 here. Also, do we want to depend on koji, when we can get the info from the repos directly?

@stormi
Copy link
Member

stormi commented Mar 3, 2025

There seems to be some overlap with #677 here. Also, do we want to depend on koji, when we can get the info from the repos directly?

You can't get the koji build ID and links to the web interface from repos directly.

Copy link
Contributor

@ydirson ydirson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note the "hmtl" typo in commit message

koji_output = check_output(['koji', 'buildinfo', build[0]]).decode('utf8')
kv_lines = [line.split(': ') for line in koji_output.splitlines()]
build_info = dict(kv for kv in kv_lines if len(kv) == 2)
print(build_info)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like debugging output, do we want to keep it?

Comment on lines 71 to 89
tags = [f'v{v}-{p}' for v in ['8.2', '8.3'] for p in ['incoming', 'ci', 'testing', 'candidates', 'lab']]
for tag in tags:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem necessary to precompute a list of tags. This and the lack of specific details about what exactly gets parallelized makes reading the patch harder than necessary

@glehmann glehmann force-pushed the pkg-in-pipe branch 2 times, most recently from 55ae5cb to 6f29e52 Compare March 6, 2025 10:40
# Run in docker

```sh
docker build -f pkg_in_pipe .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
docker build -f pkg_in_pipe .
docker buildx build . -t pkg_in_pipe

docker buildx replaces docker build see the "Important" note here https://docs.docker.com/reference/cli/docker/build-legacy/

Copy link
Contributor Author

@glehmann glehmann Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker build is an alias to docker buildx build and buildkit is the default build engine since docker 23, so I think we should be good with docker build :-)

unless we need to run an older version somewhere?

ref:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch for the -t!


```sh
docker build -f pkg_in_pipe .
docker run -v ~/.koji:/root/.koji:z -e PLANE_TOKEN=<plane token> -v /out/dir:/output:z pkg_in_pipe /output/index.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document in README that a directory from the host should me mounted in the container, and that a positional argument is used to specify the output path and filename inside the container, shouldn't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

@@ -0,0 +1 @@
report.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore as the output is flexible, and the example in README.md is outside of the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while not strictly needed, I think it's still nice to have it for local development

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is still the default output file.

@glehmann glehmann force-pushed the pkg-in-pipe branch 2 times, most recently from 336d713 to f14946b Compare March 7, 2025 18:20
@glehmann
Copy link
Contributor Author

glehmann commented Mar 7, 2025

@nathanael-h I have also added the last generation time and generation interval in the report

def print_footer(out):
now = datetime.now()
print(f'''
Last generated at {now}. Generated every 5 minutes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the "last generated" is useful info, hardcoding the interval of an external cron job does not feel right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it should be a command line option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a --generated-info="Generated every 5 minutes."? --cron-info="5 minutes"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanael-h as a consequence of this change, we'll have to add --generated-info "Generated every 5 minutes." on the command line when calling pkg_in_pipe.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the delay is short and people see that it was last generated less than 5 minutes ago, I think passing a parameter to tell them "it's generated every 5 minutes" is a bit overkill, but now that it's done, why not :)

@glehmann glehmann force-pushed the pkg-in-pipe branch 3 times, most recently from 4b42ed2 to 10b532b Compare March 10, 2025 10:02
@glehmann glehmann requested a review from ydirson March 14, 2025 08:07
@stormi
Copy link
Member

stormi commented Apr 8, 2025

@glehmann let's discuss it together with @nathanael-h in the coming days?

@glehmann glehmann force-pushed the pkg-in-pipe branch 3 times, most recently from 56bbe01 to 139539f Compare April 15, 2025 11:54
@glehmann glehmann force-pushed the pkg-in-pipe branch 2 times, most recently from 03a273a to 42496a3 Compare April 25, 2025 12:21
@@ -3,82 +3,91 @@
import argparse
import os
from datetime import datetime
from textwrap import dedent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "style fixes" should likely be squashed into the first commit, no?

Comment on lines 145 to 294
except Exception as e:
ok = False
print(f"error while building the report: {e}", file=sys.stderr)
print_koji_error(out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print_koji_error error message is very specific, but gets triggered by a very wide except Exception. No need to wake up the koji admins if the problem is eg. a disk full while writing the report ;)

Comment on lines 150 to 151
if not ok:
exit(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't exit(1) or return 1 in the exception block do the job, with print_footer in a finally clause?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, not sure why I didn't do that 👍

Comment on lines 17 to 18
# from icecream import ic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like remnants from a debugging session ;)

tag = tag.split('-')[-1]
return TAG_ORDER.index(tag)

def find_previous_build_commit(session, build_tag, build):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a short comment explaining it is not just the "previous build commit" could be useful

Comment on lines 169 to 170
for commit in gh.get_repo(repo).get_commits(start_sha):
if commit.sha == end_sha:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That idiom feels strange, as git history is not just linear. Isn't it possible do directly get a commit range from gh?
A docstring would help too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found a way to do that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits seems very limited indeed, apparently we would need a local clone to get the proper info (a "monorepo" would help here too). Maybe add a note then, for the time when we see strange results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's a problem: github linearizes the history in its commit pages.
We could check in a repo with a complex branch structure (like host-installer) if the commits made before the last commit in the base branch and merged in the current branch appear after the last commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e15e appears after 6e8c in https://github.com/xcp-ng/host-installer/commits/ydi/10.10.28-8.3, but the merge commit, which is also considered to be in the PR appears before, so we're good for this use case.
I'll add a comment to warn about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH linearizes in much the same way git log does, ie I guess it guaranties that all commits descending from the "limit commit" we're looking for will be see first. But other commits merged from a branch whose base does not include that "limit commit" may be listed after the "limit commit" (they bear no relation so order can be arbitrary), so they would be missed by this way of walking.

Question is, how much do we care?

Off the top of my head, I'd say this can happen every time we have a branch that gets merged quickly, while others have been open from some time (in which case the older commits are often sorted as older, and this the probability not to see then listed before the limit is really high). But then we're talking packaging repos only, and the management of package version already pushes towards rebasing before getting a PR merged - so it might not be that much of a problem, but could still occasionally happen.

I still think it warrants a note in the code explaining that this is the best we can reasonably do in the current state of GH APIs, and maybe point to this discussion for more details.

return sorted(prs, key=lambda p: p.number, reverse=True)

parser = argparse.ArgumentParser(description='Generate a report of the packages in the pipe')
parser.add_argument('output', nargs='?', help='Report output path', default='report.html')
parser.add_argument('--generated-info', help="Add this message about the generation in the report")
parser.add_argument('--plane-token', help="The token used to access the plane api", default=os.environ['PLANE_TOKEN'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed by mistake, reintroduced with different indent in next commit (reinident would belong to first commit)

Copy link
Contributor Author

@glehmann glehmann Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a style fix: I used environ.get() to not raise an exception if the envar is not defined.
but indeed it was removed by mistake

Comment on lines 209 to 255
gh = github.Github(auth=github.Auth.Token(os.environ['GITHUB_TOKEN']))
if args.github_token:
gh = github.Github(auth=github.Auth.Token(args.github_token))
try:
gh.get_repo('xcp-ng/xcp') # check that the token is valid
except BadCredentialsException:
gh = None
else:
gh = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"without github" does not seem completely accurate, since that covers "no token passed" and "bad credentials", but not any other error from GH, which happen from time to time.

Also, it looks like print_github_warning would claim "Github malfunction" when it's just "no token passed"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's inaccurate. I'll try something else.

Comment on lines 257 to 286
build_url = f'https://koji.xcp-ng.org/buildinfo?buildID={tagged['build_id']}'
maintained_by = PACKAGES.get(repo.split('/')[-1], {}).get('maintainer')
build_url = f'https://koji.xcp-ng.org/buildinfo?buildID={tagged["build_id"]}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me some concentration to see why that line is shown as modified, just to realized it does not belong to this commit

Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
glehmann and others added 3 commits April 30, 2025 14:43
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Co-authored-by: Nathanaël <7300309+nathanael-h@users.noreply.github.com>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
@stormi stormi removed their request for review April 30, 2025 12:43
@stormi
Copy link
Member

stormi commented Apr 30, 2025

Removing myself from the code reviewers for this one. I'll continue providing feedback on the end result, independently of the implementation.

Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
Signed-off-by: Gaëtan Lehmann <gaetan.lehmann@vates.tech>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants