Skip to content

feat: compliant cli checker for correct variants and dependencies #121

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 23 commits into from
May 7, 2025

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Apr 14, 2025

This PR adds a new cli tool compliant that checks kernels for compliance.

$ compliant
Hugging Face kernel compliance checker

Usage: compliant <COMMAND>

Commands:
  list   List fetched repositories with build variants
  check  Check repository compliance and ABI compatibility
  help   Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version

check args

$ compliant check --help
Check repository compliance and ABI compatibility

Usage: compliant check [OPTIONS] --repos <REPOS>

Options:
  -r, --repos <REPOS>            Repository IDs or names (comma-separated)
  -m, --manylinux <MANYLINUX>    Manylinux version to check against [default: manylinux_2_28]
  -p, --python-abi <PYTHON_ABI>  Python ABI version to check against [default: 3.9]
  -a, --auto-fetch               Automatically fetch repositories if not found locally
  -r, --revision <REVISION>      Revision (branch, tag, or commit hash) to use when fetching [default: main]
      --long                     Show all variants in a long format. Default is compact output
      --show-violations          Show ABI violations in the output. Default is to only show compatibility status
      --format <FORMAT>          Format of the output. Default is console [default: console] [possible values: console, json]
  -h, --help                     Print help

Usage

list all kernels (in cache and have a build variant)

$ compliant list
.
├── kernels-community/activation
├── kernels-community/deformable-detr
├── kernels-community/flash-mla
├── kernels-community/quantization
╰── 4 kernel repositories found

checking a repo

 $ kernels-community/activation
├── build: Total: 13 (CUDA: 12, ROCM: 1)
│   ├── ✓ CUDA
│   ╰── ✗ ROCM
╰── abi: compatible
    ├── ✓ manylinux_2_28
    ╰── ✓ python 3.9

with --long output for variant specific abi compatibility

$ compliant check --repos kernels-community/activation --long
├── build: Total: 13 (CUDA: 12, ROCM: 1)
│  ✓ CUDA
│    ├── torch25-cxx11-cu118-x86_64-linux
│    ├── torch25-cxx11-cu121-x86_64-linux
│    ├── torch25-cxx11-cu124-x86_64-linux
│    ├── torch25-cxx98-cu118-x86_64-linux
│    ├── torch25-cxx98-cu121-x86_64-linux
│    ├── torch25-cxx98-cu124-x86_64-linux
│    ├── torch26-cxx11-cu118-x86_64-linux
│    ├── torch26-cxx11-cu124-x86_64-linux
│    ├── torch26-cxx11-cu126-x86_64-linux
│    ├── torch26-cxx98-cu118-x86_64-linux
│    ├── torch26-cxx98-cu124-x86_64-linux
│    ╰── torch26-cxx98-cu126-x86_64-linux
│  ✗ ROCM
│    ├── torch25-cxx11-rocm5.4-x86_64-linux
│    ├── torch25-cxx11-rocm5.6-x86_64-linux
│    ├── torch25-cxx98-rocm5.4-x86_64-linux
│    ├── torch25-cxx98-rocm5.6-x86_64-linux
│    ├── torch26-cxx11-rocm5.4-x86_64-linux
│    ├── torch26-cxx11-rocm5.6-x86_64-linux
│    ╰── torch26-cxx11-rocm62-x86_64-linux
╰── abi: compatible
    ├── ✓ manylinux_2_28
    ╰── ✓ python 3.9

screenshot to show coloring

Screenshot 2025-04-14 at 10 48 49 PM

Checks

  • must contain all builds for a given arch (cuda/rocm)
  • must pass abi-check
  • conforms to having a build dir
  • add json output format

@drbh drbh force-pushed the add-compliant-checker-tool branch from bcc946d to bd4f49f Compare April 14, 2025 22:52
@drbh drbh marked this pull request as ready for review April 24, 2025 00:07
Copy link
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice to have this! Added a bunch of comments.

@drbh drbh requested a review from danieldk May 1, 2025 13:30
Copy link
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making all the changes! Looks very good already, added some more comments on the new changes.

Comment on lines 18 to 44
println!("cargo:warning=Fetching remote variants JSON...");
let url = "https://raw.githubusercontent.com/huggingface/kernel-builder/refs/heads/main/build-variants.json";

let mut remote_variants_json = String::new();

match ureq::get(url).call() {
Ok(resp) => {
match resp.into_reader().read_to_string(&mut remote_variants_json) {
Ok(_) => {
println!(
"cargo:warning=Successfully fetched remote variants ({} bytes)",
remote_variants_json.len()
);
}
Err(e) => {
println!("cargo:warning=Error reading response body: {e}");
// Instead of returning an empty JSON, provide fallback content
remote_variants_json = String::from("{}");
}
}
}
Err(e) => {
println!("cargo:warning=Error fetching remote variants: {e}");
// Provide fallback content
remote_variants_json = String::from("{}");
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incompatible with Nix, we cannot have network access during the build. We should make the JSON file part of the compliance checker source files. I think (but we need to check) if we add a symlink to the JSON file in e.g. src, cargo publish would replace the symlink by the contents of the file. In that way we can keep it in the current repo location, but also include it in the crate (when uploading to crates.io).

I also don't think we should be writing Rust code through a string, we don't get syntax checking, completion, etc. We should use include_str or include_bytes and then serde-parse it. The cost of doing so is going to be near-zero.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great point, I've updated with a much more simple solution to include_str of the build_variants.json at the root of this repo, and added a symlink to copy that file into src (need to confirm that publish will work too), but the build.rs is now removed and is much more clean

VARIANTS_CACHE.get_or_init(|| {{
serde_json::from_str(VARIANTS_DATA).unwrap_or_else(|_| {{
// Provide a fallback empty object if parsing fails
serde_json::json!({{}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data comes from the repo, we know it's going to be in-sync and we can panic when the JSON is invalid (since it's a bug).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated/removed in latest changes

cmd.arg("--force");
}

println!("Using huggingface-cli to download repository");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I would print such messages to stderr. For me stdout is only for output that you may want to pipe to other processes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, ive updated many of the print lines to eprintln


// Parse object file
let file = object::File::parse(&*binary_data)
.map_err(|e| eyre::eyre!("Cannot parse object file: {}: {}", so_path.display(), e))?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all such cases I always use with_context in place of map_err. The original error is printed as well as part of the chain when the error is bubbled up to outside main. I think error chains are nicer than errors nested in a string (e.g. imagine the caller of this function wrapping it in a string again).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice thats a great recommendation, i've updated all of the map_err to with_context, that is a very nice error improvement. Thanks!


/// Manylinux version to check against
#[arg(short, long, default_value = "manylinux_2_28")]
manylinux: ManylinuxVersion,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should stringly-type this to avoid having to keep the enum in sync with the data from the manylinux project.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated in the latest commits. Thanks!

@drbh drbh force-pushed the add-compliant-checker-tool branch from 40d8da1 to b8026b1 Compare May 6, 2025 16:16
@drbh drbh requested a review from danieldk May 6, 2025 16:49
Copy link
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@danieldk danieldk merged commit 4784203 into main May 7, 2025
8 checks passed
@danieldk danieldk deleted the add-compliant-checker-tool branch May 7, 2025 12:43
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants