Skip to content

feat: compliant cli checker for correct variants and dependencies #121

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Apr 14, 2025

This PR adds a new cli tool compliant that checks kernels for compliance.

$ compliant
Hugging Face kernel compliance checker

Usage: compliant <COMMAND>

Commands:
  list   List fetched repositories with build variants
  check  Check repository compliance and ABI compatibility
  help   Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version

check args

$ compliant check --help
Check repository compliance and ABI compatibility

Usage: compliant check [OPTIONS] --repos <REPOS>

Options:
  -r, --repos <REPOS>            Repository IDs or names (comma-separated)
  -m, --manylinux <MANYLINUX>    Manylinux version to check against [default: manylinux_2_28]
  -p, --python-abi <PYTHON_ABI>  Python ABI version to check against [default: 3.9]
  -a, --auto-fetch               Automatically fetch repositories if not found locally
  -r, --revision <REVISION>      Revision (branch, tag, or commit hash) to use when fetching [default: main]
      --long                     Show all variants in a long format. Default is compact output
      --show-violations          Show ABI violations in the output. Default is to only show compatibility status
      --format <FORMAT>          Format of the output. Default is console [default: console] [possible values: console, json]
  -h, --help                     Print help

Usage

list all kernels (in cache and have a build variant)

$ compliant list
.
├── kernels-community/activation
├── kernels-community/deformable-detr
├── kernels-community/flash-mla
├── kernels-community/quantization
╰── 4 kernel repositories found

checking a repo

 $ kernels-community/activation
├── build: Total: 13 (CUDA: 12, ROCM: 1)
│   ├── ✓ CUDA
│   ╰── ✗ ROCM
╰── abi: compatible
    ├── ✓ manylinux_2_28
    ╰── ✓ python 3.9

with --long output for variant specific abi compatibility

$ compliant check --repos kernels-community/activation --long
├── build: Total: 13 (CUDA: 12, ROCM: 1)
│  ✓ CUDA
│    ├── torch25-cxx11-cu118-x86_64-linux
│    ├── torch25-cxx11-cu121-x86_64-linux
│    ├── torch25-cxx11-cu124-x86_64-linux
│    ├── torch25-cxx98-cu118-x86_64-linux
│    ├── torch25-cxx98-cu121-x86_64-linux
│    ├── torch25-cxx98-cu124-x86_64-linux
│    ├── torch26-cxx11-cu118-x86_64-linux
│    ├── torch26-cxx11-cu124-x86_64-linux
│    ├── torch26-cxx11-cu126-x86_64-linux
│    ├── torch26-cxx98-cu118-x86_64-linux
│    ├── torch26-cxx98-cu124-x86_64-linux
│    ╰── torch26-cxx98-cu126-x86_64-linux
│  ✗ ROCM
│    ├── torch25-cxx11-rocm5.4-x86_64-linux
│    ├── torch25-cxx11-rocm5.6-x86_64-linux
│    ├── torch25-cxx98-rocm5.4-x86_64-linux
│    ├── torch25-cxx98-rocm5.6-x86_64-linux
│    ├── torch26-cxx11-rocm5.4-x86_64-linux
│    ├── torch26-cxx11-rocm5.6-x86_64-linux
│    ╰── torch26-cxx11-rocm62-x86_64-linux
╰── abi: compatible
    ├── ✓ manylinux_2_28
    ╰── ✓ python 3.9

screenshot to show coloring

Screenshot 2025-04-14 at 10 48 49 PM

Checks

  • must contain all builds for a given arch (cuda/rocm)
  • must pass abi-check
  • conforms to having a build dir
  • add json output format

@drbh drbh force-pushed the add-compliant-checker-tool branch from bcc946d to bd4f49f Compare April 14, 2025 22:52
@drbh drbh marked this pull request as ready for review April 24, 2025 00:07
Copy link
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice to have this! Added a bunch of comments.

@@ -0,0 +1,29 @@
[package]
name = "compliant"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should name it kernel-compliance-check? A bit more descriptive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, updated dir and tool name in latest changes. Thanks!

Comment on lines 1 to 13
use anyhow::{Context, Result};
use clap::{Parser, Subcommand, ValueEnum};
use colored::Colorize;
use hf_hub::api::tokio::{ApiBuilder, ApiError};
use hf_hub::{Repo, RepoType};
use kernel_abi_check::{check_manylinux, check_python_abi, Version};
use object::Object;
use once_cell::sync::Lazy;
use serde::{Deserialize, Serialize};
use std::fmt;
use std::fs;
use std::path::{Path, PathBuf};
use thiserror::Error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use anyhow::{Context, Result};
use clap::{Parser, Subcommand, ValueEnum};
use colored::Colorize;
use hf_hub::api::tokio::{ApiBuilder, ApiError};
use hf_hub::{Repo, RepoType};
use kernel_abi_check::{check_manylinux, check_python_abi, Version};
use object::Object;
use once_cell::sync::Lazy;
use serde::{Deserialize, Serialize};
use std::fmt;
use std::fs;
use std::path::{Path, PathBuf};
use thiserror::Error;
use std::fmt;
use std::fs;
use std::path::{Path, PathBuf};
use anyhow::{Context, Result};
use clap::{Parser, Subcommand, ValueEnum};
use colored::Colorize;
use hf_hub::api::tokio::{ApiBuilder, ApiError};
use hf_hub::{Repo, RepoType};
use kernel_abi_check::{check_manylinux, check_python_abi, Version};
use object::Object;
use once_cell::sync::Lazy;
use serde::{Deserialize, Serialize};
use thiserror::Error;

I like this separation of imports between stdlib <-> external crates <-> local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yea thats better, updated in latest changes. Thanks

Comment on lines 91 to 93
/// Automatically fetch repositories if not found locally
#[arg(short, long, default_value = "true")]
auto_fetch: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice if auto-fetching is the default:

  • If cached and up-to-date, immediately use local.
  • Otherwise download and use local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed and improved to always check if the local source is the latest hash of that revision and added and sync if needed. Also added --force to enable fetching even if you have the copy locally

#[derive(Subcommand)]
pub enum Commands {
/// List fetched repositories with build variants
List {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I like that with normal Hub/kernels usage there is not a real distinction between local/remote. You give a repo name and the library takes care of everything. With list and non-auto download (see below), you have to think about where things live, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea that makes sense, removed in latest commits

}

async fn fetch_compliant_variants() -> Result<(Vec<String>, Vec<String>)> {
let url = "https://raw.githubusercontent.com/huggingface/kernel-builder/refs/heads/main/build-variants.json";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should move this build-variants.json into the src directory of the checker and bake it into the binary? I think this allows more control over future changes of the JSON file and avoids breaking all past versions if the format changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes alot of sense. For now i've added a new build.rs script that vendors the json in a rs file at build time so its baked into the binary. I think this should add alot more stability and avoids fetching the variants during each run

Comment on lines 461 to 462
let hash = content.trim();
let snapshot_dir = repo_path.join(format!("snapshots/{}", hash));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can directly get the correct repo directory from hf-hub, so that we don't have to care about the on-disk representation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, this has been simplified to prefer the hf-hub in latest commits

let repo_path = get_repo_path(repo_id, cache_dir);

// Check if repository exists locally
if !repo_path.exists() || !repo_path.join("refs/main").exists() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails on other branches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now relying on hf-hub

}

// Re-check after potential fetch
let ref_file = repo_path.join("refs/main");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails on other branches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now relying on hf-hub

.with_context(|| format!("Failed to read ref file: {:?}", ref_file))?;

let hash = content.trim();
let snapshot_dir = repo_path.join(format!("snapshots/{}", hash));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get the directory from hf-hub.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now relying on hf-hub

@@ -0,0 +1,155 @@
use anyhow::{Context, Result};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: we use eyre in build2cmake and kernel-abi-check. Maybe we should use the same here for consistency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, i've updated to prefer eyre over anyhow in the latest changes. Thanks!

@drbh drbh requested a review from danieldk May 1, 2025 13:30
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants