feat: Implement the API design and set up pyproject.toml for building wheel #11

aluu317 · 2024-10-24T05:00:31Z

Run:

python3 -m build

to build a wheel

Then install it:

pip3 install .

And run the sdk example:

python3 fm_training_estimator/sdk/examples/ex1.py

This PR:

exposes 3 more estimate functions in sdk for a total of 4 estimate functions: memory, cost, time, tokens
changes some print statements to logging
updates the SDK example code

ChanderG

I will need to test out the various UI's once (no integration tests yet!) to be fully sure about some of the changes, but looks good otherwise.

fm_training_estimator/ui/core.py

fm_training_estimator/config/arguments.py

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

aluu317 · 2024-10-24T22:47:31Z

fm_training_estimator/throughput/hybrid/hybrid.py

@@ -45,7 +48,9 @@ def check_lookup(self, seqlen):
            "batch_size": self.ta.per_device_train_batch_size,
            "seq_len": seqlen,
            "gpu_model": self.ia.gpuModel,
-            "method": self.fm.technique,
+            "method": "fsdp"


@ChanderG I wasn't sure about this. I saw that the data.csv file you gave me didn't have any "full" in the method column so I assumed maybe we are looking for "fsdp". Should I leave it as self.fm.technique.value which is full, or should this code be ok?

We should remove the if condition and pass in technique as-is.

I only recently changed the frontend UI to display "full" instead of "fsdp". We can fix this mismatch either in the data.csv file, or in the web ui frontend. Most likely in the source data files, since "fsdp" is misleading when both are fsdp.

ChanderG

I was testing the UI with the new changes and ran into a problem. This is in the "parse" function interface which you don't need for the SDK code as it stands (the user is expected to put together the config components directly in Python), but is needed in the other interfaces (UI/cli) where the input is parsed out of JSON.

Here is a minimal reproducer:

from enum import Enum
from transformers import HfArgumentParser
from dataclasses import dataclass, field

class TuningTechnique(Enum):
    LORA = "lora"
    FULL = "full"

@dataclass
class FMArguments:
    technique: TuningTechnique = field(
        default=TuningTechnique.FULL,
        metadata={"help": ("Fine-tuning technique being used")},
    )

config = {"technique": "lora"}
arg_parser = HfArgumentParser([FMArguments])
res = arg_parser.parse_dict(config)

print(res)

We expect the result to be a parsed FMArguments object, with the field technique an enum, but instead we get the technique field set to a string.

I think this is a limitation of HF Argparser (https://github.com/huggingface/transformers/blob/main/src/transformers/hf_argparser.py) in that it may not be possible to parse out from dict to enum like we would want.

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

aluu317 · 2024-10-28T21:04:23Z

@ChanderG Ahh good catch, I see the problem. I spent some time today trying to figure out how to get enum to work with JSON/dict parsing, but couldn't. So I switched it back to using str. Let me know if this is good to merge!

ChanderG

LGTM.

aluu317 force-pushed the setup branch from a65608a to 8e44d13 Compare October 24, 2024 05:01

ChanderG reviewed Oct 24, 2024

View reviewed changes

fm_training_estimator/ui/core.py Outdated Show resolved Hide resolved

fm_training_estimator/config/arguments.py Show resolved Hide resolved

aluu317 added 3 commits October 24, 2024 14:04

build: add pyproject toml file

82c7d7e

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

refactor: Use the new dataclass for estimate input to sdk

cc90e49

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

feat: expose 4 endpoints via sdk with the defined types

1aabc1f

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

aluu317 force-pushed the setup branch from 8e44d13 to 1aabc1f Compare October 24, 2024 20:05

aluu317 added 2 commits October 24, 2024 16:44

docs: update default 0 for numGpusPerPod to autodiscover

a6ec64e

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

chore: remove unnessary return

4cda7bb

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

aluu317 commented Oct 24, 2024

View reviewed changes

aluu317 marked this pull request as ready for review October 24, 2024 22:49

ChanderG requested changes Oct 28, 2024

View reviewed changes

aluu317 added 2 commits October 28, 2024 14:59

refactor: Use str instead of Enum for technique

ecb9576

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

refactor: use technique as is for looking up in learned model

77327c5

Signed-off-by: Angel Luu <angel.luu@us.ibm.com>

ChanderG approved these changes Oct 29, 2024

View reviewed changes

ChanderG merged commit 2b88d82 into foundation-model-stack:main Oct 29, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement the API design and set up pyproject.toml for building wheel #11

feat: Implement the API design and set up pyproject.toml for building wheel #11

aluu317 commented Oct 24, 2024

ChanderG left a comment

aluu317 Oct 24, 2024

ChanderG Oct 28, 2024

ChanderG left a comment

aluu317 commented Oct 28, 2024

ChanderG left a comment

feat: Implement the API design and set up pyproject.toml for building wheel #11

feat: Implement the API design and set up pyproject.toml for building wheel #11

Conversation

aluu317 commented Oct 24, 2024

ChanderG left a comment

Choose a reason for hiding this comment

aluu317 Oct 24, 2024

Choose a reason for hiding this comment

ChanderG Oct 28, 2024

Choose a reason for hiding this comment

ChanderG left a comment

Choose a reason for hiding this comment

aluu317 commented Oct 28, 2024

ChanderG left a comment

Choose a reason for hiding this comment