ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header #872

merveenoyan · 2022-05-12T15:48:38Z

This PR removes the history part in model card and turns hyperparameters dictionary into table in the model card like below.

It was previously looking like this:

I removed the metrics (aka history) from the card itself, and turned hyperparameters into a table. History is already kept in a json file and when people train a model with 1000 epochs it will not look good on the card that's why I did it.

I also removed the header "Training Procedure" because I felt like it looked redundant.

this solves #869

HuggingFaceDocBuilderDev · 2022-05-12T15:59:40Z

The documentation is not available anymore as the PR was closed or merged.

adrinjalali · 2022-05-16T12:54:41Z

@merveenoyan could you write a test for this so that we could see how it affects the results?

merveenoyan · 2022-05-16T13:35:35Z

@adrinjalali This changes output in the model card (basically the hyperparameters were printed as dictionary and now they're a table) and I removed the part that wrote history inside the card, and I feel like testing it is too overkill, if you insist I will write the test but I wouldn't like to do it for the time being 😅

osanseviero · 2022-05-16T14:04:51Z

src/huggingface_hub/keras_mixin.py

+        if model.history.history != {}:
+            path = os.path.join(save_directory, "history.json")
+            with open(path, "w") as f:
+                json.dump(model.history.history, f)


Could we add a test that checks that this file is created and contains the history?

I already added that test previously, see

huggingface_hub/tests/test_keras_integration.py

Line 228 in 65d3c88

self.assertIn("history.json", files)

if you want I can create a test for the case where there's no history (can make a prediction and make history go away) and assert that the file doesn't exist but I feel like that's also overkill.

I'm actually confused by this part as I thought this was added in #861. Also, you're not keeping the original format

path = os.path.join(save_directory, "history.json") with open(path, "w", encoding="utf-8") as f: json.dump(model.history.history, f, indent=2, sort_keys=True)

If there's already a test then it's good, but let's keep the json format

osanseviero

The PR name says "ENH remove history writing into model card and convert hyperparameters into table in model card for Keras #872", but i feel there are more things going around. Improving a bit the PR description and name would be very useful here.

Some general thoughts/questions though:

Creating a markdown table in a function called _extract_hyperparameters_from_keras seems misleading, should this be renamed?
Why was "training procedure" h2 removed?
Why were the metric writing removed entirely?

osanseviero · 2022-05-16T14:13:57Z

src/huggingface_hub/keras_mixin.py

+        if model.history.history != {}:
+            path = os.path.join(save_directory, "history.json")
+            with open(path, "w") as f:
+                json.dump(model.history.history, f)


If there's already a test then it's good, but let's keep the json format

merveenoyan · 2022-05-16T14:23:37Z

@osanseviero I will check and address your comments :)

adrinjalali · 2022-05-16T14:23:53Z

It's just that as it is, it's not easy for me to review the PR since I don't know what the effect of this PR is.

It would make it easier to review if I could see an example before and after this PR, and then we can decide if a test makes sense.

Like, a standalone example to use this change, and generate the modelcard locally and see the difference.

merveenoyan · 2022-05-16T15:13:28Z

@osanseviero I updated the description and name. I also changed the function name for table creation and brought back the JSON format.
@adrinjalali I trained and pushed a model for you:
after this PR: https://huggingface.co/merve/model-card-pr
before this PR: https://huggingface.co/merve/model-card-example

merveenoyan · 2022-05-16T15:21:28Z

The PR name says "ENH remove history writing into model card and convert hyperparameters into table in model card for Keras #872", but i feel there are more things going around. Improving a bit the PR description and name would be very useful here.

Some general thoughts/questions though:

Creating a markdown table in a function called _extract_hyperparameters_from_keras seems misleading, should this be renamed?

Why was "training procedure" h2 removed?

Why were the metric writing removed entirely?

I removed metric writing in the card only, previously we put it inside a JSON file so it's redundant to have it in the card and looks bad when user trains a model with 100 epochs (in the card).
I also thought h2 was redundant, I can bring it back though.
I changed the name of the function, thanks for the heads up :)

nateraw

Ok so I see a few things going on here, not just hparams related stuff.

I see training metrics being removed entirely? (Maybe I looked too quickly?)
I don't see tensorboard in your second model repo you posted above that is the result of this PR. Is this intentional?
I would really like to see some comments/docstrings added so folks who want to contribute can have better understanding of what's going on
The code has changed quite a bit but the tests haven't. Should we be adding in tests for this? (I think probably yes?)

Also, if we're going to add hparams to a file, might I suggest we do so with hparams.yaml so we can get the added benefit of seeing hparams in tensorboard if the user pushes tensorboard? This would take some additional logic + testing I'm sure, but I'm pretty sure if you put hparams.yaml in the same dir as your tensorboard logs, they should surface up in the hparams tab in TensorBoard, which can be really helpful. Not a dealbreaker, but an idea that could be fruitful to some.

nateraw · 2022-05-16T17:16:09Z

src/huggingface_hub/keras_mixin.py

+        if model.history.history != {}:
+            path = os.path.join(save_directory, "history.json")
+            with open(path, "w") as f:
+                json.dump(model.history.history, f, indent=2, sort_keys=True)


Thank you for making sure the format is in line with HF stuff.

nateraw · 2022-05-16T17:16:58Z

src/huggingface_hub/keras_mixin.py

@@ -27,45 +27,27 @@
    import tensorflow as tf


-def _extract_hyperparameters_from_keras(model):
+def _create_hyperparameters_table(model):


A docstring here would be helpful to explain what's going on.

nateraw · 2022-05-16T17:17:39Z

src/huggingface_hub/keras_mixin.py

@@ -123,15 +85,11 @@ def _create_model_card(
    model_card += "\n## Intended uses & limitations\n\nMore information needed\n"
    model_card += "\n## Training and evaluation data\n\nMore information needed\n"
    if hyperparameters is not None:
-        model_card += "\n## Training procedure\n"


Why is this removed? I don't know if it needs to be there or not, but just curious as to why its gone now.

I thought it was redundant. We're putting training related stuff only if hyperparams are there so it doesn't make any sense to keep it since metrics are gone.

nateraw · 2022-05-16T17:19:58Z

src/huggingface_hub/keras_mixin.py

-            with open(path, "w", encoding="utf-8") as f:
-                json.dump(model.history.history, f, indent=2, sort_keys=True)
-            lines = []
-            logs = model.history.history


Wait so here the training metrics section is gone too? No TensorBoard logs either?

TensorBoard logs don't leverage this anyway. They're still here.
see

huggingface_hub/src/huggingface_hub/keras_mixin.py

Line 422 in 65d3c88

if log_dir is not None:

osanseviero · 2022-05-16T17:31:00Z

Multiple small PRs, each one with individual changes, tends to be better than a larger PR with multiple changes impacting different completely unrelated things, it's making it a bit harder for 3 people to go through the PR, understand what is being changed, and clearly is creating a bit of confusion 😅

merveenoyan · 2022-05-17T09:11:25Z

I didn't remove tensorboard, it's just in the model I pushed I didn't log them to be quick. See

huggingface_hub/src/huggingface_hub/keras_mixin.py

Line 386 in 49a14d9

if log_dir is not None:

to see that they're still here. My tests wouldn't pass if they weren't there.

merveenoyan · 2022-05-17T09:17:55Z

I will break this PR up.

osanseviero · 2022-05-17T19:04:36Z

Closing this as it's now split.

remove history writing and prettify hyperparameters section

fadb480

merveenoyan requested review from nateraw and osanseviero May 12, 2022 15:48

put history.json back that was accidentally removed

fb8316a

osanseviero reviewed May 16, 2022

View reviewed changes

renamed table creation and changed format of json

49a14d9

merveenoyan changed the title ~~ENH remove history writing into model card and convert hyperparameters into table in model card for Keras~~ ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header May 16, 2022

nateraw reviewed May 16, 2022

View reviewed changes

merveenoyan mentioned this pull request May 17, 2022

ENH Removed history writing in Keras model card #876

Merged

osanseviero closed this May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header #872

ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header #872

merveenoyan commented May 12, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented May 12, 2022 •

edited

Loading

adrinjalali commented May 16, 2022

merveenoyan commented May 16, 2022 •

edited

Loading

osanseviero May 16, 2022

merveenoyan May 16, 2022

merveenoyan May 16, 2022

osanseviero May 16, 2022

osanseviero May 16, 2022

osanseviero left a comment

osanseviero May 16, 2022

merveenoyan commented May 16, 2022

adrinjalali commented May 16, 2022 •

edited

Loading

merveenoyan commented May 16, 2022 •

edited

Loading

merveenoyan commented May 16, 2022

nateraw left a comment

nateraw May 16, 2022

nateraw May 16, 2022

nateraw May 16, 2022

merveenoyan May 17, 2022 •

edited

Loading

nateraw May 16, 2022

merveenoyan May 17, 2022 •

edited

Loading

osanseviero commented May 16, 2022

merveenoyan commented May 17, 2022

merveenoyan commented May 17, 2022

osanseviero commented May 17, 2022

ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header #872

ENH remove history writing into model card, convert hyperparameters from a dict into table in model card, removed a header #872

Conversation

merveenoyan commented May 12, 2022 • edited Loading

HuggingFaceDocBuilderDev commented May 12, 2022 • edited Loading

adrinjalali commented May 16, 2022

merveenoyan commented May 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

osanseviero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merveenoyan commented May 16, 2022

adrinjalali commented May 16, 2022 • edited Loading

merveenoyan commented May 16, 2022 • edited Loading

merveenoyan commented May 16, 2022

nateraw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merveenoyan May 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merveenoyan May 17, 2022 • edited Loading

Choose a reason for hiding this comment

osanseviero commented May 16, 2022

merveenoyan commented May 17, 2022

merveenoyan commented May 17, 2022

osanseviero commented May 17, 2022

merveenoyan commented May 12, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented May 12, 2022 •

edited

Loading

merveenoyan commented May 16, 2022 •

edited

Loading

adrinjalali commented May 16, 2022 •

edited

Loading

merveenoyan commented May 16, 2022 •

edited

Loading

merveenoyan May 17, 2022 •

edited

Loading

merveenoyan May 17, 2022 •

edited

Loading