Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Feature Request]: Retrieve meta-data for models from a YAML file #8029

Open
1 task done
schumar opened this issue Feb 22, 2023 · 7 comments
Open
1 task done

[Feature Request]: Retrieve meta-data for models from a YAML file #8029

schumar opened this issue Feb 22, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@schumar
Copy link

schumar commented Feb 22, 2023

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Allow to store meta-data for models (i.e. Checkpoints, LoRAs, Hypernets,
Embeddings) which can be useful for the WebUI, in a YAML file together with the
model (in the same directory, with extension .webui.yaml).

This would allow solving a whole bunch of existing feature requests; I found at
least those:
#3121 #3497 #3522 #4996 #5237 #5922 #6013 #6729 #7169

It might also help with:
#4476 #3443 #4286 #1800 #6574

There already exists a pull request, #7953, which stores 1 piece of meta-data
(a description) in a .txt file, similar to sd-model-preview,
but we might want to store so much more, e.g.

  • A nice "display name", which can be shown in the UI instead of the filename
  • The already mentioned description
  • Usage hints for the model
  • Trigger words
  • A "favorite" flag (can then show that model with a star icon)
  • URL where to find the model (e.g. to look for updates, and more details)
  • URL where to donate
  • Flag for NSFW-filtering (so we could switch the WebUI to "SFW mode" by not
    showing any models with that tag)
  • Information if the model works on realistic/semi-realistic/anime style
  • Information if the model can produce SFW and/or NSFW pictures
  • For checkpoints: Which (if any) VAE is recommended?
  • List of other models this is known to work with (e.g. for a LoRA: which
    checkpoints does it work with?)
  • Example generation data (prompt, neg-prompt, cfg, sampler, steps, seed)
  • Suggested weight(-range)
  • Can the model be used for txt2img and/or img2img?

(and that list is just what I can come up with, I'm sure others will have
a lot of other great ideas!)

Proposed workflow

User has to manually create the meta-data file (at least at first; later on,
there might be extensions allowing to do this via UI, or maybe model authors can
start providing such files too).

Whenever the WebUI creates a list of models by scanning a directory, it will
need to also load and parse the meta-data file, and add the retrieved
information to the object.

This would then allow other parts of the WebUI (and of course extensions) to use
this data (see above list or some ideas).

Additional information

I don't expect to simply write this feature request and then have others do all
the work :) I wanted to put this out there before starting any coding.

So this is more of an RFC:

  • What do you think about the idea?
  • What other data could be stored?
  • Is there already something out there, which does sth similar?
  • Is there some better way to do this?
  • @AUTOMATIC1111, would you be okay with a PR?
  • Any important hints on how to implement this?

Here's an example file for AOM3A (AOM3A3.webui.yaml)

displayname: AbyssOrangeMix3 A3
description: |
  The main model, "AOM3 (AbyssOrangeMix3)", is a purely upgraded model that
  improves on the problems of the previous version, "AOM2". "AOM3" can generate
  illustrations with very realistic textures and can generate a wide variety of
  content.

  A3 is a midpoint of artistic and kawaii. The model has been tuned to combine
  realistic textures, a artistic style that also feels like an oil colour
  style, and a cute anime-style face.
source: "https://civitai.com/models/9942/abyssorangemix3-aom3"
based-on: "SD 1.5"
nsfw-filter: false
does:
  anime:  true
  semirealistic: true
  realistic: false
  sfw: true
  nsfw: true
hints: |
  Don't write a lot of "Negative prompt".
  Sampler: “DPM++ SDE Karras” is good.
  Steps: for Test: 12, illustration: 20
vae: AOM3A3.vae.pt
works-with:
  # 1: no, 2: ok, 3: good, 0/unset: unknown
  vaes:
    - name: anythingv3
      hash: 0e3f4822
      rating: 2
    - name: bastard_mse
      rating: 3
  loras:
    - name: alloldrpgarts11990s_alloldrpgarts11990sV1
      rating: 2
examples:
  - pos: farmer, plowing field, under an alien sun
    neg: machines
    cfg: 7.0
    sampler: Euler a
    steps: 25
    seed: 8123234
  - pos: pumpkin with hat
    neg: witch
@schumar schumar added the enhancement New feature or request label Feb 22, 2023
@Skeula
Copy link

Skeula commented Feb 22, 2023

I've actually started something similar to this using a fork of the model-keywords extension (though using JSON). My scope is a little less ambitious but very similar. That experience tells me most of this could be implemented as an extension (and probably should be).

What that looks like right now is this:

{
    "title": "MeinaMix",
    "tags": [
        "anime",
        "illustration",
        "mix",
        "semi-realistic",
        "model"
    ],
    "author": "Meinaaa",
    "type": "checkpoint merge",
    "description": "This model may do nsfw art! (add nsfw in the negative prompt if you don't wish for nsfw art )My main objectives for my model is:1- Not need a long prompt to generate good images and relay less in luck, using the prompt only to fine-tune the results.2- Be capable of generating wallpaper like images!However making models, merging and testing takes a lot of time, so i made a ko-fi page in case you like my model and want me to support me improve it by helping me stay awake by giving me coffee <3 , it will be very much appreciated: https://ko-fi.com/meinaRecommendations of use:for the negative: (worst quality, low quality:1.4), (malformed hands:1.4),(poorly drawn hands:1.4),(mutated fingers:1.4),(extra limbs:1.35),(poorly drawn face:1.4), The best samplers in most of the generations is DPM++ SDE/DPM++ SDE Karass at 20 to 50 steps, Euler A at 50 steps, with a CFG scale of 5 up to 10. ( Clip skip 1 or 2. )As for the upscaler in most of the scenarios is R-ESRGAN 4x, with 10 steps at 0.4 up to 0.6 denoising.I've been testing with the Orangemix VAE, it will be added in the download option in case you don't have it. I changed the VAE and it will be baked in all of the versions starting now with the 2.1! I'll love to see the images everyone can generate using it and help me find situations where the model needs improving, it will help for the next versions of Meina to be better!!!In the merged models list: Meina Version 1, Kenshi, AbyssOrangeMix2, PastelMix and Grapefruit, i do not have the exact recipe because i did multiple mixings using block weighted merges with multiple settings and kept the better version of each merge.",
    "link": "https://civitai.com/models/7240/meinamix",
    "version": "Meina V4.1 - Baked VAE",
    "updated": "2023-02-16T12:59:05.832Z",
    "trigger": [],
    "settings": {
        "negative_prompt": [ "(worst quality, low quality:1.4), (malformed hands:1.4),(poorly drawn hands:1.4),(mutated fingers:1.4),(extra limbs:1.35),(poorly drawn face:1.4)" ]
    },
    "suggested": {
        "sampler": ["DPM++ SDE", "DPM++ SDE Karras"],
        "steps": [20,50],
        "clip_skip": [1,2]
    },
    "base": "SD 1.5",
    "preview": "https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/b6e80b63-4ac2-42f0-9a34-b89568aae000/width=400",
    "files": [
        {
            "id": 10454,
            "filename": "Meina V4.1 - Baked VAE.safetensors",
            "url": "https://civitai.com/api/download/models/11187?type=Model&format=SafeTensor",
            "type": "Model",
            "format": "SafeTensor"
        }
    ]
}

(These are generated from a small CLI I wrote that downloads the files associated with a specific version on CivitAI and writes the metadata file alongside the files it downloaded. It also downloads the first image from the site as the preview.)

The extension reads the settings key for prompt and negative_prompt currently, though I intend to at least add clip_skip to that.

My thoughts on the suggested section was to expose the min the UI (but I haven't got as far as thinking about what that would look like exactly. I like your ideas around tags in the filter section in particular.)

@schumar
Copy link
Author

schumar commented Feb 23, 2023

Thanks for your reply, Skeula!

I've actually started something similar to this using a fork of the model-keywords extension

Would you mind sharing this?

(though using JSON).

Personally I also prefer JSON over YAML (don't even get me started), but my thoughts were:

  • Model authors already know YAML (it can be used to specify some model-specific stuff I don't understand)
  • As long as the only way to generate that data is "type it in, manually", less typing is good
  • JSON is a subset of YAML, i.e. a YAML parser will be able to read your JSON just fine

My scope is a little less ambitious but very similar.

Not sure about that; for now my scope is very limited to "read the data".
Actually using the data will take more work -- but that first step is necessary :)

That experience tells me most of this could be implemented as an extension (and probably should be).

I trust you here, as I have no experience with this project so far.

Will other extensions still be able to access the data?
(I'm not sure that a single extension which has to make user of all of the provided data is feasible)

What that looks like right now is this:
[...]
(These are generated from a small CLI I wrote that downloads the files associated with a specific version on CivitAI and writes the metadata file alongside the files it downloaded. It also downloads the first image from the site as the preview.)

That's such a cool idea, or as the kids would say, "shut up and take my money!" ;)

The extension reads the settings key for prompt and negative_prompt currently, though I intend to at least add clip_skip to that.

Right, clip_skip should definitely be in there too

My thoughts on the suggested section was to expose the min the UI (but I haven't got as far as thinking about what that would look like exactly

I'm awful at UI/UX, my approach would have been "color the sliders" for all the values that have a slider.
(just writing this down conjures up a nightmarish UI. This is why it might be a good idea to just read the data, and make it available to other extensions by people who know how to UI :)

I like your ideas around tags in the filter section in particular.)

Thanks! Though I didn't think of those as "tags"; my approach was to actually have those exact 5 keys in does, because in a tag-based approach, you don't know if e.g. a missing sfw tag means "Doesn't do SFW", or "Don't know/care".

But the more I think about it, the more over-engineered that sounds :) just having this as tags (can-sfw/cannot-sfw?) would be perfectly sufficient :)

Thanks again for your thoughts!

@davidmoore-io
Copy link

+1 on this. Model and config management are real gaps in all the new SD interfaces. I have ideas I'll type up over the weekend... I totally agree with the proposed approach here tho.

@Skeula
Copy link

Skeula commented Feb 25, 2023

Thanks for your reply, Skeula!

I've actually started something similar to this using a fork of the model-keywords extension

Would you mind sharing this?

Oh, yes, sure...

https://github.com/Skeula/model-specific-prompts/

* Model authors already know YAML (it can be used to specify some model-specific stuff I don't understand)

Honestly this is a pretty good point.

Will other extensions still be able to access the data? (I'm not sure that a single extension which has to make user of all of the provided data is feasible)

There are probably ways, but extensions mostly don't interact much directly.

What that looks like right now is this:
[...]
(These are generated from a small CLI I wrote that downloads the files associated with a specific version on CivitAI and writes the metadata file alongside the files it downloaded. It also downloads the first image from the site as the preview.)

That's such a cool idea, or as the kids would say, "shut up and take my money!" ;)

I've put it up here: https://github.com/Skeula/stable-diffusion-webui/blob/skeula/get-civit

I saw that there's a module for browsing civitai directly from the ui, so I've been thinking that integrating with that would be best.

@schumar
Copy link
Author

schumar commented Feb 26, 2023

Added some lines to my fork of stable-diffusion-webui, see branch read_metadata_example, to better illustrate my approach.

This will read a .webui.yaml (@Skeula, tested this with your example JSON, works fine, just name it e.g. meinamix_meinaV41.webui.yaml), and add the data to the checkpoint object.

That data can then be used elsewhere, e.g. I have modified ui_extra_networks_checkpoints.py to show the title or displayname from the YAML file instead of the filename.

@schumar
Copy link
Author

schumar commented Feb 26, 2023

Just updated my branch to also do this for LoRAs.

@schumar
Copy link
Author

schumar commented Mar 7, 2023

Pushed the updates for the other 2 types (embeddings and hypernetworks).

I would consider this patch "complete" for now, i.e. it makes the webui read meta-data, so that other parts can access it.

btw: Just today, someone else tried a completely different approach: butaixianran/Stable-Diffusion-Webui-Civitai-Helper

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants