Skip to content

modelscope/ImagePulse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImagePulse

ImagePulse project aims to provide dataset support for the next generation of image understanding and generation models, by atomizing the capabilities of these models and constructing atomic capability datasets.

Switch to Chinese

Atomic Capability Datasets

1. Change, Add, Remove

image_1 image_2 mask editing_instruction reverse_editing_instruction
Remove the mustache and beard, change the white shirt to a blue turtleneck sweater, and remove the glass of milk. Add a mustache and beard, change the blue turtleneck sweater to a white shirt, and add a glass of milk.
Add a silver butterfly to the glowing golden lace on her face. Remove the silver butterfly from the glowing golden lace on her face.
Remove the necklace. Add a necklace.

2. Zoom In, Zoom Out

image_1 image_2 image_cropped mask editing_instruction reverse_editing_instruction
Zoom in to focus on the headband. Zoom out to show the full view of the anime girl.
Remove the superhero costume and replace it with a red shirt. Adjust the lighting to highlight the man's face. Add a superhero costume with a red and yellow emblem on the chest and a red cape. Adjust the lighting to emphasize the costume.
Remove the elephant and replace it with a large rock. Replace the large rock with an elephant.

3. Style Transfer

image_1 image_2 image_3 image_4 editing_instruction reverse_editing_instruction
transform the image into a cartoon style with vibrant colors and a confident expression. transform the image into a realistic portrait with a serious expression and subtle lighting.
transform the image to have a brighter, more colorful palette and a clear blue sky. transform the image to have a more muted color palette and an overcast sky.
transform the style of the image to an anime illustration, change the jacket to red, and add a cityscape background. transform the style of the image to a digital painting, change the jacket to black, and remove the cityscape background.

4. Face ID

image_face image_1 image_2 editing_instruction reverse_editing_instruction
Change the woman's white t-shirt to a white tank top. Change the woman's white tank top to a white t-shirt.
Add a nighttime street scene with bokeh lights in the background. Remove the nighttime street scene and bokeh lights from the background.
Change the background to a warmly lit room with lamps, change the suit to maroon, and add a sweater under the suit. Change the background to a dimly lit room with red lighting, change the suit to black, and remove the sweater.

Running Dataset Generation

pip install -r requirements.txt
python change_add_remove.py \
  --target_dir "data/dataset" \
  --cache_dir "data/cache" \
  --dashscope_api_key "sk-xxxxxxxxxxxxxxxx" \
  --qwenvl_model_id "qwen-vl-max" \
  --modelscope_access_token "xxxxxxxxxxxxxxx" \
  --modelscope_dataset_id "DiffSynth-Studio/ImagePulse-ChangeAddRemove" \
  --num_data 1000000 \
  --max_num_files_per_folder 1000
  • target_dir: Path to store the dataset
  • cache_dir: Cache path
  • dashscope_api_key: DashScope API Key, required when calling DashScope API
  • qwenvl_model_id: ID of the Qwen-VL model on DashScope, required when calling DashScope API
  • modelscope_access_token: Access token from ModelScope, required when uploading datasets to ModelScope
  • modelscope_dataset_id: Dataset ID on ModelScope, required when uploading datasets to ModelScope
  • num_data: Total number of data samples
  • max_num_files_per_folder: Number of files per packaged folder

Acknowledgements

  • DiffSynth-Studio: Provided Diffusion model inference support for this project
  • ModelScope: Provided storage and download support for models and datasets in this project
  • DashScope: Provided inference API support for large language models in this project