ImagePulse

ImagePulse project aims to provide dataset support for the next generation of image understanding and generation models, by atomizing the capabilities of these models and constructing atomic capability datasets.

Switch to Chinese

Atomic Capability Datasets

1. Change, Add, Remove

Dataset: https://www.modelscope.cn/datasets/DiffSynth-Studio/ImagePulse-ChangeAddRemove
Dataset Construction Script: ./scripts/change_add_remove.py

image_1	image_2	mask	editing_instruction	reverse_editing_instruction
			Remove the mustache and beard, change the white shirt to a blue turtleneck sweater, and remove the glass of milk.	Add a mustache and beard, change the blue turtleneck sweater to a white shirt, and add a glass of milk.
			Add a silver butterfly to the glowing golden lace on her face.	Remove the silver butterfly from the glowing golden lace on her face.
			Remove the necklace.	Add a necklace.

2. Zoom In, Zoom Out

Dataset: https://www.modelscope.cn/datasets/DiffSynth-Studio/ImagePulse-ZoominZoomout
Dataset Construction Script: ./scripts/zoomin_zoomout.py

image_1	image_2	image_cropped	mask	editing_instruction	reverse_editing_instruction
				Zoom in to focus on the headband.	Zoom out to show the full view of the anime girl.
				Remove the superhero costume and replace it with a red shirt. Adjust the lighting to highlight the man's face.	Add a superhero costume with a red and yellow emblem on the chest and a red cape. Adjust the lighting to emphasize the costume.
				Remove the elephant and replace it with a large rock.	Replace the large rock with an elephant.

3. Style Transfer

Dataset: https://www.modelscope.cn/datasets/DiffSynth-Studio/ImagePulse-StyleTransfer
Dataset Construction Script: ./scripts/style_transfer.py

image_1	image_2	image_3	image_4	editing_instruction	reverse_editing_instruction
				transform the image into a cartoon style with vibrant colors and a confident expression.	transform the image into a realistic portrait with a serious expression and subtle lighting.
				transform the image to have a brighter, more colorful palette and a clear blue sky.	transform the image to have a more muted color palette and an overcast sky.
				transform the style of the image to an anime illustration, change the jacket to red, and add a cityscape background.	transform the style of the image to a digital painting, change the jacket to black, and remove the cityscape background.

4. Face ID

Dataset: https://www.modelscope.cn/datasets/DiffSynth-Studio/ImagePulse-FaceID
Dataset Construction Script: ./scripts/faceid.py

image_face	image_1	image_2	editing_instruction	reverse_editing_instruction
			Change the woman's white t-shirt to a white tank top.	Change the woman's white tank top to a white t-shirt.
			Add a nighttime street scene with bokeh lights in the background.	Remove the nighttime street scene and bokeh lights from the background.
			Change the background to a warmly lit room with lamps, change the suit to maroon, and add a sweater under the suit.	Change the background to a dimly lit room with red lighting, change the suit to black, and remove the sweater.

Running Dataset Generation

pip install -r requirements.txt

python change_add_remove.py \
  --target_dir "data/dataset" \
  --cache_dir "data/cache" \
  --dashscope_api_key "sk-xxxxxxxxxxxxxxxx" \
  --qwenvl_model_id "qwen-vl-max" \
  --modelscope_access_token "xxxxxxxxxxxxxxx" \
  --modelscope_dataset_id "DiffSynth-Studio/ImagePulse-ChangeAddRemove" \
  --num_data 1000000 \
  --max_num_files_per_folder 1000

target_dir: Path to store the dataset
cache_dir: Cache path
dashscope_api_key: DashScope API Key, required when calling DashScope API
qwenvl_model_id: ID of the Qwen-VL model on DashScope, required when calling DashScope API
modelscope_access_token: Access token from ModelScope, required when uploading datasets to ModelScope
modelscope_dataset_id: Dataset ID on ModelScope, required when uploading datasets to ModelScope
num_data: Total number of data samples
max_num_files_per_folder: Number of files per packaged folder

Acknowledgements

DiffSynth-Studio: Provided Diffusion model inference support for this project
ModelScope: Provided storage and download support for models and datasets in this project
DashScope: Provided inference API support for large language models in this project

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
pulse		pulse
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ImagePulse

Atomic Capability Datasets

1. Change, Add, Remove

2. Zoom In, Zoom Out

3. Style Transfer

4. Face ID

Running Dataset Generation

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

modelscope/ImagePulse

Folders and files

Latest commit

History

Repository files navigation

ImagePulse

Atomic Capability Datasets

1. Change, Add, Remove

2. Zoom In, Zoom Out

3. Style Transfer

4. Face ID

Running Dataset Generation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages