Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

add CRAFT training code #739

Merged
merged 6 commits into from
May 30, 2022
Merged

add CRAFT training code #739

merged 6 commits into from
May 30, 2022

Conversation

gmuffiness
Copy link
Contributor

@gmuffiness gmuffiness commented May 26, 2022

@rkcosmos
Copy link
Contributor

thanks, I'll test it and get back to you.

@rkcosmos
Copy link
Contributor

hi, i have a few questions.

  1. Is the reported score coming from training from scratch? Why do we need to download a pretrain in training step 2?
  2. how many synthtext images do you use to produce the result? Is it from pre-generated 800,000?
  3. To produce a pretrain for SynthText + ICDAR2015 dataset (second row in readme), do you start from a checkpoint from SynthText only? (I saw ckpt_path in ic15_train.yaml. I guess it is from the one with 50,000 iter.)

Thanks.

@gmuffiness
Copy link
Contributor Author

Thanks for the quick response.

  1. First row in readme is a model trained with SynthText from scratch, and second row in readme loaded the first step model checkpoint that is trained for 50,000 iter with SynthText and start training for 25,000 iterations with SynthText and IC15.

Training step 2 model is simply attached to show that model (trained with this training code) achieves performance comparable to papers. If you don't need it, you don't need to download it actually. If that confuses you a bit, is it better to delete it?

  1. Yes, 800,000 images provided by the following link. https://www.robots.ox.ac.uk/~vgg/data/scenetext/

  2. Yes. I trained only SynthText total 50,000 iter, but among them, the checkpoint with the best accuracy was uploaded.
    I can change the best accuracy checkpoint to final checkpoint (50,000 iter) or add it. There is no significant difference in performance between the two. What would be better for me to do?

Feel free to ask if you have any more questions.

@rkcosmos
Copy link
Contributor

Thanks for the explanation.

I tried to clarify training section in readme file a bit. Please comment if it is accurate.

@gmuffiness
Copy link
Contributor Author

gmuffiness commented May 30, 2022

Thanks for the supplementary explanation.

I changed the readme a bit and add separate training script for SynthText dataset.
I think it will be easier to understand how to use now.

@rkcosmos rkcosmos merged commit 4b7c26b into JaidedAI:master May 30, 2022
@rkcosmos
Copy link
Contributor

Merged.

Dependency bot reports a few security issues with pillow==8.2.0. Could you check if the code still work for higher pillow version? The bot suggest pillow==9.0.1.

@gmuffiness
Copy link
Contributor Author

Thanks!

I checked that it works well in the higher pillow version. (pillow==9.0.1)

thuc-moreh pushed a commit to moreh-dev/EasyOCR that referenced this pull request Jul 5, 2023
@VuDinhToan303
Copy link

Hello gmuffiness,
I have tried to train the Craft model using your code. However, when training with my data, I encountered the following error: Error : operands could not be broadcast together with shapes (200,200) (330,0)
On generating affinity map, strange box came out. (width: 0, height: 330)
This error came from gaussian.py, how can i fix that?Thank you

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants