New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

#

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Jump to bottom

VisionAPIによる画像の文字起こし機能の実装 #23

Closed

Tracked by

book000 opened this issue Oct 4, 2023 · 6 comments · Fixed by #70

Closed

Tracked by

VisionAPIによる画像の文字起こし機能の実装 #23

book000 opened this issue Oct 4, 2023 · 6 comments · Fixed by #70

Member

book000 commented Oct 4, 2023 •

edited

Loading

https://github.com/jaoafa/JDA-VCSpeaker/blob/main/src/main/java/com/jaoafa/jdavcspeaker/Lib/VisionAPI.java

Label detectionはせず、Text detectionのみ行う。

ファイルの拡張子によって、画像かそうでないかを判別し「画像処理」と「ファイル処理」に分ける
無料範囲内を超えるリクエストになってしまう場合はやめる（ファイル処理へ移行）
画像ファイルをテンポラリディレクトリにダウンロードする
MimeTypeを取得し、適切なMimeTypeか確認する（ダメならファイル処理に移行）
画像の中身をmd5などでハッシュ化し、キャッシュが存在するかを確認する
base64で画像ファイルの中身をエンコードする
VisionAPIのエンドポイントへリクエストを投げる。リクエスト回数をカウントする
レスポンスを処理し、文字列・文字の位置を取得し保存する。
画像を編集し、文字列がどの位置にあるかを示す画像を作成する。
適宜読み上げたり投稿する。

結果はキャッシュし、ハッシュ値が同じであればキャッシュレスポンスを利用。

book000 mentioned this issue

Open

book000 changed the title ~~VisionAPI~~ VisionAPIによる画像の文字起こし機構の実装

book000 changed the title ~~VisionAPIによる画像の文字起こし機構の実装~~ VisionAPIによる画像の文字起こし機能の実装

Member Author

book000 commented Oct 4, 2023

疑問: Okhttp以外に、なんかマシなHTTPクライアントライブラリはあるのかしら。

Member

yuuahp commented Oct 4, 2023

たぶん VoiceTextAPI と同じで Ktor 使うと思う

Member

yuuahp commented Oct 22, 2023

もしかして VisionAPI ってクレジットカード必須ですか？

Member Author

book000 commented Oct 22, 2023

そうかも。

Member

yuuahp commented Oct 23, 2023

ダメです。

Member Author

book000 commented Oct 23, 2023

うーん、ちょっと内々で話させてください

book000 mentioned this issue

feat: vcテキストチャンネルに画像が投稿された場合に、文字起こしする #70

Merged

yuuahp closed this as completed in #70

# for free to join this conversation on GitHub. Already have an account? # to comment