Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Various tasks where LLM might help #6145

Open
nicolas-raoul opened this issue Jan 19, 2025 · 4 comments
Open

Various tasks where LLM might help #6145

nicolas-raoul opened this issue Jan 19, 2025 · 4 comments
Assignees
Labels
enhancement gsoc Google Summer of Code

Comments

@nicolas-raoul
Copy link
Member

nicolas-raoul commented Jan 19, 2025

  • help find correct category name, showing suggestions from "Cuisine of Japan" when "Japan food" is typed
  • fix obvious typos in depiction search
  • detect unwanted nudity pictures from caption
  • detect meaningless captions such as PXL1234 etc
  • show translation of captions/etc not available in the user's languages. Either show next to original, or replace original when "AI translation" button shown. Better than OS-level because we know source language.

Feel free to comment if you want to add more, thanks! :-)

@nicolas-raoul nicolas-raoul added enhancement gsoc Google Summer of Code labels Jan 19, 2025
@nicolas-raoul nicolas-raoul self-assigned this Jan 19, 2025
@Thejas775
Copy link

Addition ? Maybe check the caption with image

@nicolas-raoul
Copy link
Member Author

Maybe check the caption with image

Indeed great idea! All Android device-embedded models so far seem to be text-to-text, but that could change in the future.

@Thejas775
Copy link

Thejas775 commented Jan 23, 2025

Maybe check the caption with image

Indeed great idea! All Android device-embedded models so far seem to be text-to-text, but that could change in the future.

yeah in future maybe. Now too very less devices support on device model running.
I also have a doubt why are we not considering api calls?

@whym
Copy link
Collaborator

whym commented Jan 23, 2025

I think API calls to a Wikimedia-controlled Toolforge server would be fine from a privacy perspective. (We already call Wikimedia servers, including a Toolforge server, so more calls to Wikimedia servers would be fine.) However, whether Toolforge will host a large vision-language model would depend on their policy and is still unclear (https://phabricator.wikimedia.org/T336905).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement gsoc Google Summer of Code
Projects
None yet
Development

No branches or pull requests

3 participants