Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add NLP coordination table to site #69

Open
cdrchops opened this issue Aug 25, 2021 · 0 comments
Open

Add NLP coordination table to site #69

cdrchops opened this issue Aug 25, 2021 · 0 comments
Labels
enhancement New feature or request
Milestone

Comments

@cdrchops
Copy link
Member

If I dumped all of the matching sentences into their own table in a database that you, or someone else you wanted, could add more sentences to the corpus then choose a download file of your choosing, comma delimited, tab delimited, xls, etc and the entries tagged by source - would that be of interest to you? It'd be like an additional corpus search field on the ced but a page that if you're an admin you can add values one at a time or if you have an xls I can write a parser to import an xls file and they'd get a tag. then if you want anyone can download the corpus in whatever manner they want for use including all of the sentences in the ced. I could supply filtering on the page so you could see all of the entries for a particular group or whatever. idk, just thinking this trough on what might make it easier for you or what might make it easier for others to use. it's up to you.

I think that would make things super convenient, and it would really help the work I’m trying to do with the corpus in terms of bootstrapping that NLP stuff. If we had aligned Cherokee/English sentences from all y’all’s sources plus the texts we’ve already added, it would get us ever closer to that 30,000 lines we were after. You say you’d be able to enter pairs on the ced site? That and/or an importer for xls would be very useful.

ok awesome. Give me a couple of days and I'll have a hidden page worked up with an xlsx importer to test. I also found that google has an awesome tensorflow collaboration hub that I didn't know about. you use their processors to test your code so you're not burning your processor power.

@cdrchops cdrchops added the enhancement New feature or request label Aug 25, 2021
@cdrchops cdrchops added this to the Backlog milestone Aug 25, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant