You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I dumped all of the matching sentences into their own table in a database that you, or someone else you wanted, could add more sentences to the corpus then choose a download file of your choosing, comma delimited, tab delimited, xls, etc and the entries tagged by source - would that be of interest to you? It'd be like an additional corpus search field on the ced but a page that if you're an admin you can add values one at a time or if you have an xls I can write a parser to import an xls file and they'd get a tag. then if you want anyone can download the corpus in whatever manner they want for use including all of the sentences in the ced. I could supply filtering on the page so you could see all of the entries for a particular group or whatever. idk, just thinking this trough on what might make it easier for you or what might make it easier for others to use. it's up to you.
I think that would make things super convenient, and it would really help the work I’m trying to do with the corpus in terms of bootstrapping that NLP stuff. If we had aligned Cherokee/English sentences from all y’all’s sources plus the texts we’ve already added, it would get us ever closer to that 30,000 lines we were after. You say you’d be able to enter pairs on the ced site? That and/or an importer for xls would be very useful.
ok awesome. Give me a couple of days and I'll have a hidden page worked up with an xlsx importer to test. I also found that google has an awesome tensorflow collaboration hub that I didn't know about. you use their processors to test your code so you're not burning your processor power.
The text was updated successfully, but these errors were encountered:
If I dumped all of the matching sentences into their own table in a database that you, or someone else you wanted, could add more sentences to the corpus then choose a download file of your choosing, comma delimited, tab delimited, xls, etc and the entries tagged by source - would that be of interest to you? It'd be like an additional corpus search field on the ced but a page that if you're an admin you can add values one at a time or if you have an xls I can write a parser to import an xls file and they'd get a tag. then if you want anyone can download the corpus in whatever manner they want for use including all of the sentences in the ced. I could supply filtering on the page so you could see all of the entries for a particular group or whatever. idk, just thinking this trough on what might make it easier for you or what might make it easier for others to use. it's up to you.
I think that would make things super convenient, and it would really help the work I’m trying to do with the corpus in terms of bootstrapping that NLP stuff. If we had aligned Cherokee/English sentences from all y’all’s sources plus the texts we’ve already added, it would get us ever closer to that 30,000 lines we were after. You say you’d be able to enter pairs on the ced site? That and/or an importer for xls would be very useful.
ok awesome. Give me a couple of days and I'll have a hidden page worked up with an xlsx importer to test. I also found that google has an awesome tensorflow collaboration hub that I didn't know about. you use their processors to test your code so you're not burning your processor power.
The text was updated successfully, but these errors were encountered: