Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Added python script for parsing xml -> tsv or json. #898

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alvadia
Copy link

@alvadia alvadia commented Mar 4, 2021

Added python script for parsing xml -> tsv or json.
First argument: from (example - dict.xml)
Second argument: to (example - dict.json)
Third argument: mode (example - json).

Sample format of tsv (' ' means space):
id \t root \t data \t extra \n
#header \t dictionary \t version \t revision \n
OpenCorpora \t dictionary \t <version_from_xml> \t \n
#lemmas lemma variants empty
[ \t ' ' <';'.join(attributes)> \t ' ' <';'.join(attributes)> [, ' ' <';'.join(attributes)>]* \t \n]*
#gramemes \t parent \t alias \t description \n
[ \t \t \t \n]*
#links \t from \t to \t type \n
[ \t \t \t \n]*

It requires much less space.
This script is a sample, it requires a .sh wrapper.

First argument: from (example - dict.xml)
Second argument: to (example - dict.json)
Third argument: mode (example - json).
@alvadia
Copy link
Author

alvadia commented Mar 4, 2021

#5

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant