cariban_irregular_1/data at main · fmatter/cariban_irregular_1

History

Name		Name	Last commit message	Last commit date
parent directory ..
cldf		cldf
README.md		README.md
abbreviations.csv		abbreviations.csv
apalai_sa_verb_stats.csv		apalai_sa_verb_stats.csv
bathe_data.csv		bathe_data.csv
cariban.bib		cariban.bib
cognate_sets.csv		cognate_sets.csv
examples.csv		examples.csv
extensions.csv		extensions.csv
inflection_data.csv		inflection_data.csv
kalina_dictionary.txt		kalina_dictionary.txt
languages.csv		languages.csv
other_lexemes.csv		other_lexemes.csv
segments.txt		segments.txt
split_s_data.csv		split_s_data.csv
test.py		test.py
verb_stem_data.csv		verb_stem_data.csv

README.md

This folder contains all the data compiled for the study, organized in CSV (comma-separated values) files. The contents are structured as follows:

cldf: dataset in the CLDF format, generated with this python script
verb_stem_data.csv
- Language_ID: Reference to languages:ID.
- Form: Derivational morphology segmented with +, elements in brackets only surface sometimes.
- Source: bibkey[page] referencing references.bib, multiple separated by ; , pc for personal communication
- Cognateset_ID: cognate_sets:ID+cognate_sets:ID.
- Class: Class of the verb: S_A_ or S_P_. – when no split-S, ? when class unknown, S_A_ / S_P_ for mixed Wayana verbs.
- Comment: Comments
- Cog_Cert: Cases which do not seem fully cognate are marked with 0.5 here.
- Meaning_ID: Reference to values of cognate_sets:Meaning.
split_s_data.csv
- Language_ID: Reference to languages:ID.
- Construction: Kind of verb form.
- Form: Form with - separating morphemes.
- Meaning: Direct translation, not a cognate_sets:Meaning.
- Class: Verb class, S_A_ or S_P_.
- Source: bibkey[page] referencing references.bib, multiple separated by ; , pc for personal communication
other_lexemes.csv
- Language_ID: Reference to languages:ID.
- Form: Form.
- Meaning: Translation.
- Source: bibkey[page] referencing references.bib, multiple separated by ; , pc for personal communication
- Full_Form: Full form as it appeared in the cited source.
- Cognateset_ID: cognate_sets:ID
examples.csv
- ID: An ID, usually consisting of languages:ID-X.
- Language_ID: Reference to languages:ID.
- Sentence: Either identical to Segmentation, an orthographical form, and/or the form in the source.
- Segmentation: -separate morphemes, spaces phonological words.
- Gloss: Corresponding to Segmentation.
- Translation: Free English translation.
- Source: bibkey[page] referencing references.bib, pc for personal communication
- Orig_Segmentation: Segmentation of the form as it appears in the source.
- Orig_Glossing: Glossing as it appears in the source.
- Orig_Translation: Translation as it appears in the source.
- Comment: Comments.
extensions.csv
- ID: IDs referring to extensions.
- Language_ID: Reference to languages:ID.
- Form: Form of the innovative prefix.
- Cognateset_ID: cognate_sets:ID
- Comment: Comments.
bathe_data.csv
- Language_ID: Reference to languages:ID.
- Form: + separates (etymological) morphemes.
- Source: bibkey[page] referencing references.bib, multiple separated by ; , pc for personal communication
- Cognateset_ID: cognate_sets:ID+cognate_sets:ID
- Transitivity: Either transitive or intransitive 'to bathe'.
- Comment: Comments
apalai_sa_verb_stats.csv
- Form: Form.
- Meaning: Meaning.
- ID: cognate_sets:ID or generic reg_sa
- Count: Times the verb occurred
- % Sa: Ratio of verb tokens in all Sa verbs
- % Words: Overall ratio
- High_Frequency: Defined as more than average, used for classifying cognate verbs.
cognate_sets.csv
languages.csv
- ID: A three- (attested) or four-letter (reconstructed) string.
- Name: The name as used in the paper.
- Glottocode: The identifier used in Glottolog.
- Longitude: Longitude in decimal format.
- Latitude: Latitude in decimal format.
inflection_data.csv
- Meaning_ID: Reference to values of cognate_sets:Meaning.
- Verb_Cognateset_ID: cognate_sets:ID+cognate_sets:ID.
- Language_ID: Reference to languages:ID.
- Inflection: Person inflection of the form.
- Form: - mark boundaries between productive morphemes, + between unproductive morphemes.
- Prefix_Cognateset_ID: cognate_sets:ID+cognate_sets:ID.
- Source: bibkey[page] referencing references.bib, multiple separated by ; , pc for personal communication
- Full_Form: The full form in the source, if applicable.
- Comment: Comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md