Language-recognizer

First try to program something useful in Python. Program should recognize a language of text input, uses the n-grams probability.

##langRecognizer.py

try: python3 langRecognizer.py "tell me: in witch language is this input" some_vector_file, the vector file is not mandatory, default is language_vector.p
python3 langRecognizer.py (without any argument) works as well

##ngrams.py

can creater ngrams and count the number of ngrams in text, count the probability...

##langVector.py

##addVector.py

command line script for adding a new language vector into existing (or new) file with vector
python3 addVector.py "language" file_with_plain_text file_with_vectors
last argument is optional, default is language_vector.p

##language_vector.p

file with ready vectors, just for testing (create from small data), but you can use it for trying as well
data source: Gutenberg.org, wikipedia.
can recognize these languages:
- czech
- english
- germam
- finnish
- swedish
- norwegian (bokmål)
- nynorsk (new norwegian)
- danish
- slovak
- bulgarian
- hungurian
- russian
- italian
- french
- spanish
- urdu
- persian (farsi)
- arabic

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
LICENSE		LICENSE
README.md		README.md
addVector.py		addVector.py
langRecognizer.py		langRecognizer.py
langVector.py		langVector.py
language_vector.p		language_vector.p
my_vectors.py		my_vectors.py
ngrams.py		ngrams.py

Provide feedback