In this repository, I have implemented the IBM Models 1,2 and 3. As dataset a Turkish-English bilingual sentence aligned corpus has been used. The dataset is relatively big, so I have not upload it. If you need, please contact with me. The project was implemented in PyCharm using Python code, thus I also add IDE related files.
At the very beginning of Machine Translation history, rule-based machine translation systems were popular. In rule based systems the problem is solved using rules. In 1988 a statistical machine translation system is introduced by Brown et. al. After these Brown introduced IBM models in detail in 1993. In statistical machine translation, we are trying to find most probable translation of a given sentence. There are some kinds of statistical machine translation like word-based or phrase-based
There is no special user interface designed for the program. Program should be run in any environment which can run Python 3.5.1. User should call the “python3 Main.py” command in linux terminal or any other environment for program execution. In any of these environments, to stop the words, the regarding program execution stop command can be used like CTRL+C in linux. Additionally, by choosing terminate option by pressing 9, user can stop the program.
Please find all the details about the project in the file "Design and Implemantation Report"