Natural-Language-Processing/Eng-Hin-NMT

An implementation of English to Hindi Neural Machine Translation. This model is trained on 140,000 sentences taken from IIT Bombay English Hindi Parallel Corpus.

A preview of model configuration

Embedding Layer
Bi-directional LSTM #Encoder.
Repeat Vector #for connecting encoder to decoder as input and output shapes might be different.
Bi-directional LSTM #Decoder
Dense #to get output in desired shape.

Dataset Used:

IIT Bombay English-Hindi parallel corpus. The dataset can be downloaded from http://www.cfilt.iitb.ac.in/iitb_parallel/

Steps to run:

Follow steps mentioned in preprocess.py for cleaning and loading of dataset.
1. You can run this notebook in Google Colab by just clicking on the link mentioned right at the top of the notebook.
2. If you want to run it on local machine (not recommended as it needs GPU) you can install dependencies via requirements.txt then preprocess.py then the main nmt file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Natural-Language-Processing/Eng-Hin-NMT

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Natural-Language-Processing/Eng-Hin-NMT