Skip to content

CNN and LSTM models for identifying the relation between two specific tokens in sentences

Notifications You must be signed in to change notification settings

aidamd/RelationExtraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This code uses Python 3.6 Libraries:

  • nltk 3.2.5
  • numpy 1.15.2
  • tensorflow 1.8
  • sklearn 0.19.2

The params.json file contains the parameters of the program: attention_size: the size of hidden vectors that re generated to produce the attention values, num_layers : number of cell layers in the RNN model : can be LSTM, BiLSTM or CNN hidden_size : the hidden size of the neural cells n_outputs : number of labels, it is added to the params file by make_vocab.py filter_sizes : size of different filters used by CNN num_filters : number of filters used in CNN pretrain : use pretrained word embeddings or not embedding_size : size of embeddings learning_rate : learning rate of the model keep_ratio : keep ratio of cells epochs : max number of epochs max_char : added by make_vocab.py max_len : added by make_vocab.py

Pleas add 300d Glove word embeddings (Download from http://nlp.stanford.edu/data/glove.840B.300d.zip) into embeddings folder and name it "glove.300.txt".

You can first run "python3 make_vocab.py" to read the train and test file, partition the train set to train and dev, get position vectors for all sentences, convert sentences to their their Glove representation and also get the labels.

make_vocab.py generates the data.pkl that contains all the data. You can also skip this part and use the data.pkl file that is already in the repository

About

CNN and LSTM models for identifying the relation between two specific tokens in sentences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages