Skip to content

Error on using Spacy Tokenizer #10

@radhikasethi2011

Description

@radhikasethi2011

I edited tokenize.py
and in main called

tokenizer=SpacyTokenize()

to use the Spacy Tokenizer for English text. Tho I always end up getting a :

tcmalloc large alloc 

memory error on running on Google Colab.

Thoughts on how I can use the English tokenizer for my dataset? Or for the English dataset dailydialoguttr_lines.txt, how do you run the code for the GSM model? @zll17

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions