Skip to content

TextGrids cannot be read if they contain special/IPA characters #52

@mfaytak

Description

@mfaytak

Expected behaviour
Read in a textgrid (long format) using: tg = pympi.Praat.TextGrid(path_to_textgrid)

Actual behaviour
Throws an AttributeError (included below) and halts if the contents of any interval tier contain non-ASCII characters such as ɪ or ŋ or ɛ. All other TextGrids are imported without issues as expected.

System information

  • python version: 3.x (Jupyter Notebook kernel)
  • os: Mac OS 13.4.1 (Ventura)
  • are you up to date with the latest master?: Yes

Offending notebook cell (which imports any TGs not containing ɛ or ɪ just fine):

for subj in os.listdir(corpus):
    for file in os.listdir(os.path.join(corpus,subj)):
        if not file.endswith(".TextGrid"):
            continue
        print(file)
        tg = pympi.Praat.TextGrid(os.path.join(corpus,subj,file))

Full traceback of the issue I am encountering is included below.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[40], line 11
      9     continue
     10 print(file)
---> 11 tg = pympi.Praat.TextGrid(os.path.join(corpus,subj,file))
     12 for tier in tg.get_tiers():
     13     print(tier.name)

File ~/miniconda3/envs/cameroon/lib/python3.11/site-packages/pympi/Praat.py:44, in TextGrid.__init__(self, file_path, xmin, xmax, codec)
     42 else:
     43     with open(file_path, 'rb') as f:
---> 44         self.from_file(f, codec)

File ~/miniconda3/envs/cameroon/lib/python3.11/site-packages/pympi/Praat.py:101, in TextGrid.from_file(self, ifile, codec)
     99 # Skip the Headers and empty line
    100 next(ifile), next(ifile), next(ifile)
--> 101 self.xmin = float(nn(ifile, regfloat))
    102 self.xmax = float(nn(ifile, regfloat))
    103 # Skip <exists>

File ~/miniconda3/envs/cameroon/lib/python3.11/site-packages/pympi/Praat.py:94, in TextGrid.from_file.<locals>.nn(ifile, pat)
     92 def nn(ifile, pat):
     93     line = next(ifile).decode(codec)
---> 94     return pat.search(line).group(1)

AttributeError: 'NoneType' object has no attribute 'group'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions