Skip to content

change .docx files and it won't work #6

@hongbiaozhu

Description

@hongbiaozhu

I changed .doc files and it logs
Traceback (most recent call last): File "/Users/a/Library/Favorites/Code/docx-equation-master/demo.py", line 4, in <module> convert_to_html('/Users/a/Library/Favorites/Code/docx-equation-master/equation.docx') File "/Users/a/Library/Favorites/Code/docx-equation-master/docx_equation/docx.py", line 56, in convert_to_html res = mammoth.convert_to_html(new_docx_filename) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/__init__.py", line 12, in convert_to_html return convert(*args, output_format="html", **kwargs) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/__init__.py", line 26, in convert return options.read_options(kwargs).bind(lambda convert_options: File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/results.py", line 15, in bind result = func(self.value) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/__init__.py", line 27, in <lambda> docx.read(fileobj).map(transform_document).bind(lambda document: File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/__init__.py", line 31, in read return results.combine([ File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/results.py", line 15, in bind result = func(self.value) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/__init__.py", line 35, in <lambda> _read_document(zip_file, read_part_with_body, notes=referents[0], comments=referents[1], part_paths=part_paths) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/__init__.py", line 127, in _read_document return read_part_with_body( File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/__init__.py", line 172, in read_part return _read_entry(zip_file, name, partial(reader, body_reader=body_reader)) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/__init__.py", line 202, in _read_entry return reader(office_xml.read(fileobj)) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/office_xml.py", line 20, in read return _collapse_alternate_content(parse_xml(fileobj, _namespaces))[0] File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/site-packages/mammoth/docx/xmlparser.py", line 83, in parse_xml document = xml.dom.minidom.parse(fileobj) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/xml/dom/minidom.py", line 1988, in parse return expatbuilder.parse(file) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/xml/dom/expatbuilder.py", line 913, in parse result = builder.parseFile(file) File "/Users/a/opt/anaconda3/envs/trame/lib/python3.9/xml/dom/expatbuilder.py", line 207, in parseFile parser.Parse(buffer, False) xml.parsers.expat.ExpatError: mismatched tag: line 2, column 6134

the same issue will occur even if I insert an ENTER after provided equation.docx. I upload it as follows:

equation.docx

How can I fix it? THX

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions