Skip to content

pandas to_csv will fail on | when text scraping  #4

@NemoAndrea

Description

@NemoAndrea

It can happen that the text scraper picks up strings with | or other csv valid separators. The current code does not have a default escape character set, causing an Exception in the rare cases where such strings occur.

https://github.com/meronvermaas/PURE_fulltext_analysis/blob/60a8410a754d3a650a4d93cf57c0af05143e4a84/pure_scraper/pdf_converter.py#L60

A solution would be to add the escapechar option of pandas.to_csv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions