Skip to content

Span Splitter Breaks down when using raw file paths #2

@sortedcord

Description

@sortedcord

When the query contains a raw path to a file or a directory, the span splitter does not parse it correctly as apparent from the given cases:

 "copy all jpg files from /home/rahul/photos/2023_trip/ to /backup/photos/trip_2023/",

Results in:

[[all, jpg, files], [from, /home, /, rahul, /, photos/2023_trip/], [to, /backup, /, photos, /, trip_2023/]]

a possible solution to this is to maybe have a regex based identifier to sniff out these hardcoded paths from the query and replace them with a dummy token that is always classified as a noun like Directory00 or File00 during the POS & DEP tagging phase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions