When the query contains a raw path to a file or a directory, the span splitter does not parse it correctly as apparent from the given cases:
"copy all jpg files from /home/rahul/photos/2023_trip/ to /backup/photos/trip_2023/",
Results in:
[[all, jpg, files], [from, /home, /, rahul, /, photos/2023_trip/], [to, /backup, /, photos, /, trip_2023/]]
a possible solution to this is to maybe have a regex based identifier to sniff out these hardcoded paths from the query and replace them with a dummy token that is always classified as a noun like Directory00 or File00 during the POS & DEP tagging phase.
When the query contains a raw path to a file or a directory, the span splitter does not parse it correctly as apparent from the given cases:
Results in:
a possible solution to this is to maybe have a regex based identifier to sniff out these hardcoded paths from the query and replace them with a dummy token that is always classified as a noun like
Directory00orFile00during the POS & DEP tagging phase.