Skip to content

pg_bulkload on PG13 fails to handle large data duplicates #148

@Tyler-DiPentima

Description

@Tyler-DiPentima

Hey,

I am trying to test out pg_bulkload for a sample dataset that has a unique constraint between 3 of its columns. When I test out a dataset of ~200 lines, pg_bulkload loads the data fine and when re-load is attempted again with the same data, all are recognized as duplicates and discarded. This was on PG9.6.20

When upgrading to PG13.3 the same test is attempted with the same table and data and works the same. However, when the data size is increased to ~350-400 lines issues begin to occur. The first load works fine, but instead of detecting the second run as duplicates, the following error is output:

277 Row(s) loaded so far to table mbaf_exclude. 2 Row(s) bad. 0 rows skipped, duplicates will now be removed
ERROR: query failed: ERROR: could not create unique index "index_name"
DETAIL: Key (account_number, areacode, phone)=(083439221 , 954, 2222222) is duplicated.
DETAIL: query was: SELECT * FROM pg_bulkload($1)

The increased data size works as expected on PG9.6.20, which makes me believe this has to do with postgres version compatibility. Any guidance here would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions