About Me

My photo
Working as Technical Lead in CollabNET software private limited.

Wednesday 12 January, 2011

Remove duplicate rows from postgress using ctid

Today, I was in a situation to delete the duplicate row entries in a table which has all the column values similar including the unique constraints too.

I'm not sure how the table accepted the unique constraints of same value, it's been an large dataset and I guess some could've gone wrong on migration or some thing else.

While trying to reindex the table it failed stating

>>

reindexdb: reindexing of database "testdb" failed: ERROR: could not create unique index "folder_pk"

DETAIL: Table contains duplicated values.

<<


After trying to remove the duplicate values, I found all the columns values are similar including the primary keys,

So finally got an idea from one forum, which suggests to use the "ctid" of the rows to delete that.

It looks, for the table rows, internally there is ctid and you can view it using

select ctid, id from table;

Then, remove the specific rows which you need using the ctid.