Maybe I am doing it wrong but I ran through about 1,250 images (out of more than 350,000) and hashed (Perceptual) them. I then stored the hash in the database and I am using the SQL Bit_COUNT to get the hamming distance.
I took a random hash and ran the query and ended up with 2 at a distance of 0 (different hashes) and maybe 50 or more at a distance of 1. The farthest away is a distance of 27.
These are the two images that had different hashes but yet were still 0 away.
Hash: 12d2552c66ddc94b (image: https://10deb7fbfece20ff53da-95da5b03499e7e5b086c55c243f676a1.ssl.cf1.rackcdn.com/a1afc58c6ca9540d057299ec3016d726_l.jpg)
Hash: 12e627593dbc2307 (image: https://10deb7fbfece20ff53da-95da5b03499e7e5b086c55c243f676a1.ssl.cf1.rackcdn.com/3b8a614226a953a8cd9526fca6fe9ba5_l.jpg)
As you can see these are not anywhere close to the same.
SELECT c.*, BIT_COUNT('12e627593dbc2307' ^ i.hash) as hamming_distance
FROM images i
where hash is not null
ORDER BY hamming_distance ASC
Will this not work on "created" images? Maybe the sample size is too small for an accurate comparison. I think I am reading it is converted to an 8x8 image...maybe in my case it should be MUCH larger but I am not sure where to start.
Maybe I am doing it wrong but I ran through about 1,250 images (out of more than 350,000) and hashed (Perceptual) them. I then stored the hash in the database and I am using the SQL Bit_COUNT to get the hamming distance.
I took a random hash and ran the query and ended up with 2 at a distance of 0 (different hashes) and maybe 50 or more at a distance of 1. The farthest away is a distance of 27.
These are the two images that had different hashes but yet were still 0 away.
Hash: 12d2552c66ddc94b (image: https://10deb7fbfece20ff53da-95da5b03499e7e5b086c55c243f676a1.ssl.cf1.rackcdn.com/a1afc58c6ca9540d057299ec3016d726_l.jpg)
Hash: 12e627593dbc2307 (image: https://10deb7fbfece20ff53da-95da5b03499e7e5b086c55c243f676a1.ssl.cf1.rackcdn.com/3b8a614226a953a8cd9526fca6fe9ba5_l.jpg)
As you can see these are not anywhere close to the same.
SELECT c.*, BIT_COUNT('12e627593dbc2307' ^ i.hash) as hamming_distance
FROM images i
where hash is not null
ORDER BY hamming_distance ASC
Will this not work on "created" images? Maybe the sample size is too small for an accurate comparison. I think I am reading it is converted to an 8x8 image...maybe in my case it should be MUCH larger but I am not sure where to start.