Skip to content
This repository was archived by the owner on Jan 11, 2023. It is now read-only.
This repository was archived by the owner on Jan 11, 2023. It is now read-only.

"ANGULAR" distance defined differently than in main annoy package #22

@jrhaberstroh

Description

@jrhaberstroh

This Java implementation defines ANGULAR distance as "cosine similarity", while the C++/python implementation defines ANGULAR distance as "approximate theta".

Evidence:

Java implementation -- dot(u,v), the value of cosine

public static float cosineMargin(final float[] u, final float[] v) {

float margin = (INDEX_TYPE == IndexType.ANGULAR) ? cosineMargin(v, queryVector)

C++ / python implementation -- sqrt(2 * (1 - dot(u,v))), an approximation of theta
https://github.com/spotify/annoy/blob/6f6b0c84ab413337eb4d2e850a4cba637f52ccbc/src/annoylib.h#L473
https://github.com/spotify/annoy/blob/6f6b0c84ab413337eb4d2e850a4cba637f52ccbc/src/annoylib.h#L502

Related issue defining "angular" method in the other repo: spotify/annoy#530

Because this would be a significant breaking change, I did not put a PR together. Would the lead devs provide guidance on the appropriate way to name a new distance that implements angular correctly ("angular-cpp" or "angular-correct"?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions