co-KNN AUC & co-KNN size¶
Description¶
The co-KNN AUC and co-KNN size metrics assess how well local neighborhood structures are preserved after dimensionality reduction. They are based on the concept of common k-nearest neighbors (co-KNN), which are the neighbors shared between the original high-dimensional space and the reduced space.
-
co-KNN AUC evaluates the global fidelity of local structure preservation using a ROC curve.
-
co-KNN size measures the average number of shared neighbors, reflecting local stability.
These metrics are particularly useful for benchmarking dimensionality reduction methods in biological data analysis, such as single-cell RNA-seq.
Formulas¶
co-KNN size :¶
Where :
-
\(N\): total number of data points.
-
\(N_k^{\text{orig}}(i)\): the set of \(k\) nearest neighbors of point \(i\) in the original space.
-
\(N_k^{\text{embed}}(i)\): the set of \(k\) nearest neighbors of point \(i\) in the reduced space.
-
Range: \([0, k]\), where a higher values indicate better local structure preservation.
co-KNN AUC :¶
Construct a ROC curve by treating each pair of points as a positive example if they are co-KNN, and negative otherwise.
Compute the Area Under the Curve (AUC):
Range: \([0, 1]\) where a value of 1 indicates perfect neighborhood preservation.