Kruskal's stress¶
Description¶
Kruskal Stress is a fundamental metric for evaluating the quality of representation in Multidimensional Scaling (MDS). It measures the discrepancy between distances in the reduced representation space and the original data dissimilarities. This metric quantifies the information loss during dimensionality reduction, allowing to assess whether the projection in reduced dimensions faithfully preserves proximity relationships between points. Stress ranges from 0 (perfect representation) to 1 (highly distorted representation). It allows validation of the quality of high-dimensional data visualizations (i.e. datasets with a large number of variables that pose problems).
Formulas¶
Stress :
where :
-
\(d_{ij}\) is the Euclidean distance between points \(i\) and \(j\) in the reduced representation space
-
\(\hat{d}_{ij}\) represents the transformed dissimilarity (or proximity) between objects \(i\) and \(j\) in the original space
Interpretation Criteria :¶
-
Stress < 0.05: Excellent representation
-
0.05 ≤ Stress < 0.1: Good representation
-
0.10 ≤ Stress < 0.15: Acceptable representation
-
0.15 ≤ Stress < 0.20: Poor representation
-
Stress ≥ 0.20: Unacceptable representation
Sources¶
“Applying Deep Learning algorithm to perform lung cells annotation”, A. Collin, 2020