- scipy.spatial.distance.cdist(XA, XB, metric='euclidean', *, out=None, **kwargs)[source]#
Compute distance between each pair of the two collections of inputs.
See Notes for common calling conventions.
- Parameters:
- XAarray_like
An \(m_A\) by \(n\) array of \(m_A\)original observations in an \(n\)-dimensional space.Inputs are converted to float type.
- XBarray_like
An \(m_B\) by \(n\) array of \(m_B\)original observations in an \(n\)-dimensional space.Inputs are converted to float type.
- metricstr or callable, optional
The distance metric to use. If a string, the distance function can be‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’,‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘jensenshannon’,‘kulczynski1’, ‘mahalanobis’, ‘matching’, ‘minkowski’,‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’,‘sokalsneath’, ‘sqeuclidean’, ‘yule’.
- **kwargsdict, optional
Extra arguments to metric: refer to each metric documentation for alist of all possible arguments.
Some possible arguments:
p : scalarThe p-norm to apply for Minkowski, weighted and unweighted.Default: 2.
w : array_likeThe weight vector for metrics that support weights (e.g., Minkowski).
V : array_likeThe variance vector for standardized Euclidean.Default: var(vstack([XA, XB]), axis=0, ddof=1)
VI : array_likeThe inverse of the covariance matrix for Mahalanobis.Default: inv(cov(vstack([XA, XB].T))).T
out : ndarrayThe output arrayIf not None, the distance matrix Y is stored in this array.
- Returns:
- Yndarray
A \(m_A\) by \(m_B\) distance matrix is returned.For each \(i\) and \(j\), the metric
dist(u=XA[i], v=XB[j])
is computed and stored in the\(ij\) th entry.
- Raises:
- ValueError
An exception is thrown if XA and XB do not havethe same number of columns.
Notes
The following are common calling conventions:
Y = cdist(XA, XB, 'euclidean')
Computes the distance between \(m\) points usingEuclidean distance (2-norm) as the distance metric between thepoints. The points are arranged as \(m\)\(n\)-dimensional row vectors in the matrix X.
Y = cdist(XA, XB, 'minkowski', p=2.)
Computes the distances using the Minkowski distance\(\|u-v\|_p\) (\(p\)-norm) where \(p > 0\) (notethat this is only a quasi-metric if \(0 < p < 1\)).
Y = cdist(XA, XB, 'cityblock')
Computes the city block or Manhattan distance between thepoints.
Y = cdist(XA, XB, 'seuclidean', V=None)
See Alsoscipy.cluster.hierarchy.linkage — SciPy v1.13.1 ManualClustering Custom Data Using the K-Means Algorithm — PythonDefinitive Guide to K-Means Clustering with Scikit-LearnCustom Distance Function in K-Means Clustering with scikit-learn in Python 3 - DNMTechs - Sharing and Storing Technology KnowledgeComputes the standardized Euclidean distance. The standardizedEuclidean distance between two n-vectors
u
andv
is\[\sqrt{\sum {(u_i-v_i)^2 / V[x_i]}}.\]
V is the variance vector; V[i] is the variance computed over allthe i’th components of the points. If not passed, it isautomatically computed.
Y = cdist(XA, XB, 'sqeuclidean')
Computes the squared Euclidean distance \(\|u-v\|_2^2\) betweenthe vectors.
Y = cdist(XA, XB, 'cosine')
Computes the cosine distance between vectors u and v,
\[1 - \frac{u \cdot v} {{\|u\|}_2 {\|v\|}_2}\]
where \(\|*\|_2\) is the 2-norm of its argument
*
, and\(u \cdot v\) is the dot product of \(u\) and \(v\).Y = cdist(XA, XB, 'correlation')
Computes the correlation distance between vectors u and v. This is
\[1 - \frac{(u - \bar{u}) \cdot (v - \bar{v})} {{\|(u - \bar{u})\|}_2 {\|(v - \bar{v})\|}_2}\]
where \(\bar{v}\) is the mean of the elements of vector v,and \(x \cdot y\) is the dot product of \(x\) and \(y\).
Y = cdist(XA, XB, 'hamming')
Computes the normalized Hamming distance, or the proportion ofthose vector elements between two n-vectors
u
andv
which disagree. To save memory, the matrixX
can be of typeboolean.Y = cdist(XA, XB, 'jaccard')
Computes the Jaccard distance between the points. Given twovectors,
u
andv
, the Jaccard distance is theproportion of those elementsu[i]
andv[i]
thatdisagree where at least one of them is non-zero.Y = cdist(XA, XB, 'jensenshannon')
Computes the Jensen-Shannon distance between two probability arrays.Given two probability vectors, \(p\) and \(q\), theJensen-Shannon distance is
\[\sqrt{\frac{D(p \parallel m) + D(q \parallel m)}{2}}\]
where \(m\) is the pointwise mean of \(p\) and \(q\)and \(D\) is the Kullback-Leibler divergence.
Y = cdist(XA, XB, 'chebyshev')
Computes the Chebyshev distance between the points. TheChebyshev distance between two n-vectors
u
andv
is themaximum norm-1 distance between their respective elements. Moreprecisely, the distance is given by\[d(u,v) = \max_i {|u_i-v_i|}.\]
Y = cdist(XA, XB, 'canberra')
Computes the Canberra distance between the points. TheCanberra distance between two points
u
andv
is\[d(u,v) = \sum_i \frac{|u_i-v_i|} {|u_i|+|v_i|}.\]
Y = cdist(XA, XB, 'braycurtis')
Computes the Bray-Curtis distance between the points. TheBray-Curtis distance between two points
u
andv
is\[d(u,v) = \frac{\sum_i (|u_i-v_i|)} {\sum_i (|u_i+v_i|)}\]
Y = cdist(XA, XB, 'mahalanobis', VI=None)
Computes the Mahalanobis distance between the points. TheMahalanobis distance between two points
u
andv
is\(\sqrt{(u-v)(1/V)(u-v)^T}\) where \((1/V)\) (theVI
variable) is the inverse covariance. IfVI
is not None,VI
will be used as the inverse covariance matrix.Y = cdist(XA, XB, 'yule')
Computes the Yule distance between the booleanvectors. (see yule function documentation)
Y = cdist(XA, XB, 'matching')
Synonym for ‘hamming’.
Y = cdist(XA, XB, 'dice')
Computes the Dice distance between the boolean vectors. (seedice function documentation)
Y = cdist(XA, XB, 'kulczynski1')
Computes the kulczynski distance between the booleanvectors. (see kulczynski1 function documentation)
Y = cdist(XA, XB, 'rogerstanimoto')
Computes the Rogers-Tanimoto distance between the booleanvectors. (see rogerstanimoto function documentation)
Y = cdist(XA, XB, 'russellrao')
Computes the Russell-Rao distance between the booleanvectors. (see russellrao function documentation)
Y = cdist(XA, XB, 'sokalmichener')
Computes the Sokal-Michener distance between the booleanvectors. (see sokalmichener function documentation)
Y = cdist(XA, XB, 'sokalsneath')
Computes the Sokal-Sneath distance between the vectors. (seesokalsneath function documentation)
Y = cdist(XA, XB, f)
Computes the distance between all pairs of vectors in Xusing the user supplied 2-arity function f. For example,Euclidean distance between the vectors could be computedas follows:
dm = cdist(XA, XB, lambda u, v: np.sqrt(((u-v)**2).sum()))
Note that you should avoid passing a reference to one ofthe distance functions defined in this library. For example,:
dm = cdist(XA, XB, sokalsneath)
would calculate the pair-wise distances between the vectors inX using the Python function sokalsneath. This would result insokalsneath being called \({n \choose 2}\) times, whichis inefficient. Instead, the optimized C version is moreefficient, and we call it using the following syntax:
dm = cdist(XA, XB, 'sokalsneath')
Examples
Find the Euclidean distances between four 2-D coordinates:
>>> from scipy.spatial import distance>>> import numpy as np>>> coords = [(35.0456, -85.2672),... (35.1174, -89.9711),... (35.9728, -83.9422),... (36.1667, -86.7833)]>>> distance.cdist(coords, coords, 'euclidean')array([[ 0. , 4.7044, 1.6172, 1.8856], [ 4.7044, 0. , 6.0893, 3.3561], [ 1.6172, 6.0893, 0. , 2.8477], [ 1.8856, 3.3561, 2.8477, 0. ]])
Find the Manhattan distance from a 3-D point to the corners of the unitcube:
>>> a = np.array([[0, 0, 0],... [0, 0, 1],... [0, 1, 0],... [0, 1, 1],... [1, 0, 0],... [1, 0, 1],... [1, 1, 0],... [1, 1, 1]])>>> b = np.array([[ 0.1, 0.2, 0.4]])>>> distance.cdist(a, b, 'cityblock')array([[ 0.7], [ 0.9], [ 1.3], [ 1.5], [ 1.5], [ 1.7], [ 2.1], [ 2.3]])
scipy.spatial.distance.cdist — SciPy v1.13.1 Manual (2024)
Top Articles
Brel Gate 2 Cheat Sheet
U nang buồng trứng có mấy loại và có nguy hiểm không?
Beeindruckende Zahlen für drittes Quartal 2023: LG erzielt zweithöchste Ergebnisse bei Umsatz und Gewinn
¿Por qué se celebra en EE.UU. el 4 de julio? ¿Cuál es su origen e historia que se remonta al 1776?
Army Airborne School: Requirements, Length, Packing List, And More - Operation Military Kids
Busted Talladega County
GENERAL ELECTION - 2008-go8631 - New Zealand Gazette
Polling Places Appointed Under - 2005-go5982
Nehemiah 6 Kjv
CDL Class A Drivers OTR. Up to 0.75 CPM+$2,000 Sign-on Bonus - transportation - job employment - craigslist
Latest Posts
Solitaire Cash Promo Code Free Money 2022 No Deposit
CEO Is Chasing Ex-Wife Back novel read online free Chapter 77
Article information
Author: Allyn Kozey
Last Updated:
Views: 5456
Rating: 4.2 / 5 (43 voted)
Reviews: 90% of readers found this page helpful
Author information
Name: Allyn Kozey
Birthday: 1993-12-21
Address: Suite 454 40343 Larson Union, Port Melia, TX 16164
Phone: +2456904400762
Job: Investor Administrator
Hobby: Sketching, Puzzles, Pet, Mountaineering, Skydiving, Dowsing, Sports
Introduction: My name is Allyn Kozey, I am a outstanding, colorful, adventurous, encouraging, zealous, tender, helpful person who loves writing and wants to share my knowledge and understanding with you.