Which correlation coefficient is resistant to outlier?

Spearman Rank correlation coefficient is resistant to outlier.

Formula for it is as below:

For example X=[x1,x2,x3,x4,x5,x6] and Y=[y1,y2,y3,y4,y5,y6]

Now to find rx we need to order data of X in ascending order with marking its position in another arrray.

Example X=[1,3,0,5,3,6].

Now lets sort X and mark sorted array position.

Array is like X[1]=1,X[2]=3,X[3]=0,X[4]=5,X[5]=3,X[6]=6.

Now lets sort it 0(X[3]),1(X[1]),3(X[3]),3(X[4]),5(X[5]),6(X[6]) .

so it will be like 0,1,3,3,5,6. Now convert values to indexes of sorted order means X = [0(p[1]),1(p[2]),3(p[3]),3(p[4]),5(p[5]),6(p[6])]

Now check original sequence of X = 1,3,0,5,3,6. Now replace X’s value with sorted index like value 1 is at position 2 in sorted array p[2]. There are two 3 values and its sorted array position is p[3] and p[4] so, for two same value we will take mean of position means 3+4/2=3.5 so for both 3 values positions is 3.5 . Like wise for other values we will get sorted array postion. So, final values according to sorted values indexs for 1,3,0,5,3,6 will be 2,3.5,1,5,3.5,6. These values area called rank of values r1=2,r2=3.5,r3=1,r4=5,r5=3.5,r6=6. so rx=[2,3.5,1,5,3.5,6]

Now come to Spearman Rank correlation coefficient. Its denote by prank. Its equation is as below

prank = 1 – (6ΣN(rxi -ryi)2/N(N2-1))

Here N is size of array and rxi or ryi are rx1 or ry1 to rxn or ryn

If you want to learn about data science/machine learning on time series or on enivironment science and all about which I am sharing in this post then follow below book.