# python – numpy outputs nan for corrcoeff

## Question:

numpy returns `nan` . If there are two different arrays, then `nan` fly out:

``````np.corrcoef([0.1, 0.1, 0.1], [0.9, 0.9, 0.9])
array([[  1.,  nan],
[ nan,  nan]])
``````

If two arrays are the same, then this does not happen:

``````np.corrcoef([0.1, 0.1, 0.1], [0.1, 0.1, 0.1])
array([[ 1.,  1.],
[ 1.,  1.]])
``````

It is also not clear why `corrcoef` all `nan` for zeros:

``````np.corrcoef([0, 0, 0], [0.1, 0.1, 0.1])
array([[  nan,  nan],
[ nan,  nan]])
``````

If we simulate what happens inside the `np.corrcoef()` function :

``````In [37]: x=[0.1, 0.1, 0.1]; y=[0.9,0.9,0.9]

# расчет ковариционной матрицы
In [38]: c = np.cov(x,y)

# для указанных значений - она имеет нулевые значения для 3-х из 4-х элементов
# в результате `np.corrcoef()` на месте этих элементов будут стоять NaN
In [39]: c
Out[39]:
array([[  2.88889492e-34,   0.00000000e+00],
[  0.00000000e+00,   0.00000000e+00]])

In [40]: d = np.diag(c)

In [41]: d
Out[41]: array([  2.88889492e-34,   0.00000000e+00])

In [42]: stddev = np.sqrt(d.real)

In [43]: stddev
Out[43]: array([  1.69967494e-17,   0.00000000e+00])
``````

In the next line, we get `NaN` as the result of division by zero:

``````In [44]: c /= stddev[:, None]
...\py36\Scripts\ipython3:1: RuntimeWarning: invalid value encountered in true_divide

In [45]: stddev[:, None]
Out[45]:
array([[  1.69967494e-17],
[  0.00000000e+00]])

In [46]: c
Out[46]:
array([[  1.69967494e-17,   0.00000000e+00],
[             nan,              nan]])
``````

The way around this is to add a very small number to one or more elements of the second array:

``````In [109]: x=[0.1, 0.1, 0.1]; y=[0.9,0.9,0.9+1e-16]

In [110]: np.corrcoef(x,y)
Out[110]:
array([[ 1.        , -0.57735027],
[-0.57735027,  1.        ]])
``````

or

``````In [111]: x=np.array([0.1, 0.1, 0.1]); y=np.array([0.9,0.9,0.9])+1e-16

In [112]: np.corrcoef(x,y)
Out[112]:
array([[ 1., -1.],
[-1.,  1.]])
``````
Scroll to Top