numpy – What is pythons equivalent of Rs NA?

numpy – What is pythons equivalent of Rs NA?

nan in numpy is handled well with many functions:

>>> import numpy as np
>>> a = [1, np.nan, 2, 3]
>>> np.nanmean(a)
2.0
>>> np.nansum(a)
6.0
>>> np.isnan(a)
array([False,  True, False, False], dtype=bool)

Scikit-learn doesnt handle missing values currently.
For most machine learning algorithms, it is unclear how to handle missing values, and so we rely on the user of handling them prior to giving them to the algorithm.
Numpy doesnt have a missing value. Pandas uses NaN, but inside numeric algorithms that might lead to confusion. It is possible to use masked arrays, but we dont do that in scikit-learn (yet).

numpy – What is pythons equivalent of Rs NA?

for pandas take a look at this.

http://pandas.pydata.org/pandas-docs/dev/missing_data.html

pandas uses NaN. You can test for null values using isnull() or not null(), drop them from a data frame using dropna() etc. The equivalent for datetime objects is NaT

Leave a Reply

Your email address will not be published. Required fields are marked *