添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I am a beginner and started learning data science some days ago. My feature data set has some categorical features and I was trying to preprocess in order to convert them into numbers. I was trying to use the ColumnTransformer from the sklearn compose module but had an error when I used the fit_transform( ) parameter>

# Turn our categories into numbers 
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer               
categorical_features = ["Make", "Colour", "Doors"]
one_hot = OneHotEncoder()
transformer = ColumnTransformer([("one_hot", 
                                  one_hot, 
                                  categorical_features)],
                                  remainder = "passthrough")
transformed_X= transformer.fit_transform(X)

These are the details of the error:

AttributeError                            Traceback (most recent call last)
<ipython-input-179-9955bb392ac9> in <module>
     12                                   categorical_features)],
     13                                   remainder = "passthrough")
---> 14 transformed_X= transformer.fit_transform(X)
~\Desktop\sample_project1\env\lib\site-packages\sklearn\compose\_column_transformer.py in fit_transform(self, X, y)
    524         X = _check_X(X)
    525         # set n_features_in_ attribute
--> 526         self._check_n_features(X, reset=True)
    527         self._validate_transformers()
    528         self._validate_column_callables(X)
AttributeError: 'ColumnTransformer' object has no attribute '_check_n_features'*

Could you provide information about your scikit-learn version:

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

It looks like there is something fishy there.

Could you provide information about your scikit-learn version:

Windows-10-10.0.18362-SP0
Python 3.8.3 (default, May 19 2020, 06:50:17) [MSC v.1916 64 bit (AMD64)]
NumPy 1.18.1
SciPy 1.4.1
Scikit-Learn 0.22.1

I typed conda search scikit-learn --info and last version shown was:

----------------------------------
file name   : scikit-learn-0.23.1-py38h25d0782_0.conda
name        : scikit-learn
version     : 0.23.1
build       : py38h25d0782_0
build number: 0
size        : 4.7 MB
license     : BSD-3-Clause
subdir      : win-64
url         : https://repo.anaconda.com/pkgs/main/win-64/scikit-learn-0.23.1-py38h25d0782_0.conda
md5         : 2ec528a84314c584c05be041463a1e5f
timestamp   : 2020-06-22 19:21:32 UTC
dependencies:
  - blas 1.0 mkl
  - joblib >=0.11
  - mkl >=2019.4,<2021.0a0
  - mkl-service >=2,<3.0a0
  - numpy >=1.14.6,<2.0a0
  - python >=3.8,<3.9.0a0
  - scipy
  - threadpoolctl
  - vc >=14.1,<15.0a0
  - vs2015_runtime >=14.16.27012,<15.0a0 

I am actually confused and don't know whether the problem is related to the current scikit - learn dependencies

So the original traceback shows some code from scikit-learn 0.23.1 while your import shows scikit-learn 0.22.1 which explains that there is something fishy.

Could you update/resinstall scikit-learn 0.23