KNN实现鸢尾花分类

sklearn

发布时间 : 2021-01-03 22:56

阅读 :

1. 导入依赖包
2. 导入数据集
3. 将数据集切分成训练集和测试集
4. 对数据进行标准化
5. 定义模型
6. 训练模型
7. 打印预测结果
8. 打印准确率
案例二：添加网格搜索

1. 导入依赖包

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier

2. 导入数据集

iris = load_iris()

3. 将数据集切分成训练集和测试集

x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, 
                                                    random_state=666)

4. 对数据进行标准化

transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)

5. 定义模型

estimator = KNeighborsClassifier(n_neighbors=3)

6. 训练模型

estimator.fit(x_train, y_train)

7. 打印预测结果

y_predict = estimator.predict(x_test)
print(y_predict)

8. 打印准确率

score = estimator.score(x_test, y_test)
print(score)

案例二：添加网格搜索

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV


iris = load_iris()
x_train,x_test,y_train,y_test = train_test_split(iris.data, iris.target, 
                                                 random_state=22)
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)

estimator = KNeighborsClassifier()

设置网格搜索的可选参数

param_dict = {'n_neighbors': [1,3,5,7,9,11]}

设置为10折交叉验证

estimator = GridSearchCV(estimator, param_grid=param_dict, cv=10)

estimator.fit(x_train, y_train)

y_predict = estimator.predict(x_test)

打印最佳得分

score = estimator.score(x_test, y_test)
print(score)

打印最佳参数、最佳得分、最佳模型、参数列表

print(estimator.best_params_)
print(estimator.best_score_)
print(estimator.best_estimator_)
print(estimator.cv_results_)

转载请注明来源，欢迎对文章中的引用来源进行考证，欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论，也可以邮件至 2621041184@qq.com