ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测的模板流程

ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测的模板流程
 

 

 

目录

六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测

数据集理解

1、kNN

2、逻辑回归

3、SVM

4、决策树

5、随机森林

6、提升树

7、神经网络


 

 

相关文章
ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测的模板流程
ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测全部

六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测

数据集理解

data.shape:  (768, 9)
data.columns: 
 Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',
       'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome'],
      dtype='object')
data.head: 
    Pregnancies  Glucose  BloodPressure  ...  DiabetesPedigreeFunction  Age  Outcome
0            6      148             72  ...                     0.627   50        1
1            1       85             66  ...                     0.351   31        0
2            8      183             64  ...                     0.672   32        1
3            1       89             66  ...                     0.167   21        0
4            0      137             40  ...                     2.288   33        1

[5 rows x 9 columns]
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Pregnancies               768 non-null    int64  
 1   Glucose                   768 non-null    int64  
 2   BloodPressure             768 non-null    int64  
 3   SkinThickness             768 non-null    int64  
 4   Insulin                   768 non-null    int64  
 5   BMI                       768 non-null    float64
 6   DiabetesPedigreeFunction  768 non-null    float64
 7   Age                       768 non-null    int64  
 8   Outcome                   768 non-null    int64  
dtypes: float64(2), int64(7)
memory usage: 54.1 KB
data.info: 
 None
8
data_column_X:  ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age'] 
 ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']

 

1、kNN

kNNC(n_neighbors=9):Training set accuracy: 0.792
kNNC(n_neighbors=9):Test set accuracy: 0.776

 

 

2、逻辑回归

LoR(c_regular=1):Training set accuracy: 0.785
LoR(c_regular=1):Test set accuracy: 0.771

 

 

 

3、SVM

SVMC_Init:Training set accuracy: 0.769
SVMC_Init:Test set accuracy: 0.755
SVMC_Best(max_dept=1,learning_rate=0.1):Training set accuracy: 0.788
SVMC_Best(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781
DTC(max_dept=3):Training set accuracy: 0.773
DTC(max_dept=3):Test set accuracy: 0.740

 

4、决策树

DTC(max_dept=3):Training set accuracy: 0.773
DTC(max_dept=3):Test set accuracy: 0.740

 

5、随机森林

RFC_Best:Training set accuracy: 0.764
RFC_Best:Test set accuracy: 0.750

 

6、提升树

GBC(max_dept=1,learning_rate=0.1):Training set accuracy: 0.804
GBC(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781

 

7、神经网络

MLPC_Init:Training set accuracy: 0.743
MLPC_Init:Test set accuracy: 0.672

 

 

 

来源url
栏目