caret包应用之三:建模与参数优化
在举办建模时,需对模子的参数举办优化,在caret包中其主要函数呼吁是train。
首先获得颠末特征选择后的样本数据,并分别为练习样本和检讨样本
newdata4=newdata3[,Profile$optVariables]然后界说模子练习参数,method确定多次交错检讨的抽样要领,number确定了分另外重数, repeats确定了重复次数。
inTrain = createDataPartition(mdrrClass, p = 3/4, list = FALSE)
trainx = newdata4[inTrain,]
testx = newdata4[-inTrain,]
trainy = mdrrClass[inTrain]
testy = mdrrClass[-inTrain]
fitControl = trainControl(method = “repeatedcv”, number = 10, repeats = 3,returnResamp = “all”)确定参数选择范畴,本例建模筹备利用gbm算法,相应的参数有如下三项
gbmGrid = expand.grid(.interaction.depth = c(1, 3),.n.trees = c(50, 100, 150, 200, 250, 300),.shrinkage = 0.1)操作train函数举办练习,利用的建模要领为晋升决定树要领,
gbmFit1 = train(trainx,trainy,method = “gbm”,trControl = fitControl,tuneGrid = gbmGrid,verbose = FALSE)从功效可以调查到interaction.depth取1,n.trees取150时精度较高
interaction.depth n.trees Accuracy Kappa Accuracy SD Kappa SD
1 50 0.822 0.635 0.0577 0.118
1 100 0.824 0.639 0.0574 0.118
1 150 0.826 0.643 0.0635 0.131
1 200 0.824 0.64 0.0605 0.123
1 250 0.816 0.623 0.0608 0.124
1 300 0.824 0.64 0.0584 0.119
3 50 0.816 0.621 0.0569 0.117
3 100 0.82 0.631 0.0578 0.117
3 150 0.815 0.621 0.0582 0.117
3 200 0.82 0.63 0.0618 0.125
3 250 0.813 0.617 0.0632 0.127
3 300 0.812 0.615 0.0622 0.126
同样的图形调查
plot(gbmFit1)