caret包应用之四:模子预测与检讨
模子成立好后,我们可以操作predict函数举办预测,譬喻预测检测样本的前五个
predict(gbmFit1, newdata = testx)[1:5]为了较量差异的模子,还可用装袋决定树成立第二个模子,定名为gbmFit2
gbmFit2= train(trainx, trainy,method = “treebag”,trControl = fitControl)另一种获得预测功效的要领是利用extractPrediction函数,获得的部门功效如下显示
models = list(gbmFit1, gbmFit2)
predValues = extractPrediction(models,testX = testx, testY = testy)
head(predValues)
obs pred model dataType object6 Active Active gbm Training Object1
1 Active Active gbm Training Object1
2 Active Active gbm Training Object1
3 Active Inactive gbm Training Object1
4 Active Active gbm Training Object1
5 Active Active gbm Training Object1
从中可提取检讨样本的预测功效
testValues = subset(predValues, dataType == “Test”)假如要获得预测概率,则利用extractProb函数
probValues = extractProb(models,testX = testx, testY = testy)对付分类问题的效能检讨,最重要的是调查预测功效的夹杂矩阵
testProbs = subset(probValues, dataType == “Test”)
Pred1 = subset(testValues, model == “gbm”)功效如下,可见第一个模子在精确率要比第二个模子略好一些
Pred2 = subset(testValues, model == “treebag”)
confusionMatrix(Pred1$pred, Pred1$obs)
confusionMatrix(Pred2$pred, Pred2$obs)
Reference
Prediction Active Inactive
Active 65 12
Inactive 9 45Accuracy : 0.8397
Reference
Prediction Active Inactive
Active 63 12
Inactive 11 45Accuracy : 0.8244
最后是操作ROCR包来绘制ROC图
prob1 = subset(testProbs, model == “gbm”)
prob2 = subset(testProbs, model == “treebag”)
library(ROCR)
prob1$lable=ifelse(prob1$obs==’Active’,yes=1,0)
pred1 = prediction(prob1$Active,prob1$lable)
perf1 = performance(pred1, measure=”tpr”, x.measure=”fpr” )
plot( perf1 )