X. Advice for Applying Machine Learning (Week 6)

机器学习Machine Learning – Andrew NG courses学习笔记

Advice for Applying Machine Learning对应用机器学习的建议

解决应用机器学习算法遇到的trainning set和test set预测不高的问题

Deciding What to Try Next决定接下来尝试什么

机器学习算法不佳时应该怎么做

But sometimes getting more training data doesn’t actually help and we will see how you can avoid spending a lot of time collecting more training data in settings where it is just not going to help.选择怎么做之前要知道的：

So in the next, I’m going to first talk about how evaluate your learning algorithms and after about some of these diagnostics.

Evaluating a Hypothesis假设评估

how do you tell if the hypothesis might be overfitting.

for problems with a large number of features is hard or may be impossible to plot what the hypothesis and so we need some other way to evaluate our hypothesis.The standard way to evaluate a learned hypothesis is as follows.

if your data were already randomly sorted,you could just take the first 70% and last 30%.if your data were not randomly ordered,it would be better to randomly shuffle or to randomly reorder the examples in your training set.Before you know sending the first 70% in the training set and the last 30% of the test set.

Note: overfitting is why the training set’s error is not a good predictor for how well the hypothesis will do on new example.

fairly typical procedure for how you would train and test the learning algorithm

Sometimes there is an alternative test sets metric that might be easier to interpret,and that’s the misclassification error.

Model Selection and Train_Validation_Test Sets模型选择和Train_Validation_Test集

model selection process

{choose what features like the degree polynomial to use with the learning algorithm or choose the regularization parameter for learning algorithm.}choose what degree polynomial to fit to data.it’s as if there’s one more parameter, d, that you’re trying to determine using your data set.

choose the model which has the lowest test set error.

这样选择会导致的问题：because I had fit this parameter d to my test set is no longer fair to evaluate my hypothesis on this test set, because I fit my parameters to this test set, I’ve chose the degree d of polynomial using the test set.And so my hypothesis is likely to do better on this test set than it would on new examples that it hasn’t seen before.

评估计算出的参数更优的方法（split the data set into three pieces）

Sometimes it’s also called the validation set instead of cross validation set.模型选择

同时也用对她的怀念来惩罚自己。

相关文章：

你感兴趣的文章：

标签云：