

So, if you make that decision: Is a 0.33% improvement worth it? In some cases, it’s maybe worth it (e.g., in the financial sector for non-real time predictions), in other cases it perhaps won’t be worth it, though.

The ConvNet ensemble may reach a better accuracy (for the sake of this ensemble, let’s pretend that these are totally unbiased estimates), but without a question, I’d say that the 35 ConvNet committee is far more expensive (computationally). The best-performing model in this set is a committee consisting of 35 ConvNets, which were reported to have a 0.23% test error the best SVM model has a test error of 0.56%. The same is true for deep learning algorithms if you look at the MNIST benchmarks ( ): “ An empirical comparison of supervised learning algorithms.” Proceedings of the 23rd international conference on Machine learning. Caruana, Rich, and Alexandru Niculescu-Mizil.If it comes to predictive performance, there are cases where SVMs do better than random forests and vice versa: Again, in practice, the decision which classifier to choose really depends on your dataset and the general complexity of the problem – that’s where your experience as machine learning practitioner kicks in. Another advantage is that you have to worry less about the feature engineering part. On the other hand, deep learning really shines when it comes to complex problems such as image classification, natural language processing, and speech recognition. Also, deep learning algorithms require much more experience: Setting up a neural network using deep learning algorithms is much more tedious than using an off-the-shelf classifiers such as random forests and SVMs. And deep learning algorithms… well, they require “relatively” large datasets to work well, and you also need the infrastructure to train them in reasonable time. Random forests may require more data but they almost always come up with a pretty robust model. Deep LearningĪs a rule of thumb, I’d say that SVMs are great for relatively small data sets with fewer outliers. In SVMs, we typically need to do a fair amount of parameter tuning, and in addition to that, the computational cost grows linearly with the number of classes as well. The complexity of a random forest grows with the number of trees in the forest, and the number of training samples we have.

To summarize, random forests are much simpler to train for a practitioner it’s easier to find a good, robust model. Although, there are multi-class SVMs, the typical implementation for mult-class classification is One-vs.-All thus, we have to train an SVM for each class – in contrast, decision trees or random forests, which can handle multiple classes out of the box. Also, we can end up with a lot of support vectors in SVMs in the worst-case scenario, we have as many support vectors as we have samples in the training set. The more trees we have, the more expensive it is to build a random forest. Training a non-parametric model can thus be more expensive, computationally, compared to a generalized linear model, for example. On the contrary, there are a lot of knobs to be turned in SVMs: Choosing the “right” kernel, regularization penalties, the slack variable, …īoth random forests and SVMs are non-parametric models (i.e., the complexity grows as the number of training samples increases). I would say that random forests are probably THE “worry-free” approach - if such a thing exists in ML: There are no real hyperparameters to tune (maybe except for the number of trees typically, the more trees we have the better). If this doesn’t work “well” (i.e., it doesn’t meet our expectation or performance criterion that we defined earlier), I would move on to the next experiment. I.e., try a linear model such as logistic regression. If we tackle a supervised learning problem, my advice is to start with the simplest hypothesis space first. Machine Learning FAQ How can I know if Deep Learning works better for a specific problem than SVM or random forest?
