Time complexity for different machine learning algorithms

Time complexity for different machine learning algorithms

Let D be an n times m dimensional supervised learning dataset, with n the number of examples, and m the number of features.

Naive Bayes Classifier

A (Gaussian) Naive Bayes classifier takes O(nm) to train. The prediction of a new sample takes O(mc), with c the number of classes.

Ordinary Least Squares

Training by ordinary least squares take O(nm^2), while prediction for a new sample takes O(m).

Support Vector Machines

Training time complexity depends on the training set and on hyper-parameters, and takes between O(n^2m) and O(n^3m). Let v be the number of support vectors obtained by training. Then prediction time for a new sample is O(vm).

Random Forest

Assuming trees are free to grow to maximum height O(logn), training takes O(tunlogn), where t is the number of trees and u is the number of features considered for splitting. The prediction of a new sample takes O(tlogn).

eXtreme Gradient Boosting

Training with XGBoost takes O(tdxlogn), where t is the number of trees, d is the height of the trees, and x is the number of non-missing entries in the training data. Prediction for a new sample takes O(td).

Marco Virgolin
Marco Virgolin
Junior Researcher on Explainable AI and Evolutionary Machine Learning