Time complexity for different machine learning algorithms

Marco Virgolin

Feb 20, 2021 supplementary

Time complexity for different machine learning algorithms

Let D be an n times m dimensional supervised learning dataset, with n the number of examples, and m the number of features.

Naive Bayes Classifier

A (Gaussian) Naive Bayes classifier takes O(nm) to train. The prediction of a new sample takes O(mc), with c the number of classes.

Ordinary Least Squares

Training by ordinary least squares take O(nm^2), while prediction for a new sample takes O(m).

Support Vector Machines

Training time complexity depends on the training set and on hyper-parameters, and takes between O(n^2m) and O(n^3m). Let v be the number of support vectors obtained by training. Then prediction time for a new sample is O(vm).

Random Forest

Assuming trees are free to grow to maximum height O(logn), training takes O(tunlogn), where t is the number of trees and u is the number of features considered for splitting. The prediction of a new sample takes O(tlogn).

eXtreme Gradient Boosting

Training with XGBoost takes O(tdxlogn), where t is the number of trees, d is the height of the trees, and x is the number of non-missing entries in the training data. Prediction for a new sample takes O(td).

supplementary

Time complexity for different machine learning algorithms

Time complexity for different machine learning algorithms

Naive Bayes Classifier

Ordinary Least Squares

Support Vector Machines

Random Forest

eXtreme Gradient Boosting

Marco Virgolin

Junior Researcher on Explainable AI and Evolutionary Machine Learning