A commonly used algorithm in clustering and machine learning it basically works by finding the k nearest neighbours to a point and then using the average of those points to predict the value of the point.
The algorithm we are trying to implement is the following:
The implementation is fairly simple, we will be using the sklearn library to do the heavy lifting for us. We will be using the iris dataset to test our algorithm. The iris dataset is a dataset that contains 3 different types of iris flowers and 4 different features of the flowers. We will be using the first 2 features to predict the type of flower.
The biggest problem with this algorithm is that it is very slow. This is because it has to calculate the distance between every point in the dataset and the point that we are trying to predict. This means that the algorithm has to calculate the distance between the point we are trying to predict and every other point in the dataset.
The Big(O) notation for this algorithm is
The results are as follows: