K-Nearest Neighbors
K-nearest neighbors (KNN) is a supervised machine learning algorithm that classifies a data point based on how its neighbors are classified.
- Instant classifier
- Store Data in KNN
- Classifying a Query Point
- Calculating Classification Probability
- Retrieving Nearest Neighbors
- Configuring Distance Metric
Instant classifier
Instant Classifier Function
Function Signature
def instant_classifier(self, x_vals, y_vals, query_point, p=3, k=2):
Parameters
-
x_vals: The input data points.
-
y_vals (must be converted to int32): The labels corresponding to the input data points.
-
query_point: The point for which classification is to be determined.
-
p (int, default=3): The power parameter for the Minkowski distance metric.
-
k (int, default=2): The number of nearest neighbors to consider for classification.
Return Value
Returns the classification result for the query_point
based on the k
nearest neighbors in the x_vals
dataset.
Description
The instant_classifier
function classifies a given query_point
based on the k
nearest neighbors in the x_vals
dataset. The labels of the k
nearest neighbors are then used to determine the classification of the query_point
.
Examples
from deeprai.models import KNN
# Sample data
x_vals = [[1, 2], [2, 3], [3, 4]]
y_vals = [0, 1, 0]
query_point = [2, 2]
# Create an instance of the classifier
classifier = KNN()
# Classify the query_point
result = classifier.instant_classifier(x_vals, y_vals, query_point, p=3, k=2)
print(result) # This will print the classification result for the query_point
Note: Ensure that y_vals
is converted to int32
before passing it to the function.
Store Data in KNN
Storing Values in KNN Classifier
Function Signature
def store_vals(self, x_values, y_values, p=3, k=2):
Parameters
-
x_values: The input data points to be stored in the classifier.
-
y_values (must be converted to int32): The labels corresponding to the input data points to be stored in the classifier.
-
p (int, default=3): The power parameter for the Minkowski distance metric, to be stored for future use.
-
k (int, default=2): The number of nearest neighbors to consider for classification, to be stored for future use.
Return Value
This function does not return anything. It modifies the KNN instance by storing the provided values.
Description
The store_vals
function stores the provided data points, labels, power parameter, and number of neighbors in the KNN classifier instance. This allows for the classifier to use these values in subsequent classification tasks without needing them to be provided again.
Examples
from deeprai.models import KNN
# Sample data
x_vals = [[1, 2], [2, 3], [3, 4]]
y_vals = [0, 1, 0]
# Create an instance of the classifier
classifier = KNN()
# Store the values in the classifier
classifier.store_vals(x_vals, y_vals, p=3, k=2)
Note: Ensure that y_vals
is converted to int32
before storing.
Classifying a Query Point
Classifying a Query Point with KNN
Function Signature
def classify(self, query_point):
Parameters
- query_point: The point for which classification is to be determined.
Return Value
Returns the classification result for the query_point
based on the stored values in the KNN instance.
Description
The classify
function classifies a given query_point
based on the stored values in the KNN instance. The distance between the points is calculated using the Minkowski distance metric with the stored power parameter. The labels of the stored data points are then used to determine the classification of the query_point
.
It's important to note that the store_vals
function must be called prior to using the classify
function to ensure that the necessary values are stored in the KNN instance.
Examples
from deeprai.models import KNN
# Sample data
x_vals = [[1, 2], [2, 3], [3, 4]]
y_vals = [0, 1, 0]
query_point = [2, 2]
# Create an instance of the classifier
classifier = KNN()
# Store the values in the classifier
classifier.store_vals(x_vals, y_vals, p=3, k=2)
# Classify the query_point
result = classifier.classify(query_point)
print(result) # This will print the classification result for the query_point
Calculating Classification Probability
Calculating Classification Probability with KNN
Function Signature
def classify_probability(self, query_point, expected_val):
Parameters
-
query_point: The point for which classification probability is to be determined.
-
expected_val: The label value for which the probability is to be calculated.
Return Value
Returns the probability (in percentage) that the query_point
belongs to the class specified by expected_val
based on the stored values in the KNN instance.
Description
The classify_probability
function calculates the probability that a given query_point
belongs to the class specified by expected_val
. It first retrieves the nearest neighbors of the query_point
using the classify_neighbors
function. It then counts how many of these neighbors have the label expected_val
and calculates the probability based on this count.
It's important to note that the store_vals
function must be called prior to using the classify_probability
function to ensure that the necessary values are stored in the KNN instance.
Examples
from deeprai.models import KNN
# Sample data
x_vals = [[1, 2], [2, 3], [3, 4]]
y_vals = [0, 1, 0]
query_point = [2, 2]
# Create an instance of the classifier
classifier = KNN()
# Store the values in the classifier
classifier.store_vals(x_vals, y_vals, p=3, k=2)
# Calculate the probability that the query_point belongs to class 1
probability = classifier.classify_probability(query_point, 1)
print(f"The probability that the query point belongs to class 1 is {probability}%")
Retrieving Nearest Neighbors
Retrieving Nearest Neighbors with KNN
Function Signature
def classify_neighbors(self, query_point):
Parameters
- query_point: The point for which the nearest neighbors are to be determined.
Return Value
Returns the indices of the k
nearest neighbors to the query_point
based on the stored values in the KNN instance.
Description
The classify_neighbors
function retrieves the indices of the k
nearest neighbors for a given query_point
based on the stored values in the KNN instance.
It's important to note that the store_vals
function must be called prior to using the classify_neighbors
function to ensure that the necessary values are stored in the KNN instance.
Examples
from deeprai.models import KNN
# Sample data
x_vals = [[1, 2], [2, 3], [3, 4]]
y_vals = [0, 1, 0]
query_point = [2, 2]
# Create an instance of the classifier
classifier = KNN()
# Store the values in the classifier
classifier.store_vals(x_vals, y_vals, p=3, k=2)
# Retrieve the nearest neighbors of the query_point
neighbors = classifier.classify_neighbors(query_point)
print(f"The indices of the nearest neighbors to the query point are: {neighbors}")
Configuring Distance Metric
Configuring Distance Metric for KNN
Function Signature
def config_distance(self, distance):
Parameters
- distance: The name of the distance metric to be configured for the KNN instance.
Return Value
This function does not return anything. It modifies the KNN instance.
Description
The config_distance
function sets the distance metric for the KNN instance.
The valid distance metrics that can be passed to this function are:
- "hamming distance"
- "minkowski distance"
- "manhattan distance"
- "euclidean distance"
Examples
from deeprai.models import KNN
# Create an instance of the classifier
classifier = KNN()
# Configure the distance metric to be used
classifier.config_distance("euclidean distance")