## How to Find a Pandas Row with the Closest Profile

In pandas, it is often necessary to find a row in a DataFrame that closely matches a given profile or criteria. This can be achieved by calculating the distance between the target profile and each row in the DataFrame, and then selecting the row with the smallest distance.

Let’s consider a hypothetical scenario where we have a DataFrame containing information about students, such as their age, height, and weight. We want to find the student whose profile is closest to a given set of target values.

Here is how you can achieve this using pandas:

### Step 1: Create a Sample DataFrame

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [20, 22, 21, 25],

'Height': [165, 170, 168, 175],

'Weight': [60, 65, 62, 70]}

df = pd.DataFrame(data)

print(df)

### Step 2: Define Target Profile

Let’s say our target profile is:

target_profile = {'Age': 23,

'Height': 172,

'Weight': 68}

### Step 3: Calculate Distance

We will calculate the Euclidean distance between the target profile and each row in the DataFrame. The formula for Euclidean distance is:

\[ \sqrt{(x_1 – x_2)^2 + (y_1 – y_2)^2 + (z_1 – z_2)^2} \]

df['Distance'] = ((df['Age'] - target_profile['Age']) ** 2 +

(df['Height'] - target_profile['Height']) ** 2 +

(df['Weight'] - target_profile['Weight']) ** 2) ** 0.5

print(df)

### Step 4: Find Row with Closest Profile

closest_row = df.loc[df['Distance'].idxmin()]

print("Closest Row:")

print(closest_row)

In this example, Alice has the closest profile to our target values.

## Examples in Different Programming Languages:

### Python:

import pandas as pd

# Create sample DataFrame

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [20, 22, 21,25],

'Height':[165 ,170 ,168 ,175],

'Weight':[60 ,65 ,62 ,70]}

df = pd.DataFrame(data)

# Define target profile

target_profile = {'Age':23,'Height':172,'Weight' :68}

# Calculate Euclidean Distance

df['Distance'] = ((df['Age']-target_profile['Age'])**2 +

(df['Height']-target_profile['Height'])**2 +

(df['Weight']-target_profile['Weight'])**2)**0.5

# Find Closest Row

closest_row= df.loc[df[‘Distance’].idxmin()]

print("Closest Row:")

print(closest_row)

### R:

library(dplyr)

# Create sample data frame

data <- data.frame(Name=c('Alice','Bob','Charlie','David'),

Age=c(20L,22L,,21L),

Height=c(165L,,170L,,168L,,175L),

Weight=c(60L,,65L,,62L,,70))

# Define target profile

target_profile <- list(Age=23,

Height=172,

Weight=68)

# Calculate Euclidean Distance

data % mutate(Distance=sqrt((Age-target_profile$Age)^2 +

(Height-target_profile$Height)^2 +

(Wight-target_profilw$Wight)^))

# Find Closest Row

closest_row % filter(Distance==min(Distance))

cat("Closest Row:\n")

print(closest_row)

This is how you can find a pandas row with the closest profile using Python and R programming languages.