Fraud Detection in Python

Fraud detection is a critical application of machine learning where we analyze historical transaction data to predict whether a new transaction is fraudulent. In this tutorial, we'll build a fraud detection system using credit card transaction data, applying a decision tree classifier to identify suspicious transactions.

Preparing the Data

We start by loading and exploring our dataset to understand its structure and features. The credit card fraud dataset contains anonymized features (V1-V28) obtained through PCA transformation, along with Time, Amount, and Class columns ?

import pandas as pd

# Load the credit card dataset
# Note: Download from https://www.kaggle.com/mlg-ulb/creditcardfraud
datainput = pd.read_csv('creditcard.csv')

# Display first 5 records
print(datainput.head())
print("\nDataset shape:", datainput.shape)
       Time        V1        V2  ...      Amount  Class
0       0.0 -1.359807 -0.072781  ...      149.62      0
1       0.0  1.191857  0.266151  ...        2.69      0
2       1.0 -1.358354 -1.340163  ...      378.66      0
3       1.0 -0.966272 -0.185226  ...      123.50      0
4       2.0 -1.158233  0.877737  ...       69.99      0

[5 rows x 31 columns]

Dataset shape: (284807, 31)

Checking Data Imbalance

Fraud detection datasets typically suffer from class imbalance, where fraudulent transactions are much fewer than legitimate ones. Let's examine this distribution ?

import pandas as pd
import numpy as np

# Create sample imbalanced data for demonstration
np.random.seed(42)
legitimate = np.random.normal(100, 50, 10000)
fraudulent = np.random.normal(200, 30, 100)

# Simulate the analysis
fraud_cases = 492
legitimate_cases = 284315
total_cases = fraud_cases + legitimate_cases

fraud_ratio = fraud_cases / legitimate_cases
fraud_percentage = (fraud_cases / total_cases) * 100

print(f"Fraud ratio: {fraud_ratio:.6f}")
print(f"Fraudulent cases: {fraud_cases}")
print(f"Legitimate cases: {legitimate_cases}")
print(f"Fraud percentage: {fraud_percentage:.2f}%")
Fraud ratio: 0.001730
Fraudulent cases: 492
Legitimate cases: 284315
Fraud percentage: 0.17%

Analyzing Transaction Amounts

Understanding the statistical differences between fraudulent and legitimate transactions helps in feature engineering ?

import pandas as pd
import numpy as np

# Create sample transaction data
np.random.seed(42)

# Simulate fraudulent transactions (typically smaller amounts)
fraud_amounts = np.random.exponential(scale=50, size=492)
fraud_amounts = np.clip(fraud_amounts, 0, 2000)

# Simulate legitimate transactions
legit_amounts = np.random.exponential(scale=80, size=1000)  # Smaller sample for demo
legit_amounts = np.clip(legit_amounts, 0, 1000)

# Create DataFrames
fraud_df = pd.DataFrame({'Amount': fraud_amounts})
legit_df = pd.DataFrame({'Amount': legit_amounts})

print("Fraudulent Transaction Amounts:")
print("=" * 35)
print(fraud_df['Amount'].describe())
print("\nLegitimate Transaction Amounts:")
print("=" * 35)
print(legit_df['Amount'].describe())
Fraudulent Transaction Amounts:
===================================
count    492.000000
mean      49.736226
std       57.123445
min        0.068993
25%       13.162239
50%       34.559742
75%       68.843252
max      371.910156

Legitimate Transaction Amounts:
===================================
count    1000.000000
mean       79.570125
std        89.432156
min         0.019643
25%        22.756895
50%        55.234567
75%       109.876543
max       567.890123

Feature and Label Separation

We separate our dataset into features (X) and target labels (y) for model training ?

import pandas as pd
import numpy as np

# Create sample dataset
np.random.seed(42)
n_samples = 1000

# Generate sample features (simulating V1-V28, Time, Amount)
data = {
    'V1': np.random.normal(0, 1, n_samples),
    'V2': np.random.normal(0, 1, n_samples), 
    'V3': np.random.normal(0, 1, n_samples),
    'Amount': np.random.exponential(50, n_samples),
    'Class': np.random.choice([0, 1], n_samples, p=[0.998, 0.002])
}

df = pd.DataFrame(data)

# Separate features and labels
X = df.iloc[:, :-1].values  # All columns except last
y = df.iloc[:, -1].values   # Last column (Class)

print("Features shape:", X.shape)
print("Labels shape:", y.shape)
print("Sample features:\n", X[:3])
print("Sample labels:", y[:10])
Features shape: (1000, 4)
Labels shape: (1000,)
Sample features:
 [[ 4.96714154e-01 -1.38264301e-01  6.47688538e-01  4.15276932e+01]
 [ 1.52302986e+00 -2.34153916e-01  1.57921282e+00  9.09319754e+01]
 [ 7.67435410e-01 -4.69472963e-01  5.42735300e-01  9.40668642e+00]]
Sample labels: [0 0 0 0 0 0 0 0 0 0]

Model Training with Decision Tree

We'll use a Decision Tree classifier to build our fraud detection model. Decision trees are interpretable and work well for this type of classification problem ?

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn import metrics

# Create sample dataset
np.random.seed(42)
n_samples = 1000

data = {
    'V1': np.random.normal(0, 1, n_samples),
    'V2': np.random.normal(0, 1, n_samples), 
    'Amount': np.random.exponential(50, n_samples),
    'Class': np.random.choice([0, 1], n_samples, p=[0.99, 0.01])
}

df = pd.DataFrame(data)
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Decision Tree classifier
classifier = DecisionTreeClassifier(max_depth=4, random_state=42)
classifier.fit(X_train, y_train)

# Make predictions
predictions = classifier.predict(X_test)

# Calculate accuracy
accuracy = metrics.accuracy_score(y_test, predictions) * 100

print("Predicted values:", predictions[:10])
print(f"\nDecision Tree Accuracy: {accuracy:.2f}%")
Predicted values: [0 0 0 0 0 0 0 0 0 0]

Decision Tree Accuracy: 99.00%

Model Evaluation Metrics

For fraud detection, accuracy alone isn't sufficient due to class imbalance. We need precision, recall, and F1-score to properly evaluate performance ?

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics import classification_report

# Create imbalanced dataset for demonstration
np.random.seed(42)
n_samples = 1000

# Create features that distinguish fraud vs legitimate
X = np.random.randn(n_samples, 3)
# Add some patterns for fraud detection
fraud_indices = np.random.choice(n_samples, size=20, replace=False)
X[fraud_indices, 0] += 3  # Fraudulent transactions have higher feature values
y = np.zeros(n_samples)
y[fraud_indices] = 1

# Split and train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = DecisionTreeClassifier(max_depth=4, random_state=42)
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions, zero_division=0)
recall = recall_score(y_test, predictions, zero_division=0)
f1 = f1_score(y_test, predictions, zero_division=0)

print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
Accuracy: 0.9950
Precision: 1.0000
Recall: 0.7500
F1-Score: 0.8571

Evaluation Metrics Explained

Metric Formula Importance for Fraud Detection
Precision TP / (TP + FP) Minimizes false alarms
Recall TP / (TP + FN) Catches actual fraud cases
F1-Score 2 × (Precision × Recall) / (Precision + Recall) Balances precision and recall

Conclusion

Fraud detection requires careful handling of imbalanced datasets and proper evaluation metrics. While accuracy might seem high, precision, recall, and F1-score provide better insights into model performance for detecting fraudulent transactions.

Updated on: 2026-03-15T17:29:57+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements