Detected fraud transactions with Isolation Forest (Recall 0.82) and Autoencoder (Precision 0.91) on a 284k transaction dataset.
The objective of this project is to detect fraudulent credit card transactions using Python. Fraud transactions are extremely rare, making the dataset highly imbalanced. This project demonstrates the use of unsupervised anomaly detection techniques to identify fraud patterns and visualize anomalies for better interpretability.
The project uses the Credit Card Fraud Detection Dataset from Kaggle:
- Source: Kaggle – Credit Card Fraud Detection
- Number of transactions: 284,807
- Features:
Time– seconds elapsed between transactionsAmount– transaction amountV1toV28– anonymized PCA featuresClass– target variable (0 = Normal, 1 = Fraud)
- Fraud cases: 492 (~0.17%)
- Isolation Forest Recall (Fraud): 0.82
- Unsupervised tree-based anomaly detection algorithm
- Isolates outliers by partitioning data using random trees
- Flags transactions as anomalies (fraud) without using class labels
- Autoencoder Precision (Fraud): 0.91
- Neural network-based unsupervised model
- Learns to reconstruct normal transaction patterns
- Transactions with high reconstruction error are flagged as fraud
- Scaled
AmountandTimefeatures usingStandardScaler - Kept PCA features (V1–V28) as-is
- Prepared feature matrix
Xand target vectory - Trained models using only normal transactions for unsupervised learning
- Evaluated using confusion matrix and classification report
- Key metric focus: Recall for fraud class (class 1)
- Isolation Forest detected anomalies effectively, Autoencoder showed higher precision in reconstructing normal patterns
- Both models handle highly imbalanced datasets without oversampling
Shows severe imbalance between normal and fraud transactions
Highlights difference in reconstruction errors between normal and fraud transactions
Visualizes fraud (outliers) and normal transactions in 2D
Shows model’s prediction overlay on PCA-reduced data
- Install dependencies:
pip install -r requirements.txt-
Open fraud_detection.ipynb in Jupyter Notebook
-
Run all cells sequentially to reproduce results and visualizations.
- Languages: Python 3.11
- Data Analysis & Visualization: Pandas, NumPy, Matplotlib, Seaborn
- Machine Learning: Scikit-learn (Isolation Forest, PCA)
- Deep Learning: TensorFlow / Keras (Autoencoder)
- Environment: Jupyter Notebook, VS Code
- Version Control: Git / GitHub
- Built an end-to-end credit card fraud detection system using Isolation Forest and Autoencoder
- Handled highly imbalanced dataset and scaled features professionally
- Visualized anomalies with PCA for clear insights
- Achieved high recall for fraud detection, demonstrating strong data analysis, machine learning, and deep learning skills
- Fully reproducible and ready for portfolio/GitHub showcase
Shadan Tech
Data Analyst
🔗 LinkedIn Profile
🔗 Tableau Public Profile
🔗 Newsletter
If you found this project insightful, give it a ⭐ Star on GitHub — it helps others discover it too!
Connect on LinkedIn for more Power BI, Tableau, and Data Analytics projects.



.png)