Data Analytics; Financial Services: A Case Study

Case Study | Data Analytics | Financial Services

Problem Statement

A regional bank in Tanzania was losing $3 million annually due to fraudulent transactions, including credit card fraud and account takeovers. The existing rule-based fraud detection system flagged too many false positives, overwhelming the fraud investigation team and delaying legitimate transactions.

Solution

The bank implemented a data analytics solution using machine learning to enhance fraud detection accuracy and efficiency.

Technical Approach:

Data Collection and ETL Process:
- Extracted data from transaction logs, customer profiles, and device information stored in an Oracle database.
- Built an ETL pipeline using Apache Nifi to process and transform data, including anonymization of sensitive customer information to comply with GDPR.
- Handled imbalanced data (fraud cases were <1% of transactions) using oversampling techniques (SMOTE) during model training.
Feature Engineering:
- Created features such as transaction velocity (number of transactions in the last hour), geolocation mismatches (using Haversine distance), and unusual spending patterns (z-score deviations).
- Features are stored in a feature store using Redis for low-latency access during real-time inference.
Fraud Detection Model:
- Built an anomaly detection model using unsupervised learning with Isolation Forest (scikit-learn) to identify outliers in transaction data.
- Developed a supervised learning model using XGBoost (library) to classify transactions as fraudulent or legitimate, achieving a precision of 92%.
- Combined both models in an ensemble approach to improve detection accuracy, using a weighted voting mechanism.
Real-Time Monitoring:
- Deployed the model as a streaming application using Apache Flink to process transactions in real time, with a latency of <100ms.
- Integrated the model with the bank’s transaction processing system using REST APIs built with FastAPI.
Case Management Dashboard:
- Developed a dashboard using Dash (Python) to prioritize high-risk cases and provide model explanations using SHAP (SHapley Additive exPlanations) values.
- Hosted the dashboard on an internal Kubernetes cluster for scalability and security.

Tech Stack:

Data Storage: Oracle, Redis
Data Processing: Apache Nifi (ETL), Apache Flink (streaming)
Programming: Python
Visualization: Dash
Deployment: FastAPI, Kubernetes

Implementation:

We collaborated with the bank’s in-house tech team and functional teams to build the solution.
Conducted a 3-month PoC in the credit card division before full deployment.
Trained the fraud team on interpreting model outputs and managing flagged cases.

Impact:

Efficiency: Reduced false positives by 70%, decreasing the workload of the fraud investigation team by 40%.
Productivity: The team resolved fraud cases 50% faster due to prioritized case management and accurate flagging.
Decision-Making: The bank’s leadership gained insights into fraud trends, enabling proactive updates to security policies and customer authentication processes.
Overall Growth: Fraud losses decreased by $0.5 million annually, customer trust improved, and the bank attracted 10% more high-value customers due to enhanced security measures.

Tanzanian Bank Slashes Fraud Losses to $0.5M and False Positives by 70% with AI-Driven Detection

Problem Statement

Solution

Technical Approach:

Implementation:

Impact:

Address

Contact

Quick Links

Problem Statement

Solution

Technical Approach:

Implementation:

Impact:

Related Posts

Address

Contact

Quick Links