Fraud Detection using Analytics in R
📅 September 17 -18, 2020 (9am-5pm)
🌍 English
Course description
The Association of Certified Fraud Examiners estimates that fraud costs organizations worldwide $3.7 trillion a year and that a typical company loses five percent of annual revenue due to fraud. Fraud attempts are expected to even increase further in future, making fraud detection highly necessary in most industries. This course will show how learning fraud patterns from historical data can be used to fight fraud. Combining theoretical and practical insights, the use of predictive analytics (using a labeled dataset) and descriptive analytics (using an unlabeled dataset) are discussed.
Two main challenges when building a supervised tool for fraud detection are the imbalance or skewness of the data and the various costs for different types of misclassification. We present methodologies to solve these issues. Moreover, we present techniques from robust statistics and digit analysis to detect unusual observations that are likely to be associated with fraud.
The discussed techniques can be applied across a wide variety of fraud applications, such as insurance fraud, credit card fraud, anti-money laundering, healthcare fraud, telecommunications fraud, click fraud, tax evasion, and counterfeiting. Various real-life case studies and examples are presented to illustrate the methodologies at work (using the R programming language) and to show how and why these machine learning tools complement traditional expert-based fraud-detection approaches.
Course outline
Chapter 1: Introduction and motivation
- Importance of fraud detection
- Defining fraud
- Types of fraud
- Fraud detection challenges
- Fraud analytics process model
Chapter 2: Data preprocessing
- Types of variables
- Visual data exploration
- Missing values
- Standardizing and transforming data
- Coarse classification
Chapter 3: Featurization and Social Network Analysis
- Traditional features for fraud detection
- Creating interesting features based on
- Time
- Frequency
- Recency
- Social Network Analysis
- Social network components and characteristics
- Is Fraud a social phenomenon?
- Social Network Metrics
- Adding features based on social networks
Chapter 4: Dealing with imbalanced datasets
- Random oversampling (ROS) of minority class
- Random undersampling (RUS) of majority class
- Synthetic Minority Over-sampling Techniques (SMOTE & MWMOTE)
- Diversified Sensitivity-based Under-Sampling (DSUS) or cluster-based under-sampling (CLUS)
- Combining over- and under-sampling
Chapter 5: Supervised techniques for fraud detection
- Linear and logistic regression
- Decision trees and ensemble methods
- Neural networks
- Evaluating fraud detection models
Chapter 6: Unsupervised techniques for fraud detection
- Digit analysis using Benford’s Law
- Multivariate outlier detection using robust statistics
- Clustering approaches
- Dimension reduction techniques
👩🏫 Lecturers
Prof. dr. Tim Verdonck
Professor at University of Antwerp
Prof. dr. Bart Baesens
Professor at KU Leuven
🏢 Location
Van der Valk Hotel Brussels Airport (Belgium)
Culliganlaan 4b
1831 Diegem
Belgium
hotelbrusselsairport.com
🏫 Organizer
💼 Register
This course is in the past, registration is no longer possible.
Price and Registration
This course is in the past, registration is no longer possible.