October 16, 2018
October 9, 2018
Posted October 4, 2018
Machine Learning (ML) is computational learning using algorithms to learn from and make predictions based on data, just like us making decisions from experience. For example, a payment event coming from a device detected with malware is more likely to be a fraud; a credit card on average transacts three times per day for the past month but today, it suddenly had ten transactions and the increase in velocity indicates risk. Furthermore, if two of the ten transactions happened in New York while the card holder resides in San Jose, it seems even more suspicious.
ML learns from past data or experience on many dimensions such as device information, event velocity, geographical location, merchant information and transaction amount, etc. The feature space describes various fraud patterns; ML can learn these fraud patterns and classify a future event as a fraud or not in the probability space.
In this blog post, we will explore how supervised machine learning can use historical fraud tags to deliver powerful fraud prevention results.
ThreatMetrix has thousands of clients and over 40 billion annual network transactions. One key requirement for a ML product is scalability, that is, we want minimum human supervision on building high-performing ML models. In addition, many clients have to deal with fast evolving fraud patterns therefore a short development cycle is desired. Lastly, financial institutions and other clients prefer a clear box solution to understand the decisioning process. To satisfy these needs, we build an automated and lite machine learning product: Smart Learning as a Service.
The above graph gives an overview of Smart Learning architecture. The top left “API” section and the boxes below it describe feature generation. It is usually a time- and energy-consuming part of model building. For Smart Learning as a Service, we moved the feature creation from offline to the real-time production environment. When a session query is conducted, all the features defined in customer policy, global policy and others are calculated within milliseconds. The change of architecture design serves two purposes: firstly, the model development cycle is greatly shortened and it is easier to automate the whole process; secondly, the offline feature generation framework usually produces slightly different results from the production environment due to various reasons. This guarantees that Smart Learning is trained and used on the same feature space.
For the “Truth Data” tab, clients have the option to upload fraud tagging using an update API. Smart Learning then performs all the necessary data extraction and merging with truth data and feeds the proper data format for model build, which in turn produces the model report. Clients then review the model report and decide whether they wish to deploy the model into production. If clients want to proceed, the automated importer will upload the model into the customer portal and produce a Smart Learning policy score and Smart Learning risk rank for real-time decisioning. The model is a clear box solution in readable format. All the above process is fully automated and requires minimum human supervision.
This section discusses the model building process, drawn within the dark red rectangular box in the above graph. The following diagram expands the process into more detail:
The data quality step checks percentage of blank and distribution of important fields, fraud and/or non-fraud volumes to make sure the fraud tagging is meaningful and properly done, followed by data preparation, feature selection, model build, and model output, including a performance report and the files for model deployment.
ML sometimes produces counter-intuitive weights that are difficult to understand. This may be due to counter-intuitive data evidence, or highly correlated features co-existing in the model. To correct this, Smart Learning constrains the model training to use the same sign (+/-) of current policy weights. For extremely conservative users, Smart Learning also provides the option to allow users to review or modify the signs. This approach allows joint decision-making from domain knowledge, data evidence and ML algorithms.
To assess performance, we use the following metrics:
TPR – True Positive Rate which is also called Detection Rate or Capture Rate. This is the percentage of correctly identified fraudulent transactions, and it describes the benefit of the model. For example, if there are 1000 fraudulent transactions and the model correctly identified 750 of them, then the TDR is 75 percent. As the policy score goes up (moving towards +100), the TDR goes up but the false positive rate also goes up.
FPR – False Positive Rate which is also called Non-Fraud Review Rate. This is the percentage of incorrectly identified non-fraudulent transactions regarding an operating score threshold, and it describes the cost or unwanted impact of the model. A FPR of 1% means that the model flags 1% of legitimate transactions as fraudulent and further reviews need to be conducted. As the policy score goes down (moving towards -100), the FPR goes down but the detection rate also goes down.
ROC curve — Receiver Operating Characteristics curve to describe model performance. The curve is created by plotting the true positive rate (TPR) in y-axis against the false positive rate (FPR) in x-axis at various threshold settings. When comparing two models, two ROC curves are plotted on the same chart. By comparing, we can read the detection rate improvement at the same false positive rate, or vice versa, we can read the false positive reduction at the same detection rate. Sometimes we focus more on lower FPR because it corresponds closely to operational range.
AUC (Area Under Curve) — This is a metric which falls between 0 and 1 with a higher number indicating better classification performance overall.
The following are ROC curves comparing Smart Learning performance with existing customer policy performance for selected clients. We also display the fraud detection rate (TPR) at lower false positive rates (FPR) as this is more related to clients’ operational range.
Both clients have strong customer policies, with AUC scores of 0.93 and 0.87. However Smart Learning achieves better performance with AUC scores of 0.96 and 0.98 respectively with the number of rules drastically reduced. Most importantly, Smart Learning can achieve much better fraud detection rates while at a low false positive rate (FPR). For Client A, with an FPR of 3%, Smart Learning achieves 81% fraud detection compared to 33% in the existing customer policy – this represents a 145% relative improvement. For Client B at 5% FPR, Smart Learning achieves 91% fraud detection rate compared to 40% in the current customer policy, corresponding to a 127% relative improvement in performance.
The above graph summarizes the automated end-to-end modelling process for Smart Learning. Users can easily run machine learning algorithms customized to a ThreatMetrix environment without necessarily knowing the technical details. It empowers fraud experts with handy machine learning tools and helps them refine fraud detection policies combining both technology and domain knowledge.