At Simon-Kucher & Partners, we strive to develop innovative and cutting-edge approaches to unleash revenue potential through the use of machine learning algorithms. We support our clients in designing and implementing solutions on a wide range of topics, from increasing up- and cross-selling potential, reducing customer churn to adopting dynamic pricing.
How can machine learning be effectively implemented in a way that helps generate a competitive advantage? In general, successfully implementing machine learning can be divided into four essential phases:
- Set the goal and determine the scope and strategy
- Select and pre-process data
- Select a machine learning algorithm
- Implement the findings
Figure 1: Workflow of machine learning projects
Set the target, determine scope and strategy
The first step is to set a specific and quantifiable target (It has been proven a good practice to define the target with the SMART framework.) - e.g. increase cross-selling rates, reduce customer churn). Based on this target, the scope of the machine learning campaign can be derived. Generally, setting the scope has been proven to be a non-trivial endeavour without prior experience. Start by using the existing sales processes and identify the process steps suitable to incentivise customer behaviour.
The machine learning algorithms need to be embedded into the workflow, depending on the sales channel; digital or via the client relationship manager. For the latter the algorithm must be compatible with the employee’s way of working. For the digital channels, information that influences customer behaviour must not be perceived as intrusive. Once the scope has been defined, the next step involves determining the strategy and specific measures that are most likely to achieve the desired target. Then, a divide-and-conquer approach helps to break down the problem into smaller, more manageable problems which can then be individually addressed via different Machine Learning methods.
Quality shortcomings are common in determining the scope and in splitting problems into smaller tasks. As a consequence, the results in the following phases are expected to be unsatisfying and have the potential to jeopardise the whole endeavour. This is generally due to the lack of combining machine learning with financial industry knowledge. A team consisting of a business specialist and a data scientist might weaken the negative impact, but cannot remediate it completely.
Selecting and processing data
After the framework conditions have been determined, the next step involves defining the data requirements. Machine learning algorithms attempt to recognise patterns in the data they are given and exploit them. As a rule, algorithms themselves decide which information to give more weight to or what data is more relevant to solving the problem. However, selecting the data is not sufficient. The data must be processed and quantified for the usage of the algorithm. Unstructured data, such as texts from email correspondence with customers, should be processed in a goal-oriented manner to ensure they can then be handled by machine learning algorithms. If the processing cannot be automated, the added value generated by machine learning projects will be severely limited.
Selecting machine learning algorithms
The existing framework conditions enable a preliminary selection of potential machine learning algorithms and their validation metrics. Digital channels where customers act autonomously are better suited to supervised methods, because the application is decoupled from the training, which allows a faster decision making of the algorithm in production. Once the potential algorithms have been selected, the relevant parameters must be estimated. Training and validation can be used to determine the best possible combination of algorithms for the use case. It is recommended to start with the algorithms that require a small number of or no parameter to estimate. These can serve as a baseline for the more sophisticated algorithms. To select machine learning algorithms, a combination of expertise, knowledge of data science topics and experience is key to achieving rapid and effective results and avoid flaws like overfitting and data snooping.
Implementing the results
If the modelled algorithms return promising results, these findings must be embedded into production. The implementation is usually conducted in an own project for this phase. The workflow of the decision making of the Machine Learning processes is slightly adopted. The steps in the data processing phase are tied to steps in the algorithm validation phase. The training of the algorithms becomes a support process for supervised methods. As a result, the training is decoupled from the application in the core process.
Support processes: The lubricant for the machine learning methods
After the machine learning method have been embedded into the core process, care must be taken to ensure these methods lead to valid results in the future. It is essential to continuously collect data, prepare it in a targeted manner and regularly retrain and recalibrate algorithms (overview in figure 2).
Figure 2: Support processes for machine learning projects
Data is essential for machine learning algorithms. Missing or faulty data inevitably results in sub-optimal performance when using machine learning algorithms. Therefore, it's important to consider third party suppliers in data acquisition for such cases. However ideally financial institutions should leverage their own customer data bases and internal data sources and not solely rely on external sources.
Banks typically gather their data in a very structured manner for the purposes of standardised reports or audits. Scope and procedure rarely go beyond the regulatory requirements. Systematic collection of other “unnecessary” data is completely ignored. Nevertheless, the inclusion of such data becomes a game changer and lead to a considerable profit.
Optimising data processing
Data structures in the collected data aren’t rigid. Processing and quantifying methods must be developed continuously so that new data sources can be incorporated in algorithm calculations. In addition, it is necessary to revise existing preparations, as more computing power becomes available more sophisticated data processing methods can be applied. Furthermore, organisational prerequisites must be created to make data more readily available and the data update as uncomplicated as possible. This can be implemented with data lakes, which enable most big data ventures.
Training and recalibrating algorithms
In this step, production algorithms must be retrained at regular intervals in order to react to new patterns. The frequency with which training and recalibration has to be carried out varies depending on the actual use case.
On one hand, successfully implementing machine learning solutions depends on correctly structuring the problem and on the other hand in combining knowledge from the financial industry with machine learning expertise. The latter point ensures that the right decisions are taken during the strategy phase, data is adequately selected, pre-processed, and then valid results are attained rapidly. When financial institutions are at the beginning of their learning curve but still want to progress quickly, the involvement of a consulting company is advisable.