Big Data & Machine Learning

Scroll

Posted by Maurik van den Heuvel on 14 Mar 2018

Big Data, Machine Learning

In our previous blog we explored the concept of Big Data, briefly discussed the phases in a Big Data solution, and how you can advise your customer with the so-called association algorithm about the next product he or she might want to purchase. In this blog post I will briefly discuss different types of Big Data algorithms and machine learning models. And for whom models and algorithms are not very familiar, I add a practical example of a fraud detection solution in healthcare we developed last year. Who knows, it might give you some ideas for your own application.

The first experiment with machine learning

Machine learning is a form of artificial intelligence (AI), in which the 'machine' automatically learns and constantly improves itself, without being explicitly programmed for it. The term was devised in the 1950's by Arthur Samuel who tried a number of different methods to teach a computer how to win a game of checkers. Samuel distinguished two types of learning. The first one is rate learning. In this case, the computer saves each move in the game and the score of that move in order to make the best choice later when the same situation occurs. The second is learning with the help of generalization. The computer does not store all possible outcomes, but only the generalized rules. By doing this in an iterative process, those rules become better and better.

Successful experiment

For example, it has become possible to make statements about situations of which we have no or limited knowledge on the basis of things we know from the past. And I think that's a nice definition of machine learning: "to say something meaningful about things we do not know, based on things we do know." It comes down to Samuel's experiment. Who wants to know exactly how it works, can read the entire article of Samuel.

The Azure Machine Learning Platform

Nowadays there are many different tools and platforms to implement machine learning projects. A platform that I am happy with is Azure Machine Learning from Microsoft. And that is mainly because it has a visually attractive and easy-to-use interface. Moreover, it is relatively easy to publish trained machine learning models as a web service so that you can call them from other applications. Very cool!!

Azure Machine Learning distinguishes four families of machine learning algorithms. Not exhaustive, but a good start! Each family contains different methods that can be used to achieve a certain goal:

Anomaly detection: identifying unusual data points, for example for fraud detection.
Clustering: the discovery of structure in data, for example to be able to divide consumers into different segments and thus define separate marketing strategies.
Classification: the prediction of two or more categories to predict, for example, whether a client of a bank will or will not pay back a loan.
Regression: predicting an exact value, for example how many bags of chips you will sell more if you lower the price by 10%.

A practical example: fraud detection in healthcare

Last year we developed an application for a health insurance company outside Europe, that uses machine-learning methods to identify caregivers who are fraudulent in their declaration behavior. What I find particularly interesting about this application is that machine learning is used here as part of a complete system. As a result, between the moment of declaration and making an appointment with the physician who made the declaration to discuss the conclusions, no human need to be involved anymore. Of course, if desired, this can be done.

The phases of this solution are as follows:

1) Data sources: Data about the care provider and always innovative declaration data. In this case it concerns dentists.

2) Integration: The phase in which the data is moved to the servers where further analysis can take place.

3) Data stores: The databases where the analysis is performed. We use databases that are specially designed for analysis. This makes it more efficient, and we do not have to disrupt the source systems by querying for analysis.

4) Analytical methods and techniques:

a) Pre-clustering dentists in groups with a similar profile in order to make mutual comparison possible. Think of the size of the practice, but also of different focus profiles such as 'Orthodontics' or 'Children' or 'Protheses'.

b) Identify unusual declarations immediately when they arrive.

c) Categorizing heath care providers in different groups of fraud risks.

5) Data visualization: The results of the various analyses appear in a report indicating the reasons why a claim is considered to be fraudulent or wrong.

6) Integration into the business process: The final step in the application is the automatic generation of a letter to the care provider in which the conclusions are presented, and an appointment is made to discuss the case at the office of the insurance company.

More than $ 3 million less in claims

And it has an effect! Thanks to this application, the claims have already been reduced by more than 3 million dollars within a year!

Which applications do you see for your business?

With these models you can, of course, detect much more than just fraud. There are countless other possible applications. And not only big tech companies are working on it; all sectors can have a lot of machine learning. I wonder what possible applications can be thought up for your business. Do you have questions or ideas? Or is something bubbling but cannot you put your finger on it yet? Do not hesitate - call us, email us, and we think along with you. And perhaps we will write something about it next time.

Maurik van den Heuvel

Tecknoworks Nederland BV

Pascalstraat 13H | 2811 EL Reeuwijk | Nederland

T: +31 (0)881 182 200 | M: +31 (0)6 5104 4631

E: maurik.vandenheuvel@tecknoworks.com| W: www.tecknoworks.com

comments powered by Disqus

Let's write our story!

We don't just write code, we write stories! Working with us is fun, inspiring and good for business!

Get in touch