Rise of the machines
Watchfinder.co.uk might only be an SME, by most definitions, but some time in the near future, the fast-growing buyer and seller of luxury watches might be generating quotes not via its in-house panel of experts, but by using a machine-learning application running in the cloud.
“I like the approach that the cloud operators are taking, which is that machine learning just becomes a ‘lego brick’ in your entire big data strategy,” says Watchfinder IT director Jonathan Gill.
Since Microsoft introduced machine learning into its Azure cloud platform last year, Gill has been experimenting with machine learning to see how it could be used at the company.
“It enables us to dump everything into Hadoop, and do all the historical analysis and then, literally, you pull in the machine learning app and just experiment with it. Once you get something you’re happy with, you can click a button, create a web service and then bring it into your main data pipeline – it’s the whole Lego brick aspect of it,” he says.
For Watchfinder, which hosts much of its IT infrastructure in the Amazon cloud, the ease with which machine learning can be bought and integrated when it is running in the cloud – rather than a packaged application – has finally brought the kind of predictive analytics that only major organisations used to be able to afford into the price range of ordinary businesses.
What appeals to Gill about Microsoft’s machine learning cloud service is the speed with which it has iterated and improved the service in just the past year, combined with the marketplace it has opened up for third-party algorithms and other tools that can be plugged into the core service.
“The speed with which Microsoft is iterating services on Azure now is ridiculous… But the really cool thing about it is that they have opened up the machine-learning marketplace, so you’ve got, for example, the Bing team contributing recommendation algorithms for search.
“We are ‘dumping’ our view data, give it a product and it will recommend items based on what people are viewing – we don’t need to do the hard work to create the algorithm because the guys at Bing have already done a pretty good one,” says Gill.
Mike Gualtieri, a principal analyst at Forrester Research, describes the marketplace for algorithms and other pre-prepared machine learning tools on both Amazon and Microsoft as “nascent”. Indeed, he adds, the algorithm itself is normally just a small part of the overall package required for organisations to build machine learning into their applications.
“There’s two different marketplaces: there’s a marketplace for built models – to predict customer churn or for a recommendation engine. Those models are built based on the data, so you may need to do some data wrangling to make them work,” he says.
He adds: “Is there a marketplace for algorithms out there that data scientists can use? Yes and no. The advantage is that there’s so many algorithms already and many of them, if you don’t have a predictive model, you can probably create one from those that exist.”
The key, though, is building a predictive model that is not only tailored to the needs of the organisation, but which improves accuracy – and all that involves work.
Particularly popular in the Azure Data Market, says John Bronskill, partner architect at Microsoft Research in Cambridge, are recommendation engines. “We have ready-made APIs that any online store can feed in purchase or intent-to-purchase history. Based on what users have clicked on, or rated highly, we can recommend other items that they will either need as accessories or, perhaps, something new that will appeal to them based on what they have purchased in the past,” says Bronskill.
Other items in the Azure Data Market include Customer Churn Prediction, Text Analytics, Frequently Bought Together, and Anomaly Detection. However, the marketplace currently offers just 41 items under “machine learning”, backing up Gualtieri’s judgement.
For Gualtieri, Gill’s preference for Microsoft Azure for machine learning makes sense. “I’d say that Microsoft’s is a much more sophisticated tool, at this stage, than Amazon’s,” says Gualtieri.
The main reason for this, he adds, is Microsoft’s early decision to base its cloud machine learning technology on R, a programming language and software environment originally developed to aid statistical computing, written primarily in C, Fortran, and R.
“It has a very visual tool for creating arbitrary analytical workloads. Behind many of those operations is an open-source programming language called R. In contrast, Amazon has some light data preparation tools, but only one class of linear modelling algorithms, so it’s much more limited,” says Gualtieri.
He continues: “With the Microsoft solution, because R is behind it, you can have a lot more text analytics – there’s hundreds more possibilities.
[Please see page two]
Source: Rise of the machines
Via: Google Alert for ML