The annual Alibaba Singles Day shopping festival is coming this weekend. As hundreds of millions of people splurge online and media attention will be focused on a big screen updating the sales figures in real-time, it will be a huge test for the data experts behind the screen.
Among the technologies that make such real-time data analysis possible is stream computing. It not only displays the latest sales number, but also optimizes the layout of goods on the website, as well as making personalized recommendations to shoppers.
So how exactly does stream computing work? Imagine you are standing at the entrance of a shopping mall and customers are coming from every direction. You need to make constant prediction about the next customer in terms of gender, age, preferences, purchasing power, etc.
Based on that, you assume a certain pattern for incoming customer traffic and then change the content of an electronic billboard to attract more shoppers. This is what the Singles Day real-time big screen is about.
From this example we can see that the value of data could vanish immediately, or decline as time goes by. Data analysis hence has to be real time. In the past, incoming data was stored into a database before being analyzed. But this no longer works.
In this internet era, real-time analysis of a tremendous volume of continuous data inflow is often needed to unlock the value of those data. The response time required can be as short as a split second.
For example, the real-time bidding ads only have a typical response time of less than 200 milliseconds.
In Taobao’s case, it needs to dynamically calculate the chance of an ad in different pages being clicked, based on the preference, geographical location and browsing record, and then decide on what, where and when to put those ads on its web pages.
Remember that a web page may have tens of thousands of product listings and receive hundreds of thousands of visitors every second, and there are numerous ad spaces on each page.
Alibaba needs to figure out where to locate pay-per-click ads in the best location in such an environment. Therefore, the system handling the job needs to be powered by a low-delay, highly-reliable and extendable data processing engine. The ability to do so underlies the e-commerce giant’s core competitiveness.
The significance of stream computing will increase as the Internet of Things comes of age. Technology pertaining to big data will continue to evolve and advance. To survive and thrive, companies will need to move in step with the times.
This article appeared in the Hong Kong Economic Journal on Nov 7
Translation by Julie Zhu
[Chinese version 中文版]
– Contact us at [email protected]