The birth of real-time data analysis through time-series databases

The world has more data than it knows what to do with. The statistics are incredible: IBM says the world creates two and a half quintillion bytes or 2,500,000,000,000,000,000 bytes of data EVERY DAY.

Plus, “about 90 per cent of all the data in the world has been generated in the past two years”.

For manufacturing it’s a similar story. IIoT sensor technologies are collecting and sending unparalleled amounts of data. But the challenge is turning this data into action. To address it we need to be looking at what the ‘Googles’ of the internet have done with their data. 

In particular, how have they revolutionised databases to better manage and analyse data? 

The answer: real-time data analysis

While process historians were doing their thing in industrial plants and shop floors, time-series databases were being used by the financial sector for stock volatility, or monitoring a securities price over time.

Then Amazon and Google created NoSQL databases to address the sheer scale of internet users and pieces of data that required processing. These databases were designed to cope with millions upon millions of unstructured data points and connect with other modern web based applications.

Then, instead of closing the technology off, they made their IP publically available – allowing the open source community to develop much of the NoSQL products around today.

About 10 years ago Yahoo implemented the open source Apache Hadoop NoSQL database to improve their search indexing. Other internet companies followed suit – Facebook, Twitter and e-bay rolled it out in 2009.

The advancements made by tech giants over the past 10 years have revolutionised database computing. And it hasn't stopped moving. NoSQL databases have kept improving to meet demand for scaled computing. For example, Amazon added DynamoDB and more recently Azureto to its database service portfolio while Microsoft added DocumentDB to its suite. These have bolstered database performance and allow customers instant scalability with minimal hassle. More recent is Google’s Cloud Bigtable which is targeted at IoT vendors as a time-series database that performs data analysis and anomaly detection among other functions. 

In short these databases can scale easily, integrate with multiple data sources and software, and process more data, quicker than ever.

And, because they are open source they’re cost effective and easy to use. Influx, Grefana, and Elasticsearch are a few with Google technology behind them. We'll explore the capabilities of these databases in a future blog.

“Data is the new oil”– Clive Humby

For Google, Amazon and major social networking sites, the ability to store, process and control their data analysis is central to business success – or failure. The same can be said of heavy industries and manufacturing.

By harnessing the technologies that the leading data companies of the world use, manufacturers and industrial operators will ensure they manage their data competitively well into the future. Time-series is one place to start. In case you missed it, we explored why process historians and time-series are essential for data analysis in a previous blog.

Clive Humby was the data scientist who, along with his wife changed the way retailers gather and use consumer data. They created the world's first supermarket loyalty card for Tesco. Despite saying it in 2006, Humby's revelation: "data is the new oil"  is more potent than ever.

Michael Palmer, of the Association of National Advertisers, expands on Humby's quote:

"Data is just like crude. It's valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; so must data be broken down, analysed for it to have value."

A consolidated view of operations is key to harnessing and 'refining data'. Our free downloadable cheat sheet on the opportunities and risks of integrating IT and OT manufacturing systems explores how to go about integrating disparate data systems. 

Download the cheat sheet today!Headline image by Edho Pratama on Unsplash.

Topics: Integrate Data, Improve Performance