How And What is Big Data?
Data is created around us every second. Digital processes, social media, mobile devices, automated systems, and sensors all contribute in producing it. The result is huge volumes of data created and stored from multiple sources at a stupendous speed and variety. This large volume of information is termed 'Big data' compromising of both structured and unstructured information that inundates enterprises on a day-to-day basis. However, it's not about how much volume of data is collected and stored, it's all about what can organizations do with this information. Big data need to be analyzed for insights that lead to better decisions and strategic business moves. Nevertheless, to extract meaningful value from big data, strong analytical skills and optimal processing power is required.
"Customers, employees and citizens will become engaged principally through digital means. With operational processes quickly becoming digitalized, traditional analog and manual processes will be automated, including both physical and human elements. Many, if not most, decisions will be algorithmic, based on automated judgment" says Gartner.
Big Data History and the 3 'Vs'
From ages people collect information and make decisions based on the analyzed information. The term "Big Data" has gained importance from 2001.
In the late 1990s, while many analysts, clients and vendors were discussing about the fast-growing information stores, Doug Laney observed the speed at which information was flowing mainly due to electronic commerce. The increasing quantity of information was as or more challenging along with the post Y2K-ERP application boom that was increasing structures, formats and information sources. Then Doug Laney published a research note in February 2001 entitled 3-D Data Management: Controlling Data Volume, Velocity and Variety. Today Doug Laney's "3V's" to understand "Big Data" has become ubiquitous.
Volume: Many companies are creating and storing enormous amount of information on a day-to-day basis. Organizations like NASA, Facebook, Google have informtion collected in huge volumes through various means. This data needs to be saved, analyzed and processed to create value in the form of understanding the market trends, understanding the customers to result in creating relevant solutions that succeed in market.
Variety: The information generated through various channels is either unstructured or semi-structured as it is in the form of text, images, videos, emails, binaries and other formats, most of the time without uniformity. Existing traditional systems are efficient to handle structured data, they are not capable enough to handle the huge quantity of unstructured data saved through various means in modern times.
Velocity: The traditional systems were fine till now as the query to find a single detail was to search in some millions or billions of records. But the storage is fast increasing with high speed and each query need the analysis and processing of information that is in the range of hundreds and thousands of petabytes, exabytes and some times more. So a system that is capable to process the data at higher speed and with high scalability is needed.
Now along with the 3Vs Veracity and Value are been added to the characteristics of Big Data.
Veracity: Veracity refers to the abnormality, biases, and noise in information. It is required to understand whether the data that is being stored, and mined meaningful to tinformationhe problem being analyzed. The best way to avoid unnecessary expenditure of resources is to define the objectives as early as possible.
Value: Any big data project's objective should be to create value for the company. All informationcollection and analysis should not be done just for the sake of technology.
Short Definition of Big Data
Big data is a term used for data sets with huge volume, velocity, variety and complexity where the traditional tools are incapable of collecting, processing, storing, managing and analyzing them.
Why Is Big Data Important?
It doesn't matter how much data is collected or how is it collected and stored. All that makes a difference is what is to be done with it. Whatever source the data is taken from, organizations should be able to analyse data to find methods that reduce cost & time, innovate new products and support timely decision-making. When big data is analysed it helps in accomplishing business-related tasks by identifying root causes of failures, issues and defects in business processes in real time.