2013 is called the first year of big data, and all walks of life are gradually opening the era of big data applications. Until now, big data is still talked about.
What is big data?
Is 1PB big enough?
If you don't have an intuitive impression, you can think of the capacity of your computer's hard disk. The standard configuration is 500G-1TB. Most people have used it for a year or two, but this capacity may not be used up. 1PB=1024TB=1048576GB.
In practice, the data volume of a little famous game in a day is about tens of TB, or even more.
If you think the PB unit is already the largest? That would be a big mistake!!!!
On top of PB, there are EB (Exabyte 10 Billion Bytes), ZB (Zettabyte 1 Billion Billion Bytes Zettabyte), and YB (Yottabyte 1 Billion Billion Bytes Yao Bytes). These units are just the current units given for the convenience of statistics of massive data, and larger units may appear in the future.
Brian Krzanich, CEO of Intel, said that in 2020, Internet users will generate 1.5GB of data every day.
It is predicted by HIS data that by 2025, the total installation of global Internet (IoT) connection equipment is expected to reach 75.44 billion, and the amount of data generated by this part of equipment every day can be imagined.
According to the previous data relationship, 1ZB is about 1.1 trillion GB, which is equivalent to the total amount of sand in the world.
It is easy to see from the above figure that Internet data is exploding every year. Of course, big data is not just a large amount of data, it has other deeper meanings.
For big data, the definition given by McKinsey Global Research Institute is:
”A data set that is large enough to greatly exceed the capabilities of traditional database software tools in terms of acquisition, storage, management and analysis. “
Big data has five characteristics, called 5V.
- Variety
The diversity of big data means that the types and sources of data are diverse. Data can be structured, semi-structured or unstructured. The presentation forms of data include but are not limited to text, images, videos, HTML pages, etc.
- Volume
The large amount of big data refers to the size of the data, which is what the author introduced above.
- High speed
The high speed of big data refers to the rapid growth and processing of data. Every day, the data of all walks of life are showing exponential explosive growth. In many scenarios, the data has timeliness, for example, the search engine should present the data required by the user in a few seconds. When enterprises or systems face the rapid growth of massive data, they must process and respond quickly.
- Low value density (Value)
The low value density of big data refers to the fact that there are few really valuable data among the massive data sources. Many data may be wrong, incomplete and unusable. In general, the density of valuable data in the total amount of data is very low, and refining data is like scouring the sand in the waves.
- Authenticity
The authenticity of big data refers to the accuracy and reliability of data, representing the quality of data.
Data is always there, and the way is changing
The significance of big data is not only to produce and master huge data information, but also to professionally process valuable data.
Humans never lack data, what is lacking is the deep value mining and utilization of data. It can be said that data has existed since human society had characters, and it is the same now. The only change is the form of the whole process of data generation, recording and use.
- Data production
In the early days of human society, food was the most important thing for people, and data generation was mostly linked to commodities, food, land, etc. Paleolithic tribal people carved dents on branches or bones to record daily trading activities or supplies.
In order to measure the length of goods, the Chinese invented such units as ruler, li, cun, zhang, bu, and ren; In order to measure weight, weight units such as liter, bucket and dendrobium were invented.
In the Internet era, data production becomes easier. The US Internet Data Center once pointed out that the data on the Internet will increase by 50% every year and double every two years. At present, more than 90% of the world's data are generated in recent years.
Everyone generates massive data every day, such as video data, e-commerce data, social data, etc.
What is big data? What can big data do?
Data generated every 60 seconds globally
- Data record
Thousands of years ago, people used tortoise shells, stone drums, bamboo slips, silk and silk, to paper making after the maturity of the engraving and all other accessible artifacts as the carrier of data.
Thousands of years later, people use books, newspapers, hard disks, optical disks, memory and other more flexible and simple ways to record data.
- Data utilization
The ancients used oracle bone inscriptions to predict luck and misfortune, and astrology to predict the rise and fall of dynasties; Use ants to move, swallows to fly low, and earthworms to predict the weather.
In the Internet era, enterprises or products use e-commerce data to recommend products for users, and use social data for advertising marketing.
Before the rise of the concept of big data, most enterprises did not pay attention to the valuable value of data, but simply produced and recorded data. What's more, they regard massive data as a burden. Because the storage and management of data requires a lot of costs for enterprises, few enterprises can use data as a resource to smell the value behind big data. Even now, the ability to integrate and utilize data resources is still a major challenge for every enterprise.
Big data application
As a technology that can change industrial applications, big data can only bring real value if it is actually implemented.
In fact, big data has a wide range of applications, not only in the Internet industry, but also in other areas such as finance, manufacturing, transportation and logistics.
- Big data makes lending more reassuring
In the financial industry, take borrowing and lending as an example. Before the loan, the lender will first use big data to review the borrower before the loan to ensure the repayment rate after the loan.
The lender legally collects the borrower's label information from various channels, such as education background, occupation, salary status, historical loan and repayment status (it is said that a user can have 7000 label dimensions). Massive data are put into anti fraud model, repayment capability model, identity verification model and other models for training, and finally get the assessment information of whether the loan application is passed, loan amount, lender's repayment intention and so on.
The more data collected by the borrower, the finer the label dimension, and the more authentic the data, the more comprehensive the audit effect.
- Big data makes advertising marketing more efficient
Advertising is one of the most common means of realization in the Internet industry. Big data enables advertising marketing, which changes advertising from annoying to advertising as content and advertising as service.
Once upon a time, you will find that the advertisements you see in daily life actually understand you so well. Click Taobao, and your favorite products are recommended on the Banner homepage; Open the WeChat friends circle and you will see the car maintenance you are trying to do; Open Baidu search, and the villa information you saw two days ago suddenly appears.
All this is achieved thanks to big data enabled advertising.
In the early stage of advertising, a large number of data are integrated and analyzed by means of big data, including users' browsing habits, consumption behavior, browsing records, number of clicks on advertisements, etc., and effective information is mined from them; Build a comprehensive user portrait, combine advertising business, accurately locate target users, and ensure targeted advertising.
Big data construction user profile
In the middle and late stages of advertising, through real-time data feedback, in combination with the changes in the user's region and time, dynamically optimize the advertising materials, adjust the advertising presentation mode and advertising exhibition location, so that the same user can enjoy different advertising services in different scenarios, achieve a thousand people, increase the advertising marketing effect, and improve the advertisers' KPI.
- Big data enabling retail
In the new retail era, customers' demands are changing all the time. Big data enables retail, allowing retail to change in people, goods and markets.
Retailers can use big data to predict future market demand and manage inventory in advance. In the early stage of high traffic, timely replenish the inventory and improve the commodity supply rate; In the early stage of flow dispersion, the inventory shall be removed in time to avoid overstock.
Analyze the geographical distribution of users, store traffic, consumer habits, etc. with big data, open stores and build warehouses in appropriate areas. In logistics delivery, the transportation road strength shall be reasonably planned based on the data to reduce the transportation cost.
The use of data can also unify the interaction between upstream and downstream supply chains, solve the problem of data inconsistency, reduce the bullwhip effect, and improve the utilization efficiency of each link in the supply chain.