Data labeling refers to the process of using automated tools to mark the collected data through classification, picture frames, annotations, etc., to form high-quality data for computer identification and analysis.
In 2019, the total scale of China’s data output was 3.9ZB, which is an increase from 2018. In 2020, the overall scale of China’s big data market was expected to exceed US$10 billion for the first time. The increase in data volume will drive the scale of big data industry expenditures to rise year by year.
In this context, the demand for data labeling has increased with the increase in data volume. In 2019, the demand is about 36EB, and the market size reached 3.09 billion yuan, and it was around 3.6 billion yuan in 2020. From the perspective of the supply side, the development of the big data industry will inevitably promote the demand for cleaning and labeling unstructured data, thereby driving the increase in the number of companies related to data labeling.
With rising data volume, China’s big data spending increases
The rapid development of China’s Internet industry in recent years has brought about a rapid increase in the amount of data. In 2019, the total scale of China’s data output was 3.9ZB, an increase of 29.3% year-on-year, accounting for 9.3% of the total global data output. In terms of per capita data output, China’s per capita data output was 3TB in 2019, a year-on-year increase of 25%.
Since 2015, with the vigorous promotion of national and local governments, China’s big data industry has accelerated its development. A large number of big data industrial parks have been built one after another, the big data industry ecology is accelerating, the relevant standards and technical systems continue to improve, the application market is growing, and the international influence of the industry is constantly increasing.
According to the latest forecast data released by IDC in March 2021, the overall scale of China’s big data market was expected to exceed US$10 billion for the first time in 2020, a year-on-year increase of 15.9% over 2019.
In the long term, China’s big data spending as a whole is growing steadily. The total market is expected to exceed US$20 billion in 2024, an increase of 145% compared to 2019. At the same time, China’s big data market is developing rapidly, with a five-year CAGR of 19.7%, leading the world in growth rate.
China’s data labeling demand is about 36EB, and the market is developing rapidly
At present, 1% of the data on the market can be collected and saved, and 90% of the data is unstructured data. These unstructured data can only be awakened in value after cleaning and labeling, which has produced a steady stream of cleaning And labeling requirements, based on 90% of unstructured data that all need to be cleaned and labeled for application in artificial intelligence development, the amount of data that needed to be labeled in 2019 in China amounted to 36EB.
In terms of market size, according to iResearch data, the market size of the data annotation industry was 3.09 billion yuan by 2019, the industry market size exceeded 3.6 billion yuan by 2020, and the market size is expected to exceed 10 billion yuan by 2025, indicating that China’s data annotation industry is in a high-speed development stage.
The number of data annotation companies in China is on the rise, exceeding 700 by the end of 2020
According to the statistics of “AI Data Labeling Ape”, in April 2020, the number of companies related to data labeling business in China was 565. In December 2020, the number increased to 705. The increment of Chinese companies with relevant data labeling needs from April to December 2020 was 24.78%.
Up to now, there are thousands of companies in China, including workshops, that take data annotation as their core business. In the future, with the continuous development of the big data industry, it is expected that the number of companies related to data labeling will continue to increase.