How important is data privacy?
Taking face recognition as an example, face recognition technology is being widely used in payment transfers, unlocking and decrypting, traffic cases, real-name registration, account opening and cancellation, access control and attendance and other scenarios. Each item affects our property, health, and privacy. Wait for safety.
Just in a CCTV news report in the evening column, the reporter found that on a certain online trading platform, you can buy thousands of face photos for only 2 yuan, and more than 5,000 face photos are less than 10 yuan. , A single face photo is less than 1 cent. These photos are from real life photos and selfies shared by real people on social networks. If the user's identity information is superimposed on it, it is likely to be used in crimes such as precision fraud, money laundering, and criminal involvement.
How much private information we have left on the Internet and how many platforms we have left it, I am afraid that we can't even remember it. And we know almost nothing about the ultimate destination, use, and security of these data.
In recent years, my country has initiated relevant legislation on the protection of citizens' personal data and privacy, such as the "Network Security Law" and the "Civil Code", which all have legal provisions for the protection of personal information. The "Data Security Law" and "Personal Information Protection Law" are also in the process of soliciting opinions from the whole society.
The promulgation of relevant laws is more a guarantee of rights protection after the fact, and the protection of personal data and private information still needs to be started from the source, that is, each network platform realizes the comprehensive protection and supervision of data from the technical level.
At the same time, data transaction and data circulation have become an important issue restricting the development of my country's big data industry. How to obtain credible and high-quality data through legal, compliant, safe and efficient means has become an urgent problem for many technology companies and platforms.
On the one hand, there is a flood of data privacy leaks of users, on the other hand, it is difficult for relevant enterprise platforms to obtain effective and compliant digital resources. This contradiction has caused more and more companies to call for a new data governance and application solution.
So far, a kind of privacy computing (Privacy Computing), which is used to protect data from leakage, but can realize data analysis and calculation, has been officially put on the agenda.
The "Millionaire" Problem: The Origin of Private Computing
"Suppose two millionaires meet, and they both want to know who is richer, but they don't want to let the other party know how much wealth they really have. So how do you let the other party know who is richer without the involvement of a third party? "
This is the "millionaire" hypothesis put forward by Academician Yao Qizhi, the 2000 Turing Prize winner in 1982. This brain-burning problem involves such a contradiction. If you want to compare who is richer, the two must publish their real property data, but the two people don't want the other to know how much their wealth is. So, in our opinion, this is almost an unsolvable paradox.
This seemingly difficult problem involves the ownership and use rights of data. The wealth owned by the rich is the ownership of the data, and the publication of the wealth data by the rich is the right to use the data. At present, when major Internet platforms provide services to you, they basically obtain both the right to use data and almost the actual ownership of the data. Although users retain nominal ownership of the data, most people will Keep it on these platforms, and few people will advocate that the platforms destroy data.
Faced with the careful consideration of two "millionaires", whether there is a technology that can separate the ownership and use rights of data, allowing the rich to disclose wealth data to this technology platform, but after a series of encrypted data calculations, finally Only give the corresponding result (who is richer). For Internet platforms or companies that need user data, what they get is no longer the ownership of the original data, but a set of data that is first encrypted to provide services to the data demander?
Understand this assumption, you can understand the general idea of privacy calculation.
In privacy computing, this is a professional encryption problem, which can be accurately expressed as "a collaborative computing problem between a group of untrusted parties, under the premise of protecting private information and without a trusted third party." Secure Computing Protocol. While proposing the idea, Academician Yao Qizhi also proposed his own solution "Multiple Secure Computing" (MPC).
When MPC was proposed in the early 1980s, it could only be used as a technical theory that urgently needs to be verified. With the continuous improvement of computer computing power and the increasing application and importance of private data, MPC technology is also gradually improved and applied.
Now, in addition to the progress in MPC technology, privacy computing has also shown more new technical features and solutions. So, what are the specific developments in the current technical preparations and industrial applications of privacy computing?
Privacy computing gestation period: the eve of large-scale applications
Why is privacy computing becoming more and more important now? Not only the leakage of personal privacy data of citizens mentioned at the beginning has reached a stage that urgently needs to be governed, data has also become the most important core asset of enterprise platforms, and enterprises have already motivated to fully protect and use compliant data on the platform. .
We see that this year, for the first time in China, my country has defined data as the fifth major production factor besides land, labor, capital, and technology. Not long ago, the "Draft of the Personal Information Protection Law" reviewed by the National People's Congress stipulates that if the violation of personal information rights and interests is serious, the illegal income shall be confiscated and a fine of less than 50 million yuan or less than 5% of the previous year's turnover shall be imposed. The 5% quota even exceeds the EU GDPR, which is known as the "most stringent data protection".
Whether it is for data compliance and legal considerations or for data application considerations, companies are increasing their efforts to protect data privacy. According to the latest strategic technology trend forecast by Gartner, an international research organization, privacy computing will become one of the nine key technologies to be digged in 2021. Gartner also predicts that by 2025, half of large enterprises will use private computing to process data in untrusted environments and multi-party data analysis use cases.
The emergence of these new trends puts forward new requirements for privacy computing and will also provide a broad range of industrial application requirements.
From the technical side, there are two mainstream solutions for privacy computing. One is a solution that uses cryptography and distributed systems, and the other is a solution that uses trusted hardware to receive multiple private data input and output.
At present, the cryptography scheme is represented by MPC, which is realized by professional technologies such as secret separation, inadvertent transmission, obfuscated circuits, and homomorphic encryption. In recent years, its versatility and performance have been significantly improved, and it has practical application value. At present, trusted hardware technology is mainly based on the Trusted Execution Environment (TEE), which builds a hardware security zone, and data is only calculated in this secure zone. The core is to still leave the data trust mechanism to hardware parties such as Intel and AMD. Because of its high versatility and low development difficulty, it can play important value in scenarios where data protection is not strict.
In addition, in the context of artificial intelligence big data applications, "federated learning" is also the main promotion and application method in the field of privacy computing.
In the new technology cycle represented by artificial intelligence and big data applications, privacy computing has put forward higher data governance requirements for Internet platforms and enterprises, that is, truly user-centric, without relying on the enterprises themselves or third-party companies The controlled data server provides security protection, allowing users to truly control their own data ownership and protect data security and privacy requirements.
On the industrial side, privacy computing application scenarios continue to expand.
For example, in the financial industry. Domestic privacy computing products are currently mainly used in risk control and customer acquisition in the financial industry, that is, a number of financial-related institutions conduct joint portraits and product recommendations to customers without disclosing their personal information, which can effectively reduce them in scenarios such as long-term loans. Default Risk.
In the medical industry, through privacy computing technology, medical institutions and insurance companies can analyze the health information of insured persons without sharing the original data. In the government affairs industry, privacy computing can provide solutions that integrate government data with social data such as telecommunications companies and Internet companies. In the relevant plans of some local governments, privacy computing is expected to become the focus of the next application promotion.
In the future, privacy computing will be widely used in many fields with sensitive privacy data such as finance, insurance, medical care, logistics, and the automobile industry. When solving the problem of data privacy protection, it will also help alleviate the problem of data islands in the industry. It is a large number of AI models. The training and technology landing provide a compliant solution.
A long way to go, the predicament and way out of data privacy computing
Now, as social development enters the era of data elements, mobile Internet enters the second half and the international situation is unpredictable, the data element issues have become more complex. In the field of privacy computing, the legal positioning of the safe use of citizen data, the analysis and application of data within and between enterprises, and the global cross-border transaction and circulation of data are all facing unprecedented challenges, and there are also problems in each link. .
First of all, with regard to the legal provisions on the safe use of citizens’ data in privacy computing, my country’s laws have not yet clearly stipulated whether privacy computing is legal. In existing regulations, “network operators shall not provide personal information to others without the consent of the person being collected. "The goal of privacy computing is to calculate based on multi-party data, which in principle violates this requirement, but at the same time it also applies to the exception clause that "a specific individual cannot be identified and cannot be recovered after processing." These have become the first legal bottlenecks restricting the development of privacy computing.
Secondly, there is still a certain degree of difficulty in applying privacy computing in enterprises. For example, the data standardization and data quality of most enterprises cannot support the requirements of privacy computing for data consistency among participants. The complexity and computational efficiency of privacy computing itself put forward higher requirements for large-scale commercial use of enterprises, and the cost of trial and error is high. In addition, private computing has a certain "black box" effect for users who really benefit. It is difficult for people to understand and trust private computing technology, and the cost of popularization and acceptance is high.
In addition, cross-border transactions and flows of global data are now facing numerous difficulties. For example, in the US government's attack on TikTok not long ago, one of them was to accuse it of collecting data from American citizens and to prevent it from storing the data on Chinese servers. Ireland in Europe also asked Facebook to order it to suspend the transmission of data of its EU users to the United States. In 2016, the European Union first promulgated the world's most stringent data protection program GDPR, stipulating that the consequences of non-compliance with data privacy regulations will be severely sanctioned and huge fines. Previously, Google got a high fine of 50 million euros issued by the French data protection regulator. Recently, the Swedish H&M company was fined 35 million euros for illegally monitoring employee privacy.
In the context of the tightening of new data supervision and the complex international situation, companies engaged in data cross-border activities need to reconsider their underlying architecture design. To avoid cross-regional cutting and disposal of data, but also to avoid falling into the monopoly of hardware giants, adopting new privacy computing solutions has become an important task for some companies involving cross-border business.
These application dilemmas of privacy computing urgently need to be resolved by various parties, including the active promotion of governments of various regions and countries around the world, especially the definition of the rights and responsibilities of privacy computing by laws and regulations, and the governance of corporate data by big data-related companies. Continuous investment in intensity.
So for related technology companies that promote the development of privacy computing, there are now a series of new development trends.
The first is the emergence of blockchain technology, which provides a new solution for private computing. The application of privacy computing to the blockchain not only increases the immutability and verifiability of private computing results to a certain extent, but also increases the confidentiality of data on the blockchain. It has become the technology integration direction of many manufacturers. For example, a permissionless private computing service uses TEE trusted computing nodes all over the world to ensure the stability and security of private computing.
Secondly, software and hardware collaboration and platform integration are greatly improving the performance and convenience of private computing. This enables the hardware acceleration and capability arranging of privacy computing through the platform infrastructure to achieve a full range of capabilities from storage computing to modeling and mining.
In addition, private computing is also moving towards large-scale distributed computing, and its implementation methods are more diverse. Some projects through low or even zero code development code that can greatly reduce development efficiency and reduce development threshold Privacy computing products.
In the end, we see that in the "digital rights era" when data becomes more valuable and data security becomes more and more important, privacy computing will become the most important gatekeeper between user data security protection and the enterprise's use of data value. Privacy computing companies must play the roles of data management and service providers, but this role is no longer a simple role of checking data for the "two rich men", but can provide them with a full range of data protection. Able to carry out full operation of data "assets" for it.
It is foreseeable that privacy computing will play a pivotal role in the future data governance and data collaboration between enterprises and organizations, as well as the commercial applications of emerging digital industries such as artificial intelligence and new infrastructure.