The Cyberspace Administration of China (CAC) has proposed a new set of draft regulations to govern AI-aided synthesis systems, including deepfakes, virtual reality scenes, text generation, audio, and other sub-sectors of AI media synthesis – a field in which China produces a prodigious number of academic papers and innovative research projects each month.
A post (Google Translation, original here) at the official site of the CAC sets out the proposed obligations, and characterizes affected services as ‘deep synthesis service providers’, inviting citizens to participate by contributing comments on the draft proposals, with a deadline of 28th February.
Not Just Deepfakes
Though the suggested regulations have been reported in the terms of their potential impact on the creation and dissemination of deepfakes, the document attempts an all-encompassing purview over the ability of algorithms to generate any type of content that could be interpreted in the broadly understood sense of ‘media’.
Article 2 declares the projected scope of the regulations across six sectors*:
(1) Techniques for generating or editing text content, such as chapter generation, text style conversion, and question-and-answer dialogue;
(2) Technologies for generating or editing voice content, such as text-to-speech, voice conversion, and voice attribute editing;
(3) Technologies for generating or editing non-voice content, such as music generation and scene sound editing;
(4) Face generation, face replacement, character attribute editing, face manipulation, gesture manipulation, and other technologies for generating or editing biometric features such as faces in images and video content;
(5) Techniques for editing non-biological features in images and video content, such as image enhancement and image restoration;
(6) Technologies for generating or editing virtual scenes such as 3D reconstruction.
China criminalized the use of AI for the spreading of fake news at the end of 2019, at which time the CAC expressed concern about the potential implications of deepfake technology, prompting many to consider that the Chinese government would eventually ban deepfake technology outright.
However, this would entail China officially abandoning one of the most politically and culturally significant developments in the history of media generation, AI, and even politics, and cutting itself off from the benefits of global and open scientific collaboration.
Therefore, it seems that China is now determined to experiment with controlling rather than banning the potentially rogue technology, which, many believe, will ultimately migrate out of its porn accelerator phase and into a legitimate and exploitable set of use cases, notably in entertainment.
Article 2.6 addresses the generation or editing of virtual scenes, such as 3D reconstruction, a more nascent technology than deepfake impersonation, and one that has achieved the most prominence over the past two years through the advent of Neural Radiance Fields (NeRF), where photogrammetry is used to synthesize entire scenes in the explorable latent space of machine learning models.
However, NeRF is rapidly expanding its reach out from tableaux of models and walk-throughs of environments into the generation of full-body video, with Chinese researchers having advanced some major innovations in this respect.
Though NeRF has produced a blizzard of new research since its announcement in 2020, its implementation in VR or AR systems, or its suitability in visual effects pipelines, still has many notable challenges and technological bottlenecks to traverse. NeRF’s growing ability to reconstitute and edit complete human physiognomies has yet to incorporate any of the standard identity-transforming deepfake capabilities that have characterized news headlines over the last two years.
Audio Deepfakes a Priority?
If one is to take Article 2’s list order as an indication of the deep synthesis technologies which China is most concerned to control and regulate, this would suggest that text-based AI-generated fake news is of primary concern, with voice synthesis ahead of video deepfakes in terms of its potential impact.
If so, this accords with the fact that deepfake video has yet to be used in any crime not related to pornography (Asia has not hesitated to criminalize deepfake porn), whereas deepfake audio has been posited as an active technology in at least two major financial crimes, in the UK in 2019, and in the United Arab Emirates in 2021.
The new draft regulations oblige users wishing to exploit a person’s identity via the use of machine learning systems to seek written permission from the individual. Additionally, synthesized media must display some kind of ‘prominent’ logo or watermark, or other means by which the person consuming the material can be made to understand that the content is altered or entirely fabricated. It is not entirely clear how this can be accomplished in the case of audio deepfakes.
If ratified, the draft proposals would obligate deep synthesis service providers to register their pertinent applications with the state, in accordance with the existing Provisions on the Administration of Algorithm Recommendations for Internet Information Services, and to comply with all necessary filing procedures. Deep synthesis providers will also be required to cooperate freely in terms of supervision and inspection, and to produce ‘necessary technical and data support and assistance’ on request.
Further, such providers will need to establish user-friendly portals for complaint, and to publish expected time limits on the processing of such complaints, as well as being required to offer ‘rumor-refuting mechanisms’ – though the proposals do not provide detail in regard to the implementation of this.
Infractions could prompt fines of between 10,000 and 100,000 yuan ($1,600 and $16,000), as well as expose offending companies to civil and criminal lawsuits.