Considering the potential application of Article 5(1) of the InfoSoc Directive on temporary acts of reproduction (and then the necessity of licensing copyrighted works), it is uncertain what role copyright collecting societies will play regarding the potential issuance of licenses for the pretraining and training of AI technologies and subsequently, for the collection of royalties generated by such uses. Recently, some copyright collecting societies, have tried to avoid the use of the works belonging to their repertoire, even if the measures adopted appear unsatisfactory. For instance, SACEM, the main French copyright collecting society, has opted out of machine-learning systems, asking for the application of Article L122-5-3 of the French IPC, which has transposed Article 4(3) CDSM. The same choice has also been made by SIAE, one of the two existing Italian copyright collecting societies, with its repertoire, by requesting the application of Article 70-quater of the Italian copyright law, which allows the copyright holder or holders of related rights, including the database holder, to prohibit the extraction of text and data, thus limiting TDM activities. As far as we know, SACEM and SIAE has only sent cease-and-desist letters to the major operators of the sector, but they have not put in place any technology that could recognize works as protected by these copyright collecting societies. Furthermore, it is questionable whether the mandate given by authors and publishers to SACEM covers SACEM’s right to opt-out on their behalf.in accordance with Article 4 CDSM, have sent cease-and-desist letters to major AI platforms, prohibiting them from using works from their repertoires for machine learning activities. Is this an effective answer that meets the opt-out model already discussed? This solution, in our opinion, is inefficient (at least at the moment), as, according to Article 4(3) of the CDSM, the obligation of implementing technologies which can be readable by GenAI machines relies on the copyright-holders, and then the limitation to the use of the works should be expressed “in an appropriate manner”. Yet, these works are distributed on third-party platforms (YouTube, Spotify, Amazon) through which AI platforms have been trained and are still being updated: how does the simple expression of the will to opt out of training systems effectively respond to the provision of Article 4(3) CDSM? Furthermore, the boundaries and the interpretation of the notion of “sufficiently detailed summary” are arguable. In our view, providers of AI systems are not obliged to mention individual works, also because it could prove practically impossible as a duty. On the contrary, this obligation should certainly include the repertoires of copyright collecting societies, regardless of the type of works used (literary works, musical works, drawings and paintings, etc.). Similarly, also in the light of the case New York Times v. OpenAI, we stand on the opinion that summaries should include not only the list of the newspapers used for the pretraining and training of AI machines, but also additional details, such as, for example, the years of the newspapers or whether only individual sections of the newspapers themselves have been analyzed (e.g., political, cultural, sports information, and so on). Another problem arises regarding the legitimacy of copyright collecting societies acting on behalf of their associates. In fact, the main affiliation contracts – which will be analyzed in the paper – do not provide for this possibility and do not assign to CCSs the right to exercise an opt-out on behalf of the represented authors and publishers. The agreements with CCSs allow those entities to negotiate licence agreements, and to collect and distribute royalties, but not empower CCSs with the possibility of generally forbid some uses, especially where these uses may potentially generate revenues for authors and publishers. In any case, CCSs are a necessary intermediary, as it is unimaginable for every individual author or publisher to negotiate with AI technology operators for compensation, as, again, transactional costs would be prohibitive. The issue may seem straightforward from a legal perspective (it is sufficient to modify the association agreements among CCS, authors, and publishers), but less so politically. Many authors feel strongly threatened by the advent of artificial intelligence, and it is by no means certain that they would agree to grant a license allowing developers of AI technologies to train machines with their works. Therefore, two situations could arise. The first is that CCS may be compelled to reject licensing agreement proposals from companies developing AI systems. The other, opposite solution is that CCSs may have to separate the works of authors who have granted the possibility of training AI machines with their works from the works of authors who have refused to do so. This latter solution could be time-consuming and expensive for CCS, given that license fees are likely to be very low. Indeed, technologies based on machine learning require millions of works to train their systems. Therefore, the transactional cost of the license may not reflect the current one applied to users of copyright-protected works, such as radio, television, movie or music streaming platforms. The distribution of royalties is yet another issue. As mentioned above, Article 53 AI Act provides that AI model providers should provide a detailed summary about the contents used for training the machines and it is still blurred whether this summary can be limited to provide a list of websites, platforms, or other contents used to perform scraping activities or otherwise whether AI providers shall be required to report the repertoire of a CCS (and it could be particularly complex, given that repertoires are frequently modified, even for individual rights, especially after the implementation of the Barnier Directive). These aspects should be clarified by a template that, according to Article 53, paragraph 1), letter d), will be made available by the AI Office. However, it seems important for legal scholars and industry practitioners to provide fundamental guidance to the AI Office to facilitate understanding of a complex market such as collective management of copyright. Finally, how should the royalties collected by CCSs be distributed among authors and publishers? Analytically, by evaluating the individual works used for machine learning, or by following other parameters, such as considering the other uses of a single author or publisher? In other words, some CCCs may consider the so-called market-share of each member and then granting royalties which are proportionate to those distributed to these members in the previous years. Again, the starting point is whether Article 5 InfoSoc Directive can be applied to pre-training or training activities, and, if not, how to calculate the use of the works, i.e., which royalties should be paid by AI providers.
Artificial Intelligence and Copyright: Rights and Remuneration
Giovanni Maria Riccio
2025
Abstract
Considering the potential application of Article 5(1) of the InfoSoc Directive on temporary acts of reproduction (and then the necessity of licensing copyrighted works), it is uncertain what role copyright collecting societies will play regarding the potential issuance of licenses for the pretraining and training of AI technologies and subsequently, for the collection of royalties generated by such uses. Recently, some copyright collecting societies, have tried to avoid the use of the works belonging to their repertoire, even if the measures adopted appear unsatisfactory. For instance, SACEM, the main French copyright collecting society, has opted out of machine-learning systems, asking for the application of Article L122-5-3 of the French IPC, which has transposed Article 4(3) CDSM. The same choice has also been made by SIAE, one of the two existing Italian copyright collecting societies, with its repertoire, by requesting the application of Article 70-quater of the Italian copyright law, which allows the copyright holder or holders of related rights, including the database holder, to prohibit the extraction of text and data, thus limiting TDM activities. As far as we know, SACEM and SIAE has only sent cease-and-desist letters to the major operators of the sector, but they have not put in place any technology that could recognize works as protected by these copyright collecting societies. Furthermore, it is questionable whether the mandate given by authors and publishers to SACEM covers SACEM’s right to opt-out on their behalf.in accordance with Article 4 CDSM, have sent cease-and-desist letters to major AI platforms, prohibiting them from using works from their repertoires for machine learning activities. Is this an effective answer that meets the opt-out model already discussed? This solution, in our opinion, is inefficient (at least at the moment), as, according to Article 4(3) of the CDSM, the obligation of implementing technologies which can be readable by GenAI machines relies on the copyright-holders, and then the limitation to the use of the works should be expressed “in an appropriate manner”. Yet, these works are distributed on third-party platforms (YouTube, Spotify, Amazon) through which AI platforms have been trained and are still being updated: how does the simple expression of the will to opt out of training systems effectively respond to the provision of Article 4(3) CDSM? Furthermore, the boundaries and the interpretation of the notion of “sufficiently detailed summary” are arguable. In our view, providers of AI systems are not obliged to mention individual works, also because it could prove practically impossible as a duty. On the contrary, this obligation should certainly include the repertoires of copyright collecting societies, regardless of the type of works used (literary works, musical works, drawings and paintings, etc.). Similarly, also in the light of the case New York Times v. OpenAI, we stand on the opinion that summaries should include not only the list of the newspapers used for the pretraining and training of AI machines, but also additional details, such as, for example, the years of the newspapers or whether only individual sections of the newspapers themselves have been analyzed (e.g., political, cultural, sports information, and so on). Another problem arises regarding the legitimacy of copyright collecting societies acting on behalf of their associates. In fact, the main affiliation contracts – which will be analyzed in the paper – do not provide for this possibility and do not assign to CCSs the right to exercise an opt-out on behalf of the represented authors and publishers. The agreements with CCSs allow those entities to negotiate licence agreements, and to collect and distribute royalties, but not empower CCSs with the possibility of generally forbid some uses, especially where these uses may potentially generate revenues for authors and publishers. In any case, CCSs are a necessary intermediary, as it is unimaginable for every individual author or publisher to negotiate with AI technology operators for compensation, as, again, transactional costs would be prohibitive. The issue may seem straightforward from a legal perspective (it is sufficient to modify the association agreements among CCS, authors, and publishers), but less so politically. Many authors feel strongly threatened by the advent of artificial intelligence, and it is by no means certain that they would agree to grant a license allowing developers of AI technologies to train machines with their works. Therefore, two situations could arise. The first is that CCS may be compelled to reject licensing agreement proposals from companies developing AI systems. The other, opposite solution is that CCSs may have to separate the works of authors who have granted the possibility of training AI machines with their works from the works of authors who have refused to do so. This latter solution could be time-consuming and expensive for CCS, given that license fees are likely to be very low. Indeed, technologies based on machine learning require millions of works to train their systems. Therefore, the transactional cost of the license may not reflect the current one applied to users of copyright-protected works, such as radio, television, movie or music streaming platforms. The distribution of royalties is yet another issue. As mentioned above, Article 53 AI Act provides that AI model providers should provide a detailed summary about the contents used for training the machines and it is still blurred whether this summary can be limited to provide a list of websites, platforms, or other contents used to perform scraping activities or otherwise whether AI providers shall be required to report the repertoire of a CCS (and it could be particularly complex, given that repertoires are frequently modified, even for individual rights, especially after the implementation of the Barnier Directive). These aspects should be clarified by a template that, according to Article 53, paragraph 1), letter d), will be made available by the AI Office. However, it seems important for legal scholars and industry practitioners to provide fundamental guidance to the AI Office to facilitate understanding of a complex market such as collective management of copyright. Finally, how should the royalties collected by CCSs be distributed among authors and publishers? Analytically, by evaluating the individual works used for machine learning, or by following other parameters, such as considering the other uses of a single author or publisher? In other words, some CCCs may consider the so-called market-share of each member and then granting royalties which are proportionate to those distributed to these members in the previous years. Again, the starting point is whether Article 5 InfoSoc Directive can be applied to pre-training or training activities, and, if not, how to calculate the use of the works, i.e., which royalties should be paid by AI providers.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.