Generative Artificial Intelligence (Gen AI) and Copyright

Generative Artificial Intelligence constantly hits the news headlines, a brief search online can produce thousands upon thousands of results. Popular culture reflects our concerns and thrills in future worlds where Artificial Intelligence has been pitted as either our saviour or more frequently our doom. In an effort to determine how these newly evolving technologies will affect us some countries have created regulations in the compliant use of Gen AI.

Gen AI is transforming how we work, study and play, there are many benefits to embrace but equally we cannot ignore the legal and ethical concerns. Understanding how Copyright and Intellectual Property (IP) law plays a significant part in the use of Gen AI will enable you to responsibly use these technologies in Higher Education Institutions.

What is Gen AI?

Generative Artificial Intelligence (Gen AI) is a subcategory of Artificial Intelligence (AI). Whereas traditional AI's are trained on specific data to perform specific tasks, Generative AI's are trained on vast amounts of unstructured data so that they have a 'foundation' to be able to adapt and do multiple tasks and functions*. Simply put traditional AI can recognise patterns to analyse and classify information, whereas Generative AI can recognise patterns to generate new content.

Foundation models** such as large language models (LLM) like OpenAI's ChatGPT are able to recognise and generate texts based on words they have seen before and produce new works like poetry, stories or essays. Other foundation models such as OpenAI's DALL-E2 can create custom images from prompts based on the data sets they have been trained on, whilst GitHub's Copilot can help complete code and provide code suggestions whilst the developers are working on the more complex areas of a project.

Generative AI's use machine learning techniques like deep learning to create or generate new content from the data it has been trained on. Similar to how a human learns but at a much faster speed and scale.

Some of the issues both legally and ethically concern the lack of transparency regarding the data sets. For example, where is the Generative AI getting the information from? Is it part of a data collection? Is the data freely available? Is the information out-of-date? Can the results be replicated? Will the work you create using the software be available for a third-party to use? Does it contain copyright protected works? Some Generative AI's can produce works of art that are almost indistinguishable from the data they have been trained on - this has caused huge concerns regarding copyright infringement within the creative industries.

Another concern is the lack of references, which is considered highly problematic for plagiarism. Generative AI's can't autonomously evaluate the accuracy of the outputs created and they can also produce false outputs known as 'hallucinations'. Hallucination is a term referred to when an Generative AI creates false information as fact when responding to a prompt.

*For more information please refer to IBM "What are Generative AI models?"

**For definitions on Foundation models, LLMs and Generative AI, please visit The Alan Turing Institute

Gen AI guides at the University of Derby

Informative library guide that looks at generative AI. Please see our Generative AI Guidance

Extensive Generative AI guide which includes a University of Derby Code of Practice

On our Policy Hub we also have Guidance on the Acceptable and Responsible use of Generative Artificial Intelligence in Research

Copyright Guide

Keen to know more! See these external resources

Attribution for image on this page

Generative Artificial Intelligence (Gen AI) and Copyright

What is Gen AI?

Gen AI guides at the University of Derby