A blog post explaining how GPT AI systems are problematic because they scrape content without recognising other people’s intellectual property and fail to understand the systemic bias embedded in historic datasets.

Dalle2 image from prompt: a black and white photo of an AI system trying to write a complicated post about the ethics of IP scrapping and toxic bias in historic datasets

GPT (Generative Pretrained Transformer) AI systems, like many other AI systems, are trained on large amounts of existing text data. This data is used to help the AI system learn how to generate text that is similar to the input it was trained on. However, this reliance on existing text data can lead to a number of problems, including the scraping of content without recognizing other people’s intellectual property and the failure to understand the systemic bias embedded in historic datasets.

One of the major problems with GPT AI systems is that they often scrape content from various sources without recognizing the intellectual property of the original creators. This can lead to the theft of content and can be detrimental to the creators who have spent time and effort developing their ideas. In order to avoid this problem, it is important for GPT AI systems to be trained on data that has been properly licensed and to be used in accordance with the terms of that license.

Another problem with GPT AI systems is their failure to understand the systemic bias that is present in many historic datasets. These biases can come in many forms, such as gender, race, and socioeconomic status. As a result, GPT AI systems that are trained on biased datasets may generate text that reinforces these biases, leading to the perpetuation of harmful stereotypes. In order to avoid this problem, it is important for GPT AI systems to be trained on diverse and representative datasets, and for their outputs to be carefully monitored for biases.

Overall, while GPT AI systems have the potential to be incredibly powerful tools for generating text, it is important to be aware of their limitations and to take steps to address the problems they can cause. By properly licensing the data used to train GPT AI systems and by monitoring their outputs for biases, we can help ensure that these systems are used responsibly and ethically.

This led to a second prompt

A blog post explaining in 5 points the specific actions to take to prevent GPT AI systems scraping internet content without permission and regurgitating toxic bias embedded in datasets.

This was outputted by ChatGPI based on the two prompts mentioned in the title and in Bold.

--

--

Sensory Design Consultant, usability researcher and workshop facilitator. www.linkedin.com/in/alastair-somerville-b48b368 Twitter @acuity_design & @visceralUX

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alastair Somerville

Sensory Design Consultant, usability researcher and workshop facilitator. www.linkedin.com/in/alastair-somerville-b48b368 Twitter @acuity_design & @visceralUX