Generative artificial intelligence, or generative AI for short, has brought a range of opportunities, ethical questions, and issues, not least when it comes to evaluating the accuracy and the reality of what they produce.
On this page we'll look at some things to consider in terms of evaluating the information that generative AI tools like chatbots provide. But you should also take a look at the main Generative AI guide for our other pages and the University of York's generative AI guidance and policies:
Artificial intelligence involves using computers to do tasks which mimic human intelligence in some way. There are a huge range of methods, tools, and techniques that fall under the term artificial intelligence, and generative AI is just one of them. Generative AI often uses a branch of artificial intelligence known as machine learning, which focuses on getting computers to appear to learn how to do something, like recognise or generate certain kinds of content. Models are created using training data, meaning that the generative AI model is trained using examples of content, such as text and images, and then that model can be used to generate new content.
Generative AI tools are applications, usually web-based, which allow you to harness the power of this AI generation. Commonly these come in the form of chatbots like Google Gemini, Microsoft Copilot and ChatGPT. Typically, you give these tools some kind of prompt, maybe using text and/or images, and it 'generates' content in return.
One important thing to be aware of with AI generation tools is the fact that they are all based on datasets. The data that the AI tool has been 'trained' on impacts what results you will get, in terms of quality, accuracy, and bias. You will see mistakes in content generated by AI and you need to critically evaluate anything it generates. The data that these tools are trained on might be copyrighted or someone's intellectual property (IP), which introduces other issues about what people do with the outputs of generative AI tools.
Before we do anything in life, we have to think through the ramifications and potential consequences of our actions.
Generative AI has received a lot of attention for its potential detrimental impact on areas such as education, employment, privacy, intellectual property and climate. And we should indeed consider those ethical risks when deciding whether to engage with AI tools.
An important part of thinking critically is to consider the balance of those risks, and whether we're applying the same level of scrutiny to other technologies we might be inclined to feel more favourably towards. In other words, we need to examine our own biases.
Take environmental impact, for instance — a problem for all digital technologies. Our phones may contain metals mined by child labour; the videos we back up online take up storage space on server farms; the act of simply reading this page has probably dumped about a gramme of CO2 into the atmosphere somewhere. The thing is that a lot of the impact of the technologies we use is really difficult to estimate because life is just so complicated in our modern global society. How did the device you're using to read this reach you? And the components that went into its manufacture? And the data storage and transmission infrastructures? The electricity that's powering it all? And that's before you get to me as the writer of this paragraph, the equipment I'm using, the biscuit I just ate… There are compromises involved in measuring impact, just as there are compromises involved in using the technologies at all. Life is a horrible mess of compromises. Critical thinking is hard. We have to draw our own lines and follow our consciences. We need to decide whether those impacts are sufficient to stop us from experimenting with a technology and/or engaging with it entirely, be that AI tools or anything else.
Another important thing to consider is whether generative AI is helping you or whether it is actually making you worse at certain things. If you've got a magic box that can create things for you, does that make you less inclined to be creative yourself? Because you might not think you are particularly creative, but we are all capable of creativity. It just takes a little practice. The less we practice, the less easy it can be. Do you want to give AI tools all the fun and blunt your own abilities?
If I need to write a thousand words and I get the AI to do it, what do I gain? Time? Assuming I don't proof-read it and correct all the mistakes. Assuming I don't pass off the AI's work as my own and get myself into a heap of trouble when I'm found out. There are certain uses of AI which are the technological equivalent of copying the answers out of the back of the book. That's not often the best way to gain an understanding of a topic. Understanding takes effort — it might be nice if it didn't but it does. Everything else is delusion.
On the other hand, generative AI is, undoubtedly, a really interesting technology that can do an awful lot of things. Some of those things might even be useful! At its simplest it can serve as an assistant, or a sounding board we can use to develop ideas — a second person in the virtual room. It can take an empty canvas and give you a starting point to work beyond. And just as the really interesting stuff you can do with a synthesiser lies beyond the simple replication of other instruments, so we can find art and interest in AI for its own sake.
There's a lot of hype around AI at the moment. AI tools are getting everywhere in our society. And maybe it's a bubble that will eventually burst, or maybe it's not. But right now it's a big thing and it may well transform aspects of our lives one way or another. So, simply from an educational perspective, it's probably in our interests to investigate this technology. From all angles. To see not just how it works and how we can use it, but also to question what we think of it.
One use that's often been promoted for generative AI tools is as a source of information, and that's something we've looked at elsewhere on these guides (Searching for information: AI tools). Because of how generative AI works, it can sometimes put together accurate information from the sources it has been fed, but these responses are just a jumble of different things chosen by probabilities. And the probabilities are linguistic, not logical: fragments of sentences cut-and-pasted from millions of sources and reassembled as something that looks 'typical'. The computer does not 'understand' what it's writing in an analytical sense — something demonstrated by the notorious inability of chatbots to be able to count the number of letters in certain words. The reliability of the chatbot's responses are going to be reliant on a number of factors, including:
That's a lot of hurdles, and not necessarily the greatest recipe for a reliable source of information. As with so many things, we're going to need to think critically about the answers we're given.
A lot of the principles we can use when evaluating what an AI says are the principles we might apply for any source of information, but some may be more useful than others...
The first thing to consider is 'where is the AI tool getting the information?' — is it citing sources, and are those sources real and credible? How many sources are there, and is the AI tool representing them accurately? Realistically, we can't just rely on the AI's summary — we're going to have to do at least a little bit of our own reading to double-check.
When considering the sources themselves, and their credibility, think about who wrote them and why they were written. What biases might the authors have brought to the topic? You should also check when a source was written and whether that information may have been superseded. Consider, too, the level of detail the source brings to the topic — is it a superficial summary or a detailed account?
Even if you're using an AI summarising tool like NotebookLM, where you've provided the sources yourself and are relying on the AI to tell you about them, you still need to think carefully about what it tells you. For instance…
A: probably, yes!
Generative AI tools love to over-represent the information they convey, and to assert with confidence the truth of a source that may well have not been saying things nearly so firmly. Don't rely on the AI as a gauge of sentiment or authority; look to the original sources and see how they wrote about the topic.
It's not. It's a machine. And it may have been programmed to behave in a human-like way, but even then that doesn't make it your friend. Still, it's easy to fall in to thinking that it is our friend. Especially when its role is essentially to act as a colleague or an assistant. It's easy to drift into relying on the AI as a confidante.
If you're using a free AI tool then there's a serious risk that it will betray your confidences to others — it will gossip. If you're providing personal information to an AI (be it your own information or someone else's) then it may well be using that material as building blocks for future replies to other people. That's why we recommend you use Google Gemini with your University log-in — because we have assurances that that isn't going to re-use your data.
And while the robot may be very convincingly human, it's ultimately just a probability machine — and one that's been programmed to write with enthusiasm and a sense of authority it doesn't necessarily have; it's programmed to try to please. Which means you've really got to take a step back and judge everything it says with that caveat in mind. If you find yourself unable to judge with that degree of objectivity — if you start to fall for its smooth talking — then you need to take a step even further back. Because the AI isn't just a gossip; it's also a yes-man who will tell you what it thinks you want to hear. And that's not a friend. That's something altogether more sinister — potentially even dangerous. Never ask an AI for advice on a life-or-death topic: don't ask it to help you forage for mushrooms; don't ask it for medical advice. That's not something you want to get wrong, and generative AI does, by its very nature, occasionally get things wrong.
There's nothing new about image manipulation. Since the birth of photography, we've been messing with the medium, overpainting details to make them something other than what they were. People have been airbrushing other people out of photos since long before Stalin made it fashionable, or if not whole people then at least 'undesirable' details like spots or stomachs. In the digital age it even became so common a practice that "photoshop" became a verb.
AI-generated images are just the latest complexity when it comes to believing what you see. The technology is developing fast. We could link to any number of webpages that list ways of identifying AI generated images, like this reddit post, or this BBC article and quiz, but there's no telling how useful they'll be tomorrow. There's also all manner of quizzes you can do to test your abilities at spotting AI-generated images (just do a web search for something like ai generated image quiz if you want a demonstration of just how hard it's getting). Maybe, as time goes on, we'll learn better 'tells' to help us spot when we're being deceived, but the truth is that being deceived has always been a risk, and it's only becoming more of a risk with each passing day.
Fundamentally, then, we can't rely on an image as the sole evidence for something. It's just one piece of evidence, and we'll likely need to look for other clues before we can gauge how far to trust it: clues like the trustworthiness or reliability of the source of the image or of the person or publication who shared it.
Academic publishing is a business, and while more and more academic literature is published in a way that is freely available online, there's still a significant amount which exists behind a paywall. The University spends a lot of money to access academic literature so that it's available for its members to read. And it also pays for specialist databases that can search that literature in a systematic and controlled way.
We don't know what data our AI tools are trained on, but it is very likely that they do not have access to everything that is available. They will not be able to search behind every paywall, and they cannot search all of the Library's collections. It's also possible that the data powering the AI model is not as up to date as you might need it to be.
And it's difficult to repeat a search with an AI. The model that powers it may give one result today and another tomorrow. This isn't necessarily a problem for quick searches scoping out the general picture of available literature, but a lot of academic research is about doing things that can be replicated and validated by others, and that's not really possible with an AI tool.
The lack of transparency over the coverage of what's being searched is a concern. And partial coverage may lead to biases within the responses you receive. Libraries have been doing a lot to 'decolonise' their collections — to try to overcome potential biases. The overwhelming amount of academic literature has historically been written by white western men of a certain age. And AI tools are an averaging machine — they play the probabilities and return the 'obvious' answers, which means they may well end up amplifying the same voices.
The certainty with which these chatbots present their results is also something to guard against. They talk like experts but they're just cutting and pasting without any real understanding. A set of search-results is relatively neutral and easy to skim-read. An AI response may be loaded with additional values you'll need to critically peel away.
What's more, the information they provide might not even be accurate. A BBC study found that over half of its engagements with AI regarding news stories contained factual errors, and that's before we get to the other issues we've already noted.
It's easy to fall into the mistake of assuming that the robot is an authority: that it knows what it's talking about. But it doesn't. It's just working with probabilities: it's literally a chancer. Remember that you're better than it is: you can actually perceive and understand. It might be easier and safer to do the research for yourself than to rely on the robot to do it for you. It might also be more rewarding. Which brings us to the next point...
It's important to consider whether AI is helping you or whether it is actually making you worse at certain things. AI tools are being sold to us as shiny solutions to our problems but perhaps they generate more labour than they save. We've already noted how a list of search-results might be easier to skim-read than paragraphs of summary, and how the chatbots might not even be up to the task of research. But Microsoft's own research has also found that the use of AI can have a negative impact on our critical thinking. As students we need to learn how to do things; as researchers we need to engage with literature — it's part of the point. If we're getting a robot research assistant to do our work for us, we really need to be able to trust it, otherwise we're either giving ourselves extra work to verify what it's found for us, or we're putting our reputation on the line.
Take a look at our Being critical guide for more advice on how to critically assess sources of information:
Forthcoming sessions on … :
There's more training events at: