Skip to Main Content
University of York Library
Library Subject Guides

Searching for information: a practical guide

AI tools

Searching for information is not always straightforward. We got a bunch of librarians to suggest some insider pointers and useful techniques.
Feedback
X

Artificial intelligence, or AI for short, involves using computers to do tasks which mimic human intelligence in some way. It's something that's getting talked about a lot at the moment, with several high-profile tools having been opened up for public use. Chatbot tools like ChatGPT and Bing Chat have become particularly popular as a way of finding out information or generating answers to specific queries. Just as we might turn to a Librarian for an answer to something, so we might turn to an AI chatbot service.

On this page we'll be concentrating on AI tools as a reference source — as a way of finding out information. But also take a look at our Digital Creativity page on AI generation tools:

Once upon a time machine learning

It's worth pointing out that this sort of technology is not actually new. Chatbot programs like ELIZA have been around for as long as the University has. They work by a process of machine learning whereby they're exposed to passages of text which are analysed to identify patterns of structure that can then be mimicked. It's like the computer has got a pile of books, a pair of scissors and a tub of glue. You ask it a question and then it flicks through the books to cut and paste together something that looks like it might probably look like an answer. The more books you give it, the greater probability it has of putting together an accurate and intelligible response.

Put in very basic terms, imagine you've asked the bot to tell you a story. Well, stories often begin with "Once upon a time..." so the bot decides to start with that. As for what comes next, well often the word time can be followed by the word machine, especially in stories. Let's try it. And if the bot's creators keep talking to it about the concept of machine learning, then learning is going to be a pretty strong candidate, probabilistically, to come after machine. All the while it needs to check that what's being written makes 'sense' (at least with respect to what it's seen before), so maybe the next word needs to be was: Once upon a time, machine learning was... and so on until they all live happily ever after.


  Ebora is a mock Chatbot librarian.   It is a parody of Eliza which was described   by Joseph Weizenbaum in 1966. EBORA: Would you like some information ? >yes please What would you like to know ? >the capital of peru Would it help you to know the capital of Peru ? >yes Can you elaborate on that ? >not really Come come elucidate your thoughts. >_
The sort of thing you'd've got with an old chatbot.

In the past, chatbots like this didn't have the storage or the speed to be able to respond especially effectively (as the above example goes some way to illustrating). But computing power has improved dramatically, and the internet has given the bots something huge to work with: countless webpages, social media posts, electronic correspondence and digitised publications; a library of unimaginable scale that can be cross-referenced at immense speed.

Still, these 'AI' systems are still not very intelligent in real terms: they're still just collaging bits of other peoples' sentences together based on patterns and trends. And predictive text is still doing pretty much exactly the same thing as programs like ChatGPT albeit on a smaller data-set: it's working out what you're likely to say next based on previous things you've written, just like the early chatbots were. If all you ever do is type nonsense, the predictions will be nonsense too. And that's something important to remember about AI technology like this: it's only ever as good as the information that's been fed into it.

Suppose you asked an AI "What's the capital of Peru?" but most of the data the AI had to work with was people doing bad jokes and naff puns... You might be more likely to get "P" back than "Lima". It's crowd-sourced data, and so you're very reliant on your crowd being right!

But then the internet's a bit like that, too, isn't it? And just as we have to think critically about the websites we encounter and the content within them, or even the academic sources we might find in a database, so we have to be vigilant about the 'answers' we get from AI tools. Like a daft dog with a wagging tail these bots are desperate to please us, but like a cat who's just left us a dead bird as a present, they don't always get it right.

Interrogating a chatbot

AI chatbots are, in a way, glorified search engines. They're searching across millions of records and using algorithms to piece together what other algorithms suggest are the required results.

In fact, Google has been using machine learning AI as part of its search ranking algorithms since 2015. The results you get from a Google Search are, in part, being determined behind the scenes by AI technology. That's even more obvious from the various extras you get in Google results these days like the content along the top and down the right-hand side. The answer we get in a chatbot AI, then, is not all that far removed from just going to the Google homepage and clicking on the "I'm Feeling Lucky" button.

The difference is in the way that that answer is presented and constructed. Rather than us being taken to a source that might, somewhere within it, give us our answer, that information is being presented to us by the chatbot, reframed in its 'own words' (words it's found in various places and glued together in a way it suspects will make sense). As with any source of information, the question we have to ask ourselves is 'do we trust it?'

Chatbots vs the encyclopædia

This way of presenting information is, again, far from new: encyclopædia are a more established way of condensing and collating large amounts of information for easy reference. Wikipedia is a well-established online, crowd-sourced reference tool that's so useful that we've already linked to it several times on this page alone. Since chatbots are also collating information from multiple sources, might they be as reliable a reference source?

The difference is in the way the information is being published: an encyclopædia will go through some form of editorial scrutiny, even if, as is the case with Wikipedia, that scrutiny is just a huge audience of potential editors. A chatbot's reply isn't vetted in quite the same way: the answer you're getting has an audience of one; it's been generated for you, there and then, with no real editorial control. Sure, it's getting its information from correlations within a huge database, but that's not quite the same thing.

And there's an additional problem: chatbots have an unfortunate habit of just making stuff up. They lie. Or rather, in the act of assembling a sentence based on probabilities, sometimes those probabilities take their replies to all sorts of strange places. It's almost as if, in an effort to please you, they sycophantically tell you what you want to hear.

That's where citation comes in useful. Look at a Wikipedia article and it's full of citations linking off to other, hopefully more reputable sources of information. And while we might read Wikipedia to get an understanding of a topic, if we're doing a piece of academic work then it's those more scholarly sources that we, like Wikipedia, will be needing to work with.

Some chatbots also give citations, and you'll need to explore these to confirm that what the bot is telling you is accurate, just as you need to pay attention to the citations in an encyclopædia. Occasionally you'll find on Wikipedia a citation that isn't really supporting what's being stated, and the same is very much true with chatbots.

When it comes to finding information, then, chatbots and encyclopædia perform very similar 'summary' roles, and in both cases we really need to pay attention to (and critically assess) the sources they claim to be using.

Reformulating your query

Just as with a search engine, it will very often take more than one attempt to get the information you need. With a web or database search you'll need to tweak the terms you're using based on the response you get, and the same is true with a chatbot AI tool. The benefit with the AI is that you're working in sentences: you're constructing your search requests in meaningful English, not in fragmentary keywords and operators. That makes your query reformulations much more natural — a conversation. The drawback is that you have to be even more precise about what you're asking (sometimes even devious!), and even more cautious about what you receive in return. You should treat everything you find with a degree of healthy scepticism, but that's even more true when the gatekeeper of information is a robot. Who knows what rubbish it's been reading?!

It's a puppet!

A sock puppet enters from screen-right with a daft grin on its face.

Throughout all of the above we've kind of been writing as if these chatbot services are actually intelligent — that they're autonomous beings that are really reading lots of texts and thinking about what they contain. It's easy to fall into this trap of personifying the thing beyond the screen, especially when it's been programmed to respond in a lifelike manner. But it's important to remember that this is not actually an intelligent, conscious, thinking creature: it's lines of code sifting through tables of probabilities in order to assemble fragments of sentences in a naturalistic way. The choices it 'makes' are being governed by the algorithms written by its programmers, and the raw materials being used will also have been selected by those programmers.

We can approach this information in two ways: on the one hand we could get all philosophical and question to what extent our own human consciousness is actually just metaphorical cogs and code. But we should also think about the role of the programmer and the influence of their choices when writing the software and when setting the criteria of what's included in the data.

Forthcoming training sessions

Forthcoming sessions on :

Taught students
Staff
Researchers
Show details & booking for these sessions

There's more training events at: