Description
In this episode, Lily and David aim to demystify data science across various disciplines. They use the Post Office to illustrate the power of AI systems, and then look at other case studies to highlight some limitations of AI systems. This episode is part of the series aimed at Responsible AI for Lecturers, but is still relevant for those not following the course.
[00:00:00] Lily: Hello and welcome to the IDEMS podcast. I’m Lily Clements, a data scientist, and I’m here with David Stern, one of the founding directors of IDEMS. This is an episode specifically intended for our Responsible AI for Lecturers series.
Hi David.
[00:00:11] David: Hi Lily. It’s quite nice combining this in with our standard podcast episodes, but this particular episode, as you say, is related to the course which we’re developing for lecturers to be able to sort of interact with and engage with AI responsibly.
[00:00:29] Lily: Yeah, and this is the second one of them. So the one before was on using generative data and that was I’m not sure if you listened to them in order, but was released…
[00:00:37] David: A week earlier.
[00:00:40] Lily: A week earlier, and this is the second of our five part series.
[00:00:44] David: Absolutely. I think we’re going to dig in today to really these elements about responsible AI. And these are things we’ve discussed in earlier episodes in the podcast series. But it’s something where we’re going to dig into this a little bit more specifically related to the needs, if you want, and how we think we can help to demystify data science and responsible AI for lecturers specifically.
[00:01:10] Lily: Yes, and I guess that’s the first point. So why do we need to demystify data science?
[00:01:17] David: I think in the context of responsible AI for lecturers, it is really important that lecturers engage with AI without being overbeared by it. The key point is, it’s not just for lecturers of data science, of course, statistics or mathematical science . If you’re a philosopher, if you’re somebody in psychology, if you’re in agriculture, it doesn’t really matter what domain you’re in, your world as a lecturer will be turned upside down or will have been turned upside down by the advances in generative AI in recent years. And hence, actually thinking about how to respond to this positively and responsibly is really critical.
[00:01:57] Lily: And I would say that students will be using AI. That’s not a choice. And so knowing the power and the limitations can help lecturers with their students in using it.
[00:02:08] David: Absolutely.
[00:02:09] Lily: Well, just that they can use it in a way that’s responsible and informed and that if we know the limitations of it, then we can make sure that we use it to the best of its ability.
[00:02:19] David: And that that’s a moving target, because things are changing so fast, doesn’t stop the fact that a lot of the basic principles remain the same. And therefore it is possible for lecturers to know, if they’re not trying to do this as a competition, then to be responsible in how it is used and how their students are using it and support it, is something which we think is possible.
So let’s dig into demystifying it a bit, because that’s the heart of this.
[00:02:48] Lily: Great. Yeah. And I know where you want to start. And I guess that’s what this example that shows the real power of AI, data science .
[00:02:54] David: And I think the key point is this is an old example. This is before the new advances with generative AI, but it really highlights why this method is so powerful compared to maybe the more statistical or other research methods that people are familiar with. This goes back to the 90s and really is when AI was just starting to win at chess, and therefore got into the news then. This is one of the sort of real outcomes of those same mechanisms, those same machine learning algorithms, which has been really well documented. It’s a very common example to take. It’s the example of reading automatically handwritten postcodes to be able to automatically sort mail.
[00:03:48] Lily: Yeah. So how mail was sorted was it would be done by people, by humans, and sorting out, okay, there’s mail’s for here, there’s mail’s for here, and reading the postcodes. And in my understanding, from what you’ve said, is that then around the 90s they moved to machines doing it with algorithms.
[00:04:05] David: The key advance in the 90s was this ability to use machine learning algorithms to essentially turn the postcode part of the envelope into basically boxes with dots and to then automatically train the models to be able to recognise what letters those were to be able to classify and send the mail to the right place.
[00:04:32] Lily: And then I guess the first point that I then think is well, it can go wrong.
[00:04:38] David: Of course, and it did. You have the numbers probably better than I do, but I think it was 90 percent success rate to start with.
[00:04:44] Lily: Yeah, around 90 percent success rate to start with, and bearing in mind, this is before, emails, or at least before emails were used as widely. There was a lot more posts then, presumably.
[00:04:53] David: But whether there was more posts or less posts, it was an important function.
[00:04:57] Lily: Yeah.
[00:04:57] David: A little bit in history, in the Second World War, an African American female battalion was brought out to the UK to sort the mail and get rid of the backlog that was happening because of the Second World War. And this was a huge deal and the fact that they were able to do so brilliantly and ahead of schedule and all the rest of it, was a huge boom to the war effort. Because this is such an important function. The ability to automate this so much more, even if things were going wrong, was perceived as something which would really improve things.
And the key point about why this is so powerful is the learning and the fact that at the beginning, good human sorters outperformed the AI systems. But the AI systems, when they got something wrong, it would get caught somewhere else and get sent back into the system to be re-read, to be improved. And that improved the algorithms. And so now the algorithms, they are way better than the human sorters. And that’s the key. It’s that humans in the loop, where things go wrong, feeding it back in, applying the corrections to enable the systems to learn in constructive ways, in ways which are aligned with the correct, the truth, if you want. That is when the system really thrives.
[00:06:19] Lily: And then it gets to this point where the system is more powerful or is more accurate now than the human sorting it.
[00:06:27] David: Exactly, and this is human sorters their error rate would have reduced as they got more experienced. But when somebody’s got really experienced sorting well, they might not stay in the job forever, they cannot stay in the job forever. And so that experience eventually gets lost. It can’t be transmitted from one human to the other in that perfect way, in the same way that the automated system, that knowledge is never lost, those improvements are never lost, those learnings aren’t lost.
Now there’s all sorts of details around this which we’re not going to dig into, but it’s a really good image to keep in mind when you’re trying to think about what AI is really doing, is it’s feeding off humans who give it feedback to be able to learn and repeat those learnings continuously, and therefore that’s where the systems improve.
Now, if we’re thinking about demystifying it, I think there’s one other instance I’d like to take where I see a real challenge. And this is about birdsong, of course.
[00:07:37] Lily: Your other favourite story.
[00:07:38] David: It’s another favourite story of mine. And again, these are ones where we have whole episodes around these ideas. So if you want to go into the backlog, you can dig into more detail. And the key point here being the algorithms identify birds using birdsong are fantastic, and you can get apps on your phone to do so. What I struggle to see is how we can put humans in the loop, so that those systems can continue to improve over time, and get better and better, to the point where you can really rely on them, if you’re doing conservation, something like this.
And let me give a very concrete example, where a colleague of ours, used such an app and it claimed there was a very rare bird nearby and they eventually traced it to find that no, the sound was coming from a toad, which was being misidentified.
Now that might seem like a bit of fun or it’s not really that important, but if you were thinking about this as a conservation effort for rare birds and you miss calculated the number of rare birds because you were confusing a population of toads with that rare bird. This could have really negative consequences. This could lead to the extinction of certain species, if you are purely relying on this.
And so thinking through and understanding not just what it is that algorithms can do, and what they can do is amazing. It’s really incredible because of the ways they can learn. But, also recognising that without relying on that is where we need to be cautious and careful. Because it doesn’t know the truth by default. It only knows the truth that it is given and it can bring out things which are simply not correct without anyone really noticing.
[00:09:32] Lily: And I think that’s what’s really difficult often is those limitations, when you think of, say, chatGPT, it comes out, it’s so confident, it reads so confidently and same with the apps, with these kind of birdsong apps, it comes out so confidently that actually you’re not going to often, particularly with birdsong, have someone go around and go check that is the correct bird. That’s a really long, difficult task. How can we have these kinds of feedback loops in? I could see where it is in that kind of generative AI or partly where it is in generative AI. They let you rate the answer, sometimes they give you two different answers and they say, which one do you prefer out of these? And that’s helping their training models.
[00:10:13] David: But it’s helping them on preference. It’s not helping them on substance. This element of if your generative AI is giving you something which is actually false, it doesn’t distinguish that in terms of actually, or, no, so let me correct this. There are models which do have elements of recognising this in built in different ways, and are trained in different ways on that. And there were some really interesting advances which are happening about how you can do that.
However this distinction between the substance and style is really important in terms of actually recognising that in the generative AI, the advance, which has happened, which has blown people away is really about style. It’s about the fact that it’s passed the Turing test and, should remind our listeners what the Turing test is, that this is where an external observer, a human observer, who is observing human to human interaction and human to AI interaction, can’t identify the AI. That’s the Turing test.
And many of the models under certain circumstances are now passing this test, you can’t easily identify that they are AI. And this is played out in the education sphere, where we recently had Reading University, which had this case where all but two of 33 assignments graded were not identified as being AI. The other 31 slipped through, and two of them were identified.
That’s where we are at the moment. There’s also stories out there about people who have their assignments identified as AI even when they’re not. And there’s false positives and false negatives and, actually being able to differentiate between what is AI and what isn’t is hard and it’s going to get harder and harder as we go on.
[00:12:15] Lily: And I think that’s a really good point. That’s one that I’ve not thought of in that way, but you’re right. When ChatGPT, for example, does give you those two options, the substance is the same behind them, it’s the style which is different. And I know that you’ve explained to me before how these kind of machine learning and AI works.
I definitely cannot remember it enough to explain it back, but maybe that’s something that we could dig into in that the kind of how it works by looking at…
[00:12:40] David: The internet, for example, it’s it’s learning from such a huge amount of text which exists and which is out in the world, and that’s what it’s in its banks, in its memory, so to speak. And so being able to identify elements and ways to use that, to be able to communicate something back in a particular style. You can ask it to use a particular style and it might adapt to what you like as a style and so on.
This ability really is extremely powerful and it comes from, underlying algorithms, which are not yet very well understood, where these huge data systems are then, if you want, they have advanced mathematical models applied to them, and then you have these learning processes where it is able to sort of identify these are the styles which are appreciated in different ways based on the learning and often that’s a very human process so you have to have humans in the loop to enable that learning. And sometimes this is something that people forget.
Talk about the Amazon example of this, where they were trying to have Amazon Shops, Amazon Fresh, I think they were called, which automatically tracks you during your shopping and you paid without going to a checkout. But, it was then recently closed because actually their AI algorithms hadn’t progressed fast enough. And so really what this was doing is removing jobs from wherever the supermarket was and employing people in India who then watched people and watched that whole video of what people did to figure out what they put in their shopping basket. So this wasn’t AI at all. This was HI, human intelligence.
[00:14:33] Lily: And we don’t know. I, anyway, okay what can AI actually achieve and what can it not achieve? You see these things with these Amazon shopping, that’s incredible, how are they doing that? And they’re claiming it’s AI.
[00:14:47] David: The point is they did have AI models. It’s just the AI models were not yet reliable enough. The problem was too complex. In the future, I’m sure we’ll be able to get models that can do it. But their models at the moment weren’t good enough. And so they were basically, every shopper was going through and being watched by somebody in a low resource environment, paid very little. That’s how they were hoping to train the AI, but the AI models were not yet up to it. And that’s an important thing to remember, that boundary is very fluid. Maybe in 5, 10 years it’ll absolutely be possible. But that boundary, nobody really knows where it is at this point in time because it keeps changing, it keeps moving.
[00:15:29] Lily: So Amazon’s decided to pivot away from this “just walk out” and instead they’re now doing smart shopping carts. And so it’s showing that they’re moving towards a more simple, still pretty impressive, but a much more straightforward technological solution, because there are practical challenges.
[00:15:50] David: But the smart shopping carts have existed, not in that form, but in other forms for a long time. And so there’s nothing that innovative about it. A big part of what their attraction was the innovation behind and the way that this was actually working. And so I think, you know, we’re getting a little bit distracted and so we need to frame this back to responsible AI and thinking about how this relates to lecturers.
But I think the important thing to remember is that boundary between, what AI can do now and what it can’t do, this is a challenge for that next generation. The students coming through now are going to be continually updating what is possible, what’s not possible, as advances come and go, in a sense. And for us to be able to support them in their learning, and to be able to enable that generation to be ready for that world, which is going to be including AI and generative AI in very concrete ways. What do we need to understand about the processes so that we can be responsible in how we think about it ourselves?
And so I want to recap, if you want, some of the key things we’ve mentioned already. So we’ve mentioned this fact that in the responsible AI thinking, understanding that it is humans in the loop who are leading that learning through the feedback they give to the system is critical. And that’s really central to any AI system that’s currently out there.
If you don’t have that, there have been cases where things have gone off track where, you know, the way the system’s learned has not gone according to plan, so to speak. And that has led to a number of scandals. And that is a concern when we think about very powerful AI systems, which have been designed, which work now, but which don’t have obvious systems in place to ensure that the learning system is getting more reliable.
And this will be the case, I think, for the generative AI system for quite a while to come, where for many topics and areas of knowledge, we would expect the AI to be able to do a reasonable job, but not to really be able to understand the substance and draw out and consistently give correct substance.
So in terms of using it responsibly as experts, we need to recognise that AI will not replace expert knowledge or the ability to fact check for quite a while yet. Those are going to be elements where experts who really understand what they’re talking about will be able to get nuance and have a more nuanced argument, that AI doesn’t deal very well with nuance for obvious reasons, that as data, this is closer together and yet the nuanced meaning can mean that it actually is very different in meaning. But that might be contextual. So it’s very different to train for that nuance.
And similarly, it’s very difficult to train to be able to tell the difference between something which is true and something which might appear it could have been true. The fact that something actually happened or didn’t happen is something where, it relies on processes to be put in place, which would be beyond the scope of current AI systems.
And there are subjects where there are ways to build some of this in, verification steps and so on, but there are others where that’s just not feasible.
[00:19:41] Lily: I think that there’s one thing that we haven’t yet touched on in terms of the limitations.
[00:19:47] David: Yes.
[00:19:48] Lily: Because we’ve spoken about these feedback loops, but there’s this other limitation of that kind of human interpretation.
[00:19:55] David: Oh yes, and this is so important. Our most powerful case study for this is an instance, again we’ve done an episode on this in the past, where in the Netherlands they used an algorithm related to trying to identify fraud in child care, is I think a fair way to describe it. And in that algorithm, there may have been issues in the algorithm and there were, but the main issue was that in the interpretation of the algorithm, it confused the fact that the algorithm was predicting risk of fraud, and this was then interpreted as incidents fraud, which couldn’t actually be verified or contested.
And so it led to really terrible consequences. And fundamentally, again that awareness about what the algorithm could do, which it was doing relatively successfully, with problems, which was identifying risks, those at risk. But the human interpretation was then the outputs of this was telling them who was fraudulent or not.
And there wasn’t the data. There was no way with the data that was going in, that the algorithm could know who was fraudulent or not. It could only identify elements which correlated with risk of fraud and therefore where it might have been worth follow up or some human intervention.
And the key point is that if you think about the issues around risk. Of course, this is highly correlated to issues around poverty and other instances of marginalisation within society. And it might be that the correct human response to the identification of risk of fraud would have been to be going in with the idea of offering support to reduce the incidence of fraud. To actually recognise, okay this person may be at risk because they’ve lost their job and they have no money coming in and therefore their incentives have just changed or whatever it is.
That might have been a moment, I’m not saying that’s what’s happening, we don’t have the details on exactly what the algorithm was identifying. We do know that there were issues around race there and there were other issues as well. But, what I do know is that with this type of algorithm, which is identifying risk, the most logical response to risk should be support, not assumptions of guilt. Because when somebody is at risk, it is likely that they are at risk because of challenges that they are facing that would lead to this behaviour. There could be other reasons and there could be other elements which are leading to that identification of risk. But if you understand what the AI algorithm is giving you, then you can think differently about how to respond to it.
[00:23:03] Lily: I was just going to add and say, yeah, it was that kind of lack of human oversight, or the misinterpretation of what the algorithm was saying. Maybe part of that is not understanding the limitations in AI at the time, is thinking, okay, it’s what AI says and trusting it too much. I’m not sure.
[00:23:22] David: Absolutely. And it’s this element that, we recognise quite often, quite well, our own limitations in understanding. We could be fooled. We could be misled because of what we see, because of what we observe. Whereas, actually, if we think about what the algorithm is basing its conclusions on, very often it has less information than an expert would have, as sort of somebody going in to observe, who would still find it difficult to actually make a valid assumption. So it’s very easy to pass responsibility to the AI systems. But this is something which there’s now regulation coming into the EU, which will make this unlawful.
And I think this is something where, as thinking about responsible AI and how we need to be aware of what it can and can’t do, I think it’s really important to articulate that the role AI will play in the future depends in large part on how we as a society understand what it is capable of, what it is doing and the power we give it.
And so one of the reasons we’re so keen on promoting for lecturers the use of AI responsibly with their students is because this becomes part of that wider effort to ensure that there is a large cohort of the population that understands some of these subtleties around AI, about what it takes to use it responsibly, whatever their area of specialisation and expertise may be.
[00:25:23] Lily: Just to say a final thing on the Dutch childcare system scandal is that there were absolutely no winners there, the prime minister and his entire cabinet resigned over it, they ended up paying 30,000 euros to each affected family, they didn’t make any financial gain from it. But most importantly, were the victims, these 26,000 families who went through these ordeals for, some of them nine years or so, resulting in horrific outcomes.
[00:25:55] David: Absolutely. And the point is that’s where this education is so important.
[00:26:00] Lily: Yeah.
[00:26:00] David: And this was all pre-generative AI. So this wasn’t this issue around us not being able to identify. This was simply the issue of us misusing AI. And so now that generative AI has entered into our public consciousness much more, it is an opportunity for us as a society to try and take stock of what it is doing, to recognise that it is just a data analytic tool. It is taking in data, and using it in extremely powerful ways to be able to either, in the case of postcodes, classify things, which is the same as this Dutch child care scandal, where they were classifying people in certain ways, that’s what the algorithm was doing. Or to generate things, to actually produce things based on the data that’s been brought in and the learning and that’s what it is. It is just a set of approaches which are able to combine large amounts of data with learning processes to be able to produce outputs of various types.
And that means that at the heart of it, whatever we’re using, whatever we come out, whatever we use as an output, we need to also have some awareness of what’s being fed in, and what’s the learning which has led to the outcomes we’re looking at. And one of the things which I think is so important on this is that if we’re not careful in how we use this, those learnings could lead to a loss of diversity in the way we speak, the way we have different writing styles and so on. If we start to rely on AI in ways where we’re not enhancing this diversity and the ways we communicate, then we may live in the future in a society where we have become more standardised and we’ve let that happen to ourselves.
And this I think is a final thought in terms of this idea as lecturers when we are using and enabling our students to use AI, how can we do so in a way which increases that understanding, those skills to be able to use it to enhance the different ways that we can communicate and rather than just reproduce the things that it thinks are standard, are the best answer.
And I think this comes back to the previous session where we’ve discussed this, where we discussed the different ways to use generative AI for brainstorming, for iterations, and for polishing. And that process is really important.
[00:29:12] Lily: I absolutely, yeah, absolutely agree. Do you have any final thoughts?
[00:29:17] David: I guess maybe just the very final thought is that if we have lecturers or non lecturers engaging in this process, one of the things to take out is that we’re trying to create this experience for lecturers which will empower them and enable them to seize the opportunities presented by generative AI in a way which will enhance learning for their students rather than detract from it. So while we’ve spent a lot of this episode, really with warnings and trying to demystify a bit of what’s going on and why it’s led to these issues and these things that can go wrong, we do want to emphasise that this is something where if you understand what can go wrong, it can help you to communicate, to use it responsibly and to communicate to others how to use it responsibly.
[00:30:19] Lily: Absolutely. Thank you very much, David. It’s been great to chat and I look forward to next week’s on Friend or Foe.
[00:30:25] David: Oh, yes. Thank you.
[00:30:26] Lily: Thank you.