ADCET

ILOTA Things: Episode 6 - New Frontiers of Expression: I'm All Ears

ADCET Season 1 Episode 6

Welcome to ILOTA Things, the ADCET podcast where we explore Inclusive Learning Opportunities through AI.

In this episode, titled New Frontiers of Expression: I'm All Ears, we're going to take a look at how some of the modern AI tools can help educators and students create and use audio and speech to, providing multiple modes of engagement, expression and representation. 

More information including episode notes and links are available on the ADCET website.

Announcer: Welcome to ILOTA Things, the ADCET podcast where we explore Inclusive Learning Opportunities through AI. In this series, we'll explore the exciting convergence of universal design for learning, UDL, artificial intelligence, AI, and accessibility, and examine ways in which we can utilise emerging technologies to enhance learning opportunities for educational designers, educators, and students. Now, here are your hosts, Darren, Elizabeth, and Joe. 

Elizabeth: Hello and welcome from whenever, wherever and however you are joining us and thank you for your time as we investigate ILOTA things, that is, Inclusive Learning Opportunities Through AI. My name is Elizabeth Hitches and joining me on the artificial intelligence, universal design and accessibility deep sea dive are my co-hosts Joe Houghton,

Joe: Hi from Dublin,

Elizabeth: and Darren Britten,

Darren: Hello from Australia.

Elizabeth: In this episode titled New Frontiers of Expression I'm All Ears, we're going to be diving into AI and the conversion and generation of sounds, and audio and music. This ability to convert text to sounds is creating opportunities for content to be created and shifted into different formats, enabling learners and educators the ability to express themselves and ideas in new and exciting ways. As usual, we will present a buffet of some of the AI tools that are available in the audio space and examine how we can use audio to support a UDL approach. It might be altering formats into audio to provide multiple means of representation. It could be providing ways for students to act on information and express what they know through auditory means, or we might even be adjusting formats to support student engagement. Each of these could be explored in depth all on their own, but today we want to explore the possibilities broadly first. So over to you, Darren, and the first thing that springs to mind for me, even prior to the capabilities with AI, is text-to-speech. So, from that format-shifting perspective, why is this important for accessibility?

Darren: Thank you, and it's a great question because the text to speech, I suppose notion of format shifting has been around for a long time, even back to you know, possibly for a blind student that may not be able to use a physical book. You know, and we used to employ people to, you know, read to cassette tape and would produce then some audio versions in that sense old school analog audio of those learning resources. Cut to, you know, maybe a decade, decade and a half ago, some of the other tools that came out at the time. TextAloud there was some very early on synthetic voices, you know the very robot sounding voices that you would hear, that were installed into the early versions of Windows and those kind of things where you know that text could be read out. Very artificial sounding, as they were, certainly, but they've improved dramatically, you know, over a decade and a half and in the last couple of years tremendously improved to the point that they're very human-sounding voices. 

There's a whole range of reasons that we've used this, and particularly with that lens of students with disability, where they may be using it for a range of different reading difficulties or learning disabilities that may be there. A common one of those will be with dyslexia, where having to read the text and have it highlighted while you're hearing the text at the same time or you're hearing, should I say, the audio at the same time is hugely beneficial, and that's been a standard go-to tool in a whole range of scenarios for students to assist them with their learning. So it does a range of things. So, from improving reading comprehension and that can be useful for any student to hear that. Or again, I don't want to sit and stare at the computer for hours on end and I'm getting fatigued, I just want to hear this. I don't need to concentrate. There's not tables, there's not diagrams. It's a narrative that I need to listen to, or it's a story or whatever that might be. It also enhances independence by having that as a tool. It gives me that ability now, with these tools, that somebody doesn't have to read it to tape. As I said, I can now put my text in there, or I can drag across a PDF of a learning resource and say can you read this to me? Or I might just convert it into MP3 that I can listen to while I'm on public transport, on my way home or in the car while I'm on the way to my job. It helps anywhere and everywhere become a learning environment, you know. So I can take that resource in those multiple modes and with the voices getting better, as I'm saying it adds. It's much nicer to listen to. It's far more nuanced than it used to be.

Look, it also boosts confidence for some students in having that and hearing words played back and going oh, that's the word, particularly for some international students or where English may not be first language, to hear some of those words just from the point of view, it's just text on a page. But then hearing that and hearing oh, that's how I say seismology or that's how I say this term. Again, AI has improved that no end. It used to be a lot of those words were wrong and you had to put in pronunciation editing, etc. into that mix.

Elizabeth: Another example I was just thinking of is also when students are editing documents. For some students they may be reading over that text multiple times and either it's incredibly familiar to them or when they read it they just don't realise those errors. But sometimes, when that text is read aloud, that's when they realise oh okay, that's not the word I meant to type, that's an error. Or perhaps they realise that a couple of words may be in the wrong order and need to be adjusted, so benefits there even for editing.

Joe: Yeah, I teach business writing, as you know, as a course. Um, and that's a tip that I use a lot read your work aloud. Yeah, even just when you've written your work. You know if you're writing a long email or you're writing, you know, any kind of piece. Read it aloud, because the cadence that you speak at, the fact that you've got a long sentence and you suddenly realise that you're running out of breath because there's no commas in it, or you know that sentence went on for 15 paragraphs and maybe you should actually cut it down and make it a little bit shorter. So actually having something read aloud, particularly in a regular, you know, human sounding voice, rather than the robotic stuff we've had for years, can really give you a sense of the flow of your piece and help you perhaps break it up into more audible, friendly kind of chunks, if you like, and there's clearly loads of tools out there that will help us do this kind of stuff now. In the show notes, I've put together a little web page that's going to point you at dozens of tools in this space. Okay, so, text to speech, text to video, you know, creating music, all this kind of stuff that's covered in the title today, but I mean some notable ones just to be aware of perhaps. Speechify is a well-known um kind of speech tool, now very powerful platform provides things like adjustable reading speeds. If you've got, you know, even old documents now it will provide OCR, optical character recognition for maybe scanning in an old PDF of a paper before things were put out in free text, so you can listen to pretty much anything. If you can get it on the screen now you can listen to it, and tools like this are really good.

Darren mentioned earlier on sometimes he might want to convert something, to say an mp3 file, um for use on you know, his phone or whatever to play in in his headphones on the bus or whatever. AI reader is a tool which will take books, pdfs, word docs, video scripts, whatever you like, and save those as mp3 files. So there are lots of ways and I mean that harks back to last episode that we were talking about you know, kind of conversion from, from one format to another. So don't be constrained by the fact that you have material in one format. You know, if it's a document, you can now listen to it, you can now represent it in different ways and obviously that's very UDL friendly, which is one of the the kind of you know, three pillars of the podcast.

Um, so there are many, many ways of doing this kind of stuff and then saving those files in different formats for consumption in different environments. You know, you might want to consume this listening in your own headphones on the bus or whatever, or you might want to listen to it in the car, and our devices give us this capability. So lots to go on.

Darren: Just harking back to using it as a revision tool, even the earlier versions of things I've been recommending to students for a couple of decades. Now have your own essay played back to you, even with the robotic voices, because the things that you look over and something I'm at fault of doing quite regularly is putting the and the next to each other, and I only read one the. I don't see it when I'm reading it, but if I have it played back, it's really obvious, even with the computer voices, that it'll go and such and rather the the. You go oh, that's, there's something completely wrong there. So it helps some students, I mean getting them to do that as their first pass of proofreading, to go back and see you know what's in there. And Joe, you probably had that experience with students as well .

Joe: Yeah, I mean a lot of my students are multicultural. You know they're not just from Ireland. I mean, in my 50-odd project management class I typically have about 14 nationalities. So you know I might have 20 Indian and you know 15 Chinese students or whatever. So a lot of these students are coming to education in Ireland with English as a second or even a third language, which I'm always amazed at. You know the bravery of coming to do a master's programme not only in a different place in the world but in a language that isn't native to yours, and the tools now allow these students to get translation in real time or to record, you know, a session.

I don't call them lectures anymore because I don't do lectures anymore um, I've, I've learned that that lectures are not a good thing. Now we go, we, we, we run workshops and facilitation sessions and exploration sessions. But I tell my students in all my courses you, you are free to record these sessions and many of my students do so and then run the transcript or run the session recording through a tool that will translate not only into text but also into their own language. So we've got two conversions going on there, but that means that they can access and process and retain this information far better than if it was only presented in English by me speaking and they got to hear it once and I’m expecting them to make all sense of all that in that 40 minute slot or whatever it was. Whereas many of my foreign students say that the ability to take recordings and then spend time afterwards listening to those in their native language allows them a huge kind of boost in learning capability, and I mean that plays to lots of different UDL kind of perspectives, Elizabeth.

Elizabeth: Yeah, I can feel the UDL brain pinging everywhere and I think you know let's start with the really obvious one, and that is this is giving us the capability to present something that is text-based or something that might be visual through auditory means, and that's one of our key principles in that UDL guideline, representation being able to represent something in more than just one means.

The other thing that that really prompts me to think about too is, you know I think we've talked about it before what you put in, the quality of what you put in really determines the quality of what you get out. So if you were doing something like Joe mentioned, you had a transcript that was developed from that recording, you would really want to be checking that any closed captions or any transcript was actually accurate. And I've had a lot of fun this week in my own classes editing errors out of transcripts and they have been utterly hilarious. So I can't even imagine, if you had some of those really hilarious errors and then translated those, just how inaccessible that information could very quickly become. So just a caveat there, if you are doing that conversion, and just a general good rule of thumb is to make sure that you're checking those closed captions and transcripts for accuracy. And on that space I'd love to throw to Darren, because this really plays into the accessibility of those materials.

Darren: It certainly does, and I mean utilising, I suppose, the UDL principles in your teaching. You know for that recording to be better. So, elaborating on things, rephrasing questions, you know if you're asking students some questions and saying that's a really good question, let me paraphrase that. So what you're asking is … you want to make sure that your resources, as you're saying from that UDL approach, are as useful as possible by including as much information as you can in there.

I had one academic that on reviewing the transcripts of their work, they realised there wasn't enough break times, there wasn't enough explanation, and they always referred in this class to the brown book and the blue book because that was just the main colors of the covers. So in the brown book go to chapter three and they realized, for some students that missed that very bit of information, what the blue book, what the brown book was, because some were using an online chapter version, some were using you know things through a different resource. They didn't have the whole book and so that cognitive load of now I've got to go work out which book you're talking about. So having a clearer UDL approach to the resource or that seminar or session, as Joe's saying in the first place, gets a much better transcript or more useful transcript, which hopefully will produce more useful than things that we can turn into audio.

Elizabeth: I'd love to jump in here because I think there's also a really interesting place that this audio actually allows us to go as well, and that is for students actually turning their own text into audio. So perhaps they have a presentation that is to be done, and perhaps verbalizing that information may not be an accessible experience for them, but some of these tools actually allow that particular experience to happen, and I know of one educator in the US, who hopefully we'll get in contact and have a chat with in the near future, who has used technology like this to translate a student's speech sounds which may be really difficult for those around them to decode, use AI to do that decoding and then provide an audio version of that in real time. And this particular student, I believe they were asked you know what would you like for lunch? And usually very few people would be able to interpret what their response would be, and they got to know this student wanted a peanut butter and jelly sandwich and I thought that was just fantastic what that audio experience was actually creating. And so I wanted to know, joe and Darren, putting you on the spot, have you come across any students using sound in this way or using their own text and turning that into sound.

Joe: Yeah, I mean my Chinese students particularly talk about this now, and it's interesting because, again, there's this gap, isn't there? There's this fear in a lot of educators' minds about using AI in the classroom. Still, you know, is it cheating and all this kind of stuff? I've been doing it now for you know 18 months or so, and one thing that keeps coming up is students who are making use of this technology, having been given permission to do so, you know, I give overt permission, I encourage them to do it, and they come up afterwards, you know, during the course, even after the course, and they say thank you so much for making this okay to do. I didn't want to do it before.

And you know, different parts of the world have different levels of power distance, don't they? You know the old Hofstede stuff and perhaps in you know the East, there's this kind of perceived difference between the professor and the student, you know where the professor is, this kind of like you know sage and all the rest of it, and so very often, students from cultures that have that high power distance perhaps would not do this because they're fearful of doing something wrong or, you know, offending the professor or whatever it is. So just giving people the ability not just the ability, but the permission to say, okay, use these tools, accelerate your own learning, develop your own potential, using these tools better, I think is so important. And it's not really anything to do with AI, is it? It's just about thinking about our students, it's thinking about the learning environment, but then pointing them at tools that they may not be aware of that can help them with this.

So even if you just give them a list of I mean, these are smart people, they're doing a master's program. All they need to do is be told yes, one, you could use them. Two, here's some examples of tools. Okay. And three, maybe, if these ones don't work, go find out some more tools that will do similar things that might work for you. And maybe the Chinese students type in, you know, in mandarin or whatever, and there are 15 Chinese apps that I've never heard of that will do transcription and stuff like that. But until you take them through those, perhaps those mental steps, they might not even start to do this. So, yeah, I think it's really important. What about you, Darren?

Darren: Yeah, I've seen certainly some students and it's advanced a lot with AI, particularly for maybe learning languages as well. So putting in the language and saying, but play it back in this language to me, so you know they're learning the phonetics, they're learning the sounds, because it's gotten much better at representing those words. It's not a really bad English version of a word from another language. So utilising it in that way to you know, to get some feedback, and even within some of these engines, you can put in, you know, tags around the words to say this is French, this bit's English, this bit's whatever language it might be, and it can translate that on the fly and change voices, for instance, into those various languages.

I've also seen students use it for communication disorders that might be there, and or I've got a stammer and I need to do a presentation for the class, or I've got a lisp or something and you know I really don't want that in there. Again, I think that's the student's choice. I love diversity being in there and I think the more we can have that represented, the better. But for some they may not want that, and so for their presentation, or their video presentation, I'm going to put another voice over the top, you know and there's some of those tools now as well that will allow that speech to speech, and so it will pick up on how I speak, my inflections, and it will play back a synthetic voice in that way which can still have my mannerisms, my tone. That's in there. We didn't have that 18 months ago by any means, even 12 months ago, you know. It's come along in leaps and bounds and, Joe, look, I know the tools have just changed dramatically.

Joe: The tools are changing even as we speak. We're expecting to get a new version of Amazon's Alexa app. Now there's going to be a paid tier and a non-paid tier, but Alexa's just about to get huge new smarts but also much more natural sounding language. Um, the WWDC conference that Apple did um a month or two ago, um, again, Siri is going to get you know, seriously updated and rather than it sounding robotic and rather than you having to wait those really annoying for me three or four seconds between asking a question and then there's that pause isn't there before anything happens, you know, and there's some weird graphic going on the screen. Well, what we're going to see coming out in the next few months is that the AIs will have zero latency or near enough zero latency.

Yeah, so you know, if we're having a conversation, there's normally very little latency. I say something, Elizabeth responds, you jump in Darren, you know it's, it's a flowing conversation and the ais are getting to that point um, much, much more quickly. So the natural conversation if there's a latency of three or four seconds between a question and an answer, you think in a different way. You're processing, you're waiting, it's stilted. If the conversation flows, you start to free up mentally and I'm using chatGPT's audio feature now quite a lot, when I'm in the car. And you know, if I'm driving for a while, I'm going to fire up chatGPT, put it into audio mode and I'm going to just ideate around a problem. Or I'm going to structure up a class that I'm going to be teaching next term or whatever, by just asking, asking questions of ChatGPT, and it answers and gives me follow-up ideas and questions and we have a conversation and then that transcript is waiting for me when I get home and I can now mine that transcript, you know, from the text that's in the ChatGPT screen when I get back to my computer. So again, using text and speech in this combined way is giving me different opportunities for creating and ideating material. Elizabeth, again, what are your thoughts?

Elizabeth: We've talked a bit about the representation guidelines. We've talked a bit about that action and expression, and I think what you talk about there really opens up that, that section around engagement. You know how many times might we be confronted with a really large document and think, oh, you know I don't have the stamina to read this right now, um, but perhaps we can have it read aloud to us and we could be cleaning the kitchen while we listen to that document, or, you know, opening up those opportunities for engagement in those moments when it might be more challenging to engage through just a text-based means. So, yeah, the whole UDL guideline is really fitting into this section, which is really great.

Joe: Absolutely. I mean Penny, my wife just loves listening to audio versions of her papers and stuff, and I mean she's a psychotherapist so she's listening to stuff on, you know, psychotherapy and stuff. That means nothing to me, but you know she'll go in the shower and she'll stick her phone on in the shower or whatever, and she's listening away to EMDR or, you know, polyvagal theory or whatever it is. She's really making good use of time that would otherwise be, relatively, I suppose, dead time in terms of learning or in terms of engagement.

Darren: And I've seen like some students also utilise it where they couldn't get somebody to help them, like in terms of, you know, an assessment that was there and in one case this was a case study of interviewing a patient that might have a particular condition. This was in psychology and so, yeah, you know the prompts and reading the cues from the verbal responses of that person. Are they being evasive? Are they doing this so they could create that second character, you know, in text from the response that was there, but create an audio version and create themselves giving an interview to this person and picking up on those bits of speech, so it's all audio at the end and it sounds, sounds like an interview, you know, because these tools have gotten that good, you know, And there's even like tools like EVI, you know, the empathic voice AI tool that's picking up on your tone while you're speaking to it, you know. So you sound a little bit hesitant there or you sound excited, or you sound, you know, and so these kind of tools that's not just text to speech but it's also speech to speech. Now that's converting across and it's far easier to have a conversation, albeit a virtual conversation, you know, to get some of those key things in and out and again some of those tools to get that audio out for a student to sit there and say I'm really stumped with X. I've come to a bit of a crossroads at the moment and I'm not sure whether I should do this or this, you know, and the promise that some tools I know they're still in beta and things like EVI that hume.ai has, has that ability, I think, for students to problem solve themselves and to help direct them. So that personal tutor, which is again coming from the text but it's using text to audio, you know, so it's much more friendly, seems much more friendly in that sense, or more accessible to a student to go in there and do that. So I think some of this text-to-speech and coupled with the AI, I think the first things that will be replaced is some of those FAQs in a kind of thing, where I can put it in and it will give me a verbal response and the written response in a much more human way. Joe.

Joe: Well, I'm going to try something we haven't done before. I'm going to try and interact with one of these bots. So you just mentioned Hume AI and I had been playing with Hume AI, so I've just gone to https://demo.hume.ai. Okay, and Hume is now listening to me talking and while it's listening to me, on the right-hand side of the screen, it's giving me some feedback on how I sound. So I was getting determination and calmness and contemplation. So, hume, what do you think of the conversation so far?

HUME.AI:

I appreciate you exploring the demo and pointing out how I'm designed to pick up on vocal tones you're right. I aim to recognize emotions behind the words, to have more natural, empathic conversations. It sounds like you're feeling a mix of interest, contemplation and maybe a hint of confusion or skepticism so far.

Joe: Yeah, I'm on a podcast with Darren and Elizabeth, so I'm always confused. But you know, that's great,

Darren:  I've given a couple of demos with EVI and it's, the potential's certainly there and I think that's you know what we're talking about in a lot of things is those opportunities, and there's huge opportunities for something and tools like that to help students and assist them, even within personalized learning. And I can say you know you're getting a bit frustrated with your tone, can you help me with my presentation? And having that give you that real-time feedback or close to real-time feedback on you're a bit hesitant, you know you need to breathe more or whatever along with the rest of the AI, can be a really, really invaluable tool where some students would prefer to talk to one of these tools than to a human, maybe in the first place, and then I'll go seek a human's advice after because I might have a really stupid question, or can you help me pronounce this? I'm really unsure of these terms and it can give you that feedback. There's a lot of tools, but this is just one that's certainly on the cutting edge of some of this technology. Joe.

Joe: Yeah, another one that probably everybody's got access to is just built into PowerPoint now and if you go into PowerPoint, there is a coach, a presenter coach, now on the ribbon and you can present your presentation and the coach will listen to how you deliver it and it will give you immediate feedback on delivery, on all your stuttering and your ums and your r's and whether you're sounding confident and all that kind of stuff. So that's, that's a really interesting one for you know, both students and educators to play with um, because you can do it in the safety of you know a room with nobody else around, uh, and hone it to perfection. Have your TED Talk ready to go, yeah, so that's really, really good.

Darren:

And I think, look, one of the other flip sides of this we're focusing on that text to speech, and possibly that speech side is also just that music generation and other types of audio generation that's there. So in terms of, and I'll throw to you Elizabeth, in terms of expression and being able to introduce music that might be in a presentation, it could be in any range of things, and so there's lots of tools coming out in that space. I'm not a musician, but it helps me write a background, it helps us write the music for this podcast. Even the announcer is using text-to-speech that we've got for this podcast. So, Elizabeth, in terms of UDL practice, where does music fit into that scope?

Elizabeth: Well, we think about all the different ways that an individual can express themselves and, in terms of music, there's a lot of emotion that can be conveyed. We can really foreground the tone of a presentation or foreground the tone of an episode based on that music. Ours, hopefully, sounds like we're a bit inquisitive and a bit excited about the topic, but you could imagine we could have started it with some ominous JAWS music and then you'd know wow, they're going to go really, really into the dangers of AI. So, you know, the different music that you can create and present can itself be a mode of expression, and perhaps we may not have access to some of the music that we like to have access to, to put into a presentation or to put forward. But we now have the ability to create something and really refine what it is we want to convey a particular tone.  

But also, thinking along those UDL lines, whatever we do create that is, representing information through an auditory pathway, we want to be sure we have that alternative means available. So, if it is perhaps a podcast like this, you have that text-based transcript which we're all going to be checking for errors, that's the takeaway from today. Just be sure, because I've come across so many errors in my teaching, just where the automated systems believe that I'm saying something that I'm not. And you know you need that human lens to know that that is not what I said. Let's change that. And the other thing we really need to think about is if we're using something like music, then we're not going to be having that text transcript, but we do need to have audio descriptions so we can describe what type of tone or emotion is going to be conveyed by that audio. And in that sense it would be whatever the individual who would be hearing that music would take away from it. That's what we want to put into that audio description. And if you want to see a bit of what that looks like, you can always turn on some of those audio descriptions. I believe the ABC programs here in Australia do a lot of audio descriptions. So you know, if you're interested, one night you're here in Australia, turn on the TV, try one of those audio descriptions and just get a taste for what exactly are those sounds that they're describing and how are they doing that. I'm going to throw over to Joe.
 
 Joe:  Yeah, I just want to give it, give a very quick kind of um plug to one one music or song generator tool that I've had a lot of fun with, which is called Suno www.suno.com. Um, and I've created a few kind of songs in here. So there's a song that I've that I've got here, called end of term bliss. Yeah, um, and I just put in you know, just give me a rock ballad electrifying rock ballad, okay about end of term, bliss and within a few seconds … 

*song plays* “…a mountain, reaching the sky, red pens ready bright as flames, fuelled by students dreams…”

and it gave me kind of you know, three, three, four verses of this ballad. Kind of gets bigger and bigger and bigger and you know, kind of explodes with a rocky thing and stuff. We'll put a link to that song in the chat for you.

Elizabeth: I love that line. You know you're fuelled by students' dreams.

Joe:  Absolutely so. It's absolutely, you know, easy now to do this, and I mean, I've run creative classes with students and we've done this. I've pointed to them at Suno and said, right, go and create a song on whatever is of interest to you, and they come back with gangster raps and they come back with, you know, soft rock or whatever it is, and that gives you wonderful space to now start exploring with your students. You know why gangster rap, you know, or why hip hop or whatever. Oh well, I'm into hip hop. Oh, I didn't know that, you know. And all of a sudden, we know more about each other, don't we? We've actually kind of you know, found out more, and our learning journey has been amplified and has been made so much richer because we've gone into a creative space using AI that we probably would never have gone into before. I mean, how many of you have used music in your classrooms? You know, generally that's not something we do. Elizabeth.

Elizabeth: That just brought up some interesting examples that I heard. It could have been a couple of weeks ago, I don't believe it's from technology that we've mentioned today but there was some AI software where there were some voices created that were incredibly reminiscent of very famous human voices, and I remember there being a little bit of a challenge around that of how similar is too similar. If someone can hear that similar voice and say, actually that sounds just like this actor or that sounds just like this singer. I think that's a really interesting space that we're in at the moment, and so in any of those tools, I think we all want to be acting ethically. Whatever tool you select, it might be good to do just a bit of background research into the ethics of where those voices come from, maybe consent around who's given consent for those voices, and even for you too, how's your data going to be used with those as well. Just some points to consider.

Darren:  And be careful. Some of these tools can certainly allow you to clone your own voice and put in there, but if you look at the terms with some of those, your voice remains the property of them and they can use that and other people can use your voice to do things, unless you come to a specific licensing agreement and those kinds of things. So while some of these are fun to go and play with, again, be careful of those terms, be careful of attributions to things that are there. I know there's lots of challenges to some of these tools, what were they trained on? Again, so that note of caution.

I'm encouraging students certainly from creating ballads, background music, atmosphere, those kind of things which is saying to enhance a presentation. But I'm trying to encourage again some designers, if there's some group work, to get people to include audio in there, just so that everybody's presentation isn't the same, you know. And if the class has to sit there and watch that and everybody's, oh, here we go, another one of the same thing. It's a PowerPoint slide with somebody talking, it's a PowerPoint slide with somebody talking. You know, I've had a student. You know, for various reasons they're not going to be identified to the class and the remote students, so they're using an avatar with a different voice. You know that's not their voice, but it's perfectly acceptable. It's still showing what they know. They still had to write out the script of everything. That was there. A little bit more work possibly, but it's probably one of the more engaging presentations. So you know, there's multiple modes of expression. You know, built in with some of these tools and the advantages there and, Joe, you would have seen that with students as well.

Joe:  Absolutely. I mean, I set my management consulting students last term that one of their final team assignments was to create a one hour online learning course on a subject of their choice. Yeah. I mean they're just about to go out into consulting, they're going to be working in companies and stuff. They finished however many years in companies and stuff. They've finished however many years of you know, study and stuff. So this allowed them to pull together all their knowledge from all the studies that they did and also go play with modern course design techniques and tools and AIs and other aspects. And a number of the groups utilized AI both in, as you were just saying Darren, avatars and voices as well, as you know doing talk to camera, regular talk to camera and stuff as well. Um and uh and yeah, it was very, very impressive the stuff that they were able to put together, even using mostly the free tools, although some of them did pay a few dollars and you know, get some of the pro tools, a lot of the, a lot of the AI, voice tools and certainly the music generators you know have got some safeguards built in as well. I mean, you can't go into Suno and say you know, make me a song like Taylor Swift. If you put in a recognized artist's name, it comes back and it says we can't do that, yeah. So you know you have to get creative with your prompts and not mention any identifiable names.

But Darren's point about attribution and licensing and stuff is very, very well made, because you do have to be very careful. Digital identities are going to become a thing, very, very quickly. Our own identities, our personas on camera, and it is possible now, with 15 seconds of your voice, to clone somebody's voice and then feed a script in and have that person's voice speak it. Okay, so the technology exists, so the bad actors out there will no doubt want the use of this. But you need to be very careful about where your you know your ID is, your, your digital persona was going, and make sure that you don't hand out that by default to places that you don't want it to be.

Darren:  While the tools are fun to create, some of these things really need to be cautious because some of them you cannot use the outputs that are generated beyond personal use for your own listening, you know. So I can't go and convert this book chapter by somebody into audio and then go share it with my friends. No, you're breaching copyright by doing that. You transformed it into another format beyond personal use. In some countries you can't even turn it into an audio situation. It's restricted. So you've got to be very careful about the use of these and some whatever you make is available to the world, but the fact that you've made it, it's public. You don't have a right over making it private and not for anybody else to hear. So you may be heartfelt, you're putting things in there. Again, as you're saying, I'm using it based on a real-life case or something like that for my studies and I'm using this. But now suddenly everybody's aware in the world of this thing you created. It might have some very identifiable stuff in there if you're using names, Hello, I'm such and other, and you're using it to do those things. So again, like with all of these, a word of caution. Check how and the tool that you're actually using. But I think Joe's right, From the educator's point of view, of demonstrating these tools and demonstrating their use safely and ethically, as much as we can, and say you know, we can't stop you from using them, but we want you to explore these tools, we'd like you to play with these tools because of all of these advantages that are in there and the UDL and, as Joe said, just some students having that permission, because there's a lot of fear among students and there's academics saying if I use this tool I'm going to be flagged for academic integrity, I'm going to be flagged and told that I'm cheating and things. So there's a resistance as well to using some of them. So having that permission, you know, upfront and clear, and the boundaries around how you're expecting it to be used, you know is helping set our students up, as you know, the next gen in the workforce that has some of these parameters already built in around them saying hang on, we shouldn't just be using it for this and we shouldn't be using it for this, for a whole range of reasons.

There's some other aspects of where I think these tools will be really useful and we're talking about EVI and some of those before you know we've been talking at one of the institutions that I'm working with around, you know some of that ability for students to practice role playing in a whole range of scenarios, but particularly for job interviews and those kinds of things. So tools that can, you know, bounce ideas back and forth and say you kind of, you know, like that coaching presence that Joe’s talking about, that you know is in there. So these are becoming much better coaches than I'm going to talk and then I need to read the response and stuff. Now having it verbalized back to me keeps me in the moment. So I think there's the opportunity for these tools is huge and we're only just scratching the surface. This has come on in a very short amount of time, so we're still exploring the possibilities that are there. But, Joe, there's a lot of things changing in this space with audio and there's a lot of new tools coming out as well, so what's the sector looking like?

Joe:  I mean, there's dozens and dozens of tools out there that you know that touch on what we've talked about today, and in the perplexity page which is in the show note links, I've got kind of a lot of stuff for you to go and go and look at. I mentioned a couple. I mean, Pocket is one that a lot of people use um as as a kind of drop place for articles that they want to read later on and that has good text-to-speech features in as well. Um, there's there's one called voice stream reader which allows you to customize into many, many different file formats and stuff like that. Um there's there's one called captivoice and that specializes particularly in educational applications. So it's very much aimed at students and teachers. Um, and you know again, there's quite a number of specifically education aimed text-to-speech options.

Probably the market leader in text-to-speech at the moment is ElevenLabs, that's all one word. And that's a voice generation platform that you can go in and clone your own voice and use avatars to speak text that you want to and stuff like that, so that's well worth the look. But don't discount just the plain old chatbots that you've already got. You know, don't discount ChatGPT and co-pilot. Go in and give them a role. Yeah, I want you to act as an experienced corporate interviewer, um, and I'm going to, you know, be interviewing for a role as a junior project manager at Google in Dublin. Now you be my interviewer and give me a hard interview, okay, and it will adopt that persona of an experienced senior Google manager who's used to doing interviews, who is now going to quiz you for a project management role at Google. So if you think about and use your own imagination to setup the scenario that you want to role play, then the AI will be very good at, you know, taking on a persona and engaging with you. And if you do that and then turn on the voice, you know, so the microphone, so that you're speaking and it is speaking, it'll make it feel even better.

Elizabeth: And I'm also thinking you know, it can still feel a bit scary to engage with AI, and I think we're still in a time when, you know, perhaps people have come through an education degree and haven't even grappled with the basics of accessibility. That isn't often built in to education degrees. I don't know why, but it's not really baked in as often as it should be. So, if you want to have a look at text-to-speech or even introduce text-to-speech in a non-AI way, you can do really simple things like go to your Microsoft Word document, there's a tab in the toolbar, the tool ribbon at the top, called view, and within view I'm doing this from memory, I hope I don't mislead anybody, under view there should be something called immersive reader, and within immersive reader are a whole range of tools, but one of them is also to have that text read aloud. So if you just want to see what that read aloud software is like, there are really accessible places to find that that don't involve you exploring AI. If you don't feel like you're quite there yet and you want to take the first step before diving into that, so that might be a nice, comfortable way to do it.

Darren: And, of course, we encourage you to go and explore these tools yourself and try some of these things out in a safe environment, as Elizabeth pointed out, or maybe jump straight in to Perplexity, into ChatGPT, into some of these other tools that we've been talking about today, and have a play, create a song, create some music. You know we could take a leap from Thomas Tobin's presentations as well, here's a minute's thinking music. Well, you know, to take a bit of a break while I've asked you a question rather than the, I'll just give you a couple of seconds. I'm actually going to give you some thinking time and some music. So it's not just dead air. But look, you'll find links and tools into the show notes, things we've discussed in this episode, along with some possible prompts for creating some tunes and that yourself, and these will all be up on the ADCET website at www.adcet.edu.au/ilotathings.

Elizabeth: And, of course, we would love to hear from you. We really want this podcast series to be a conversation, and there are so many things that you may have come across that even we may not have come across yet. So let us know what you're seeing, what you're engaging with, any interesting things that you find, and if you have a question or even a comment about AI, UDL or accessibility or anything that we've discussed, you can contact us via our old school email at feedback@ilotathings.com.

Joe: So that brings us to the end of this episode and, as always, we hope we've given you some insights into how AI can help you deliver and explore your learning pathways, both for you and your students, in multiple ways, giving flexibility and agency to your students. Thank you all for listening and we hope you can join us on the next episode, as we will continue to explore ILOTA Things. So, until then, take care and keep on learning. 

Darren: Thanks everybody, bye.

Elizabeth: Bye.

Announcer: Thank you for listening to this podcast brought to you by the Australian Disability Clearinghouse on Education and Training. For further information on universal design for learning and supporting students through inclusive practices, please visit the ADCET website. ADCET is committed to the self-determination of First Nations people and acknowledge the Palawa and Pakana peoples of Lutruwita upon whose lands ADCET is hosted. We also acknowledge the traditional custodians of all the lands across Australia and globally from wherever you may be listening to this podcast and pay our deep respect to Elders past, present and emerging and recognize that education and the sharing of knowledge has taken place on traditional lands for thousands of years.