Welcome to our newest season of HumAIn podcast in 2021. HumAIn is your first look at the startups and industry titans that are leading and disrupting ML and AI, data science, developer tools, and technical education. I am your host, David Yakobovitch, and this is HumAIn. If you like this episode, remember to subscribe and leave a review. Now, onto our show.
Listeners, welcome back to the HumAIn podcast, where we’re discussing all things AI, as we’re leading in a digital-first world. This year, I’m so excited to talk about the changes that we’re seeing in the media. I don’t know about you, but I’ve been glued to my phone nonstop. And today I have one of our experts, Claire Leibowicz, who leads the AI and Media Integrity Program at the Partnership on AI.
I’m really excited to have her with us in the show because, as we’ve all been glued to our digital devices, there’s a lot of information out there. And we don’t always know what’s fact from fiction. Claire, thanks so much for joining us on the show.
Thanks for having me, David.
You’ve seen it firsthand. Everything from elections through consuming material and all the big open and big tech platforms. Can you start framing for our audience a little bit about the work that you’re doing at the Partnership on AI and why it’s so important?
The Partnership on AI, a little bit of context that might help, we’re a global multi-stakeholder nonprofit devoted to responsible AI. And we were founded about three years ago by the heads of AI Research with some of the largest technology companies, Facebook, Apple, Amazon, DeepMind, IBM, Microsoft, Google. And really meaningfully, we’re multi-stakeholders.
So what that means is, not only those tech companies should be involved in creating good, responsible, ethical AI, but also we need folks from civil society organizations, academic venues, other parts of industry and especially media, which I’ll talk about a little bit more, to think about what healthy, responsible AI means.
In the past two years, we’ve been working on questions of AI and media integrity, which is, basically, in a very simple way, how do we have good, healthy, beneficial information online? And how do we use AI systems to do that? And what’s really important, which we’ll probably dive into, David, is that not everyone agrees what type of content should be allowed online.
Even humans don’t agree about what misinformation is or what content should be shown to people through technology, and those fundamental questions we explore quite a bit. So again, we have over a hundred partners that help us do this, and importantly, come from all over the world, so that we can pay attention to different information contexts and the nuances of how information flows in different geographies and demographics.
I love how you brought that up on information context, Claire. I spent a couple months at the tail end of 2020 and the beginning of 2021 in Taipei, and being around China and Singapore and Japan, and seeing how East Asia operates is very different and very much the same as in the States. There’s a lot of myths to dispel around information, contexts and biases. What have you seen, first and foremost, as some of the changes and trends in the past couple of years?
Most recently, and it’s hard to pay attention to global trends, but coming from the US and in a bellwether moment for the field recently, we saw the de-platforming of Donald Trump. We’ve seen some tech companies be really empowered to take those off platforms. So, not only just to declare a label or more context around people, but really to take a public figure off a platform, which is really an emboldening, compared to a few years ago, of platform agency in contributing to who is allowed to speak and who’s not.
So that’s one thing that’s new, not going totally into if that’s good or bad, or all the complexity of that, but more akin to, as some human rights defenders point out, to certain tendencies in China or other countries that may actually regulate the internet and speech in a way that’s more active than we’ve seen in the United States. So that’s one interesting trend. And then, in terms of tactics for misinformation, how people create misinformation, how they spread content, we see that it’s not only happening on the big platforms that we think about like Twitter, Facebook and social media. But even in closed venues.
There’s misinformation flowing in WhatsApp groups, in texts, in all these different venues. We now have Tiktok, which is a new medium for how information flows. And amidst that, a real movement towards this kind of misinformation that’s not just total misrepresentation of an event or a fact, but a slant or a leaning, or a caption that may make a post have a different connotation than it would if it was written by someone else. So that’s really complicated. The nuance inherent and misinformation has always been there, but with new technologies, it can be pursued and manifest itself in lots of different ways. That makes it really hard to combat as an issue.
I’ve seen firsthand a lot of these misinformation labels as a consumer of technology in the last couple of years. I’ll sign onto Facebook and Twitter and TikTok on my devices and see that this ad may be actually doctor or this tweet may have misleading information or graphical violence.
It’s so fascinating that a couple of years ago none of these labels existed. We, as consumers, just assumed everything was true, or we didn’t have much of that filter. Why do you think we’re starting to see that shift today for misinformation? Why do we need to look at manipulated media with tools and more than tools?
That’s a perfect observation, David. And we have a whole area of work devoted to studying things like labels, and other interventions that allow for more context around a post. And what’s funny, you bring this up that it’s become really hard to label things. In the past few months when we saw there was a time when Twitter, if you looked at Donald Trump, and not to dwell on him, but his feed in the spring, basically every post said, this is manipulated media.
And there’s been a reckoning in the platform community that we really want a public that can distinguish credible information from misleading information. So, since 2016, the election was this really significant moment for bringing misinformation on platforms to light. And it’s taken some time to think about strategies.
And labeling is an interesting, almost in-between option, because it’s not necessarily limiting speech or saying you can’t share this post or saying someone’s information shouldn’t be seen. It’s giving you more context. So, the idea is that there’s this nice middle ground for platforms to seem like they’re giving the user control and autonomy, and being able to judge for themselves what’s credible. But we actually don’t know a lot about how effective these labels are.
And you brought up a really interesting point, which is, does merely knowing that a video or image has been manipulated, let’s say, you disclosed that you put a filter on your Instagram post, is that really helpful for judging if a post is true or false or if you should believe it? So we actually did research pretty recently not to go in the weeds on it, about how different users encounter visual misinformation and labels across platforms. Because another really interesting point is that people don’t typically consume information or a story on Facebook or on Twitter. They consume it across all of these platforms. So it’s really important to ground strategies and our understanding of labels more generally in how we understand people consuming content in their own information context, across platforms, over time.
Something interesting in our research about labels, one is that there might be unintended consequences of the labels. Some people are really skeptical about platforms, posting a little image that says this has been manipulated, and we really saw a major division in user attitudes between those who supported labeling interventions in good faith, and wanted them and thought they’re important for people to be healthy consumers of content and those who found the really biased and partisan and error prone, which is such a fundamentally human question that when you think about its application to maybe automating that label deployment, meaning how might you want to label all content at scale? It’s really complicated. And we don’t really know what the best intervention is right now to help bolster credible content consumption.
That’s right. And to be a healthy consumer, I always try to see both sides of the aisle. So, back to, not only the election, news that we’ve seen around the Coronavirus pandemic, what does CNN say? What does Fox News say? What does the Wall Street Journal say? What does The New York Times say? or The Washington Post to see how different sources that I would deem as both retail information and then institutional, how they make opinions or how they provide facts on different theses, on different topics.
And it’s fascinating to see as some of these platforms have gone more all in, as you say, Claire, gone emboldened around fact-checking and around labeling. So I feel that it’s the early days, and we’ve seen with the de-platforming of Donald Trump, that now we’re living in a new society where we are giving the rights of freedoms to platforms to say, we can get content so that we’re providing the best interest for our users. One question I think about is, do users want that? And what does that mean for the internet and platforms as a whole? Whether that results in unification or splintering? So it’s a big question. Love to see if you can unpack any of that for us.
I also want to emphasize, I go back and forth because, in some ways, the platforms have been emboldened and that has a connotation that we’re going to become the arbiters of truth, which is really scary. It would be scary to those who value free speech and principles, about the online public square and how the internet was founded as a venue for democratizing speech and allowing people to speak. However, there’s also the possibility that they aren’t emboldened enough, that the interventions they’re using are kind of these, forgive my language, half-assed middle ground solutions as opposed to pursuing other routes.
So for example, rather than labeling, maybe they could say we should down rank some of this content that’s perceived as manipulated or less credible. Down ranking means that they could tweak their algorithms that show you content that maybe you might like based on your clicks. But maybe they could include other information to choose what those algorithms show you or not that has to do with its credibility. So there are other solutions that the platforms can take to change how content gets shown beyond just labeling. So that’s just one point to highlight.
But overall, from our research, we’ve found that platform labels alone, to your point, are insufficient to address the question of what people trust and why there is this general distrust, which you were alluding to, David, in the principle of platforms to self-regulate and for fact-checkers and media companies to offer non-politicized ratings.
So there’s this philosophy for many users that they just don’t trust the partisan platforms or fact-checkers to tell them what’s right. There’s this lost trust and anger towards labels as being really punitive and condescending. And they don’t actually say what people care about, which is, is this false or is it true? as opposed to, as I said before, how it’s been manipulated. So certain suggestions that if I were to wave a magic wand, not that there’s a perfect solution. We need to better design interventions that don’t repress people, but really respect the intelligence and autonomy that you have described. What’s really wonderful of this awareness of looking into a source and media literacy.
And while that’s not a perfect solution, meeting people where they are is really important for building those types of skills. So holistic, digital literacy, educational interventions. And even what we’ve seen on Twitter, this new rollout called birdwatch, there’ve been some ideas to focus community-centric moderation, which is kind of like Wikipedia. And that people in the community rather than platforms, or the platform itself, are the ones doing the moderation, which might increase trust in how the speech is being labeled and ultimately decided upon.
Back to the presidential debates, when Donald Trump and Joe Biden were going at it in public discourse and we would see there that live moderation, we’re seeing them live providing fact-manipulated media and so forth. And I’m thinking about our world today, where conversation is moving at such a fast speed on audio. We have platforms like Clubhouse that have been gaining a lot of traction, as we’re thinking about the digital world and moving into the hybrid world for life after the pandemic.
Back to when I attended in-person events and often at these conferences, everyone would be dialed in onto their phones to look up the startup that was mentioned, or if the statistic was actually true. The challenge that I’m thinking about is public discourse doesn’t live only online. It’s both online and offline. And where are we looking from your research to start solving for understanding more about misinformation in public discourse?
Yes. I work with several colleagues and peers who think about these general questions, about polarization and mediation that are important to the dynamics of in-person interaction. What you’re describing might have applicability to the internet. So there are some interesting models, I have to look up the names, so forgive me, but there’s a new effort to create a healthy public square. It’s called New Public out of the University of Texas in Austin.
Some faculty there that’s devoted to trying to replicate the beneficial, welcoming public spaces of the in-person environment. So a lot of people look to libraries or community centers as really interesting models for welcoming, cordial, non-antagonistic venues for thought. And there’s a desire to make our online spaces match that.
So there’ve been some interesting efforts there. I will say misinformation in particular, it is fueled by a lot of the dynamics on the internet and there are differences. So when you don’t have to show your face, when you can be anonymous, when you can just like something, as opposed to having to describe to someone in detail why you disagree or not, there’s a lot lost in those interactions that might make it harder to promote the norms of in-person interaction online, though.
What’s an interesting point to underscore is, a lot of the policies that platforms have about speech on the platforms have to do with the way in which they cause real world harm. So, what I mean by that is, you may have a policy that says we don’t label speech, we don’t do anything until there’s a perception that post might prompt real-world harm.
So in the case of Donald Trump, the fact that he was inciting violence based on the platform’s judgment, was what really emboldened them. I’ll use that word again, to take action. So there is this really meaningful connection between what happens on these platforms. How can we have it take the good parts of real-world interaction? and also, how might we change our policies and interventions on a platform based on how the online information will affect real world practice?
Let’s dive deeper into that. When we’re talking about platforms, the big challenge that I’ve been giving a lot of thought to lately is how platforms should address media manipulation. When we think back, of course, it’s the big story that many people have thought about this year and last around the de-platforming of Donald Trump. Different platforms went faster and other platforms went slower for different reasons on responding to media manipulation. But we need to get some general practices, some best practices there around integrity, around information. What are some techniques or strategies that platforms should be considering, Claire?
That’s a great question. For the past year we’ve been working really explicitly in manipulated media to give listeners a step back. Manipulated media is basically any visual artifact that has been manipulated, not to be really literal, but that could be the Instagram story filter you use. I like to use this funny example to show people that in a canonical New York City coffee cup, you can insert your face. That’s manipulated media.
Also, recently there was a video during the election of Joe Biden listening to defund the police or a song about F the police that was manipulated. He wasn’t actually listening to that song. To make it seem like, that was tweeted by Donald Trump to make it seem that Biden was wanting to defund the police or politically touchy issue. And what’s meaningful is, I don’t think there’s any harm to the public square of me putting my face in that coffee filter, but it’s manipulated media. But there might be harm to other types of political speech or those that are misleading. So when we talk about manipulated media, it’s really important to underscore what makes that misleading or problematic.
And that isn’t the nature of how it’s manipulated. If it’s used AI, if it’s used Photoshop tools, basic tools, that’s more about the meaning and the context around it. So a lot of people have advocated for AI-based solutions to deal with manipulated media. Last year, we were involved in something that Facebook ran called the deep fake detection challenge. The idea being that AI tools might make it easy for people to manipulate media in a way that people can’t tell with the naked eye, and technology companies will need to be able to see how things have been manipulated.
But two important things came from that. And we worked with a multidisciplinary group of experts in human rights and other fields to help govern that challenge and those questions about deep fakes.
And we realized that it’s not just how an artifact has been manipulated that matters. It’s partially the intent, why it’s been manipulated and what it conveys that really matters, which is a really complicated question for machines to answer.
And ultimately, we might not even just care about the media manipulation itself, but maybe if it has new context or a caption put over it or a new sound, and there’s all these complicated variables that make it misleading or not. So in terms of where that leaves us for meaningful solutions, now there are a lot of efforts to bake notions of trust into media artifacts and the web.
So our colleagues at the Content Authenticity Initiative at Adobe, and Project Origin, which is an effort, lots of buzzwords have been named, but basically an effort by the BBC, the New York Times and the Canadian Broadcasting Company. And in Canada, they want to bake signals of media authenticity into the web. So rather than having to retroactively say, has this been tweaked, has this been tampered with? you’ll be able to say, here’s this artifact and here’s where it came from, which may help users.
So basically, I’m just trying to lay out for your listeners that just because something has been manipulated doesn’t mean it’s inherently misleading or automatically misinformation. iPhone portrait mode actually uses AI to manipulate the image, to actually make it more realistic, which is an interesting use case.
But rather, what we should care about is why might something have been manipulated? What’s the effect of that manipulation? And that’s a really hard task for machines to gauge, let alone people. We really disagree about that. So any meaningful solution will empower users to understand more about the artifact and also the context around it, and be really explicit about why any decision-maker is evaluating it as misleading or something that should be labeled or taken down.
That’s a lot of common sense because as you described, if you’re going to your Instagram filters or TikTok filters for sound and audio and video and images, this is all fun and games. So, it is manipulated media. But there’s not a bad intent. We see, even in podcasting today, there’s platforms like Descript that help you remove Um’s and Uh’s and all these pauses.
I need those.
That’s technically media manipulation, but it’s not a bad intent. It’s improving the listing stability and improving the user experience. So that’s the challenge about finding where it is becoming beneficial or actually negating that experience for users.
And we’ve seen a lot of that material grow, as you mentioned, with Joe Biden and some of the other figures we’ve seen the growth of the deep fake movement, deep fakes and cheap fakes and how you can now take your body and putting on some celebrity in a movie and on different portraits and art, all these are generally very fun, creative, bringing consumers together, bringing community together when it’s all played with gist, but then sometimes it becomes more serious.
And we’re not always certain where that’s leaving consumers. One example that I’ve found really fascinating is, there’s this startup called Synthesia. And Synthesia did a demo back in 2020, where they took David Beckham, the famous soccer, football player, and superimposed his voice into other languages like German and French and Spanish and English to serve ads to the patrons of football soccer. And that begs the question. That is media manipulation. But do we need to tell advertisers and customers that David Beckham doesn’t speak those languages? Where do we find that right line on information integrity?
So we think about that a lot. And our colleagues at Witness, which is a video and human rights organization, which cares deeply about the power of video being perceived as realistic to speak truth to power and be able to have redress for human rights offenses. They focus also a lot on satire as a really powerful, beneficial use case for deep fakes. To your point, there’s this threshold between something’s misleading and that will cause a degradation in public trust in video and imagery and also satire, which is a really potent mechanism. Discourse satire is often one of the most powerful mechanisms for change.
And interestingly, there’s this example called sassy justice, which was created by the founders of Family Guy, which was a deep fake satirical, deep fake based political commentary that came out before the election, which was just a really interesting use case. But where do we draw the line? What’s really important to mention is the need for global stakeholders being involved in that conversation and figuring out who gets to decide where the line is drawn is a really meaningful question because some people might say, it’s the platform’s doing that right now.
It should be the government in the United States. Noticeably, much of our conversation today has used examples from the US which, for better or worse, are precedent setting for a lot of speech decisions around the world, just because of where Silicon Valley companies are based.
And yet, there might be use cases in different countries that are really different than the ones we’re describing about manipulated media and how it might affect speech. So let’s say in a country where, hypothetically, there’s a ton of satirical, deep thinking to protest an authoritarian government. You wouldn’t want to stifle that just because maybe in the US those deep fakes are being used for being misleading. So we actually just admitted some new partners to PAI, including Code for Africa, which is based in Africa.
And we have some partners like Witness, who I mentioned, who work around the world in different information ecosystems. So they work in Latin America, in the regions in Asia, which you described, and their goal is to really ground a lot of the recommendations for where that line might be drawn or how policies should be constructed at tech companies in the cultural demographic and nuanced context of different regions.
So a really integral part of our work, even though there’s no magic wand to get all of that input involved, is to make sure there’s global attention to the case examples of manipulated media. So if you were going to theme and create a framework, you really want to make sure you don’t just have examples from the US presidential election, but many different countries and many different contexts, and not just politics too, because you might want politics in one way. Political speech may be done in one way. Whereas scientific and vaccine misinformation may be treated differently. So there’s a question of whether or not domain specificity is really important for the types of manipulated media. And also just say, some people think this is all ludicrous to be focused on this.
We’ve had image manipulation for thousands of years. Humans were manipulating imagery and Joseph Stalin in the 20th century had teams devoted to manipulating images to change public opinion. And even nowadays, some people say it’s just going to be like Photoshop. We’ll get used to it. So there’s a lot of different opinions about what the best strategy is. But no matter what, it needs to embed real world attention to what’s happening on the ground in different countries and cultures.
Now, most of our conversation today, Claire, has been guided around the different countries, the different cultures, the different policies at the government level, thinking about how should governments get involved? How can platforms lead the charge for best labeling, organizing information for our users or our customers or our listeners?
On the flip side, what do people really want? How do we think about this? About giving choice around? I deserve to know this label. I deserve to know if it’s media manipulation and how that might be looking as a choice on free versus paid platforms. It’s also what do people want, if a user says, I want to see information from certain people. So let’s play devil’s advocate here. Donald Trump was de-platform and someone might say, no, I really want to listen to it because I find Donald Trump hilarious and funny. And even though all the content is ludicrous, I want to see that with the label. So how do we enable our users to also select or make choices? Is that part of the conversation?
You’ll notice. And what you’re describing brings to mind, the platform siloing that people have described, that user autonomy. If they can’t access the Trump tweets on Twitter, they’re just going to go follow him on Parlor, which is going to be even more of this filter bubbling that we’ve seen on individual platforms will just be this fracturing of the internet, where based on typically partisanship people, some people will go to certain platforms and others to others.
And that is a real, potential hazard, an interesting mediating variable for that is some of the content moderation that’s been happening, isn’t just happening on platforms, but even on AWS. So from Amazon, which hosts a lot of these platforms, they have been intervening to say, no, we don’t want Parlor online. Because we don’t want people to be able to see Donald Trump’s tweets, which is a really interesting moment for content moderation.
And we see something similar with the app stores and the power that they have, not being platforms to intervene and affect the extent to which people have that agency that you’re describing, David. So if the app score bans your ability to download the app where you can get Donald Trump’s speech, that’s a really powerful force, almost beyond him being de-platformed on Twitter to make it even harder to hear what he has to say.
So, in terms of the agency that users have today, Donald Trump is a really interesting example. He has plenty of opportunities to speak his mind. Being de-platformed was meaningful for him, but he is a person who can start his own media company with millions of dollars saved up. Whereas, what’s really more important almost is the person screaming into a void where Twitter’s their only venue for having access to speak their mind. And that’s an even more interesting use case for how we might treat them if they got de-platformed or labeled in a certain way that they found was particularly punitive. So right now, users have agency to find public figures on the internet. But what that may result in is this fracturing of spaces and certain venues for certain speech.
It’s so interesting to think as a consumer, that media is media. Audio, video, text, we’re consuming on a daily basis. But as consumers, we often don’t think deeper about what effects might be impacting our opinions. Some of this research I’ve read up from PAI and others about the implied truth effect, the continued influence effect, the picture superiority effect, the illusory truth effect, all these very interesting notions that the lay person may not know. Could you unpack some of these that are relevant and can help us in forming better opinions about real data,
The implied truth effect. I’ll dive in. Mostly, it is really interesting to me. And that has to do with the implication, to use the word implied, that if platforms label content, they’re only going to be able to label a subsection of content as being misleading. They’re never going to be able to monitor all content on the web. And if users start to assume that they have guideposts or signals for what is misinforming content, they come to expect that they may wrongfully assume that anything that’s not labeled is true. And that might increase the legitimacy or credibility associated with other posts.
So for example, if you get used to seeing manipulated media labels on Twitter, you may assume that anything that doesn’t have that label is not manipulated, which might not be the case. So this question of what the unintended consequences of labels may be, is really important. And we need to focus user research on that. And something notably that the platforms could do better if they really wanted to contribute to scholarship and ensuring that labels are effective, we don’t even have metrics for measuring the efficacy of these labels. And what I mean by that is, the platforms don’t really report out on what these labels are doing. Facebook had a metric recently in a blog that said 95% of people who saw a label were less likely to click through it or something like that.
But that doesn’t really say much about what that label is doing, just because you don’t click through to a post doesn’t mean it might not have an effect on you. And this question of how labels actually change belief is one that we really need to interrogate before we decide if labels are really useful intervention for misinformation and the psychology and cognitive-first principles around how people interpret these labels.
Dave Rand does wonderful work on this at MIT, but we really need more reporting from the platforms, metrics, and measures to evaluate what these labels are doing. So we have upcoming work that’s hopefully going to draw from our multi-stakeholder cohort, which includes fact-checkers, journalists, others in the field to be really precise about what metrics we want from the platforms, so that we can actually evaluate them as a field. So not just as these little independent kingdoms of Twitter, of Facebook, of Google, what the labels are doing, but really evaluate as a field, what they’re doing and how they may be affecting different communities differently.
So, from all the topics that we covered today, there’s so much information that now listeners can discover and they can go to different resources to become better informed. And that’s fantastic, Claire, but what’s next? What else can our listeners do to get more actionable today around the misinformation, disinformation campaigns?
Hopefully, they’re becoming actionable about preventing their consumption and sharing of mis- and disinformation, but some resources that are really wonderful. We have some colleagues at First Draft News, who are mis- and disinformation research organization that puts out wonderful resources for those looking to understand misinformation dynamics.
For those looking to help family members, peers who are susceptible to this type of content, think about having those conversations. So I’ll point people there for really wonderful resources. Also, if you want to learn more about the global impacts of emergent breaths, like synthetic media, deep fakes, WITNESS, Witness is an organization that has done wonderful work on that.
And maybe I can send these links to you, David, if they can be attached and show notes. And just being really intentional about what you share, and reading more about things that you’re skeptical of, always do that and have empathy for those around you. And even the platforms to a certain degree who are trying to deal with, basically, billions of posts per day. So really, being conscious and skeptical and thinking about these topics before you share content and, ultimately, reading up and on media literacy and all of those points, too.
I’m excited to see as our content becomes more labeled, more structured, more available this year, and then the coming decade. I’m sure there’s going to be a lot of new technology, a lot of new tools and a lot of new policies in place. Claire Leibowitz. Thank you so much for joining us, sharing your take on it. And for all the great research we’re doing. Claire Liebowitz, who’s leading the AI and Media Integrity Program at the Partnership on AI.
Thanks, David, for really thoughtful questions, but so fun chatting.
Thanks for joining us on the show. Thank you for listening to this episode of the HumAIn podcast. Did the episode measure up to your thoughts on ML and AI, data science, developer tools and technical education?
Share your thoughts with me at humainpodcast.com/contact. Remember to share this episode with a friend, subscribe and leave a review. And listen for more episodes of HumAIn.