Machine Learning Bias and Fairness with Timnit Gebru and Margaret Mitchell

This week, we dive into machine learning bias and fairness from a social and technical perspective with machine learning research scientists Timnit Gebru from Microsoft and Margaret Mitchell (aka Meg, aka M.) from Google.

They share with Melanie and Mark about ongoing efforts and resources to address bias and fairness including diversifying datasets, applying algorithmic techniques and expanding research team expertise and perspectives. There is not a simple solution to the challenge, and they give insights on what work in the broader community is in progress and where it is going.

Timnit Gebru

Timnit Gebru works in the Fairness Accountability Transparency and Ethics (FATE) group at the New York Lab. Prior to joining Microsoft Research, she was a PhD student in the Stanford Artificial Intelligence Laboratory, studying computer vision under Fei-Fei Li. Her main research interest is in data mining large-scale, publicly available images to gain sociological insight, and working on computer vision problems that arise as a result, including fine-grained image recognition, scalable annotation of images, and domain adaptation. The Economist and others have recently covered part of this work. She is currently studying how to take dataset bias into account while designing machine learning algorithms, and the ethical considerations underlying any data mining project. As a cofounder of the group Black in AI, she works to both increase diversity in the field and reduce the impact of racial bias in the data.

Margaret Mitchell

M. Mitchell is a Senior Research Scientist in Google’s Research & Machine Intelligence group, working on artificial intelligence. Her research involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence toward positive goals. Margaret’s work combines machine learning, computer vision, natural language processing, social media, and insights from cognitive science. Before Google, Margaret was a founding member of Microsoft Research’s “Cognition” group, focused on advancing artificial intelligence, and a researcher in Microsoft Research’s Natural Language Processing group.

Cool things of the week
  • GPS/Cellular Asset Tracking using Google Cloud IoT Core, Firestore and MongooseOS blog
  • GPUs in Kubernetes Engine now available in beta blog
  • Announcing Spring Cloud GCP - integrating your favorite Java framework with Google Cloud blog
Interview
  • PAIR | People+AI Research Initiative site
  • FATE | Fairness, Accountability, Transparency and Ethics in AI site
  • Fat* Conference site & resources
  • Joy Buolamwini site
  • Algorithmic Justice Leaguge site
  • ProPublica Machine Bias article
  • AI Ethics & Society Conference site
  • Ethics in NLP Conference site
  • FACETS site
  • TensorFlow Lattice repo

Sample papers on bias and fairness:

  • Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification paper
  • Facial Recognition is Accurate, if You’re a White Guy article
  • Mitigating Unwanted Biases with Adversarial Learning paper
  • Improving Smiling Detection with Race and Gender Diversity paper
  • Fairness Through Awareness paper
  • Avoiding Discrimination through Casual Reasoning paper
  • Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings paper
  • Satisfying Real-world Goals with Dataset Constraints paper
  • Axiomatic Attribution for Deep Networks paper
  • Monotonic Calibrated Interpolated Look-Up Tables paper
  • Equality of Opportunity in Machine Learning blog

Additional links:

  • Bill Nye Saves the World Episode 3: Machines Take Over the World (includes Margaret Mitchell) site
  • “We’re in a diversity crisis”: Black in AI’s founder on what’s poisoning the algorithms in our lives article
  • Using Deep Learning and Google Street View to Estimate Demographics with Timnit Gebru TWiML & AI podcast
  • Security and Safety in AI: Adversarial Examples, Bias and Trust with Mustapha Cisse TWiML & AI podcast
  • How we can build AI to help humans, not hurt us TED
  • PAIR Symposium conference
Question of the week

“Is there a gcp service that’s cloud identity-aware proxy except for a static site that you host via cloud storage?”

Where can you find us next?

Melanie will be at Fat* in New York in Feb.

Mark will be at the Game Developer’s Conference | GDC in March.

MARK: Hi, and welcome to episode number 114 of the weekly Google Cloud Platform podcast. I'm Mark Mandel and I'm here with my colleague, Melanie Warrick. How are we doing this fine morning, Melanie?

MELANIE: Great Mark. How are you doing?

MARK: I'm doing very well. Glad to be back in the recording studio with you, today.

MELANIE: Yes, it is good to be back here, as well.

MARK: Yes. Talking is hard.

MELANIE: And today, we are going to be-- on this very special day-- we're going to be actually talking with Timnit and Margaret who are coming from Microsoft and Google, respectively, working in machine learning and AI research. And they're also, specifically, here to talk to us about machine learning bias and fairness, and they do a lot of research around that, which we had a great time talking to them about. So I'm looking forward to that interview.

MARK: Yeah, amazing.

MELANIE: And unfortunately, we didn't have enough time for it. But before we get to that, we always do the cool things the week and we also have our question at the end of the podcast, which is basically a chat between Mark and a mutual friend of ours, KF. The question is, pretty much that, is there a GCP service that's basically Cloud Identity-Aware Proxy, except for a static site, that you host via Cloud Storage? So we'll get into that at the end of the podcast.

All right, Mark, what's a cool thing of the week that's on your mind?

MARK: Cool things of the week. So I'm going to go to the community first, rather than one of our blog posts. There is a really cool "Medium" blog post talking about GPS cellular asset tracking using Google Cloud IoT Core, Firestore, and Mongoose OS. It is written by someone whose name is, I am unfortunately going to completely and utter butcher-- I'm going to go with Alvaro Viebrantz-- seems close enough.

MELANIE: Sounds good to me.

MARK: It's a really great article that goes into great technical depth about how they put all the pieces together. It shows off a lot of code on how, basically, they get information from the assets themselves or the mobile devices into IoT core, which then takes things into Pub/Sub, which then puts things in Cloud Functions, which then goes into Cloud Fire Store, which then goes into a web app based on Firebased Hosting. It's a really great read. You should probably check that out, if any of that sounds particularly interesting to you.

MELANIE: Yes. And they provide a tutorial on how to do this, as well, which is really awesome. So definitely check that out.

MARK: What have you got?

MELANIE: All right. I have that there are GPUs now involved with Kubernetes engines. So basically, GPUs and Kubernetes engine is now available in beta. And when you are spinning up anything with Kubernetes engine, you can just set up what kind of GPUs you need-- type, number, and you're good to go.

MARK: Sweet. Finally, I know, there are a lot of Java programmers out there, and Spring is a very popular framework. So we are actually announcing Spring Cloud GCP integrating your favorite Java framework with Google Cloud. So if you're using Spring and you want to use a product, like say, Cloud SQL, or Cloud Pub/Sub, or Cloud Storage, or Stackdriver Trace, or the runtime configuration API, there's direct integrations with Spring frameworks, such as their GDPC framework, Spring resources, Cloud Sleuth, those sort of things that are available.

If you want to get started, there's links to code samples, reference documentations, the GCP project page, as well as having Spring Cloud Code Labs-- wow, that's hard to say. You can also go to and have a play with. So if Spring is your thing-- oh, that was fun-- go check that out.

MELANIE: It's coming up. All right. Well, I think it's time to go talk with Timnit and Margaret, so let's do that. This week's podcast, we're excited to have Timnit Gebru and Margaret Mitchell join us to talk about machine learning fairness. They both have experience and research backgrounds, especially around machine learning, cognitive sciences, computer vision, language processing. I'm definitely excited to have them here, because they've been doing some work around bias and fairness in the machine learning and AI space that I think would be very interesting to hear about. So welcome.

MARGARET: Thanks. Thanks for having us.

TIMNIT: Thanks for having us.

MELANIE: So Timnit, can you give us a little more information about your background?

TIMNIT: Sure. I have a very roundabout way of arriving where I am right now. I studied electrical engineering at Stanford. And well, I am originally from Ethiopia. I was a refugee here. I came to the states when I was 16. And so then I went to Stanford to study electrical engineering and I did analog circuit design, like hardware. And I did that for my master-- and then I worked at Apple doing that-- designing circuits and things like this. And then I got a master's. And then somehow, I was working in device-- something called device physics, which is then, even more of analog circuit design.

And somehow, I started getting interested in image processing, and then that became computer vision. And then I just kind of switched my entire direction to do computer vision, in my PhD. And then towards the end when I was working on computer vision, I was very, very worried about the lack of diversity in that entire field, and just an AI, in general. Every time I went to a conference, I would not see women, but also I would see very, very few black people.

And then towards the end of my PhD, I also read this "ProPublica" article about software that's predicting crime recidivism rates, so that started to worry me. And then I decided I wanted to spend a little bit of time trying to understand the societal impacts of AI and some work of people who were working on fairness, and things like this. Meg has worked on it way longer than me.

So then I joined Microsoft Research in New York and the FATE group. So FATE stands for fairness, accountability, transparency, and ethics. So right now I'm doing a post-doc there.

MELANIE: Thanks. That's great. And Margaret, your background-- I know you're at Google, but you've also had a background with Microsoft.

MARGARET: Yeah, so I did my PhD in computer science focusing on natural language processing. And then, I eventually joined MSR as a researcher there, in the natural language processing group, under Bill Dolan, and worked a lot on language generation, which is producing linguistic descriptions about the real world. My work was generally grounded in images. So I worked on that while at Microsoft Research, and I was really noticing and focusing on some of the things that my systems were starting to generate.

As we moved into deep learning, the language that was being generated seemed more and more human-like, and within that human likeness, there were more and more things that were alarming me. And so I started focusing more and more directly on a couple of things. One, is how can we take the technology we're working on and put it forward for positive, ethical-use cases, long term? And that was really focusing on things like accessibility, as well as how can we do things like mitigate bias or address problematic human biases and stereotypes in the data we're training on, so that eventually we don't propagate or amplify the biases.

There was an opportunity for me to really focus on the ethics part of this at Google. And so, after three years, I left Microsoft Research and then joined Google to work directly on fairness and ethics and machine learning.

MELANIE: Nice. I know when I met both of you, you were co-hosting or co-running an AI ethics group that was made up of people from all different groups, all different companies, education backgrounds, and it was a great session in terms of just talking about some of the issues-- the broader issues.

I think the main question I want to ask is, to just help our listeners starting out, you've touched on it, but can you give us how to think about what is ML fairness?

MARGARET: That's a really big question. So when we talk about ML fairness, we're often talking about something like prejudice or stereotypes that are propagated by algorithmic systems, and specifically, machine learning systems. There's a lot of ways to talk about what fairness is. And bias comes in, and then bias can mean a few different things, some of it mathematical, some of it social.

Basically, that the fairness world encompasses a lot of things that have to do with the diversity and inclusion in the outputs of our systems, and looking at whether or not there is disproportional negative effects or disproportional inaccuracies on some subgroups, corresponding to different kinds of social groups like, race, gender, age.

MELANIE: Thanks. And in terms of the work that you've both done, I know you've talked about the groups that you've been involved in-- can you tell us a little bit more about what Google's doing, what Microsoft is doing in these spaces with ML fairness?

TIMNIT: I can talk about what Microsoft is doing, specifically-- especially my group. So one of the reasons I joined FATE-- it's a newly created group-- is because it's an interdisciplinary group. So I really believe that you can't address this problem by just talking to other machine learning people, or by just talking to other lawyers, because, for example, there are some laws-- equal employment opportunity laws, for example, that I don't think are being applied to algorithms, like hiring algorithms that could have biased outputs or things like this.

So at FATE there are people like, Kate Crawford-- Hanna Wallach does machine learning, but also computational social science. [INAUDIBLE] and there's also other people who work with us peripherally, like Glenn, who is very well well-known economist.

And so, what we do is we not only try to understand societal impacts, but also then work on some technical machine learning solutions. Personally, what I've been very focused on, even though what I like to do most is technical work-- computer vision or purely machine learning-- this year, I've been more focused on trying to uncover some biases that exist in commercial APIs, for example, that are being sold to people. And then trying to figure out how to have a standardized way of providing some sort of information for people.

So for example, from my hardware background, I know that any little component that you use from the cheapest resistor to something as complicated as a CPU or something like this, always comes with something called a data sheet. And so, for me, I'm trying to evangelize this idea of a data sheet or retrain models, APIs, and data sets, and trying to understand what that would look like, what the pitfalls are. Or if we had a datasheet for this kind of process, what it would look like, and trying to talk to other product groups, as well-- try and work with them to see what it would look like for them-- if it's useful for them to have this idea, et cetera, et cetera.

So in addition to, I guess, the technical work, I'm personally more focused on trying to understand policy and what kinds of standardizations we can have, and things like this.

MELANIE: Nice.

MARGARET: At Google, one of the things that really, really appealed to me was the ability to work across the organization. And so part of my job now, involves talking to people on-- who's working on mobile devices, Assistant Cloud-- all these different kinds of products and services that Google is offering, and talking to them at the level of code, at the level of data-- what's going on that might be amplifying different kinds of biases?

We're starting to put out some stuff publicly, specifically focused on machine learning code that can help mitigate different kinds of biases. We're working incredibly hard internally, across a variety of different dimensions, from what the UX experience should be like when you have some sort of bias issue, to what kind of input we want from different kinds of sources, like Timnit says, including people from policy, including people who have focused on ethics philosophically-- really trying to dig into this cross disciplinary effort across Google as a whole.

So it's been a really nice chance for me to try and make change from within, working directly with products that I know are touching millions or billions of people. And really walking through how can we make this really work for everyone.

MELANIE: Nice. It's nice that there's both the internal, but also like, Timnit, you're saying, working in this cross-disciplinary way of looking at how to think about this on a broader scale, too.

MARK: Timnit, you touched on this briefly. I'm happy to let you all talk. You know way more about this than I do, but Timnit, you touched on this briefly about-- you're talking about biases in hiring practices. I wonder if you could expand on maybe some common or possibly some not well known biases that ML can introduce? Or what-- basically, some examples, especially for those people who aren't necessarily as familiar with the space.

MELANIE: And I know you've got a research paper that was out around demographics, especially looking at cars, so I know that's one that might probably be on mind.

TIMNIT: So let me answer your question in a particular way, to bring in something I always want to talk about, which is diversity. So this is why I think diversity is important. It is basically almost like having domain knowledge. So for example, when-- I guess this is a well known example, which is the crime recidivism rates that I was talking about.

There was this "ProPublica" article that talked about how this company was using machine learning to predict someone's likelihood of committing a crime again. So when I read this article, I was immediately, absolutely horrified, because I just knew that there was absolutely no way that this software would not be biased. The reason is that I've been a black woman all my life and I live in the US and I know all sorts of things that happened to my friends. I know also statistics for the likelihood of being caught by police, if you commit a crime is a black person versus not, in certain neighborhoods, et cetera.

So all of these things just came to my brain when I was reading about this article and I immediately-- even though I have never worked in fairness before, and I didn't even think before that about this concept of algorithmic fairness. That article, for me, was very scary, because I had this domain knowledge that this bias exists, because of my background and because of the types of things I've been reading about.

So I guess my point is that there are many such things where-- so one thing I learned, for example, is that speech recognition does not work well for younger people, but also people who have hearing impairments, or things like this. And I'm not in that community, so it's more difficult for me to know about these biases than someone who has that type of domain knowledge. It's important for me to interface with someone that has that type of domain knowledge, in order to know about these biases.

So for me, this issue of bias and diversity go hand in hand. And sometimes I get frustrated if we only talk about the technical aspects, because I just feel like they go hand in hand.

MARGARET: Timnit, I really hear you on that. And one thing that I think has really struck me as I've worked on fairness and I've worked with others on it, is that it's actually really difficult to work on machine learning models where you're recognizing, for example, that this only works for pale males, or this only works for men and women, and not see the same kind of patterns in your everyday work life.

I mean ideally, these are two separate worlds, but I think for actually the engineers and researchers, we're good at seeing these kinds of patterns, we're working on these kinds of patterns, and not seeing them in the world is no longer an option. It's kind of like once your eyes are opened to it, then you see it. Although they are sort of separate, I think the people who get the most inspired by it-- some of the people who are the most driven, are the people who are seeing the patterns on both sides of the coin, both in the machine learning-- machine learn models world, as well as their everyday work life.

TIMNIT: I was also going to say, one of the things I was thinking about is many of the papers or the works when we were talking about machine learning and fairness, we don't look at the data generation process. I guess there is this Moritz's paper on causality that tries to look at, OK, how could this data have been generated. But most of the time, we just take the data and we say, given this data, what are the biases that exist? What notions of fairness can we use and stuff?

And one direction-- I don't know how to bring this in-- but one direction I really want to think about, is how can you transfer the knowledge that you have about what types biases could exist to your model of how this data could have been generated, and therefore, in what ways it could be biased, so that maybe you can have knobs.

Instead of-- if you had a crime recidivism rate model, instead of giving it to some judges and it just spits out a probability saying, this person has x likelihood of committing a crime, maybe the judge-- if someone had a tool to say, OK, what if I know that this particular data could have been biased in a particular way. For example, this particular zip code is x times more heavily policed or something like that.

What if I can have some knobs and change what kinds of biases could exist, and then that tells me what the variance could be for a particular person, like prisons outcome, or something like this. I don't know. In that case, then your knowledge of the world and society, is very, very useful. And you could transfer it directly to your work.

MELANIE: Do you think that there are certain problems that we should not be solving right now with machine learning, especially based on the fact that we have bias in our data, bias in the way we're developing?

MARGARET: I think we need to be really careful to include the experts and the people who are really qualified to work on different problems when we're thinking about using machine learning. And I can give us some specific examples. One thing is, when we work on machine learning in the clinical realm or the health domain, one of the issues that is really difficult from a pure machine learning perspective is how to create technology that is directly useful for clinicians and that also has some sort of explainability for patients.

And how do you work with consent around clinician and patient, if you're using machine learning models? And how exactly will this be applied? And one thing that's really struck me about this, is that we need to make sure that as we work on machine learning, in light of diversity and in light of inclusion, include all the people who this would be directly affecting and who would potentially be working with this, in order to understand what the real problems are and bring in the expertise where we don't have it as computer scientists.

Machine learning is super powerful and it's easy to get into this thought process where you feel like you can do anything, like you're Superman or something. And that misses the fact that there are so many nuances, and so many details across so many different disciplines that you're just going to overlook when you're only focusing on the machine learning hammer.

MELANIE: That's a great point, considering, I know, there's always the question of how do you test these algorithms-- how do you fully test these algorithms? And that-- maybe not fully, but definitely looking at a broader way to test the algorithms.

TIMNIT: I totally-- I always agree with Meg, but I totally agree with this, because one example was I was talking to Adam Kalai about some idea I was really excited about. And we just went on the whiteboard, and rewriting probabilities, and things like that. And then, Glenn, who is an economist, is saying, well, but the way it would really work in the courts, is someone would sue someone like this, and he just kind of killed our idea like that.

But it was really interesting to hear his perspective, because I had absolutely no idea what he was talking about before. But he explained it to me. But unless you work with people like that, you're not going to come up with something that's directly applicable, and you're not going to come up with something that actually addresses the problem. You might come up with a paper that's cool, but I don't think you are going to come up with something that actually addresses the problem.

MARK: Cool, this is super fascinating. I'm also kind of curious to hear also-- you've touched on it a bit, saying that we don't also need to talk about the technical side, but I'd love to talk about the technical side, too, and see is there are solutions on the software side of things as well, or how you build your models, or how you build your data sets? Can you talk a little about that to see if there is opportunity there, as well, to help with biases?

MARGARET: Yeah. There's definitely a lot to do. We're not in a situation where we have a silver bullet or a clear solution, but we do have a lot of things that we can do, for example, on the data side, by including sources that are not just those that are the most easiest-- or most available to us. And actually thinking about sampling from locations across the world, different kind of demographic categories, making sure that what we do is representative of who we want to serve and who we want to work with, as opposed to what's at easy access.

And this is something that requires people who are good at collecting data from people, which might not be machine learning scientists, necessarily, but maybe be a whole other section of expertise. There's a lot to be done on how to create inclusive data sets, data sets that capture a lot of different worlds views, that aren't necessarily going to be downloadable from Flickr or Twitter, doing some simple query searches.

On the machine learning side, there is a lot of really interesting work coming out now in the space of bias mitigation. It's also kind of called un-biasing or de-biasing, which is an oversell. That's a bit of hype. What we're actually doing when we work with these models, is we take a look at some sort of known biases, some known stereotypes.

For example, Timnit mentioned the Kalai work associating woman to homemaker, while associating man to computer programmer. These are these kinds of biases that we can discover and then build models around in order to define what is this subspace of bias? What is this subspace of gender bias, for example? And can we do some cool math to address it-- doing something like subtracting it out from our representations.

I've been doing some fun work at Google where we do adversarial predictions. We basically take predictions on some sensitive attribute and then negate the gradient that we get from that, in order to force the model to not be able to make decisions based on race or based on gender or based on different kinds of sensitive characteristics.

There's a lot that is coming out and can be done on the ML side. There's a lot more to do. And I'd love to see a lot more focus on the data side but these are just-- these are just starts. This is just the start of something that could be much bigger.

TIMNIT: So one thing I was thinking about, is that we talk about certain things, like, for example, I gave the example of this open CV library not detecting Joy's face. So a lot of times we talk about diversifying our data sets. So even with our paper with Joy, we have this diverse data set, more diverse data set, so people can test out their algorithms. But humans-- as a human, if I've never seen-- is this true? If I've never seen a black person before, and I see a black person for the first time, I would identify them as a black person, right? I mean as a person. I wouldn't just not detect their face. Or someone--

If I see a new race of some people, and given all of the data that I've seen before, not being from that race. I feel like maybe there's an opportunity for us to go farther than just purely supervised machine learning where if you don't see it in your data set, then it's not accurate. And in order it make it more accurate, you need to diversify your data set. I feel like it could help us have a paradigm shift in how we approach these types of problems.

MELANIE: Well and I know there's the value of using these models, especially to think of new ways to solve problems or think of new ways to think about things. That's the real value add that several of these models have come about, which is touching on what you're talking about, like this unstructured learning. We try not to tell the model what to think, and try not to influence what the models should be thinking-- should try to let it discover on its own. But I know the biggest challenge is with the data to begin with. Do you think that there's still a way to fully pull out the bias, especially when you are working with predictive models that are doing something that's in relation to people?

TIMNIT: So one thing that's interesting, is that-- if we pick the attribute, the, what we call sensitive attributes, or protective-- protected classes-- so race, gender, age, disability, sexuality, religion, etc, etc, etc-- then I think we have a better chance, because we can see what the disparities could be with those things.

But when you look at things, like health care, when you move into a broader view of things, these are not going to be the only things that you will have systematic bias in, in my opinion. And so then the question will be, I think also, how could we discover the bias that we might have not already know exists?

So even in the work that Meg mentioned, they already knew-- they kind of guessed that there would be a bias, a gender-based bias. So what about other attributes that we haven't thought about? Because we're not 100% diverse. Even if you're well-meaning, there is going to be a bias that you're not aware of, unless you're interacting with somebody from that part of the world. So how do we also address those issues? I don't really have the answer to that. But, that's a question I'm thinking about.

MELANIE: You have to have all the answers.

MARGARET: One thing-- one thing that is really useful-- and I'm going to put a plug here for this, in case there are machine learning people listening-- one thing that is really useful to do, is to take a look at your false-positive rate and false-negative rate, and false negative rate is 1 minus true-positive rate, so this kind of thing you get from an ROC curve.

I think in both, MLP and computer vision, there tends to be a focus on precision, recall-- in NLP F-score is big. In vision, accuracy is big. This overlooks the kinds of errors that might disproportionately affect different kinds of subgroups, different kinds of groups of people. And so by having some focus, at least in our classification systems, on what the differences are in the false-negative rate and false-positive rate, can give us a sense of how these things might be underperforming when we do intersectional analyses by subgroup.

That doesn't say much about score-based systems or ranking-based systems, but there is some basic things that we can do with our given machine learning and statistics machinery to focus in on some sort of statistical version of disparate impact, or how things might be affecting some subgroups worse than others.

TIMNIT: In our current paper we were analyzing APIs. So the APIs were gender classification. So they just look at a face and they give you a binary male, female label. And they don't give you a probability or anything like that, they just give you the label. And this does not allow you to adjust for your particular setting. You might want to have different types of false-positive, false-negative rates. And if you don't get the probability output, you can't-- you don't have a chance to adjust this.

MARGARET: The thresholds-- yeah.

TIMNIT: The threshold-- exactly. So this is one concrete, I think, step that commercial APIs can take, cause I'm certainly very worried about commercial APIs. Because at least-- OK, in research, you have an understanding that it's research. Maybe it's for only the research community, etc.

These commercial APIs are being used by so many different startups, so many different people. And we don't know-- we don't have any regulation--

MARGARET: Potentially governments.

TIMNIT: Governments can use them. There's-- we don't have-- they're not accountable. They don't have to tell us what they're using, what the accuracy is, what the characteristics of the software is, for what, for which reasons. So I guess I'm bringing it back to the standardized stuff. I think this is-- Meg's example, is a great example of, I think, the kinds of information, additional information, that these APIs need to give us.

MELANIE: And interpretability-- I know that's a huge one, too, in terms of understanding what's really going on inside of the model to begin with. And Mark, I want to make sure you have a chance-- I know you had another question you wanted to ask.

MARK: No, you're a good. Keep going.

MELANIE: OK. Well, I think, that one of the other questions-- and you both have touched on this already-- is what are some of the other tools in the space that you you'd recommend that people can use now to help with their own work in machine learning and driving fairness?

MARGARET: So on my side, I can speak to some Google products and services that I'm actually kind of excited about and have been working with a bit. So one thing is facets, which is a visualization for ML data sets. It's available on GitHub. And it's a way that you can do different slicing across different kinds of groups or subgroups or user demographic characteristics in order to see how well your data or how well your output is representing each of these different kinds of groups.

It's a really handy visualization tool. And you can look at the effect of different thresholding and do a lot of really cool stuff with it. So that's something that Google has put out to help on the data side of understanding some of the possible biases. There's also this thing that's come out from Google called Lattice, which helps with some kind of post-op analyses and interpretability about different machine learning models. I believe it's publicly called Lattice Google-- yes, TensorFlow Lattice.

And we're putting more and more into TensorFlow to help with different kinds of machine learning problems that can deal with fairness and bias. We're keeping it really technically focused right now, but creating this code base that we can share with others to help with these kinds of bias and fairness issues.

MELANIE: Great. Timnit, what about you?

TIMNIT: Oh, I would recommend that people check out the FAT* Conference that's about to happen. The tutorials section has a whole bunch of tools, like software to visualize your data and what kind of discrimination might exist to pre-process some data, some tools to help with interpretability. So really like all-- I think a lot of the tutorial is just about these different tools. So I would recommend that people check those out.

MARGARET: Joy is totally using Facets, by the way, Timnit. I don't know if you know that, but for all of her beautiful visualization-- totally using Facets. I'm like dude, that's the tool.

MARK: I'll jump in real quick. You've mentioned Joy twice. Do you want to talk a little bit about what that is, because I think we talked about it before the actual podcast started.

TIMNIT: Joy is a person-- a person who is a bundle of joy--

MARGARET: She is a person--

[LAUGHTER]

TIMNIT: --and energy.

MARGARET: She is mutual friend.

TIMNIT: She is a bundle of joy and energy. Joy is a student at MIT Media Lab who focuses on face recognition and bias and face recognition. And I consider, I mean at this point, she's a computer vision person. She keeps on saying that she's not, but-- So Joy--

MARGARET: Her name is Joy Buolamwini, we Should say her--

TIMNIT: She has something called the Algorithmic Justice League. She even gave a Ted talk. I mean-- and she just started her PhD. And I met her two years ago-ish, and we've been collaborating ever since. Our current paper at FAT* is actually based on her master's theses. So she spent a whole bunch of time analyzing face recognition. And she did this intersectional kind of analysis.

So usually, people just look at-- well, usually people just look at aggregate accuracy on some sort of data set, and then maybe they would break it down by male, female, or race. And so she said, OK, you don't-- you can't just break it down by just gender or just race or just age, you have to look at the intersection of gender and race, or gender, race, and age, et cetera, cetera.

And then, she also thought, people have done this analysis across different racial groups, but race as we know is a very unstable category. So what is considered black in the US is very different from Brazil, from South Africa, from Ethiopia, etc. And it's not even stable table across time. So her idea was to use this more objective metric of skin tone, what's called a Fitzpatrick Skin Classification System, and so analyze accuracy by the intersection of gender and that. And also other things, too. But for this paper, it was just gender.

And so, Joy has been very vocal about this. She's been working with various organizations like the Itripoli to come up a new standard. But in addition to that, she also works on just algorithmic fairness, from the social standpoint and from the technical side. She does so many things, I can't keep track of all the things she's doing. She also makes short movies and she also raps, by the way. And--

MARGARET: She is super inspiring.

TIMNIT: Again diversity, for me, comes in, because people like her are generally either turned away from the field of machine learning or somehow weeded out.

MARGARET: Yeah.

TIMNIT: I think that type of personality is generally given some sort of subliminal message that they don't belong in this environment. And we really need to change the educational system, or the marketing, or whatever it is that is excluding people like her, to bring in people like her, because we absolutely need those types of people in this field, especially when we're talking about fairness.

MELANIE: Completely agree.

MARGARET: Just to be clear, her last name is Buolamwini. And you can check out her Ted talk if you want, which covers a lot of this. The last name is spelled B-U-O-L-A-M-W-I-N-I

MARK: We'll put a link in the show notes.

MARGARET: Awesome. Awesome. Yeah.

TIMNIT: Also, Aglunited.org is the Algorithmic Justice League.

MELANIE: Great. Thank you. And I also wanted to make sure, Timnit, can you also tell us, because I know you touched on it, but can you explain a little bit more about FAT*. I know this is a conference that you're one of the ones who's founded it?

TIMNIT: No, I'm in the program committee and so is Meg.

MELANIE: OK.

TIMNIT: But the person who-- well, the main people are Sorelle Friedler, Suresh, and Solon Barocas. If you go to Fatconference.org-- I'm just a big fan. I always talk about it. I didn't-- I don't want people to think that I founded this conference.

MELANIE: No worries.

TIMNIT: But if you go to Fatconference.org, you'll see who the editors are and who the program chair is and things like this.

MARGARET: This was actually started as a workshop at NIPS, which is a top, deep-learning conference. I believe Solan Barocas and Moritz Hardt were the original creators, a few years back. And that was before it was super hot, super hip to look at diversity in policy and these kinds of issues. They were really interested in combining machine learning and law-- things like taking the Civil Rights Act from 1964 and its notion of disparate impact, and then applying that mathematically to something like demographic parity.

And so really making those connections between the legal world and the ML mathematical world. And it's really gotten tons of interest and really making some strides in this space. And so now, it's become its own conference. And it switched from FAT ML to FAT* as a kind of nod to regular expressions. So its fairness, accountability, and transparency, not just for machine learning, but for lots of other things as well.

TIMNIT: You know what? What I'm excited about this conference, is the fact that it's interdisciplinary. Again, I am a huge fan of interdisciplinary teams, in general. And one of the things that I haven't seen before is in the tutorials, they have also these translational tutorials. So this specifically translational tutorials, which could be translating a concept that is very well known in one specific area to people from a different specific area.

So it could be translating a very well-known machine learning concept to lawyers. Or translating very well-known things in health to machine learning people. So just kind of start to bridging this gap of knowledge between the different groups, so that we can have some sort of common understanding and common ground to start working together.

MELANIE: This is great. And I hate to say it, that were pretty much close to time, because we could talk-- we could talk about this for a while, clearly-- clearly. I did want to ask you both, as we get close to wrapping this out, is there any last things that you wanted to talk about, in terms of making sure people are aware of-- anything you wanted to plug? Margaret, do you have anything that comes to mind?

MARGARET: Right. So if people are interested in this space, it's not just fairness that is interest, I think, to people working on fairness, its ethics, broadly. And so, there are also ethics conferences to be aware of. There was just one that was co-located with AAI, the AI ethics and society conference.

Coming up, I'm hosting one for NLP for language processing, specifically, ethics in natural language processing, which will be in New Orleans on June 6th. So there's lots of other ways to get involved, if you'd like to take a look at ethics, broadly, with machine learning.

MELANIE: Right. Thank you. Timnit?

TIMNIT: Yeah. I want to say something about hype and ethics and fairness. I think that we should all care about fairness and ethics and diversity, of course, but sometimes, all of these words are used as buzz words. And sometimes, they're used for PR purposes. And sometimes, they're used for image purposes. So let me give an example.

If you have an all-male panel on AI for ethics, or AI for social good, or something like this, I have very little faith that this is actually AI for social good. If you have an all-male team, or very homogeneous team, or you've never graduated a female student, or something like this, and now someone is-- people are talking about AI ethics and things like that.

So I think people say charity starts at home, so I encourage everybody to think about our local-- our surroundings, and think about AI ethics, in that sense. Because it's not just like, raising money or creating the next coolest fairness algorithm, it's also about including people who are very well positioned to work on these problems, in your surroundings-- in your immediate surroundings.

MARGARET: Yeah. One thing I like to look at is, how many papers have been published with women. What is your track record for publishing with women? What is your track record for producing projects with women? This is one of the things that gets overlooked, but can also be a sign of a lack of understanding or inclusion of these kinds of other viewpoints.

TIMNIT: Yeah, and so this is why I'm always so happy that Meg is always talking about accessibility, as well, as part of the conversation around ethics, for example.

MARK: Wonderful. I wish we had more time. I really do. This has been a remarkable, remarkable conversation. And I really-- thank you so much, to both of you, Timnit and Margaret.

TIMNIT: Thank you.

MARGARET: Thank you.

MARK: For joining us. I really appreciate it.

MELANIE: Thank you very much.

MARK: Maybe, we just have to do this again at some point, soon.

MELANIE: I know. We'll have to do a part two.

MARGARET: Any chance where I get to talk to Timnit, I'm really happy--

[INTERPOSING VOICES]

MELANIE: Next time, we'll get Joy.

MARGARET: Yeah.

[INTERPOSING VOICES]

MARGARET: That's good. She's like, missing.

MARK: Wonderful. Well, again, thank you so much for joining us.

MELANIE: Thank you, Timnit and Margaret, again, for that great interview. I really wish we'd had more time with them, because it was a good topic. It's an important topic. And it's something that we can explore in more detail in many ways. As Timnit had mentioned, and Margaret had mentioned, FAT*s conference is coming up next week. And actually, over the weekend, there was some great press coverage for a paper that both Joy and Timnit had worked on, that's titled "Gender Shades, Intersectional Accuracy, Disparities and Commercial Gender Classification."

And they do this, basically, assessment of facial recognition systems and how good they are, in essence, of being able to recognize, especially from a shade and gender perspective. And they use some interesting techniques to assess that. So we'll include that in our show notes, in terms of links to that, and all the links to all the things that were referenced while we were talking to them. Good stuff.

So Mark, I got to ask you this question. Question of week. He's always got to get a good question in there. How do you feel? No. Our question of the week is, as I mentioned, you had this wonderful chat with KF--

MARK: I always have-- any chat with KF is wonderful.

MELANIE: And it's all on Twitter. But you both were talking about is there a GCP service that handles Cloud Identity-Aware Proxy, especially host via Cloud Storage. So Mark, is there?

MARK: Is there? OK. So if people aren't aware of Cloud Identity-Aware Proxy, it's a proxy that you can put in front of your applications that controls whether people have access, basically through their account logins. It works with App Engine Standard, App Engine Flexible Environment, Computer Engine, and Kubernetes engine. But what KF wants to do, is put it in front of Google Cloud Storage, for static sites.

So the short answer is, unfortunately, no. But there are a couple of interesting workarounds that I think are worth discussing. The first one is-- and it's a bit of a cheeky answer-- but you can always host static content on App Engine itself. There's no reason why not. It's actually really, really easy to do. There's a lot of opportunities there for [INAUDIBLE] static content. And if you're using some sort of programming language, and you wanted to mix and match, there's also opportunities to do stuff like that, as well. So that's always a thing. So you could always put the static content, rather than putting it on Google Hot storage, you could put it on App Engine, and then put Cloud Identity-Aware Proxy in front of that. And you still get a lot of the really great edge caching you would get on Google Cloud Storage.

The other option is an open-source product that's available on GitHub, and we'll have links in the show notes. It's called weasel, which is an interesting name in and of itself, which is basically an App Engine Proxy for, basically, files that you store on Google Cloud Storage. So you could put this up on App Engine, then just change whatever you need to change, or host whatever you need to host inside Google Cloud Storage. And then put the Identity-Aware Proxy in front of that data application that runs on App Engine.

MELANIE: And it's called weasel?

MARK: It's called weasel. Yeah-- I--

MELANIE: Great name.

MARK: Yeah, I don't know. I don't know. It's a thing. I'm going to assume the naming has something to do with-- it's like a-- weasel's make tunnels. And it's like tunneling through from App Engine to Cloud Storage.

MELANIE: Sure, we'll go with that.

MARK: Yeah, I genuinely don't know. But those are the two options. Neither is exactly what KF is looking for, but it does give you some options to allow you to do something similar to what it is you're looking to do.

MELANIE: I know. She was definitely like, well, that's not really what I want. But hey, we're trying to help.

MARK: Yes.

MELANIE: OK. So Mark, where are you going to be in the next couple of weeks?

MARK: Still doing all the work that I can possibly for GDC. I'm pretty excited about that.

MELANIE: Is GDC a thing?

MARK: GDC is a thing. I'll start talking a bit more about that. But yeah, we'll be having a booth presence across Google on the floor, so I'll be giving talks there. There's some other fun stuff we're also organizing as well. So as we get a bit closer, and everything gets super locked down--

MELANIE: Lots of preparation.

MARK: Lots of preparation.

MELANIE: Nice.

MARK: What are you up to?

MELANIE: I'm going to go to New York for FAT* next week, and that's where you will be finding me. But outside of that, I think that covers us for this week.

MARK: All right. Well Melanie, thank you for joining me, once again, on the episode this week.

MELANIE: Thanks, Mark.

MARK: And thank you all for listening. And we'll see you all next week.

Hosts

Mark Mandel and Melanie Warrick

Continue the conversation

Leave us a comment on Google+ or Reddit