Innovation Monitor: Gartner Hype Cycle Trend #2 — Algorithmic Trust
Innovation Monitor: Gartner Hype Cycle Trend #2 — Algorithmic Trust
Welcome to this week’s Innovation Monitor. Previous editions of this newsletter dug into the dangers of algorithmic bias, but what about its flip side — algorithmic trust?
As companies, governments, and the public become increasingly aware of the challenges behind the notion of an objective algorithm, many are trying to build and check solutions to create more transparency. We can thank researchers and journalists uncovering countless examples of discrimination, false positives, dataset bias, sheer carelessness, and much more in tech today. And these solutions encompass Gartner’s second major hype cycle trend — Algorithmic Trust.
We discussed this during one of NYC Media Lab’s virtual Machines+Media panels with Cathy O’Neil, who raised an important guiding question: For whom will [this tech] fail? She also warned that the use of these algorithmic tools are replacing difficult, complex conversations on everything from teacher evaluations to prison reform.
Here’s Gartner’s take on it: “Increased amounts of consumer data exposure, fake news and videos, and biased AI, have caused organizations to shift from trusting central authorities (government registrars, clearing houses) to trusting algorithms. Algorithmic trust models ensure the privacy and security of data, provenance of assets, and the identities of people and things.”
One example is authenticated provenance — “a way to authenticate assets on the blockchain and ensure they’re not fake or counterfeit.” (Check out The New York Times R&D team’s News Provenance Project.) Other emerging technologies include differential privacy, responsible AI, and explainable AI. This week, we’re diving deep into each.
Finally, if you’re looking for a documentary to watch this weekend, we’re recommending The Social Dilemma on Netflix. As always, we wish you and your community safety, calm and solidarity as we support each other through this unprecedented time. Thank you for reading!
Erica Matsumoto BLOCKCHAIN In Gartner’s definition, they stress blockchain as a technology to solidify trust in the technology sphere. The provenance and authentication blockchain tech allows for could be applied to such a wide array of use cases that it still has the potential to revolutionize the world.
There have been countless startups, bank pilot programs, academic initiatives, etc. that have tested authentication and provenance using a public or private blockchain in the past few years, and I am understanding that every reader of this newsletter has almost certainly read a great deal on the topic. One of my favorite recent pieces was from the NY Times R&D group on fighting misinformation with blockchain. But, as I mentioned in the intro, I wanted to use this newsletter to dig into the other three concepts as they’re all incredibly timely and important. DIFFERENTIAL PRIVACY While big data can help unearth invaluable patterns that can be employed in health research, identifying discrimination, and even reducing traffic, the downside is that you’re aggregating massive amounts of personal information… and we’ve seen how that can go sideways for decades. Anonymizing the data isn’t fool-proof either. In fact, we’ve known it wasn’t for years.
De-anonymization can take supposedly anonymized data and trace it back to an actual person. And while 2018’s NY Times investigation — Your Apps Know Where You Were Last Night, and They’re Not Keeping It Secret — put the dangers of so-called “anonymous” data into stark limelight, this isn’t something we’ve just discovered. Back in 2007, a few researchers from the University of Texas took anonymous data from the $1M Netflix Prize competition and traced it back to real IMDB reviewers. Does this 13-year-old paragraph sound familiar?
“Privacy worries have heightened in the past few years following a number of data breaches that have leaked sensitive information on millions of people. In November, the head of HM Revenue & Customs, the United Kingdom’s tax agency, resigned after two data discs containing sensitive, yet unencrypted, personal details of 25 million U.K. citizens were lost in the mail. In January, retail giant TJX Companies announced that data thieves had stolen the credit- and debit-card details on, what currently is estimated to be, more than 94 million consumers.”
You wouldn’t be surprised if that happened yesterday — actually, you’d likely be less surprised than 2007 you. Andrew Trask, who leads OpenMined, an open-source community that builds privacy tools for artificial intelligence, says that “just erasing a piece of someone’s fingerprint doesn’t get rid of the whole thing.” Multiple sources can help threat actors connect the dots or re-identify real-life counterparts based on seemingly anonymous data points.
Differential privacy (also known as epsilon indistinguishability) might help here. According to Built In: “Differential privacy makes data anonymous by deliberately injecting noise into a data set — in a way that still allows engineers to run all manner of useful statistical analysis, but without any personal information being identifiable.”
Like de-anonymization, this practice isn’t new. In fact, it’s been around for well over a decade. Brookings links it to a 2006 research paper — Calibrating Noise to Sensitivity in Private Data Analysis.
For 2020’s census, the Census Bureau will incorporate differential privacy as it collects population data. Google used it this year too for its mobility reports, which reported on population movement patterns during the pandemic (also see the company’s differential privacy repo). Apple uses the technique to analyze user data.
The practice can enable privacy, but it also does something far more effective: it incentivizes privacy. “If the data is cloaked so that no one can pick out an individual, it can be shared — and therefore analyzed and monetized — around the globe, even if it’s “going” to a place with stringent privacy regulations,” says Built In.
Still, Brookings notes, differential privacy has its drawbacks (besides decreased accuracy due to noise injection). It requires resources and a large dataset, and there is concern that organizations might be exaggerating how much privacy they’re providing. Cynthia Dwork and researchers at UC Berkeley have proposed an Epsilon Registry in response: “a publicly available communal body of knowledge about differential privacy implementations that can be used by various stakeholders to drive the identification and adoption of judicious differentially private implementations.” Read more about the idea here. RESPONSIBLE AI Wait, you might be thinking. Isn’t Explaining AI, in a sense, responsible? And how can AI be responsible? Hasn’t the notion that an algorithm can both be the scapegoat and the solution to machine-caused bias been thoroughly quashed by researchers and investigative reporters? Ok so let’s back up. What’s the difference between explainable AI and responsible AI? Futurist Anand Tamboli wrote this nice explainer in a Medium post:
“Think of an air crash investigation; it is a classic example with which we can compare explainable AI. In the air crash investigation, when something goes wrong, say there was an accident. You first find the Black Box, open it, analyze it, and go through the whole sequence of operations. Then understand what happened, why it happened, and how you can prevent it next time. But that is the post-facto operation, a postmortem. You are not avoiding the incident in the first place.
As a responsible approach, you train your pilots, your crew to avoid these kinds of mishaps. You build your operations in such a way that it prevents these accidents from happening. When it is explainable AI, it is post-facto. It is necessary as an after-the-fact. But when it comes to responsibility AI, it is essential to prevent mishaps from happening.”
So we’ll start with the preventative measure — responsible AI. BCG noted that while companies can think “beyond barebones algorithmic fairness and bias in order to identify potential second- and third-order effects,” and even create legit principles, that doesn’t translate to tangible action. To cross what they call the “Responsible AI Gap” they suggest six steps companies can follow. These give an idea of the concentrated effort needed by organizations to walk the talk (also see Google’s Responsible AI Practices).
- Empower Responsible AI leadership: “An internal champion such as a chief AI ethics officer should…. [convene] stakeholders, [identify] champions across the organization, and [establish] principles and policies that guide the creation of AI systems.”
- Develop principles, policies, and training: “Although principles are not enough to achieve Responsible AI, they are critically important, since they serve as the basis for the broader program that follows.”
- Establish human + AI governance: “Beyond executive leadership and a broadly understood ethical framework, roles, responsibilities, and procedures are also necessary to ensure that organizations embed Responsible AI into the products and services they develop.”
- Conduct Responsible AI reviews: “For Responsible AI to have an impact, the approach must be integrated into the full value chain.”
- Integrate tools and methods: “For Responsible AI principles and policies to have an impact, AI system developers must be armed with tools and methods that support them.”
- Build and test a response plan: “Preparation is critical to making Responsible AI operational. While every effort should be taken to avoid a lapse, companies also need to adopt the mindset that mistakes will happen.”
EXPLAINABLE AI Explainable AI (remember, post-facto), or XAI, aims to answer why a model made a particular decision, something that gets numbingly complex as you venture into deep learning territory. The practice is placed nearly at the peak of Gartner’s emerging tech hype cycle. It’s going to be a long, steep way down to the trough of disillusionment.
It’s not that practical techniques for model interpretation don’t exist — they do. Just practitioners looking for predictive performance might incorrectly apply these techniques and come up with the wrong conclusions. Christoph Molnar, who literally wrote the book on explaining black box models, posted a tweet thread of “poorly drawn” comics to illustrate these pitfalls.
These issues are also explained in-depth in this excellent ZDNet piece. Since we can write entire newsletters on each of these pitfalls, we’ll stick to one in particular: Unnecessary Use of Complex Models.
“Using opaque, complex ML models when an interpretable model would have been sufficient (i.e., having similar performance) is considered a common mistake. Starting with simple, interpretable models and gradually increasing complexity in a controlled, step-wise manner, where predictive performance is carefully measured and compared is recommended.
Measures of model complexity allow us to quantify the trade-off between complexity and performance and to automatically optimize for multiple objectives beyond performance. Some steps toward quantifying model complexity have been made. However, further research is required as there is no single perfect definition of interpretability but rather multiple, depending on the context.”
Simpler models — like decision trees and Bayesian classifiers — are inherently more traceable, as Forbes points out. When you start getting to more complicated algorithms like random forests and neural networks, you sacrifice “transparency and explainability for power, performance, and accuracy.” But researchers are still trying, even with deep learning — in fact, it’s a hot field in AI research.
And researchers do stress the nascency of the field, which might justify Gartner’s prediction that XAI is on a roller coaster dive to disillusionment as organizations realize it’ll be years of playing catch-up. If you want an in-depth, but easy-to-read explainer, definitely check out The Royal Society’s Explainable AI: the basics — and give Molnar’s tweet thread and the ZDNet piece above a good look too. Till #3! This Week in Business History As a current New Yorker, and with many New Yorkers in our audience, it’s tough to look at “this day in history” and not think about one thing. Pulling out a light-hearted business history fact feels a bit ill-suited on September 11th.
Instead, we will close this week’s newsletter with a piece that captures the gravity of 9/11/01: The Falling Man by Tom Junod.
This email was sent to <<Email Address>>
why did I get this? unsubscribe from this list update subscription preferences
NYC Media Lab · 370 Jay Street, 3rd floor · Brooklyn, New York 11201 · USA