DuckDuckGo's Gabriel Weinberg: How Internet giants grab content, trample privacy

Gabriel Weinberg, founder of DuckDuckGo. The no-tracking search site, founded in 2009, is based in Paoli.

DuckDuckGo founder Gabriel Weinberg is among the digital prophets warning us how Google, the Goliath he’s been challenging for 10 years with his Paoli-based non-tracking search engine, and other giants of information are squeezing personal privacy and independent content: “If people are consuming all the Inquirer’s stuff for free on Facebook or Google, it’s a losing proposition for you.”

Check out what’s happened to, he said, which “just announced layoffs for half the staff. Facebook has taken their content and, with it, their audience, forcing them to pay to place their own new videos. And they keep increasing the price; it’s prohibitively expensive. Your content goes viral, and you don’t make any money.”

But isn’t complaining that the internet is a device to enrich tech moguls at public expense like objecting to progress? Aren’t the internet algorithms that tell us what YouTube videos to watch, which old classmates to friend on Facebook, and what celebrities wear cooler clothes on Instragram, based on math, demographics and logic?

It’s worse than “garbage in, garbage out,” Weinberg says. He tells how some of his heroes — investigative reporters, working for old- and new-style news organizations — have found “significant bias in the algorithms.”

See Kashmir Hill’s work for Gizmodo — her takedown of the “Internet of Things,” showing how companies collect and interpret your data from devices and services you pay for, in “The House that Spied on Me“; and how Google uses its power to quash ideas it doesn’t like.

Government isn’t ready to shield us, Weinberg says:  Congress allowed the old prohibition against your internet service provider selling your browser history to expire last year. So Verizon, Comcast and other ISPs “can now collect and sell your data, on where you visit. A lot of it turns out to have been unencrypted. Hulu shows have been unencrypted.” Your preferences are easier to track, list and sell.

So it’s extra worrisome how even government agencies are putting personal data to work, no matter the accuracy and privacy concerns.

Here’s how it works: IT contractors/consultants come in to government agencies “and say, ‘Your old system is not good. We’ll make a better one for you,’ ” Weinberg says. “The algorithms are often not complicated. But they can still be inherently biased. And every time there are unintended consequences.”

Aren’t government applications of personal data public? No: “Because they get the algorithm from a company, it’s ‘proprietary.’ But there is an argument they should be made available to the public, under transparency laws. So news organizations and pro bono lawyers are suing to get the algorithms,” Weinberg told me.

Or the agency will justify a costly new personal-data processing application, saying, ” ‘This is Artificial Intelligence.’ ” As if that alone is a reason to use the new tool. But it’s a tool someone put together using limited information, probably in secret, with unintended but significant consequences, putting some people at a disadvantage, and biasing results. This should be public. “When you hear about it, you should be saying to yourself, ‘Maybe I can get this uncovered!’ ”

Weinberg admires how nonprofit ProPublica used freedom-of-information (aka right-to-know) law requests to show how New York state judges “bought an algorithm to assist in sentencing suggestions” that had the effect of concentrating perceived bias, rather than making sentencing dispassionate. Weinberg says public financing programs using similar algorithms can similarly entrench neighborhoods’ economic character. (For more on how  bad data science makes bad public policy, Weinberg recommends Cathy O’Neil’s book Weapons of Math Destruction.)

I asked about NeoGov, the digital hiring system that Pennsylvania and 25 other states are supposed to be using to replace the old system of state job exams. “It sounds very much worth checking for unintended bias,” Weinberg says.

Or consider the political impact of YouTube: “Its goal is to get users to keep watching ‘related’ videos. These tend to be extreme, and outrageous and to push viewers toward extreme-outrageous positions. Even if that is not intended.” As Wired notes here.  See also the recent Guardian article featuring ex-Google engineer Guillame Cashlot’s observations on “how YouTube’s algorithm distorts truth.”

How do we unwind social media to find what the old futurist Vance Packard called “hidden persuaders” and Columbia Law prof Tim Wu calls the Attention Merchants? “Julia Angwin pioneered this area of journalism,” starting at the Wall Street Journal and moving on to ProPublica, Weinberg says. See Angwin’s investigation tracking Google’s information collection under its Chrome browser’s supposed Incognito mode 

See also Angwin’s work for the Wall Street Journal, aided by Weinberg, who has given grants for reporters to hire programming assistance. “We found a lot of ‘tailoring’ happening in Incognito searches, around topics including gun control, abortion, climate change. There were ‘magic keywords’ that would add additional news articles Google said you ‘previously searched for.’

“It turned out ‘Obama’ was a magic keyword and ‘Romney’ wasn’t. There were millions of extra Obama ideas and items. It was the reverse of what happened in the Trump/Clinton election,” when Trump videos got a lot more attention than Clinton videos because they were bolder and drew in more viewers, ensuring more Trump videos got served to viewers. “Unintentional? But effective.”

Angwin also showed how “Amazon was promoting their own products over others even though the price is not lower.”  See also her ProPublica series on “machine bias”. Angwin’s reporting, Weinberg says, “has kicked off more of this in other organizations.” Similar methods can be/have been used to show, for example, what looks like racial bias in insurance policies quoted online. “There is also this idea of charging people different prices based on their data or location.” What bases do auto insurers, travel sites, banks and retailers use for quoting some readers higher prices, some lower? Is it something as crude as a zip code, as tenuous as a shopping-site visitation history?

It’s the Filter Bubble, as leftish Web-watcher Eli Pariser called the phenomenon in his 2011 book. “It’s not balanced,” says Weinberg. “You can apply this to Netflix, Twitter, YouTube, Facebook, in terms of the individualized filter bubble and what they are generally promoting. Examine search results. Compare users in different locations and circumstances. Ask: Do the search results fall, not fairly, across political and racial lines?”

It takes work, but it’s not complex, Weinberg adds. “These are repeatable studies. The tech side of this is not too complicated. When you read one, check the published methodologies. They make an ‘automated crawl.’ An undergraduate intern could run the numbers, any programmer. Maybe they will pretend they are consumers filling out an online application form, and get the quotes.”

I noted how professional news organizations typically prevent reporters from lying about their identities. But what’s wrong with testing different potential residences or incomes? he asked.

“Someone can always get the data. The more difficult part is analyzing the data and making sure you are not making a spurious conclusion. Once you know what you are looking for, it’s not that difficult. There are a lot of tools available. It’s interesting, it’s important, to repeat studies; it’s like science, it can apply in a lot of areas.” What works in analyzing Amazon will work with your local retailer, too. And with big financial companies — Allstate and  Nationwide, Fidelity and BlackRock, Fanatics vs. Dick’s.

Weinberg is looking to sponsor more professional reporting, including Philly-centric work. “I’m interested in anyone calling these things to task. I’m interested in this area, I live here. I’m interested in any reporting around deep analysis of what tech companies are doing. Related to what we are doing [at DuckDuckGo] or not. I think it’s pernicious for everyone.”