TERRY GROSS, host:
This is FRESH AIR. I'm Terry Gross.
So if you're wondering what new ways surfing the Internet might compromise your privacy, you've come to the right place. Our guest, Julia Angwin, recently led a team of reporters from the Wall Street Journal, who discovered that nearly all of the most commonly visited websites are using sophisticated software to track our movements through the Web so they can sell the information they get about us. In many cases, the sites actually install tracking software on our home computers without our knowledge.
One of the fastest-growing businesses on the Internet, Angwin writes, is the business of spying on Internet users. The Journal reporters found that even their own newspaper's website is in on the consumer surveillance game.
Julia Angwin is senior technology editor of the Wall Street Journal and author of the book "Stealing MySpace: The Battle to Control the Most Popular Website in America." She spoke with FRESH AIR contributor Dave Davies.
DAVE DAVIES: Well, Julia Angwin, welcome to FRESH AIR. I thought we'd begin with the 26-year-old woman from Tennessee who appears in your story and who has discovered, I assume through your reporting, that her computer has software that's gathered a lot of information about her. What kind of information was being harvested without her knowing?
Ms. JULIA ANGWIN (Senior Technology Editor, Wall Street Journal; Author, "Stealing MySpace"): So what we found that the company tracking Ashley Hayes-Beaty knew was that she all of her favorite movies, her age, her hometown and that she liked quizzes and entertainment news.
DAVIES: And this information was being gathered how?
Ms. ANGWIN: So there were two parts to the information gathering. The first part was there - she was given an ID number which was stored on her computer in something called a cookie. And a cookie is something that is just a text file on your computer, really just gives you an ID. And when you visit a website, oftentimes these cookies are installed without you knowing it. So she had an ID number in her cookie.
Separately, when she went to some websites, they had a different kind of technology called a beacon, which is another invisible kind of tracker that runs some software while you're on a page and tries to assess what you're doing on that page.
So in her case, this beacon was actually seeing her activity around movies, in particular. So she had listed her favorite movies on a website, and it saw that she was typing those in and captured that data and put it in her profile, which is stored at some mother ship, where there's a little drawer that has her ID number. And inside the drawer, there's a file, and the file says these are her favorite movies. And every time they find out new information about her, they add more to the file.
DAVIES: Okay, but the information that was doing all this was not something that she asked for or that came installed on her computer when she bought it, right?
Ms. ANGWIN: No. So what's happening is that there are tons of companies like the one who was following Ashley that are out there in the business of gathering information about people while they browse the Internet.
So this company, named Lotame, had a relationship with some of the websites that she was browsing on, and those websites basically allowed Lotame to install this monitoring software on her computer.
DAVIES: I guess what's surprising to a lot of people is that when you go onto the Internet, this little piece of information-gathering software comes the other way.
We think of us as connecting to the Internet, but when we connect to some of these sites, this little information-gathering software slithers back into our computer, embeds itself there and then begins spying on us. Is that too strong a word?
Ms. ANGWIN: Yeah. I mean, I think spying is not necessarily too strong a word, although I think surveillance is probably slightly more accurate. The thing is that the advertising industry online has evolved to really rely on a lot of these surveillance or monitoring techniques.
And most people that we have heard from since writing these stories did not know what was going on. So when they - when you go to a website, you're not thinking about the fact that they might have relationships with all these different types of monitoring firms, and that those firms are installing things that are invisible to you on our computer.
Now, there are some very tech-savvy people who know - who are diligent about removing cookies and trying to block all this type of monitoring software, but the vast majority of people that we've heard from didn't know about it.
DAVIES: Right. So the average computer user, the person that does some shopping online, that enjoys surfing the Web, how many different little pieces of information-gathering software might they have on their computer, without their knowledge?
Ms. ANGWIN: Well, definitely hundreds, right? Because we did a survey of the top 50 U.S. websites to see how many they installed on our test computer, and collectively, they installed more than 3,000 pieces of tracking technology.
So that would imply that by just browsing the top 50 websites, you would get 3,000 things on your computer.
DAVIES: That's a lot.
Ms. ANGWIN: Yes.
(Soundbite of laughter)
Ms. ANGWIN: It's a lot.
DAVIES: Maybe you should explain just a little bit about how you and your team at the Journal did this kind of research. I assume you did not go to the public relations departments of these websites and ask them.
Ms. ANGWIN: No. I mean, we did talk to the public relations departments of these websites after we did our survey. But what we did was we hired a researcher who specialized in doing this type of data collection.
He had a clean computer, and he cleared it of all tracking devices that had already been installed, made sure that, you know, he wasn't browsing and collecting things from previous visits.
And then he went to visit each of these sites and saw what kind of technology was installed on his computer when after that visit. And he tried to visit 20 pages on each site.
And during this, we there are many different types of tracking technology out there. We focused on collecting data on three of the most prevalent types - so cookies, which are text files that are stored on your computer, and most of them give you sort of a unique ID number.
We looked at Flash cookies, which are also similar to cookies, but they're stored by your Flash video player. So when you watch YouTube videos or other Flash animations online, that program could be installing a Flash cookie.
And we also looked at beacons, which are basically invisible bits of software code that are installed or that run live while you're on a Web page.
DAVIES: One of the most surprising things that I read in one of your stories was that a number of the websites that you looked at that were clearly installing information-gathering software on people's computers didn't even know they were doing it. How does that happen?
Ms. ANGWIN: Yeah, it's amazing. So we were surprised about that. We thought that it would be the case that most websites would know that they had a relationship with these third-party companies that are installing tracking devices.
But, in fact, we found that tracking devices, the way they're distributed is that they can often be distributed by a third party. So, for instance, the tracking company might place one of its little monitoring devices within an ad, and then when that ad appears on the website page, the device is installed. So the website doesn't actually know what's being installed on its users' computers.
We also found that tracking devices sometimes contained other tracking devices in them. So there is what's known as piggy-backing. And so they might know about the relationship with one, but they don't know about the one that was contained inside of it.
DAVIES: Wow. Let's put some names with these. You looked at the nation's top 50 websites, found that they were installing, collectively, around 3,000 pieces of information-tracking software in people's computers. What are some of these websites?
Ms. ANGWIN: Well, so we looked at all the big websites. I mean, the top 50 list includes, you know, Google and YouTube and Facebook, all the sites that you normally think of as the top websites.
What we found was the sites that were installing the most, the one site that installed the most was dictionary.com, actually. They - a visit to dictionary.com resulted in 234 trackers being installed on our test computer. And only 11 of those were installed by dictionary.com.
So this might be a good moment to just mention that some tracking devices are completely innocuous. A cookie, or some kind of tracker is what remembers your password. And so if you ask a website to remember your login, that would be stored on a cookie.
So there are tracking devices that are useful to you as a Web browser. And those tend to be the ones installed by the website that you actually have a relationship, not the ones that you've never heard of before that are sort of secretly lurking behind the scenes.
And so on dictionary.com, most of the trackers, the vast majority, more than 200 of the 234 were installed by companies that the person visiting the site probably had never heard of.
DAVIES: Hmm. So, if those of us who are out there on the Web, using it as normal users do, have dozens, maybe hundreds of these pieces of information-tracking software that are gathering information about us without our knowledge, where does it go, and what do the people who are getting it do with it?
Ms. ANGWIN: Basically, there's an ecosystem of hundreds of online advertising companies who are in the business of tracking Web surfers. And for many years, most of them had their own network of tracking, and then they would take that data and try to sell it directly to an advertiser.
What's happened in the past year and a half is that now there are data markets where these tracking companies now try to sell your data on an exchange, which is really like a real-time, Wall Street sort of automated trading floor, where the data about your behavior online is being sold at auction.
DAVIES: Wow. Now, this I'm trying to picture, and you did write in the piece that one of these companies that trades this stuff sells 50 million pieces of information a day for as little as a 10th of a penny a piece. Explain this to me. What kind of information do they get about me, and who's going to want to buy it that day for a fraction of a penny?
Ms. ANGWIN: So if you do something that has high value online, for instance searching for a car - basically, searching for something to purchase, that makes you very interesting to an advertiser.
And so the information would be something like ID number so-and-so is looking for a car. And the company that captures the information may post it for sale immediately, almost instantaneously, on one of these trading floors.
So the one that we wrote about was called BlueKai, and for instance, they have a relationship with eBay. So at the moment I've done this many times. I went and searched for something on eBay. Then I looked in the BlueKai registry -which they are kind enough to show you what data they're selling about you. You can see almost immediately that it pops up and says you're in the market for whatever it is that you've just searched for on eBay.
Then there are a bunch of advertisers lined up at their door who have already placed bids to automatically buy up anybody who's in the market for that item. And so instantaneously, they can show you an ad wherever you next land on the Web for whatever product it was you were just looking for.
DAVIES: Wow. So if I look for some new dining room furniture, some third-party tracker gets that information, puts this up on an exchange and says: Anybody want to find people looking for dining room furniture? Somebody who's selling dining room furniture says, yeah, I'll buy 200 of them for X price, they get that information, and then can find a way to get to me.
Ms. ANGWIN: Yeah. And you may have had this experience. I mean, I certainly have. There was a pair of shoes that I was looking at online, and then those particular shoes actually followed me around on every website I went to for a month.
They had clearly bought me. And maybe this is because I'm tuned into it because I'm writing about it, but you may have noticed that various items are following you around online.
DAVIES: Now, explain this. What do you mean these shoes are following you around?
(Soundbite of laughter)
Ms. ANGWIN: They really were. They really were following me around. I looked at them, and then I put them in my basket, because for me - which is rather sad, but for me, this is what constitutes fun, is putting things in my basket that I'm not going to buy.
(Soundbite of laughter)
Ms. ANGWIN: It's just like - it's like window shopping, right? So I put them in my basket, thinking if I was really rich, I would buy these shoes. But I don't feel rich right now, so I'm not going to.
But they were able to follow me around, and a good number of the websites that I went to in the next month had those exact shoes in an ad looking at me. And this particular type of tracking and monitoring is called re-targeting, and it's considered one of the most productive types of targeting because I actually ended up buying the shoes because they kept following me for so long that I finally caved in.
DAVIES: Oh, that is creepy.
(Soundbite of laughter)
Ms. ANGWIN: Yeah, and you know what? It's not that creepy with shoes, to be honest with you. I wasn't I didn't mind so much. But I think what's creepy about it is that they target on things like, are you bipolar?
So imagine that you've been searching for information about bipolar disease, and then every ad is targeting you as bipolar. That seems creepy. That's where you get into the question about health data and financial data, or some of these things maybe should be protected categories.
DAVIES: We're speaking with Julia Angwin of the Wall Street Journal. She led a team that looked into companies that gather information about us on the Internet.
We'll talk more after a break. This is FRESH AIR.
(Soundbite of music)
DAVIES: If you're just joining us, we're speaking with Julia Angwin. She is a senior technology editor at the Wall Street Journal. She led a team that looked into companies that gather information about us on the Internet. She's also the author of a book called "Stealing MySpace: The Battle to Control the Most Popular Website in America."
I wanted to ask you a little bit about these, I guess for a lack of a better word, middlemen - I mean, these people who promote these tracking devices and then gather this data and then slice it and dice it and price it and put it up on these exchanges and sell it in huge batches to whoever is offering to meet the price. Who are these people? I mean, are these 26-year-olds in sneakers? Are they big companies?
Ms. ANGWIN: They're mostly 26-year-olds in sneakers, and they might have a Ph.D. in math or statistics. So, I mean, these are quants, right? They used to you know, some of them used to work on Wall Street. Some of them might have been responsible for blowing up the financial markets.
But basically, the ad business, it's no longer "Mad Men." You don't have men sitting there drinking martinis, at least not with this online business. This is really a math exercise. People are building algorithms that can automatically slice and dice this data and build these profiles without any humans sort of really looking at them, and it's a very - it's like a real-time, you know, everything happens in milliseconds and it's all about, you know, how big is the computer you can get to do these calculations.
DAVIES: Now let's clarify a couple of things here. When this information is gathered about you, are they getting your name, or just some long serial number that - associated with your computer?
Ms. ANGWIN: No. They are generally not getting your name. In fact, that's the industry says we never get your name. Occasionally, you might register at a website, and that website has your name, but in general, these third-party monitoring, tracking devices don't know your name, even if the website you're visiting does. So that's one thing to point out, is they don't know your name. They know your behavior.
One thing, though, that the industry is doing more and more often is they're using your behavior to make inferences about who they think you are. So they may buy data from some offline brokers estimating your income, your age, your hometown, et cetera. And so they can make some pretty educated guesses about who you are, but they are not attempting to find out your name.
DAVIES: Okay, so they build a - kind of a demographic profile of you, in effect.
Ms. ANGWIN: Yeah, it's exactly right. They put you in some sort of bucket. And some of these buckets have funny names. We talked about them in one of the stories that, you know, they categorize Americans into all these different segments, like white picket fences or bohemian mix or, you know, bohemian urban-dwellers. You know, so you might be in some bucket.
DAVIES: But I guess, you know, to the layman who hears this, you wonder, well, okay. But if they're tracking all this stuff, if they're getting into my computer without me knowing it, and if they can follow my keystrokes, what prevents them from getting the password to my online bank account or looking at the Quicken files with all my family financial information?
Ms. ANGWIN: Yeah, look. I think that the keystroke monitoring that we wrote about is pretty avant-garde. That was, you know, the websites involved do know that they're allowing that particular kind of monitoring, and they only allow it on certain pages where fairly innocuous things are happening, like talking about movies that you like.
We haven't seen any evidence of somebody deliberately allowing keystroke monitoring on a page where you would enter any sort of sensitive information, and I think you would be entering into the realm of fraud at that point, not what this is, which is a legitimate industry based around tracking your movements in an anonymous way.
Now, you might want to argue with whether this is legitimate, what they're doing, but I don't think anyone in it is actively engaged in fraud that I know of.
But it does raise the question, because the technology is out that it could get into, you know, bad hands.
DAVIES: I want to clarify one thing: Beacons are a particular kind of software that gets installed on people's computers to gather information. Does that track keystrokes?
Ms. ANGWIN: So, beacons are basically a kind of software that runs in the background while you're on a page, live. It doesn't install onto your computer. And they can do a variety of things, including log your keystrokes, if the website that has installed that beacon allows that to happen.
It's fairly rare that anyone is allowing a keystroke capture. The instance that we talked about, though, with the movies and the woman whose favorite movies were known by a tracking company, that company had a relationship with a website that allowed them to capture her typing in her favorite movies, but that's fairly unusual and probably the cutting edge of use of beacons.
DAVIES: So what distinguishes a beacon from cookies is that they're not on your computer, they're on the website, and they're following what you're doing?
Ms. ANGWIN: Yeah. The beacon is running on the page, in the background, but a cookie is something that is stored on your hard drive and doesn't take -doesn't sort of actively monitor. It only knows what page you're on. But it's not attempting to see what you're doing on that page.
GROSS: Julia Angwin will continue her discussion with FRESH AIR contributor Dave Davies about how we're tracked on the Internet in the second half of the show. Angwin is senior technology editor for the Wall Street Journal.
I'm Terry Gross, and this is FRESH AIR.
(Soundbite of music)
GROSS: This is FRESH AIR. Im Terry Gross. We're talking about one of the fastest growing businesses on the Internet: spying on consumers. Our guest, Julia Angwin, led a team of reporters from The Wall Street Journal that investigated how most of the commonly visited websites use sophisticated software to track our movements through the Web so they can sell the information they get about us.
Angwin is senior technology editor at The Wall Street Journal. She spoke with FRESH AIR contributor Dave Davies about the use of tracking software and the new issues it raises.
DAVIES: This has to raise some privacy concerns. Are there court cases on this, on what's permissible and what isn't?
Basically, the argument was it was sort of like the website was a caller and you were - and they were talking to you on the phone and they were sort of secretly allowing a friend to listen in on another line. And so, there is a legal grounding for this type of monitoring.
What's happening now is that this kind of monitoring has exploded, right. There's more trackers than anyone ever envisioned back then a decade ago, and so a lot of people are rethinking some of these laws. There's a couple bills pending in Congress and the Federal Trade Commission is rethinking its guidelines on privacy and planning to issue new guidelines by the end of the year.
DAVIES: What are some of the issues that you think should concern us the most?
Ms. ANGWIN: One issue is the question of anonymity. I think it's totally fair to say that these tracking companies dont know your name, but my feeling is, if they know everything else about you...
(Soundbite of laughter)
Ms. ANGWIN: ...does it matter that they dont know your name? Right, because it feels intrusive to have somebody know so much about you, particularly when we do so much online. I mean, for at least for me, I do everything online. So my, when I look at my record of my browsing history or I look at what Web pages I've looked at, it really seems to be like a record of my thoughts.
Every time I have a thought it seems I actually take an action online and Google it. So it does build up these incredibly rich dossiers. So I think one question is: Is knowing your name the right definition of anonymity? Right now that is considered anonymous. If they dont know your name, they're sort of not covered by laws that regulate personally identifiable information. And that's what the FTC is considering, is whether the definition of personal information should be expanded beyond just name and Social Security address number and a few other things.
Another thing that it raises is sensitive information. So if youre looking at gay websites and then, youre sort of labeled as gay in some database somewhere and that follows you around and youre sold on some exchange as gay and you just may not want that to happen. So I feel that there are some categories of information being collected that we as a society might not want to be collected: our political affiliation, our diseases, our income levels and, you know, many other things. You know, what if youre feeling suicidal and you Google, you spend a bunch of time looking at suicide pages - do you want that sort of in some database somewhere?
DAVIES: In addition to which, they can get it wrong, right? I mean, you might be researching something that youre not necessarily interested in and they could be misled.
Ms. ANGWIN: Exactly. And then youre going to see that all the time and people are going to make judgments about you based on that. And, you know, and we have laws about your credit score. Your credit score affects a lot of things in your financial life. And because its so important and affects so many things, there are laws that say you get to see the components of your credit score and have the ability to correct them. Now it's not so easy to do that, as anyone who has tried knows, but at least you have the legal right to that.
And I think that we might reach a time where this kind of tracking information, your electronic dossier, might determine so much about your life that you would want the ability to see it and correct it. And there are websites that are beginning to offer that, so the industry has acknowledged that but it doesnt go far enough, in my opinion.
DAVIES: And how would they correct it? I mean, you can look at your own profile and amend it?
Ms. ANGWIN: Yeah. So some websites, some of the bigger companies in this tracking field: Google, Yahoo, Microsoft, this trading exchange Bluki that I mentioned, Lotame, the company that monitored that woman's movie preferences, they have places you can go on their Web pages, assuming youve ever heard of them, and know to do this, and they will tell you what they think what they know about you. So you might go in and it says, oh, we think youre, you know, 15 years old and you love, you know, you love gambling and you love sports and if we're wrong you can fix this, or you can try to remove all the information. Most of them will allow you to remove all the information from the profiles.
DAVIES: You know, a lot of computers are shared. And then, of course, if a profile is built of you, around you based on the computer you use, youre not going to keep that computer forever. Isn't this information, I mean, both transitory and in some ways misleading because you could have four or five people using a computer at home.
Ms. ANGWIN: Yes, it's true. The information is both transitory and misleading. So what most of the data exchanges or most of the people buying this information pay a lot for recency(ph) because they feel that any actions you took a month ago might not really reflect who you are. But a lot of this market is actually really instantaneous, so that you could take an action on one Web page and then see something different on the next Web page. And so, that's how they attempt to get around the problems that youre talking about, which are the shared computers and the fact some of it might be wrong.
DAVIES: Which browsers or websites stood out as protecting privacy better than others?
Ms. ANGWIN: So it's worth calling out Wikipedia because they, on principal, dont allow any type of tracking technology on their website, so they were the only one we found with none. Facebook actually and a few other sites had very few. I think Facebook is a good example of a company that has plenty of information about you and really no interest in sharing that information. So there's no reason for them to allow someone else in to spy on you since, you know, their big value is that they know so much about everybody. So some of the websites were better, but on average, most websites install 64 different types of tracking technology, so the average is still pretty high.
In terms of Web browsers, none of them are great at blocking this type of monitoring. We had a big story about Microsoft and the battle at Microsoft internally about whether they could improve the privacy of their Web browser software. And ultimately, they decided not to improve it, in large part because of the advertising side of their business. And that's - the sad truth is that the two - two of the biggest browser makers, Microsoft and Google, are very heavily in the advertising business, which is essentially a business of tracking. So they have very little incentive to improve the ability of the Web browser to block this type of software. So the best way to get around this kind of tracking is you have to install some additional software into your Web browser that would block it.
DAVIES: Yeah, let's just pause over Microsoft for a moment here, because Microsoft Explorer I guess is by far the most widely used Web browser. And there was this debate about whether they would change, I guess, the default settings, right, so as to exclude this kind of information gathering. Was that the debate, essentially?
Ms. ANGWIN: Yeah. I mean, the engineers at Microsoft had a very innovative idea, which was to attempt to block tracking devices from companies that didnt appear to be the one that you were transacting with. So, meaning, if it's not the website that youre actually visiting and it's some other company installing some tracking device on your Web browser, the default was going to be no, I dont want that. And unfortunately, their view was overruled by the advertising side of company.
DAVIES: Right. And just kind of help us understand that. Why would the advertising side of Microsoft want users of the Internet Explorer to have all these third-party information tracking devices installed or pieces of software on their user's computers?
Ms. ANGWIN: Well, Microsoft is not just a Web browser maker, it's also, it runs a very big online advertising firm. And that firm was among the biggest trackers that we found in our database, so they were the second largest installer of tracking software among the 50 websites we surveyed after Google. So they are in the business of tracking users and compiling profiles of users and then selling those to advertisers. So the idea that this Web browser would've blocked that would undermine the entire business model, not just of their ad business, but really of the entire online advertising industry. So it was a very threatening move that these engineers at Microsoft were proposing.
DAVIES: We're speaking with Julia Angwin of The Wall Street Journal. She led a team that looked into companies that gather information about us on the Internet. She's also the author of a book called "Stealing MySpace: The Battle to Control the Most Popular Website in America."
We'll talk more after a break. This is FRESH AIR.
(Soundbite of music)
DAVIES: If youre just joining us, we're speaking with Julia Angwin. She is a senior technology editor at The Wall Street Journal. She led a team that looked into companies that gather information about us on the Internet.
Nobody has more information about us than Google. Are they going to are they marketing what they know about us?
Ms. ANGWIN: Yes. So Google is yes, Google seems to know everything about me for sure. And I do, I think that that's probably true for most people. Google has been extremely cautious in the past about using what it knows about people, in part because their business is built on trust. One reason we share so much with Google is that we trust them not to abuse it.
But the developments in the online ad industry have been moving so fast, right, so tracking has become so pervasive and the markets for data have become - have sprung up out of nowhere, that Google is now actually being forced - maybe forced is too strong a word - but Google is now moving more in the direction of tracking than it ever has before.
In the early days, you know, the founders of Google were opposed even to the idea of using cookies. And now they're among the biggest trackers and they are considering all sorts of moves into what really is the leading edge of this industry.
DAVIES: So just to be clear, I mean, one of the things that we know Google does now and it's a great revenue source for them is that if we go online looking for running shoes and do a search, it'll pop up ads from people who are selling running shoes, even running shoes near where we live. But what we're talking about is them keeping - aggregating all this data they get from all of the times weve used their search engine to go to all kinds of websites, taking all that data, aggregating it and selling it to others. They're not doing that yet but they may?
Ms. ANGWIN: Well, the thing is that right now when you go to Google and do a search, youve told them what youre interested in, right? So when they serve up an ad to you based on that interest, it's not as creepy as in...
Ms. ANGWIN: ...the shoes that I was looking at started following me around from website to website. But Google, in addition to having a search advertising business, has a company called DoubleClick that they bought two years ago and that company really places ads across many websites all across the Internet. And so, Google is really looking at how do they improve that business? So not so much about their search business, but how can they use all the data they have about online behavior across all their properties to help improve their ability to serve up these ads across all the other websites that DoubleClick and their other ad properties serve?
And that's where you get into the issue of well, could they use some of the data they have in one bucket, maybe their search data or information they have about what youre interested in from Gmail to improve the targeting abilities of their other advertising business.
DAVIES: Is anybody in the government thinking about this stuff?
Ms. ANGWIN: Yeah. I think that, you know, there are some bills in Congress that aim to tackle these issues and the Federal Trade Commission has been holding a series of roundtable discussions this year about privacy and, particularly, about this tracking business. But it's a really hard thing to regulate.
This is an industry that's moving at light speed. There are new types of tracking technology every day. There are new issues about new platforms. So weve been talking about tracking online but mobile is a whole different world, right, that's where they know your location. And, you know, the cookies on your mobile phone, the tracking places, you can't even see them.
At least on your computer you can go into the back room, sort of, of your computer and see what's being installed there. Most cell phones dont even allow you to see what kind of data is being tracked. So I think the problem for regulators is, how do you get your arms around sort of this ever-expanding pool of technology, and as soon as you regulate some, another one's going to pop up somewhere else.
DAVIES: Let's talk about things consumers can do themselves if they want to. Can you block this information gathering software from making its way on to your computer?
Ms. ANGWIN: Yeah, you can do a couple things. You can try to play around with your Web browser settings to block the type of cookies, but none of the Web browsers have made it particularly easy - although, Apple Safari, by default, blocks third-party cookies, which is a large part of the tracking but not all of it. Then you could also install additional software on to your computer that would block this tracking. So there's one in particular that we recommended called Abine, A-b-i-n-e, which will block all the types of tracking that we looked at in our database, which is cookies, flash cookies and beacons.
And also you can go to the websites of all of these tracking companies and ask them not to track you which is kind of absurd, because you'd have to know who they are. There is a list of all of them on the ad industry's webpage and you can opt out of all of them at the same time. But one thing to know about opting out of tracking is that they actually put a tracker on your computer saying don't track me. So you're opting in to being tracked for not being tracked.
(Soundbite of laughter)
DAVIES: Well, and I should also say that the series that you and your team did in The Wall Street Journal has a really terrific Web section, in which a lot of this is explained and a lot of information about individual browsers and websites is contained. So you can go on to The Wall Street Journal website, in the non-charging part of it - although, as you note in your piece, when you go on to The Wall Street Journal's website, it also install tracking software, right?
(Soundbite of laughter)
Ms. ANGWIN: Yes, it does. We checked our self, too, to make sure that we are being fair and The Wall Street Journal installed just slightly less than the -average number of trackers. But yeah, we built a big online database of all the trackers we found on each website, and this was actually for us, at The Journal, a new thing. We haven't been building big comprehensive online databases. So we wanted people to really have a chance to look at their favorite websites, see what we found there and explore some of the information about how their data might be used.
DAVIES: Is there anything else that should worry us or reassure us on this subject?
Ms. ANGWIN: Well, one thing that worries me about this emerging market for data is what - who's buying it. Right now it's primarily advertisers. But there's nothing to stop a health insurance company from going in and buying data about all the people who are browsing sites that are talking about certain diseases and then trying to figure out whether they can exclude them from their policies. Or an insurance company looking at people who are browsing sites for like super fast car driving and then trying to go make their premiums go up. And we haven't seen this yet, but the fact that this data is available raises that possibility and that worries me.
DAVIES: You know, one of the fascinating, and maybe troubling things to contemplate about this, is if these companies are collecting this information, building a profile about us that they assume is accurate but may not be, that's then sold to advertisers as well as, you know, websites which then tailor what they present to us on the Internet, based on who they think we are. I mean is there a concern that we're - what - getting a, either a reflection of our own interests and losing kind of the diversity of the Web, or that they're getting it wrong?
Ms. ANGWIN: I mean yes, this is a concern. I mean, I dont know if you or our listeners know this, but already, Google has started using what it thinks it knows about you to give you different search results. So you and I would see different results based on where Google thinks we're located and the cookie on our computer, the I.D. number, and the information they have about our previous searches. And they are trying to provide us better service, but at the same time they could be wrong about what we want to see. And this is true across the Web. So as more websites are using this type of data to customize their experience, so a new site customizing it, based on the fact that they think youre a sports fan and putting sports news first - where we wrote about a credit card company that's showing credit card offers based on who they think you are - that youre starting to see an Internet that looks like yourself -that youre standing in a hall of mirrors and all you see around you is this reflective version of yourself - that might be true, actually, but is sort of the dossier that's been built up about you electronically.
To me, that changes the entire experience of the Internet and makes it a place that feels very narrow. And the beauty of the Internet is sort of the breadth and width of our experience there, that you can find anything you want and that you have this feeling that youre anonymous and that you can sort of peer in to other people's lives.
DAVIES: Well, Julia Angwin, it's been an education. Thanks so much for speaking with us.
(Soundbite of laughter)
Ms. ANGWIN: Okay. Thank you.
GROSS: Julia Angwin spoke with FRESH AIR contributor Dave Davies. She's senior technology editor at The Wall Street Journal.
Our website has links to her articles about consumer surveillance on the Internet, and we have links to websites that will help you opt out of being tracked by companies that want to gather information about you while youre on the Internet. Our website is freshair.npr.org.
Coming up, Kevin Whitehead reviews the new album by Steve Coleman and Five Elements.
This is FRESH AIR.
(Soundbite of music) Transcript provided by NPR, Copyright NPR.
Nearly all of the most commonly visited websites install invisible tracking software on your computer so the information can be sold to advertisers. Julia Angwin, who recently led a team of Wall Street Journal reporters investigating the practice, explains what companies do with the information -- and how you can protect your privacy online.
One of the fastest-growing online businesses is the business of spying on Internet users by using sophisticated software to track movements through the Web, so that the information can be sold to advertisers.
Julia Angwin recently led a team of reporters from The Wall Street Journal in analyzing the tracking software. They discovered that nearly all of the most commonly visited websites gather information in real time about the behavior of online users. The Journal series identified more than 100 tracking companies, data brokers and advertising networks collecting data -- which are then sold on a stock market-like exchange to online advertisers.
In a recent conversation with Fresh Air contributor Dave Davies, Angwin explains how consumer surveillance works, how users can disable the tracking software -- and how advertisers are continually evolving to keep up with the data they receive. She notes that many Internet users are unaware that their information is being tracked and then traded.
"Most people that we have heard from since writing these stories did not know what was going on," Angwin explains. "So when you go to a website, you're not thinking about the fact that they might have relationships with all different types of monitoring firms, and those firms are installing things that are invisible to you on your computer."
Julia Angwin is senior technology editor of The Wall Street Journal, and author of the book, Stealing MySpace: The Battle to Control the Most Popular Website in America.
How cookies and beacons work
Based on the Wall Street Journal profile of 26-year-old Ashley Hayes-Beaty and what tracking companies knew about her based on her Internet usage.
"The company tracking Ashley knew all of her favorite movies, her age, her hometown and that she liked quizzes and entertainment news. ... She was given an ID number, which was stored on her computer in something called a cookie. And a cookie is a text file on your computer and really just gives you an ID. And often times when you visit a website, these cookies are installed without you knowing it. So she had an ID number in her cookie. Separately, when she went to some websites they had a different kind of technology called a beacon, which is another invisible kind of tracker that runs some software while you're on a page and tries to figure out what you're doing on that page. So in her case, this beacon was actually seeing her activity around movies in particular -- she had listed her favorite movies on a website -- and it saw that she was typing those in, and captured that data and stored it in a profile, which is stored at some mother ship where there's a little drawer that has her ID number, and inside the drawer it says, 'These are her favorite movies.' And every time they find more information about her, they add more to the file."
On Dictionary.com, the site with the most trackers installed (among the 50 most-popular websites)
"The one site that installed the most was Dictionary.com. A visit to Dictionary.com resulted in 234 trackers being installed on our test computer, and only 11 of those were installed by Dictionary.com. Some tracking devices are completely innocuous. A cookie, or some type of tracker that remembers your password, [can be innocuous]. So if you ask a website to remember your login, that can be stored on a cookie. There are tracking devices that are useful to you as a Web browser. And those tend to be the ones that are installed by the website that you actually have a relationship with, not the ones that you've never heard of before that are sort of secretly lurking behind the scenes. So on Dictionary.com, the vast majority of the trackers (200 out of 234) were installed by companies that the person visiting the site probably had never heard of."
On privacy concerns
"It's totally fair to say the tracking companies don't know your name, but my feeling is if they know everything else about you, does it matter that they don't know your name? Because it feels intrusive to have somebody know so much about you, particularly when we do so much online. When I look at my record of my browsing history or I look at what pages I look at, it really seems to be a record of my thoughts. Every time I have a thought, I take an action online and Google it. So [online tracking] does build up these incredibly rich dossiers. One question is: Is knowing your name the right definition of anonymity? Right now, that is considered anonymous. If they don't know your name, they're not covered by laws that regulate personally identifiable information. And that's what the Federal Trade Commission is considering -- that the definition of personal information should be expanded beyond name and Social Security number. Another thing that [online tracking] raises is sensitive information. So if you're looking at gay websites, then you're labeled as gay in some database somewhere and then you're followed around and sold on some exchange as gay, and you just may not want that to happen. So I feel like there are some categories that we as a society may not want collected: our political affiliation, our diseases, our income levels and many other things."
On how to protect yourself as a consumer
"You can try to play around with your web browser settings to block the type of cookies [that install tracking software], but none of the web browsers have made it particularly easy. Apple Safari, by default, blocks third-party cookies, which is a large part of the tracking but not all of it. Then you can also install additional software that would block this tracking. So there's one [browser add-on] in particular that we recommended called Abine, which will block all the types of tracking that we looked at in our database, which was cookies, flash cookies and beacons. Also, you can go to the websites of all of these tracking companies and ask them not to track you -- which is absurd, because you'd have to know who they are. There is a list of all of them on the ad industry's webpage, and you can opt out of all of them at the same time. But one thing to know about tracking is they actually put a tracker on your computer saying don't track me. So you're opting in to being tracked for not being tracked."
On tracking software installed by The Wall Street Journal
"We checked ourselves, too, to make sure we were being fair. And The Wall Street Journal installed just slightly less than the average number of trackers."