OkCupid co-founder Christian Rudder is a man obsessed with data. His dating site is known for gathering enormous amounts of information about users — the more questions you answer about yourself, the better the site's algorithm can, in theory, find you a match.
Like other social sites, OkCupid keeps track of user data in order to make the site more effective. But, Rudder says, that information could also change the way we see ourselves.
It's true that data isn't everything, he says. "Look, there's no way OkCupid, Facebook, Twitter, these sites even added all together can stand in for the entirety of the human condition," Rudder tells NPR's Arun Rath. "People do all kinds of things they don't do online."
But as more and more activities have some sort of online component, there's an increasing amount of data accessible about our lives. Rudder collects some of that information in his book Dataclysm: Who We Are (When We Think No One's Looking). It's full of charts and graphs that use aggregated online data to help explain everything from political beliefs to speech patterns — and, as Rudder tells Rath, even race relations.
On why shorter messages aren't necessarily less erudite
I looked at a large sample of tweets and looked at the average word length in tweets. And of course, the first blush would be that with all the abbreviations like "u" for you and thanks, you know, "thx," that there would be shorter words on twitter. And I found that just wasn't the case — [the words] were longer not only than Shakespearean English but also [than] in longform journalism. It's much shorter, obviously, because there's only 140 characters, but it is just as robust in a certain sense.
On what online dating tells us about race relations
All the data on race I have is from dating sites, but on these sites black users, especially, there's a bias against them. Every kind of way you can measure their success on a site — how people rate them, how often they reply to their messages, how many messages they get — that's all reduced.
And so obviously race is always a topic in his country, but especially now with Ferguson, it's such an emotional issue. And it's rare that you can find data that speaks of how one person of whatever race treats another person of another race in an aggregate and kind of measurable way. And so online data is very good at that, and specifically dating data because it's all just strangers mixing with each other. The whole premise of a dating site is to judge people, and so you really are able to tease that out free of any offline social constraints — your friends don't know what you do on a dating site like they do know what you do on Facebook.
... Of course, when you ask anybody directly, even if it's just a computer asking the question, people are like, "Oh yeah, interracial marriage is great. I don't care what race my match is," you know all of that stuff you'd expect a kind of decent, forward-thinking person to say. But when you actually are observing passively what they're doing when they don't think you're keeping track, you see a totally different story.
On how men and women perceive attractiveness online
Age is a huge variable in online dating. Women generally like a guy to be the same age as them up until the guys hit about 40. But when you flip it around, when you look at how men perceive women, it's pretty much a just straight ticket vote for 20, which is the lowest age I looked at in my data set. So even 45-year-old guys rate 20-year-old women the best. It's a very pervasive opinion among at least the people on OkCupid.
... This is just measuring people's opinions, not what they actually go out and do. What you see when you actually look at what people do, you see the realism set in. So these 40-year-old guys ... they people they actually have the courage to actually go out and message are a lot older: it's 30, 35-year-old women.
On whether mass data collection is a good thing
I definitely think it's good. ... All of this data — everything in the book and generally anything you read online about people's behavior on sites — is aggregated and anonymous. Nobody's looking at your personal account. But when you put all this stuff together, you're able to look at people in a way that people have never been able to look at people before. ... You have millions and millions of people living their lives through an interface that records what they're doing as they live. ...
Copyright 2016 NPR. To see more, visit http://www.npr.org/.