counter create hit Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are - Download Free eBook
Hot Best Seller

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

Availability: Ready to download

Foreword by Steven Pinker Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions. By the end of an average day in the early twenty Foreword by Steven Pinker Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions. By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable. Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women? Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.


Compare

Foreword by Steven Pinker Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions. By the end of an average day in the early twenty Foreword by Steven Pinker Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions. By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable. Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women? Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.

30 review for Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

  1. 4 out of 5

    Jessi

    This book tries too hard to be Freakonomics. The first two parts are full of random examples of interesting but mostly pointless things that can learned via Google search trends. However, a whole lot of assumptions are made off these bits of data that don't seem to have much basis in factual scientific methods of research. Unprofessional jokes are thrown in randomly. If you need a footnote to explain why a joke was not homophobic maybe you should have just skipped the joke. And any book of less This book tries too hard to be Freakonomics. The first two parts are full of random examples of interesting but mostly pointless things that can learned via Google search trends. However, a whole lot of assumptions are made off these bits of data that don't seem to have much basis in factual scientific methods of research. Unprofessional jokes are thrown in randomly. If you need a footnote to explain why a joke was not homophobic maybe you should have just skipped the joke. And any book of less than 300 pages of text should not need to use the same example three times, especially when it's about how the author can't believe women are concerned about the smell of their vagina. The last section of the book explains the limitations big data holds and is really the most grounded section, the rest being almost hagiography. It would have done a lot to work the third section into the examples of the first two sections. It would have balanced out the praise and also would have done much to explain the flaws present in some of the examples included. Some cool facts buried in a lot of murky oddness. Disclaimer: I was given this book in a Goodreads giveaway.

  2. 5 out of 5

    Will Byrnes

    …people’s search for information is, in itself, information. When and where they search for facts, quotes, jokes, places, persons, things, or help, it turns out, can tell us a lot more about what they really think, really desire, really fear, and really do than anyone might have guessed. This is especially true since people sometimes don’t so much query Google as confide in it: “I hate my boss.” “I am drunk.” “My dad hit me.” There’s lies, damned lies and then there are statistics. One must w …people’s search for information is, in itself, information. When and where they search for facts, quotes, jokes, places, persons, things, or help, it turns out, can tell us a lot more about what they really think, really desire, really fear, and really do than anyone might have guessed. This is especially true since people sometimes don’t so much query Google as confide in it: “I hate my boss.” “I am drunk.” “My dad hit me.” There’s lies, damned lies and then there are statistics. One must wonder. Do the lies get bigger as the datasets grow? Seth Stephens-Davidowitz posits that the availability of vast sums of new data not only allows researchers to make better predictions, but offers them never-before-available tools that can offer insight that direct questioning never could. We have seen steps up of this type before. Malcolm Gladwell has made a career of such, with Blink, Outliers, and The Tipping Point. Freakonomics is the one I would expect most folks would know. Nate Silver put his data expertise into The Signal and the Noise. All these looks at data and how we interpret it rely on the analyst, regardless, pretty much, of the data. While the same might be true of Stephens-Davidowitz’s approach, he focuses on the availability of materials that have not been there in the past. The smarts that must be applied to get the most interesting results can now be applied to new oceans of data. It is more possible than it has ever been to draw inferences and actually test them out. In addition to the volume of data that is now available, there is the sort. The author looks at Google and FB data for evidence of underlying realities. Surveys can sometimes offer inaccurate outcomes, when the people being queried do not provide honest answers. Are you a racist? Yes/No. But one can look at what people enter into Google to get a sense of possible racism by geographic area. The everyday act of typing a word or phrase into a compact rectangular white box leaves a small trace of truth that, when multiplied by millions, eventually reveals profound realities. Looking for queries on jokes involving the N-Word, for example, turns out to yield a telling portrait of anti-black sentiment, which also correlates with lower black life expectancy. (And pro-Trump vote totals) We are treated to looks into a variety of research subjects, from picking the ponies, to seeing what really interests/concerns people sexually, looking for patterns of child abuse, selecting the best wine, using the texts of a vast number of books and movie scripts to come up with six simple plot structures. I thought the most interesting piece was on the use of associations, and provoking curiosity, rather than relying on overt statements to influence how people feel about a different group of people. Another was on using a data comparison of one’s (anonymous) medical information to others who share many characteristics to improve medical diagnoses. There are some areas in which it was not entirely persuasive that the methodology in question was tracking what was claimed. SS-D sees in searches of Pornhub, for example, what people really want and really do, not what they say they want and say they do. Really? I expect that what people check out on-line does not necessarily track with what might be of interest in real life. It would be like someone with an interest in mysteries being thought to have homicidal tendencies after searching for a variety of homicide related titles. Should a writer doing research into a dark subject like child pornography, human trafficking or cannibalism expect the heavy knock of the police on his/her door? Where is the line between an academic or titillation search and one made for planning? SS-D makes a point about there being a significant difference between searches that offer projections for groups or areas, and their inapplicability for predicting individual behavior, although that will not necessarily remain the case. In baseball, for example, the explosion of available information may very well be applied to specific players to diagnose and even correct flaws in technique, or recognize patterns that might expose underlying medical issues, or predict their arrival. The Big Data related here is much more macro, looking at group proclivities. Useful for spotting trends, measuring public sentiment, but in more detail than has been heretofore possible. And of course there is the impact of dark players. Those with the resources and motivation could manipulate the Big Data produced by Google and Facebook. Such players would not necessarily be limited to Russian cyber-spies and pranksters, but corporate and ideological players as well, like Robert Mercer. There could have been a bit more in here on those concerns. The book offers plenty of anecdotal bits that could have been lifted from any of the other data books noted at the top of this review. What one needs, ultimately is smart, insightful analysis. Having all the data in the world (that means you, NSA) is merely a burden unless there is someone insightful enough to figure out the right questions to ask, and how to ask them. SS-D notes several Google (Trends, Ngrams, Correlate) services that might be familiar to folks doing actual research, but which were news to me. It might be useful to check out some of these, maybe even come up with meaningful queries to shed light on pressing, or even completely frivolous questions. Not all problems can be solved, or even examined by the addition of ever more data. Sometimes, many times, the information that is available is perfectly sufficient to the task, but other factors prevent the joining together of its various pieces to create a meaningful whole. The now classic example is from 9/11, when an absence of coordination between the CIA and FBI resulted in suicide bombers who could have been foiled succeeding in their mission. Politics and the culture of nations and organizations figure into how data is used So if everybody lies, is Seth Stephens-Davidowitz telling us the truth? I am sure there is a query one could construct that would look at diverse data sources, pull them all together and give us a fuller picture, but for now, we will have to make do with reading his book and articles, checking out his videos, applying the analytical tools already incorporated into our brains, and seeing if there is enough information there with which to come to a well-grounded conclusion. And that’s no lie. Review first posted – May 5, 2017 Publication date – May 9, 2017 =============================EXTRA STUFF Links to the author’s personal, Twitter, and FB pages VIDEOS – SS-D speaking ----- Stanford Seminar - Insights with New Data: Using Google Search Data -----Google Sex with Seth Stephens-Davidowitz - Arts & Ideas at the JCCSF ----- Big Data and the Social Sciences - The Julis-Rabinowitz Center for Public Policy and Finance The June 2017 National Geographic cover story has particular relevance to the treatment of actual truth in today's political environment. It is illuminating, if not exactly uplifting. - Why We Lie: The Science Behind Our Deceptive Ways - By Yudhijit Bhattacharjee July 12, 2017 - Washington Post - one of the very serious applications of big data - The investigation goes digital: Did someone point Russia to specific online targets? - by Philip Bump July 15, 2017 - One of the ways big data gets compromised is via automated dishonesty - Please Prove You’re Not a Robot by Tim Wu - Thanks to Henry B for letting us know about the article

  3. 4 out of 5

    Lori

    When sociologist ask people if they waste food, people give the only correct answer. It's wrong to waste food. When sociologist survey the contents of the same people's garbage, they get a more accurate answer. Just imagine how much more information is available trolling through internet searches. When sociologist ask people if they waste food, people give the only correct answer. It's wrong to waste food. When sociologist survey the contents of the same people's garbage, they get a more accurate answer. Just imagine how much more information is available trolling through internet searches.

  4. 5 out of 5

    David

    This is an engaging book about how big data can be used to improve our understanding of human behavior, thinking, emotions, and preference. The basic idea is that if you ask people about their behavior or their preferences in surveys, even anonymous surveys, they will often lie. People do not like to admit to low-brow preferences; racists do not want to admit to their prejudices, most people who watch pornography do not want to admit to it, and even voting is often misrepresented; some people wh This is an engaging book about how big data can be used to improve our understanding of human behavior, thinking, emotions, and preference. The basic idea is that if you ask people about their behavior or their preferences in surveys, even anonymous surveys, they will often lie. People do not like to admit to low-brow preferences; racists do not want to admit to their prejudices, most people who watch pornography do not want to admit to it, and even voting is often misrepresented; some people who voted for Trump would not admit to it. But, by analyzing immense datasets from Google, public archives, social media, and the like, Seth Stephens-Davidowitz has been able to unearth a lot of fascinating answers to puzzling questions. For example, he is able to predict, through Google searches for various symptoms, who is likely to have early stages of pancreatic cancer. He can predict epidemic breakouts of some contagious diseases well before they are announced by the CDC (Center for Disease Control). He shows that the single factor that correlates with voting for Trump is that of racism. Then there are the fun factoids, about the sorts of things that people search for most often on Google. Most commonly, the search "Is my son ..." is followed by "gifted", while the search "Is my daughter ..." is followed by "overweight". That tells us something about stereotypes for the way people think about their children. Interestingly, the release of a new violent movie in a city is correlated with a decrease in violent crime in that city. Perhaps the reason is that violent people who are watching the movie are not out on the streets, committing crimes. And here we get to the main problem with this sort of analysis. Undoubtedly, the research and analysis of big datasets is done correctly. However, once a surprising result is found, understanding the motivations behind the online activity are often subjective and open to interpretation. While this book is very careful about its underlying assumptions, it is a slippery road to getting the correct interpretations and explanations. This is an easy, well-paced book that should appeal to anybody who enjoys books like Freakonomics: A Rogue Economist Explores the Hidden Side of Everything.

  5. 4 out of 5

    Richard Derus

    2020 EXHORTATION Wednesday, 29 July 2020, the four horse-manuremen of the datapocalypse will testify before Congress about their insane, untrammeled greed and its deleterious effect on Society. (I am presupposing the end result of the hearing here because I am under no obligation to hide my own opinion of these nauseating monopolists.) 2019 EXHORTATION We're entering the 2020 election cycle for real at this moment. Please, all US citizens, PLEASE read books! Especially books about data, how it's 2020 EXHORTATION Wednesday, 29 July 2020, the four horse-manuremen of the datapocalypse will testify before Congress about their insane, untrammeled greed and its deleterious effect on Society. (I am presupposing the end result of the hearing here because I am under no obligation to hide my own opinion of these nauseating monopolists.) 2019 EXHORTATION We're entering the 2020 election cycle for real at this moment. Please, all US citizens, PLEASE read books! Especially books about data, how it's acquired and analyzed, how it's massaged and manipulated—the more you know about the topic, the harder it will be for agenda-having politicians to lie to you with numbers. I have nothing unique to add to the conversation about this book. I think those most in need of reading it won't, and that's frustrating. If you've ever seen a number adduced to explain a trend, read this book. If you've ever asserted that a certain percentage of something was something/something else, read this book. If you've ever seen a politician quote a study and your innate bullshit filter clogged up, read this book. Really simple, high-level terms: READ. THIS. BOOK.

  6. 5 out of 5

    Eli ad

    such an interesting book, it broaden my views, i'm looking forward to read more books of the author such an interesting book, it broaden my views, i'm looking forward to read more books of the author

  7. 5 out of 5

    Monica

    Everybody Lies has all the makings of the kind of book I get suckered into buying during an amazon kindle sale. A pop culture polemic that has a very short half-life of relevancy. After reading it, my first blush was to say that I was spot on. But as I thought about it, I realized it had more depth. That's likely because Seth Stephens-Davidowitz is an actual scientist trying to educate people about what they are actually revealing with everything that they say and do. The late 20th Century has h Everybody Lies has all the makings of the kind of book I get suckered into buying during an amazon kindle sale. A pop culture polemic that has a very short half-life of relevancy. After reading it, my first blush was to say that I was spot on. But as I thought about it, I realized it had more depth. That's likely because Seth Stephens-Davidowitz is an actual scientist trying to educate people about what they are actually revealing with everything that they say and do. The late 20th Century has heralded access to vast quantities of information on every one of us. Our buying habits, browsing habits, what news sources that we use in a very proliferated world of news access. We are telling about ourselves every time we go online on our tablets, phones and computers. Every text, every phone call, every e-mail is adding data to our digital make up. Like it or not, data is collected and available for each and every one of us on some very personal things. There is a field of science dedicated to analyzing the data and interpreting its meaning. The byproducts of this new field are used for good and evil. Corporations can use the information to target people who may buy their items (How did goodreads know that I was thinking about buying a mattress?). Some data mining results could be used to determine how much people need certain types of government services. Some internet searches combined with buying habits, forum discussions, book reviews, blog posts etc have led to medical discoveries. The amount of data is staggering and the ability to compile and analyze the data to reveal useful information is a new science that goes way beyond statistics. It requires knowledge of math, and sociology and psychology and engineering and biological science and an understanding of human nature etc to attempt to mine useful information. What Stephens-Davidowitz has discovered is that everybody lies about…well everything. His primary discussion is that people rarely tell the truth in poll and surveys etc. They also lie on their online data habits. Oftentimes to themselves. That little fact obviously complicates the mining of data for example in Red States w/ their stated evangelical postures that consume the most porn and have the highest rates of internet searches for access to abortions etc. People lie in their own searches as they seek to reinforce their own positions and don't necessarily search for answers. Those kinds of actions are not surprising but they complicate analysis (understatement). This book was very interesting data primer. There is so much more to the amount of data most of us generate every day and Stephens-Davidowitz does a great job of explaining the basics. Some of his examples and his approach are a bit superficial, juvenile, pop culture. I don't find myself curious about the users of pornhub, or average penis size or baseball stats. Some of that was silly and salacious; betraying his youth and blatantly catering to what his data mining perceived would be an audience of young males. Bah. Also, he quoted Malcolm Gladwell as a resource which in my view should never be used if you hope to build a foundation based upon experience in the field and credibility on the subject…of anything. Nonetheless, I enjoyed the book and I think Stephens-Davidowitz has a very compelling and prosperous future as both a scientist and a writer. 4 Stars Read on kindle

  8. 5 out of 5

    Carole

    Everybody Lies: Big Data, New Data, and What the Internet Reveals About Who We Really Are by Seth Stephens-Davidowitz takes us into the world of social sciences via the internet. I might have found the book version a bit on the boring side but I enjoyed the audiobook. Big Data can answer any and all of our questions. But will the answer be what we want to hear. And do we need to know about all aspects of the world we live in. The book is similar to some of Malcolm Gladwell's work but it is not M Everybody Lies: Big Data, New Data, and What the Internet Reveals About Who We Really Are by Seth Stephens-Davidowitz takes us into the world of social sciences via the internet. I might have found the book version a bit on the boring side but I enjoyed the audiobook. Big Data can answer any and all of our questions. But will the answer be what we want to hear. And do we need to know about all aspects of the world we live in. The book is similar to some of Malcolm Gladwell's work but it is not Malcolm Gladwell. However, you will be informed, you will learn about our world and you will sometimes be amused.

  9. 5 out of 5

    Jim

    I am now convinced that Google searches are the most important data set ever collected on the human psyche. writes the author early on & he shows why. (Google trends is available to all here: https://trends.google.com/trends/) He also checked other big data sets including Wikipedia, Facebook, Pornhub, & even Stormfront, the largest racist site. What he found was really interesting & it will help harden the soft, social sciences. It's a new frontier. He points out problems with traditional report I am now convinced that Google searches are the most important data set ever collected on the human psyche. writes the author early on & he shows why. (Google trends is available to all here: https://trends.google.com/trends/) He also checked other big data sets including Wikipedia, Facebook, Pornhub, & even Stormfront, the largest racist site. What he found was really interesting & it will help harden the soft, social sciences. It's a new frontier. He points out problems with traditional reporting. In the section about child abuse & abortions, Google searches suggest that child abuse does increase during economic downturns while gov't figures incorrectly show little change. Closing abortion clinics doesn't stop them, it simply leads to more self-induced abortions. Both happen off the books, but there is now convincing supporting data to show us what we need to address & make more informed decisions with resources. Big data has an advantage over every other type of survey because few realize it is being collected, so we don't lie to make ourselves look better. It's also anonymous & aggregate, so caution needs to be used when forming conclusions. For instance, based on Pornhub searches, the author concludes that about 5% of men are gay because they searched for gay porn. That seemed a reasonable conclusion until he pointed out that 15% of women search for rape porn. Does that mean they want to be raped? The author says of course not & makes a big deal out of the difference between fantasy & reality. That makes me question his first conclusion, although it seems about right. Gut reactions are often wrong & he provides several examples where it's wrong due to cognitive biases. He also points out "The Curse of Dimensionality". Given large enough sets of data, there will be correlations just through chance. For instance, there are graphs that show how closely autism diagnoses track with organic food sales or Jenny McCarthy's popularity. Separating these out is a whole other problem. Big Data only gives us trends that we need to examine. We can't use it on the individual level. While 1000 people searched for how to kill their girl friend, only 1 girl was killed in his example. That's horrific & might have been stopped if someone had looked at his search history, but do we give up everyone's privacy for a 1 in 1000 chance that we might prevent a murder? Some might be willing, but I'm not, so we also have new questions to address. The audio book was well narrated & I didn't miss the graphs too much. They're provided in the extra material, but weren't handy when I was listening & the book took that into account for the most part. Highly recommended in either format.

  10. 4 out of 5

    Amos

    No practicing analyst or social scientist will find anything of value in this book. It verges on being dangerously deceptive, filled with logical fallacies and half baked reasoning for it's conclusions. The book claims to be finding truth in an uncertain world, but actually is just adding to the noise. No practicing analyst or social scientist will find anything of value in this book. It verges on being dangerously deceptive, filled with logical fallacies and half baked reasoning for it's conclusions. The book claims to be finding truth in an uncertain world, but actually is just adding to the noise.

  11. 4 out of 5

    Rachel

    I wanted to like this book. It's an interesting topic. But I found the methodology extremely sloppy. Or maybe the author just omitted some key facts. He was clearly determined to prove that racism caused the election of Donald Trump. But it's disconcerting to read the conclusion BEFORE the data analysis itself. On one hand, he says that Obama easily won two terms, DESPITE racism. Then he quickly says that Trump won the 2016 election BECAUSE of racism. So which is it? Is racism so widespread that I wanted to like this book. It's an interesting topic. But I found the methodology extremely sloppy. Or maybe the author just omitted some key facts. He was clearly determined to prove that racism caused the election of Donald Trump. But it's disconcerting to read the conclusion BEFORE the data analysis itself. On one hand, he says that Obama easily won two terms, DESPITE racism. Then he quickly says that Trump won the 2016 election BECAUSE of racism. So which is it? Is racism so widespread that it caused both candidates to win, the black man despite it, the white man because of it? It makes no sense. Nor was I terribly convinced that Google searches for the word n-gger are actually clearcut reflections of a person who would never vote for a black president but always vote for Trump. That's a pretty big leap of logic. As is his notion that black people would spell it "n-gga" therefore all these searches are by white racists. Also, is it a real absolute that racists would never vote for a black president? After all, you could vote for Obama because you feel he's the best of two options, yet still be a flaming racist. Likewise, if you search Google for racist jokes, does that actually prove that you are treating minorities unfairly? It may sound like a reasonable conjecture but this is data science, not an op-ed column. There should be a more decisive connection before making a grand sweeping statement that Trump won due to racists but Obama won despite racists. The author is even sloppier in the section on searches of a pornographic nature. He refers to a data set from a porn site called PornHub. He has to assume that anyone who registers on that site and states "I am male" or "I am female" is absolutely telling the truth. But how do we know that? Are we sure that men never pretend to be women to chat with others, exchange messages, or share videos on porn sites? I'm not convinced. As was widely reported, 25% of searches by (alleged) women on porn sites are for rather violent porn. I don't mean a little spanking, but hardcore search terms including words like "brutal" and "crying" and so forth. 20% of the (alleged) women's searches are for lesbian porn. But the author is quick to point out: this is sexual fantasy! It's not real life! Those women aren't actual lesbians, nor do they want to engage in violent sex. But when it comes to men's searches, he regards those as literal fact. If men search for gay porn, it's because they're gay, maybe closeted, but definitely gay. Why does he insist this is true for men, but not for women? The same sloppy reasoning is applied to various other search terms. The fact that "boyfriend won't have sex" is far more common than "girlfriend won't have sex" is the foundation for his notion that men are more likely to refuse sex to their partners than vice versa. But how do we know that's actually true? What about the notion that women are more likely to SEARCH for a solution to this problem online? The fact is that we know absolutely nothing about the people performing these searches - whether they are male or female, racist or fair-minded, gay or straight. So making assumptions about their motivations based solely on search terms is just poor data science. Maybe there's some essential research that the author omitted. But it looks like pure speculation based on search terms, which is not what I would expect of an author who claims to be a data scientist. Stick with Freakonomics if this topic interests you.

  12. 5 out of 5

    Trish

    Maybe everyone does lie. But they don’t lie all the time. Stephens-Davidowitz makes the good point that asking people directly doesn’t always, in fact may not often, yield true answers. People have their own reasons for answering pollsters untruthfully, but it is clear that this is a documented fact. People sometimes lie to pollsters. Stephens-Davidowitz was told by mentors and advisors not to consider Google searches worthwhile data, but the more he looked at it, the more he was convinced that G Maybe everyone does lie. But they don’t lie all the time. Stephens-Davidowitz makes the good point that asking people directly doesn’t always, in fact may not often, yield true answers. People have their own reasons for answering pollsters untruthfully, but it is clear that this is a documented fact. People sometimes lie to pollsters. Stephens-Davidowitz was told by mentors and advisors not to consider Google searches worthwhile data, but the more he looked at it, the more he was convinced that Google searches contained the best data for determining what people are concerned about. He has uncovered some interesting trends that are not apparent through direct questioning because people are sometimes ashamed of their fears, feelings, prejudices, and predilections. ♾ I didn’t really like this book. Partly the reason is because I listened to it, and Stephens-Davidowitz gives charts, graphs, data points that obviously cannot be represented in the audio version. These usually help me to grasp things easily and maybe bypass pages of material that is not as interesting to me. It wasn’t that his material was hard, it was that I oftentimes did not like what he was talking about. He had a tendency to focus on deviant behavior, e.g., sexual predators, abuse, porn, etc. One might make the argument that these behaviors are important to understand and therefore worth looking at. Possibly. However, if ‘everybody lies,’ one might make the argument that we do not have to look at deviance to find untruthfulness. What we discover is that to test Stephens-Davidowitz’s thesis that ‘everybody lies,’ we have to spend quite a lot of time with statistics and creating studies, or as he is wont to do, studying big data. Big data probably irons out discrepancies in the reasons for our Google searches, e.g., that it is not me that is interested in the herpes virus, it is my brother, because in the end it doesn’t matter why we did the search; what matters is that we did the search. Besides, maybe I’m lying about my brother having the virus, but my interest in the topic is not a lie. Stephens-Davidowitz has made a career so far out of the study of big data, showing us ways to slice and dice it so that it is useful to our view of the world. Only thing is, I am not as interested in what big data tells us as he is. He’d trained as an economist, and towards the end of the book he hit a couple of areas I did find more interesting, like the notion of regression discontinuity, a term used to describe a statistical tool created to measure the outcomes of people very close to some arbitrary cut-off.** S-D talks about using this tool on federal inmates, discovering criminals treated more harshly committed more crimes upon their release. But S-D also studied students on either side of the admissions cut-off for the prestigious Stuyvesant High School: those who attended Stuyvesant did not have a significant performance difference in later life than students who did not. Apparently Stephens-Davidowitz went into data science because of Freakonomics, the bestselling book by Steven D. Levitt. He believes that many of the next generation of scientists in every field will be data scientists. I did finish the audiobook, another study he took note of in the last pages. Apparently few readers finish ‘treatises’ by economists. He believes this is his big contribution to our knowledge base, and there is no doubt his contrariness did highlight ways big data can be used effectively. If I may be so bold, I might be able to suggest a reason why many female readers may not be as interested in the material presented, or in Stephens-Davidowitz himself (he was/is apparently looking for a girlfriend). Stay away from the deviant sex stuff, Seth. It may interest you but I can guarantee that fewer women are going to find that appealing or reassuring conversation or reading material. An interesting corollary to this economists’ data view is the question of whether the truth matters, which is how I came to pick up this book. Recently on PBS’ The Third Rail with Ozy, Carlos Watson asked whether the truth matters. At first blush the answer seems obvious, and two sides debated this question. One side said of course truth matters…but most of us know one man’s truth to be another man’s lie. The other side said ‘everybody lies.’ It got me to thinking…I do think the two ways of coming to the notion of lying dovetail at some point, and one has to conclude that truth may not matter as much as we think. What matters is what we believe to be true. Finally, it appears Stephens-Davidson agrees to some degree with Cathy O'Neill, author of Weapons of Math Destruction, in that he agrees you best not let algorithms run without human tweaking and interference. The best outcomes are delivered when humans apply their particular observations and knowledge and expertise along with big data. ** S-D describes it this way: “Any time there is precise number that divides people into two different groups, a discontinuity, economists can compare, or regress, the outcomes of people very very close to the cut off.”

  13. 4 out of 5

    Matt Ward

    This book could have used a good editor. It tries to be a Gladwell-type of book without fully succeeding. Issue 1 is that the anecdotal stories are not fleshed out enough to really draw you in like Gladwell does. This causes much of the book to come across as a list of facts, and it gets pretty old by the midway point. The other issue is a growing trend among people writing data books. They want to write in a colloquial style to make it seem informal and easy to read. They don't want to scare off This book could have used a good editor. It tries to be a Gladwell-type of book without fully succeeding. Issue 1 is that the anecdotal stories are not fleshed out enough to really draw you in like Gladwell does. This causes much of the book to come across as a list of facts, and it gets pretty old by the midway point. The other issue is a growing trend among people writing data books. They want to write in a colloquial style to make it seem informal and easy to read. They don't want to scare off people with talk of algorithms and things like that. Unfortunately, using tons of sentence fragments and colloquial phrases only makes a book like this harder to read. It's precision and clarity that make books easy to understand. Introducing ambiguity in order to sound like a friendly conversation is exactly the wrong approach. Overall, there are a bunch of interesting facts in here. I think Seth gets a bunch wrong, though, in not understanding fully why certain search terms are used.

  14. 5 out of 5

    aPriL does feral sometimes

    I was annoyed by the author’s writing style in ‘Everybody Lies’. I have no doubts author Seth Stephens-Davidowitz was trying to write to a large general audience, including that assumed class of American non-science reader who hates math and binge watches ‘Keeping Up with the Kardashians’. Good for him, and maybe you, right? But I became more and more annoyed as I read. Ah, well. It is an interesting and informative read, in spite of trying too hard to be fun, imho. What is the book about? I am g I was annoyed by the author’s writing style in ‘Everybody Lies’. I have no doubts author Seth Stephens-Davidowitz was trying to write to a large general audience, including that assumed class of American non-science reader who hates math and binge watches ‘Keeping Up with the Kardashians’. Good for him, and maybe you, right? But I became more and more annoyed as I read. Ah, well. It is an interesting and informative read, in spite of trying too hard to be fun, imho. What is the book about? I am glad to report it has genuine information about the science of statistics and ‘big data’ collecting, and how the erroneous selection of study parameters or assumptions about what is relevant data to study affects conclusions (as far as I know - I am a dunce at scientific math, despite that I passed a statistics class). The author used what seemed to me genuinely interesting new methods to formulate statistical studies, primarily using Google’s forensic tools, along with other sources. I was shocked by what people type into Google Search (which Google compiles into anonymous data). For example, President Obama’s race appears to have truly ignited racists into coming out of their closets. Comparing survey interviews with people who state they are racist (a low percentage) with the percentage of those who Googled “n***** jokes” state by state turns out to show some truly hidden pockets of unexpected racism - and the total percentage of racist searches on Google was WAY higher than the racism that typical surveys show. In addition, those places who adore Trump also searched most for “n***** jokes”. Correlation? Idk, no one does know for the record, but I think yes. Also of interest to me (please don’t bust my balls because of my prurient interests - and maybe there is a pun in this sentence, hehheh - read on) men really truly do Google a lot about penis sizes. Come on, fellas, give it a rest! (Yes, I am trying to be snarky since the too much ‘at rest’ position is part of what men appear to be most anxious about!) Men prowl porn sites in humongous numbers - shocking, right? - which is good for statisticians looking for Truth about sexuality for their inputs into their mathematical equations. Based on Google porn searches, the author estimates 5% of the population is gay. (Btw, conservatives mostly use the word ‘homosexual’ while liberals use the phrase ‘same-sex’, statistically, in Google searches.) Not to neglect what Google says about what the ladies’ biggest sexual worry is, all I can say is, Oh. My. God. Vagina odor. Really? Really!! All statisticians should take note - interrogative surveys often show different results from those statistics revealed in Google searches about the percentages of who is thinking/feeling what where and when, especially in those morally-weighted or personally embarrassing areas of society. Of course, interpretation is always fraught with possible erroneous judgements whatever the source of sampling. I have always trusted those insurance actuarial tables FAR more than political or media spins or even university data studies - so now I am adding Google statistics to my ‘trusted info’ list. Of course, gentle reader, I know any compilations of data can be erroneously or purposely manipulated or massaged. ‘Garbage in, garbage out’ still applies...which is the case ‘Everybody Lies’ makes as well. The book seemed on top of the science, as far as I know. I am not a science-brain, but an amateur wannabe. My one irritation with this book is all about the manner in which the information is explained. Gentle reader, my complaint is subjective as hell. Honestly, I can’t put my finger on it, though. The writer seemed to be trying to fill out his actual 200-page book to 300 pages by having personal emotional filler similar to the gaspy asides many shows use to increase the viewers’ emotional high about what is being discussed. Are you familiar with those TV shows that, after each commercial break, recap the entire show in the preceding minutes before the commercial break in a breathless montage manner? And they often had a shocked-gasp teaser of what will be shown before the commercial break? Anyway, I felt there was a lot of that style of emotional manipulation (and extending of the material) going on in this book, somehow. I simply did not appreciate the personal ‘fun’ filler so much. Maybe there wasn’t enough snark. I prefer snarky humor, if there is humor. Bite me. Maybe a more tightly edited book would have worked better for me to enjoy reading it. Anyway, I realize I am floundering about here. None of this may be true at all for you. Ultimately, this is a book worthy of reading for the general reader (for the record, I definitely have a lit/history brain, so yes, I am a general science reader!) and the explanatory information about how statistical studies are done (the only math-involved college class which engaged me) and what people are really feeling and thinking (if Google searches are to be believed, and I think they are). Included are extensive Notes and Index sections.

  15. 5 out of 5

    Tam

    A pretty short book with some interesting remarks, but not yet charming enough for me. The author definitely has his quirky and funny moments, when he presents himself, his family, and especially his views more. Yet the books' ideas and findings aren't exactly ground breaking. The types of questions like in this book have been posed in Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. The usefullness of big data has been discussed by ones such as Dataclysm: Who We Are (disc A pretty short book with some interesting remarks, but not yet charming enough for me. The author definitely has his quirky and funny moments, when he presents himself, his family, and especially his views more. Yet the books' ideas and findings aren't exactly ground breaking. The types of questions like in this book have been posed in Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. The usefullness of big data has been discussed by ones such as Dataclysm: Who We Are (discussion on sex and gender actually resemble Dataclysm a lot). I was looking for something more nuanced, a long and rigorous thematic research on human's tendency, and data as an extremely useful tool but not the main focus. Instead in the book, it's more like a collection of observations. Each time Stephens-Davidowitz has an idea, he looks for an answer from the available data, then moves on. The questions are somewhat related to human's private behaviors that traditionally we can't observe. The tool seems to be a bit more at the center here, but he doesn't discuss the cons and all the ethical implications of big data that deeply enough, except for a short section at the end of the book. Now, that's totally ok, for a casual and light, yet still useful read. More importantly, we have to consider that these types of research and the topic of big data are still relatively new. It takes decades and decades more to build a literature huge enough to draw really meaningful and profound conclusions. The time simply hasn't arrived yet for the book of my taste, but this one, as the author states, hopefully would raise interests in young people, young social scientists, steering them towards potentially fruitful topics and research methodologies. That's why it's a 3 star.

  16. 4 out of 5

    Emma Deplores Goodreads Censorship

    3.5 stars This is an engaging and informative book about the huge amount of data available online and what it tells us about society. I read it alongside Dataclysm and found Everybody Lies to be by far the better of the two, presenting a wealth of information in a cohesive fashion and making fewer unfounded assumptions. The author was a data scientist at Google, and draws in large part on the searches people make on the site, along with information from sites including Facebook and Pornhub. There’ 3.5 stars This is an engaging and informative book about the huge amount of data available online and what it tells us about society. I read it alongside Dataclysm and found Everybody Lies to be by far the better of the two, presenting a wealth of information in a cohesive fashion and making fewer unfounded assumptions. The author was a data scientist at Google, and draws in large part on the searches people make on the site, along with information from sites including Facebook and Pornhub. There’s a lot of interesting stuff in the data, from the rate of racist searches in the rust belt predicting the rise of Donald Trump, to common body anxieties and whether they actually matter to the opposite sex, to an estimate of how many men are gay and whether that varies by geography (it appears not), to rates of self-induced abortions. This is a great book to read if you love unusual factoids, whether on sexual proclivities or how sports fans are made. The author also writes in a compelling way about the uses of Big Data itself, and while he waxes evangelical about it (evidently preferring to spend all his time immersed in statistically significant data, he finds novels and biographies too “small and unrepresentative" and therefore uninteresting), there are certainly a lot of possibilities there. In health, for instance, compiling early searches about symptoms with later searches for how to handle a diagnosis can help doctors detect pancreatic cancer at an earlier stage, while epidemics can be tracked through symptom searches. The author is also interested in how applying data can revolutionize a field, discussing at length the data that predicted the success of the racehorse American Pharaoh. (By "at length" I mean 9 pages; this is a book that moves through a broad range of topics quickly.) Overall, the writing is engaging and the book hangs together well, being informative while mostly resisting the urge to speculate. But the author does make a couple of assumptions worth pointing out. One is that people’s Google searches are made in earnest and for personal reasons. Certainly, you might search for “depression symptoms” out of concern that you or someone you know is depressed. But you also might want to be prepared in advance to identify warning signs, or might have encountered something in the media that sparked your interest, or you might be a student writing a paper on the topic. On the other hand, if you’re intimately familiar with depression already, you’re unlikely to google the symptoms. None of this means the author’s finding a 40% difference in rates of depression symptom searches between Chicago and Hawaii isn’t relevant, but data that’s both over- and under-inclusive serves better as a starting point for research than a definitive conclusion. It's certainly not proof that better geography is twice as effective as antidepressants, as the author suggests. The other assumption is that everybody lies: the book insists on it, based largely on the fact that typically rosy social media posts fail to reflect all those unhappy or hateful searches. Selectively sharing information doesn’t necessarily seem to me to be lying, but the author appears invested in proving the book’s title. For instance, he discusses a particular type of tax fraud: in areas where few tax professionals or people eligible for the scheme live, 2% of people who could benefit from this lie tell it, while in areas with high concentrations of both, the rate of cheating is around 30%. The author concludes that “the key isn’t determining who is honest and who is dishonest. It is determining who knows how to cheat and who doesn’t.” This bleak view of the world fails to account for the 70% who don’t cheat even in areas with high levels of knowledge; finding that significant numbers of people cheat if they know how is a far cry from finding that everyone does. So, like the author of Dataclysm, Stephens-Davidowitz is probably a better statistician than sociologist. But if you’re interested in Big Data, or in getting a peek at the thoughts and anxieties people ask Google about because they’re not comfortable sharing with others, this is the book I recommend. You’ll certainly get a lot of interesting tidbits from it, along with perhaps new inhibitions about typing things into Google!

  17. 5 out of 5

    Montzalee Wittmann

    Everybody Lies Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are By: Seth Stephens-Davidowitz, Steven Pinker - foreword Narrated by: Tim Andres Pabon Wow, this book really lays out a lot of data itself! It speaks about how people say one thing, or respond to a poll, yet they are lying. They lie less online than having to face someone. How everything that is searched is checked and through the data searches a lot can be noted. Examples of this include what people are doing Everybody Lies Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are By: Seth Stephens-Davidowitz, Steven Pinker - foreword Narrated by: Tim Andres Pabon Wow, this book really lays out a lot of data itself! It speaks about how people say one thing, or respond to a poll, yet they are lying. They lie less online than having to face someone. How everything that is searched is checked and through the data searches a lot can be noted. Examples of this include what people are doing while out of work, how they feel about their children, politics, and more. Even how racist the country is at certain times. Interesting. I guess the people that study this data know more about us than we know ourselves!😁

  18. 5 out of 5

    Cheryl

    Believe the hype. This is not a perfect book, but it's fun, enlightening, ground-breaking, and important. Too many people don't know the potential power of the new methodologies of data analytics, and too few ppl who think they do know that power don't know the limitations. SethSD does, and he shares a lot of what he knows with us. This is good science for arm-chair science consumers like me, and a good read for those who just like to dabble in non-fiction. It's both concise and rich. Documented Believe the hype. This is not a perfect book, but it's fun, enlightening, ground-breaking, and important. Too many people don't know the potential power of the new methodologies of data analytics, and too few ppl who think they do know that power don't know the limitations. SethSD does, and he shares a lot of what he knows with us. This is good science for arm-chair science consumers like me, and a good read for those who just like to dabble in non-fiction. It's both concise and rich. Documented with notes, and index, and the author's own website which he promises has lots more hard info. It may turn out to be a four-star book as more on the topic get published. But right now I urge everyone to read it. Next, I do hope to read Seth's next book, and more on the subject. Yes, Seth, I did read right to the end, and still I'm glad you didn't keep struggling to say anything for the ages in your conclusion... imo, you ended it perfectly. On a personal note, one of the key points from the intro. and one of the key points from the conclusion are amazingly relevant. Here's the thing. Our youngest is looking for a school to transfer up to, at the same time we're looking for our first post-retirement community. We're hoping to find a college & town all three of us would like, and a particular field of study for our kid. In the beginning of this book are two maps, one that reveals Trump supporters, and one that reveals pockets of closet racists as exposed by their Google searches)... which is obviously relevant data for us as we choose what part of the country to move to. And at the end of the book, Seth tells my geeky son what studies to focus on: "I hope there is some young person reading this right now who is a bit confused on what she wants to do with her life. If you have a bit of statistical skill, an abundance of creativity, and curiosity, enter the data analytics business." (Well, my young person has been listening to me read bits from the book, but otherwise that could have been directly tailored for him.) Read the book. Don't be fooled by my long review; I'm only sharing a bit of what I learned from it. Other book darts: "[P]laces with the highest racist search rates included upstate New York, western Pennsylvania, eastern Ohio, industrial Michigan and rural Illinois, along with West Virginia... The true divide... was not South versus North; it was East versus West. You don't get this sort of thing much west of the Mississippi. And racism was not limited to Republicans...." The 4 powers of Big Data can be summarized: "Offering up new types of data..." "Providing honest data..." "Allowing us to zoom in on small subsets of people..." "Allowing us to do many causal experiments...." Now we get to an example of what is not perfect about the book. First, context: Seth is a careful scientist; he knows about sampling errors, biases, correlation not equaling causation, etc. However, sometimes he forgets about alternative explanations and interpretations. That is to say, when the book shows us data, it's fine, but sometimes when Seth interprets the data, he gets trapped by a fallacy. Eg, he says, "[O]f the minority of women who visit PornHub, there is a (25%) subset who search... for rape imagery... sometimes people have fantasies they wish they didn't have and which they may never mention to others." Maybe... or maybe they're victims trying to process, or maybe they're wannabee authors doing research, or they're men lying to present as female.... It looks to me like Seth didn't want to think too hard about this one.... Big data allows researchers to zoom in on subsets of demographic groups, and geographical regions.... "But another huge--and still growing--advantage of data from the internet is that is easy to collect data from around the world.... And data scientists get an opportunity to tiptoe into anthropology." Big data could really help in the field of healthcare. When I'm done here I'm going to check out the site PatientsLikeMe.com. "Heywood hopes that you can find people of your age and gender, with your history, reporting symptoms similar to yours--and see what has worked for them." I also want to consider reading Irresistible: The Rise of Addictive Technology and the Business of Keeping Us Hooked and Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart.

  19. 4 out of 5

    Yaaresse

    At 58%, I give up. DNF. I've seldom read anything that contained so many individually interesting (if shallow) sentences and still bored the hell out of me. I'm also tired of reading about the author's infatuations with baseball, Google, and porn. I am counting this book as read, however, because I should get some small (if valueless) reward for the time I lost reading it. Some random, non-linear thoughts because I'm not interested enough in the book to try harder at this point: 1. The author wor At 58%, I give up. DNF. I've seldom read anything that contained so many individually interesting (if shallow) sentences and still bored the hell out of me. I'm also tired of reading about the author's infatuations with baseball, Google, and porn. I am counting this book as read, however, because I should get some small (if valueless) reward for the time I lost reading it. Some random, non-linear thoughts because I'm not interested enough in the book to try harder at this point: 1. The author worked for Google. Apparently, he's still smitten by them and thinks their massive collection of data on users is the most wondrous gift to humans ever. Of course he does. His degree and career depend on it. I am of the opinion that the most, and perhaps the only, honest thing Google has ever done is when they got rid of their "Don't be evil" corporate slogan. Google is kind of like Walmart to me. I don't like it, trust it, or believe anything it claims and use it as little as possible. 2. The fact that the government turned over every American's tax data to whatever researchers want to dig through it--and in such detail that they can trace income changes for every address a single individual has reported from--is worrisome. "Oh, but the individuals weren't identified." Bullshit. Maybe not by name, but there is enough info there to cross-match with other data and individually identify people. That every search you've every made is time/location stamped and available to whoever wants it is just effing creepy. While we all should know by now that we're just products to be data-mined for profit, seeing a time-stamped list of an individual's exact searches over a 24-hour period is disconcerting. 3. If this guy is right, I've been using search engines for all the wrong things. (Read that in snark font.) Do people really do Google searches for "Why are Jews cheap?" or "Is my daughter ugly?" or "Am I gay?" Gee, all this time, I've been using searches for things like "What is the capital of Latvia" or "What is the formula to calculating amortization?" Every now and then I sink to searching for David Bowie music videos. I think I once stooped to "Why the hell are the Kardashians famous?" Supposedly women everywhere are Googling "Does my vagina smell?" and "Is my husband cheating"? Well, honey, if you have to ask.... 4. Data for data sake serves little purpose. It just makes for more noise, not more clarity. 5 . Apparently the author believes that the way to keep readers' interest is to use the most sensationalist examples he could come up with. Throw around a lot of racist terms and references to sex kinks or insecurities. When all else fails, talk baseball and throw in some frat boy humor. I get it: the porn industry drives nearly every internet innovation, from web design to security and data collection. Fine. But the tenth or 15th time he refers to women being worried about vaginal odor or how common Pornhub searches for incest videos are, it comes off as an attempt to be provocative rather than informative. For a chapter or two when that content is relevant to the topic, fine. Every single chapter? Boring. And really creepy after a while. 6. If there is any discussion about the ethics of data mining in this thing, it's so far buried in the back that I didn't get to it. If there is substantive material on statistical relevance, hypothesis testing or phantom populations, it's buried in the back. He's the kid who walks in with his bright, shiny, non-traditional method and basically says he's right and everyone else who ever studied data is wrong. (And maybe, if as the author claims, the last chapters of books don't get read as much as the beginning ones, it's because books are getting less and less informative and/or credible.)

  20. 5 out of 5

    Ms.pegasus

    Author Seth Stephens-Davidowitz offers an amusing example of lying. He cites a follow-up on a survey of high school student behavior: “...a meaningful percent of teenagers tell surveys they are more than seven feet tall, weigh more than four hundred pounds, or have three children. One survey found 99 percent of students who reported having an artificial limb to academic researchers were kidding.” (Location 4702) Students like to mess with adults he cautions. Of course I didn't need to read the Author Seth Stephens-Davidowitz offers an amusing example of lying. He cites a follow-up on a survey of high school student behavior: “...a meaningful percent of teenagers tell surveys they are more than seven feet tall, weigh more than four hundred pounds, or have three children. One survey found 99 percent of students who reported having an artificial limb to academic researchers were kidding.” (Location 4702) Students like to mess with adults he cautions. Of course I didn't need to read the book to learn that. When he was a teen, my son clued me in when I expressed alarm at the results of a survey conducted by his high school indicating high usage of drugs and alcohol. What I did learn from this book was how analytical tools are helping researchers make sense of the vast amounts of data out there, data harvested from unwittingly honest interactions. The author is a Big Data specialist and he has written an engaging and approachable book about a highly technical subject. His analytic arsenal includes Google Trends, Google Correlate, Google N-gram, and Google Adwords. He uses these tools to uncover startling and even appalling revelations about sexual proclivities conveyed in Google searches. From there he moves on to more serious topics such as the role of racism in political outcomes and gender bias themes in parenting. Most of what he reveals is unexpected and contrary to conventional wisdom in a big way. Stephens-Davidowitz appeals to a wide range of interests in this book. He explores a concept called Doppelganger Searching as it was used to predict the trajectory of David Ortiz's career in major league baseball. (It's a perverse irony that Ortiz was the victim of mistaken identity — a figurative “doppelganger” — in a near fatal shooting). Much has been written about abuses of Big Data. Stephens-Davidowitz is not unmindful of this problem, although his concerns will not assuage the fears of those concerned with privacy issues. If anything, the power of the tools he wields might heighten that anxiety. Stephens-Davidowitz explains in reader-friendly fashion other technical concepts besides Doppelganger Searching. He tackles the “curse of dimensionality” — a problem relevant to social science research where the number of variables can expand almost infinitesimally. He cautions against the addictive power of metrics in another amusing example which many readers might identify with. He relates how a marketing professor became obsessed with her count on a pedometer: “She walked early in the morning, late at night, at nearly all hours of the day — twenty thousand steps in a given twenty-four hour period. She checked her pedometer hundreds of times per day, and much that remained of her human communications was with other pedometer users online, discussing strategies to improve scores.” (Location 2908) An intriguing idea Stephens Davidowitz proposes is that Big Data analysis can breach the boundary between correlation and causation through A/B testing. He offers some tests; the reader will be astonished at how widely the data diverges from an initial gut reaction. Explanations of behavior constantly see-saw between the microcosm of stories from individual viewpoints and confident generalizations based on flawed statistics. Seth Stephens Davidowitz sees Big Data analysis as a possible bridge between these two methodologies. This was an eye-opening book worth reading by anyone interested in public policy decision-making. NOTES: I read the Kindle edition of this book. The editors have taken a step in the right direction. Asterisks tag the author's comments and readers can easily toggle back and forth to the main text. On the other hand, footnotes are not marked in the text. The reader is forced to read the footnotes and conduct a search to link them back to the relevant passages.

  21. 4 out of 5

    Caroline

    I wish I could give this book more than five stars. Anyone who has a sneaking feeling that Americans aren't who they SAY they are will find confirmation here. It's also easy to read, no academic language here. I was already riveted by the introduction. His premise is that we all lie to each other, pollsters, and ourselves, but not to that white box where you type internet searches. Both before and after the election everyone went nuts trying to figure out why Trump was doing so much better than p I wish I could give this book more than five stars. Anyone who has a sneaking feeling that Americans aren't who they SAY they are will find confirmation here. It's also easy to read, no academic language here. I was already riveted by the introduction. His premise is that we all lie to each other, pollsters, and ourselves, but not to that white box where you type internet searches. Both before and after the election everyone went nuts trying to figure out why Trump was doing so much better than polls would indicate, looking for factors that would explain it. There was only one. "[Nate] Silver found that the single factor that best correlated with Donald Trump's support in the Republican primaries was that measure I had discovered four years earlier. Areas that supported Trump in the largest numbers were those that made the most Google searches for 'n-----'." (He uses the real word, which deepens the revulsion you feel at what he's discovered.) Despite Obama's two easy election victories and the narrative that we were post-racism, the Google search data tells another story about reactions to those victories. Immediately after the San Bernadino shootings, what happened online? A ton of people searched for "kill Muslims." And there is a lot more, about sex and child abuse and sexism. Did you know that the most common term used to complete the sentence "Is my son..." is 'gifted' or some variant thereof, and the most common term used to complete the sentence "Is my daughter..." is 'overweight'? America is not post-anything except maybe post-good intentions. What use does he think this can be? Well, he did have some good suggestions, and none of them are based on finding out who any individual is who's done a search. For example, if searches for "kill Muslims" spike in a certain city, a few extra police could be deployed to watch over the local mosque until the spike subsides. He spends a moment talking about how big data is not meant to be, and should not be, used to try to figure out who specifically is going to commit crimes. By the time I was done with this book I was a bit discouraged at who Americans seem to be, but it's better to know. I hope that this kind of study continues, so we can attempt to realistically work with our society instead of pretend it's something it's not.

  22. 4 out of 5

    Lubinka Dimitrova

    I sought out the book after reading an interview with the author, and it was totally worth it. The book is quite enlightening, and to be honest, deeply frightening. Internet data can work miracles for the benefit of humanity, but it can bring to life many unimaginable, Big-Brother-type nightmares (current US presidents not excluded, just sayin...). Still, it's good to know. I sought out the book after reading an interview with the author, and it was totally worth it. The book is quite enlightening, and to be honest, deeply frightening. Internet data can work miracles for the benefit of humanity, but it can bring to life many unimaginable, Big-Brother-type nightmares (current US presidents not excluded, just sayin...). Still, it's good to know.

  23. 4 out of 5

    Steven

    This book is kind of a mess, but its subject is interesting enough—and some of the findings are intriguing and potentially important enough—to make you breeze your way through it. I call it a mess, because 1) the title is Everybody Lies, yet Stephens-Davidowitz in no way shows—let alone proves—that everyone lies (only, perhaps, that people tend to keep intimate, embarrassing, and politically incorrect things to themselves or, if they do share them, are more willing to confide in Google than in o This book is kind of a mess, but its subject is interesting enough—and some of the findings are intriguing and potentially important enough—to make you breeze your way through it. I call it a mess, because 1) the title is Everybody Lies, yet Stephens-Davidowitz in no way shows—let alone proves—that everyone lies (only, perhaps, that people tend to keep intimate, embarrassing, and politically incorrect things to themselves or, if they do share them, are more willing to confide in Google than in others through traditional sources of data collection), and 2) the subtitle is What the Internet Can Tell Us About Who We Really Are, but the book is not about the Internet as such, but about the potentialities of big data in all its manifest forms (text, words, images, bio-data, and so on). Stephens-Davidowitz probably used big data to establish which words people search for and have been buzzing the most... Anyway, having said that, the book doesn't go into nearly as much depth as I would have liked it to—assumptions go unexamined, serious ethical concerns go unaddressed, and alternative hypotheses go unexplored. However, this relative superficiality has as an upside: the book is light and entertaining, although some of the findings that S-D discusses are also shocking and depressing (like how many people search for horrible things online). At the core of the book lie the four 'powers' of big data: (1) It offers up new types of data (e.g., people's sexual desires and preferences gathered through porn sites). (2) It provides honest data (e.g., people's genuine concerns as expressed through questions asked in Google). (3) It allows us to zoom in on small subsets of people (because there is so much data, even small slices of a population can provide meaningful [statistical] information). (4) It allows us to do many causal experiments (e.g., presumable, through the kind of A/B testing done by Facebook and sellers of ads – this is, in my eyes, a problematic area that is not adequately addressed by S-D).For all its weaknesses, some of which merely require some more attention and research, others of which are more fundamental, Stephens-Davidowitz should get credit for the work he has done on and with big data. I do agree with him that it's a very important source of information (which ought to be used with care, all the same) and tool for research, perhaps even—as he definitely seems to think—the most important in and for the future.

  24. 4 out of 5

    Sonja Arlow

    3.5 stars You may come across as liberal to the world but secretly google racist jokes….. Although you may spill your deepest darkest secrets to Google, make no mistake this data sits somewhere ready to be analysed. I work with big data every day, so I was immediately drawn to this book. But you really don’t need to be in the data industry to appreciate the book. It is written for the layman with humour and interesting titbits sprinkled throughout the book. The first 3rd of the book gives a notable 3.5 stars You may come across as liberal to the world but secretly google racist jokes….. Although you may spill your deepest darkest secrets to Google, make no mistake this data sits somewhere ready to be analysed. I work with big data every day, so I was immediately drawn to this book. But you really don’t need to be in the data industry to appreciate the book. It is written for the layman with humour and interesting titbits sprinkled throughout the book. The first 3rd of the book gives a notable amount of space to data on sex, politics and racism. These are things we are not always honest about with our friends (or even ourselves). The one downside for me was that the data was very USA centric so some of the case studies dealing with baseball or basketball was just not very interesting to me. There were also one or two graphs that were not properly explained. The author is also very passionate about this subject matter which means he sometimes flinted around from one topic to another in such quick succession that you almost lose the point he is trying to make But there were sections that I also found fascinating. The explanation of doppelganger search algorithms (this is how Amazon and Netflix suggest books/movies you may like) and its applications across various industries. The case study on how race horses were chosen and how new data also has its limitations were just as great. This is a little like Freakonomics for Big Data and the author himself is clearly a HUGE fan of Steven Levitt as the conclusion read like an ode to his idol. And finally, last week the world got its first glimpse of a black hole, only possible due to the crunching of HUGE data using sophisticated algorithms. Thanks Katie Bouman! This shows just how powerful the application of big data can be. The big questions and mysteries of our time may very well be answered one data set at a time. Recommended

  25. 5 out of 5

    Bryan Alkire

    Update I now give this a 3

  26. 5 out of 5

    Elena

    The author is a bit too bragging, exaggerating, and name dropping for my taste. Still, i do not regret spending the time with the book (but would regret paying money if it would not be a library borrow). Memorabilia. Predicting rate of unemployment with the frequency of porn site searches (amount of time on their hands). Predicting success of dating (listen, then listen some more, then, when you think you are done listening, listen some more). Doppelganger (DOPP-el-gang-er) searches in Internet ( The author is a bit too bragging, exaggerating, and name dropping for my taste. Still, i do not regret spending the time with the book (but would regret paying money if it would not be a library borrow). Memorabilia. Predicting rate of unemployment with the frequency of porn site searches (amount of time on their hands). Predicting success of dating (listen, then listen some more, then, when you think you are done listening, listen some more). Doppelganger (DOPP-el-gang-er) searches in Internet (by medical history, interests, etc.). Regression discontinuity (sample is taken from the section around a sharp numerical divide). Natural experiments. Presidents association and the afterlife of the economy. Future of students who made into prestigious schools and who did not. Recidivism of prisoners who were treated harsher (because they just made into the more dangerous classification) and vs vc.

  27. 4 out of 5

    Tressa

    I couldn't even make it through the introduction. This is a perfect example of starting with a conclusion and then finding the data to support your conclusion. All a search shows you is the number of times the word or phrase is searched. It does not show intention. It does not show the number of times a certain person searches for the same word or phrase. The author makes a lot of assumptions based on his own presuppositions. I thought this would be a fun read, but I have no intention of sloggin I couldn't even make it through the introduction. This is a perfect example of starting with a conclusion and then finding the data to support your conclusion. All a search shows you is the number of times the word or phrase is searched. It does not show intention. It does not show the number of times a certain person searches for the same word or phrase. The author makes a lot of assumptions based on his own presuppositions. I thought this would be a fun read, but I have no intention of slogging through a heavily biased account of his own ideas. Data is just that-data. It is one piece of a puzzle. It is not the puzzle.

  28. 4 out of 5

    Moshe Zioni

    Don't get me wrong, it is nice, funny and worth a short read. A problem for me - the causality vs correlation part comes waaaay too late in the book and the author sometimes mix the two IMHO. The biggest thing to tackle for Data Scientists is the issue of causality and if/how it can be proven to be, most of the times it just cannot be proven by this method because of its built-in limitation but the author makes a pass on this and this all makes some assumptions, as possible they can be, naive in Don't get me wrong, it is nice, funny and worth a short read. A problem for me - the causality vs correlation part comes waaaay too late in the book and the author sometimes mix the two IMHO. The biggest thing to tackle for Data Scientists is the issue of causality and if/how it can be proven to be, most of the times it just cannot be proven by this method because of its built-in limitation but the author makes a pass on this and this all makes some assumptions, as possible they can be, naive in nature.

  29. 4 out of 5

    Wen

    The title steered me a bit off-course at first—I thought it was one of those self-help psychology books that I tend to avoid. I eventually decided to give it a shot, mostly because Steven Pinker, and author I highly respect, wrote the forward. So glad I did. To the author Mr. Davidowitz , I did finish the book, so did I with regard to the first two books you mentioned below --moot point for the third book as it’s not even on my to-read list ;-) “more than 90 percent of readers finished Donna Tart The title steered me a bit off-course at first—I thought it was one of those self-help psychology books that I tend to avoid. I eventually decided to give it a shot, mostly because Steven Pinker, and author I highly respect, wrote the forward. So glad I did. To the author Mr. Davidowitz , I did finish the book, so did I with regard to the first two books you mentioned below --moot point for the third book as it’s not even on my to-read list ;-) “more than 90 percent of readers finished Donna Tartt’s novel The Goldfinch. In contrast, only about 7 percent made it through Nobel Prize economist Daniel Kahneman’s magnum opus, Thinking, Fast and Slow. Fewer than 3 percent… made it to the end of economist Thomas Piketty’s much discussed and praised Capital in the 21st Century.” As the subtitle suggested, the book was a primer on data science, a still budding field but serves as the very foundation for hot markets du-jour, like artificial intelligence and machine learning. First of all, as informative as the book was, I’d say the book targeted general reading public. It mostly steered clear of mathematical, statistical and programming jargons,. The writing style was of light-heartedness; it certainly did not remind me of a serious (boring??) college textbook. That said, I assume those readers who love numbers and prefer talking in percentage terms would enjoy the book more. In short, data science in this book was telling stories through data, big data, new data, i.e. the gigantic data sets we now have access to thanks mostly to keywords we put into internet search engines every day. And today even our personal computer might be capable of processing and analyzing such data sets , given increasingly cheaper memory chips and more powerful CPUs, GPUs or other processors. While Davidowitz admitted our guts could do a decent job drawing conclusions and making predictions naturally, he pointed out we need big data to “sharpen the picture”. For instance, it is common sense that harsh winter weather could lead to depression (the D-word was frequently brought up by tour guides during my recent vacation in Northern Europe), but how much of a temperature drop could affect people’s mood materially—10 degrees, or 50?? Would other factors, say “economic conditions, education levels, and church attendance”, muddied the picture? And how about our guts getting it totally wrong? The book gave an example of a study that concluded, totally against our intuition, couples maintaining separate sets of friends tend to stay in the relationship longer. So Davidowitz spent the bulk of the book, in Part two, to illustrate “the powers of the big data” that we have access today: 1) new types of data that are beyond survey data or tabular data, think tweets and pictures; 2) honest data, the data generated at the subconscious level, such as doing a Google search (instead of answering a survey), when people are not as inclined to lie; 3) the data granular enough that we could zoom in on small subsets for our particular study; 4) data so large and comprehensive that would allow us to undertake rapid, controlled experiments. Very soon into the book, I spotted its similarity with a very popular title published more than a decade ago that I also loved, Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven Levitt and Stephen Dubner. It was not only because Levitt was frequently mentioned in this book. • The two books shared similar carefree and witty writing style, and were of similar length; • both tried to employ data to debunk myths largely derived from our intuition; • as an engaging exercise for the reader, both gave lists of factors and asked readers to guess which ones had significant impact on the outcome, then gave the answers; • both devoted a large portion explaining the distinction between “causation” and “correlation”, and describing the A/B testing, the randomized, controlled study; • the two books even tackled a similar subject, albeit from different angles, whether the school choice determines the student’s later success. Essentially, both books encouraged readers to think out of the box and ask the right questions. Of course this book was more up-to-date; it listed data sources--Google Trends, Ngrams, and Correlate, along with unstructured data types that are more relevant to today’s digitally-connected readers. Not surprisingly, at the end of the book Davidowitz revealed that it was the very “rogue economist” Levitt and his Freakonomics that inspired him to pursuit his current career. In fact, Davidowitz explored the same data set, birth certificates data in California which included black residents’ first name ( or if it was a common white name or a distinctive black name). While Levitt’s established the connection between a black person’s first name and his socioeconomic background, Davidowitz built on the study, and used a black person’s first name as a proxy for his socioeconomic background, to study the linkage between this factor and the chance of the person making the NBA. To me this exposed one of the best known pitfalls of data analysis, so-called “garbage in, garbage out”. What if someone came out to prove Levitt wrong in his linkage? That would be like an earthquake to Davidowitz’ subsequent study? It would be great if this topic could be covered in more depth in the sequel of this book. To be fair, Davidowitz covered several limitations of big data in the book, particularly from the moral/ethics standpoint, e.g. price gouging, discrimination, and privacy. There were other parts int the book that I found less convincing. "... in the prediction business, you just need to know that something works, not why." As an example, the author cited that Walmart discovered its custoemrs preferred stockpiling Strawberry Pop-tart before hurricanes. So the store should just stock the pastries on its shelves simply based on the data, without confirming the causality first? I'd also be causcious about reading too much into the various first-date signals in the book. So a guy would be more interested in me if I act more like a narcissist? hmmm... Well, as an intro the book only scratched the surface of data science. And yet, it was an enjoyable, fast-paced, thought-provoking read. I decided to change my rating to 4, as I think both Freakonomics and Thinking, Fast and Slow mentioned above are relatively better reads.

  30. 4 out of 5

    Anton

    Delightful, very engaging read on modern takes on data analysis. Fans of Levitt and Pinker I am sure will enjoy. Hardly any 'cons' to flag up... but it is a bit on a short side and overwhelmingly US focused. Still very clever and thought-provoking Overall: definitely worth your time Delightful, very engaging read on modern takes on data analysis. Fans of Levitt and Pinker I am sure will enjoy. Hardly any 'cons' to flag up... but it is a bit on a short side and overwhelmingly US focused. Still very clever and thought-provoking Overall: definitely worth your time

Add a review

Your email address will not be published. Required fields are marked *

Loading...
We use cookies to give you the best online experience. By using our website you agree to our use of cookies in accordance with our cookie policy.