Google searches mined to uncover our true opinions


Every time you ask Google to help solve your personal problems, you are taking part in one of the largest social science experiments ever conducted.


Google Trends is a web tool for tracking what terms Google users are searching for in a given period. The anonymised, aggregate data is compiled from 100 billion searches every month, and can be broken down by region. Google itself has published an annual Zeitgeist of the most popular searches worldwide since 2000 (2012's included "Whitney Houston", "Gangnam Style" and "Hurricane Sandy").


"It's a really powerful way to measure the pulse of what is capturing people's interest," says Roya Soleimani, a member of the Google Trends team.


Since 2008, Google Flu Trends has followed the spread of annual virus outbreaks, based on the assumption that wherever people are Googling "flu symptoms", an outbreak is imminent.


Searches don't lie


Now, researchers are turning the Google lens on more difficult issues, some of which are invisible to conventional polls, says economist Seth Stephens-Davidowitz. "There are very important questions where the existing data sources may be leading to misleading conclusions," he says. "People might lie or misremember."


While a PhD student at Harvard University, Stephens-Davidowitz, now an intern at Google, became interested in the potential of Google search data to gauge users' views. People are more likely to be honest in searches than in polls, he reasoned - people Google the word "porn" far more than most ever let on, for example.


He analysed Google searches and voting patterns in the US to measure the extent to which racism hurt Barack Obama in the 2008 presidential election. He ranked states based n the proportion of searches that contained the word "nigger" between 2004 and 2007. The states with the highest rates of these searches were ones where Obama underperformed in the election, such as Ohio and West Virginia. He concluded that racism alone cost Obama 3 to 5 per cent of the total vote.


"This is about twice as high as is found by most surveys, presumably because many people do not want to admit this motivation," he says.


He has also used Google to test the notion that child abuse rates went down during the recent recession, a trend hinted at by declines in incidents reported to the authorities. Searches for terms like "My dad hit me" or "child abuse signs" from concerned adults actually went up in regions where unemployment was higher or where social services budgets had been cut.


A study published last month focused on public interest in the environment based on searches for terms like "extinction", "endangered species" and "climate change" between 2001 and 2009. Only "climate change" saw an uptick in searches, which the researchers interpreted as a loss of engagement in other environmental issues.


Chris Scheitle of the College of Saint Benedict in St Joseph, Minnesota, found that states with the highest number of searches for "creationism" were also those with the most stringent laws restricting the teaching of evolution in public schools. "It can help understand the influence of a certain religious group on public policy," he says.


Flu and the Oscars


But though these findings give insights, there is reason for caution. Search rates for "child abuse" can go up in response to one local news story, for instance, rather than many incidents. Searches for flu symptoms tend to correlate with searches for "Oscar nominations", because both the seasonal virus and interest in the Hollywood awards spike in January and February. The environmental attitudes study saw a sudden spike in searches for "extinction" in September 2007 – the same month the action movie Resident Evil: Extinction came out.


And sometimes a seeming correlation disappears, says computer scientist Keith Winstein of the Massachusetts Institute of Technology. Google Flu Trends marched in step with official data from the US Centers for Disease Control and Prevention between 2009 and 2012, and was hailed as a success. This year, however, it dramatically over-reported flu rates. No one knows why.


If there is no other source of data to check against, you might never know if the tool is failing, says Winstein. "The world is very new at this," he says. "It is provocative, but we don't know how to do it well yet."


If you would like to reuse any content from New Scientist, either in print or online, please contact the syndication department first for permission. New Scientist does not own rights to photos, but there are a variety of licensing options available for use of articles and graphics we own the copyright to.



Have your say

Only subscribers may leave comments on this article. Please log in.


Only personal subscribers may leave comments on this article


Subscribe now to comment.




All comments should respect the New Scientist House Rules. If you think a particular comment breaks these rules then please use the "Report" link in that comment to report it to us.


If you are having a technical problem posting a comment, please contact technical support.