Antisemitism and Islamophobia in the United States

In the USA recently, attempts to address antisemitism are often linked with those to address Islamophobia. Examples include those of the White House, Harvard, and Columbia University, to name but a few. The question is, why do the two appear together, and why these two and not hate against other religious groups?

To be sure, both antisemitism and islamophobia are problems in the United States (and obviously elsewhere, as I’ve written about in the past). But why the sudden need to mention both whenever antisemitism is mentioned (and not, for example, vice versa or other religious groups)? Is Islamophobia such a significant issue compared to antisemitism?

The FBI’s Crime Data Explorer provides an analysis of reported hate crimes by who they were directed at. Here’s a snapshot of the latest data, that of 2022.

FBI hate crime statistics by the group they were perpetrated against, 2022 data.

There were 1124 anti-Jewish (antisemitic) crimes, the second largest category after anti-Black hate crimes. Anti-Islamic (Islamophobic) crimes are in 15th place with 158 crimes. Pretty bad, but only one seventh of the number of antisemitic crimes. If one adds anti-Catholic, anti-Protestant and other anti-Christian crimes we find that there were 375 such crimes, more than Islamophobic crimes, but POTUS didn’t mention anti-Christian hate in their publication.

Of course, both Islam and Judaism are minority religions in the USA, so perhaps it isn’t fair to compare them to Christianity. According to Wikipedia there are 7.15 million Jews and 3.45 million Muslims in the USA. If we normalize the data per-capita, there are still 3.5 times more antisemitic crimes than Islamophobic crimes reported to the FBI. However, on a per-capita basis, anti-Sikh crimes (currently ranked 14th in the number of instances) are much worse than both. Note, however, that there are wildly different estimates for the number of people according to religious affiliations: The US Religion Census claims 4.45 million adherent Muslims and 2.07 million adherent Jews. If we use these estimates there are 15 times more crimes against Jews than against Muslims on a per-capita basis!

Unfortunately, the FBI’s data doesn’t cover 2023. Some organizations such as the Anti-Defamation Legue have reported a 5-fold increase in antisemitic crimes (see CNNs coverage, and when you do, look for the graphs worthy of “How to lie with statistics”). So what do Google searches tell us?

Looking at searches for the topic of antisemitism and Islamophobia since August 2023, we see a huge jump in the former, whereas the latter is almost at zero. The jump begins almost immediately after Hamas’ attack on Israel on October 7th.

Searches for the topics “Antisemitism” and “Islamophobia” in the USA for the 6 months starting August 7th, 2023

However, we don’t know how Google groups queries into topics. Therefore, I also looked at queries which begin with “Why are Jews” and “Why are Muslims” which, in the past at least, were associated with hate. Here the rise in Islamophobia is greater than we saw in the previous graph, but it’s still small. The total volume (as measured by the area under the graph) of antisemitic searches is 2.9 times that of Islamophobic crimes. Taking into account per-capita, that is still 1.4 times greater (using Wikipedia’s population estimates) or 6.3 times greater (using the Religion Census estimates).

Perhaps the only optimistic observation from this graph is that the jump was related to Hamas’ October 7th attack and both Islamophobia and antisemitism are going down fairly quickly.

Searches for the queries beginning “why are Jews” and “why are Muslims” in the USA for the 6 months starting August 7th, 2023

Going back to my original question, why do antisemitism and Islamophobia appear together, and why these two? It seems I don’t have good data to answer the question.  

Hamas’ attack on Israel through the lens of Google queries

On October 7th, 2023, Hamas attacked Israel from the Gaza Strip, which it has been governing since 2007. The details of the attack are horrific, and I won’t describe them here. Suffice to say that they are in line with the acts of the Nazi regime and ISIS.

We know from the past that Google search volume tells an interesting story about major world events, so here are a few graphs about that attack, through the lens of Google Search Trends.

First, the attack has caused major trauma to Israelis, as visible in this graph of search volume for anxiety and for depression. The latter serves as a comparator. As the graph shows, searches for anxiety spiked on October 7th and were high for around a month after then. As somewhat expected, searches for depression were attenuated for the same period, which is perhaps similar to an effect we saw at the start of the COVID19 pandemic.

Normalized Google search volume for anxiety (orange) and depression (blue) in Israel over time. The dotted line marks the date of the attack.

In Israel, interest in news spiked at the start of the war but has since gone down significantly. Similarly, queries about Hamas and the Gaza Strip spiked and then went down, though not to baseline (yet).

Normalized Google search volume for news in Israel over time. The dotted line marks the date of the attack.

Normalized Google search volume for Hamas (orange) and the Gaza Strip (blue) in Israel over time. The dotted line marks the date of the attack.

Worldwide the trend is as expected from the literature, which shows that a half-life of a few days for major news events.

Normalized Google search volume for Hamas in Israel (orange) and the world (blue) over time. The dotted line marks the date of the attack.

What about effects of this war on the world at large? Here’s interest in the phrase “Free Palestine”. Note how it is almost nonexistent until October 9th but then, just two days after Hamas’ atrocities, it spikes. Somewhat similar to interest in Hamas it’s decaying quickly, but is not yet back to baseline.

Normalized Google search volume for “Free Palestine” in the world over time. The dotted line marks the date of the attack.

Perhaps it’s more about fashion than about substance? Let’s check the USA. Here we see a strong political effect of these queries, i.e., more democrat states are more likely to be interested in Free Palestine. The voting share for Biden-Harris in the 2020 Presidential elections explains almost 70% of the variance. This is interesting because, of course, President Biden is a democrat and he took a strong pro-Israel stance in this war.

Normalized Google search volume for “Free Palestine” per US state as a function of the voting share for Biden in the 2020 elections in that state. DC is excluded (though it falls nicely on the line). The dotted line is a linear regression line with R2=0.69.

As Greta Thunberg noticed, Free Palestine is highly correlated with another political issue – climate change. The correlation explains more than 60% of the variance! The outliers on the top (meaning, more interest in climate change than expected according to the interest in Free Palestine) are mostly states that are likely to be affected by climate change such as Hawaii, Vermont and Alaska. Outliers on the other side are (in my opinion) Democrat states with large universities, but this warrants more careful research.

Normalized Google search volume for “Free Palestine” per US state as a function of normalized search volume for Climate Change. The dotted line is a linear regression line with R2=0.62.

My prediction is that interest in Hamas and this war will soon wane as the world moves to the next crisis. In Israel, expect a spike in pregnancy queries in a few months.

Antisemitism: (Almost) everyone has their favorite reason

“Oh, the Protestants hate the Catholics,

And the Catholics hate the Protestants,

And the Hindus hate the Moslems,

And everybody hates the Jews.”

(Tom Lehrer, National Brotherhood Week)

Today, something a little different and not entirely related to health: Antisemitism. It’s not entirely divorced from health either, as the bones of my forefathers, scattered from Spain to Poland will testify, but thankfully these days the physical aspects of antisemitism are on a somewhat less grandiose scale than in previous generations. I used Google Trends to see which Jewish conspiracies were searched in different countries. Unfortunately, it isn’t always easy to capture an entire topic with a single query, so I couldn’t encompass all the hate around this issue, but here is the volume of queries since 2004 for several antisemitic tropes:

The Protocols of the Elders of Zion
Zionist Occupation Government
Holocaust denial
Jewish lobby
Jewish bankers
Jewish Bolshevism
Prevalence of queries for common Jewish conspiracy theories

It’s interesting to see that each conspiracy has its own fan base, though some countries (perhaps owing to Internet penetration and population size) are represented in the maps of more than one conspiracy. It’s also notable that there is little correlation between the size of the Jewish population in a country and the volume of antisemitic searches therein: Pakistan, Norway, etc., have tiny Jewish populations (if any) and are not neighbors of Israel, and yet too many people in those countries have a favorite Jewish conspiracy theory.

How prevalent are those “theories”? It’s difficult to say with confidence. Querying for something doesn’t mean that a person believes in it, only that they are interested in the topic. Google Trends data has a few other drawbacks. Nevertheless, anecdotally, Jewish conspiracy theories seem to have volume (worldwide) of the same order of magnitude as common anti-Muslim theories. However, there are around 100 times as many Muslims as there are Jews.

Is there a lesson here? I don’t know. Perhaps it’s just that antisemitism is too common and that it manifests itself in a variety of ways. Perhaps it’s another demonstration that online activity into all aspects of human behavior, even the less savory ones.

A warning for internet platforms

China has, once again, instructed Bing to turn off the autosuggest feature of the search engine. The reason given by China’s State Information Office is, to quote from TheRegister article, that “Bad use of algorithms affects the normal communication order, market order and social order, posing challenges to maintaining ideological security, social fairness and justice and the legitimate rights and interests of netizens.”

I don’t know the details of why the Chinese government asked to remove autosuggest, nor whether and why Bing complied, but it seems to me that there is a lesson here for search engine operators and for people interested in algorithmic fairness.

Search engines are perhaps the most widely used internet service. They’ve replaced libraries for many of the information acquisition tasks we perform. When Google started, its stated mission was “to organize the world’s information and make it universally accessible and useful.” This implies that the results it provides reflect the world’s information. Indeed, many writers (e.g., this one in The Atlantic) wrote about the idea that search engines are automatic and reflect the knowledge available in the world. More recently, Google’s CEO said in testimony to the US Congress that “We use a robust methodology to reflect what is being said about any given topic at any particular time. It is in our interest to make sure we reflect what’s happening out there in the best objective manner possible. I can commit to you and I can assure you, we do it without regards to political ideology. Our algorithms do it with no notion of political sentiment.”

Unfortunately, as anyone in this business knows, a lot of manual work goes into an automated search engine. That manual work is done by people who have opinions, as do their managers. These people’s opinion can affect the results, and there is currently quite a lot of evidence that results don’t reflect the world’s knowledge anymore. Instead, they reflect the world as some people would like it to be.

I could provide many examples that seem to have this bias, but let’s take one of my favorites: Consider the search results for the innocuous query “Renaissance Europe art” below. Before you do, think to yourself what art you think should be shown. Botticelli? The Mona Lisa? The Sistine Chapel?

Now click on the spoiler below to see a screenshot of the Bing results for this query.

Click to see the Bing results

Notice the preference for paintings of particular people?

It seems to me that governments have taken notice of the fact that “algorithmic results” are no longer algorithmic (and in fact, they probably never were entirely algorithmic). If results are human-generated, they say, why shouldn’t we, the representative of the people, decide what the results should be? Why should workers at internet platforms who may have specific views of the world get to decide that these are the “right” views?

This is a logical argument, though the devil, as they say, is in the details. If a government takes a heavy hand and decides to censor views it doesn’t like, what happens to how people learn about the world? There are parallels between this problem and that of book banning at public libraries (see, for example, this overview), especially now that search engines have replaced libraries.

It is hard to say if this situation could have been averted and if so, perhaps this Pandora’s box has already opened. But I do wonder if a little more modesty in changing algorithmic results would have prevented the place where we are at today.

I work for Microsoft, which operates Bing. The views in this post (as indeed, the entire blog) are my own and not those of my employer. I do not have any inside information on which queries undergo manual editing.

The cost of data privacy

The other day I was talking to a friend of mine, a senior medical doctor at a research hospital. We were discussing clinical trials and how the recent staff shortages in the US made it difficult to start new clinical trials there. He mentioned off hand that clinical trials have been difficult to do in Europe for a few years now because of GDPR.

The European Union’s General Data Protection Regulation, or GDPR, is a regulation on data protection and privacy. It provides people with rights related to their data including, for example, the right to ask companies for data they collect about an individual (Article 15). GDPR is implemented in countries of the European Union, members of the European Economic Area and other countries which chose to implement it. The latter group includes Andorra, Argentina, Canada (only commercial organizations), Faroe Islands, Guernsey, Israel, Isle of Man, Jersey, New Zealand, Switzerland, Uruguay, Japan, the United Kingdom and South Korea.

The flip side of GDPR is that, for both companies and other organizations, it’s much harder to collect and process data. This may be a good idea when we’re thinking about companies which use these data to sell us more stuff, but it may be that these regulations have a less than beneficial effect for medical science. I wanted to see if there’s evidence for the latter.

In recent years medical researchers have begun registering their clinical trials on the US government’s ClinicalTrials.gov website. This can help patients find relevant trials, improve recruitment, and also reduce the likelihood of cheating (see Ben Goldacre’s wonderful talk on this subject). I took these data and extracted from them the country where each clinical trial is held (some clinical trials are held in multiple countries and I accounted for those) and the date at which it was first registered.

The figure below shows the number of clinical trials registered each month between January 2010 and July 2021. I divided the countries where the trials were held into three groups: the United States, countries where GDPR was implemented, and all other countries of the world.

Number of new clinical trials per month in the United States (top), GDPR-implementing countries (middle) and other countries (bottom). Light colors are before May 2018 and dark ones after it. Dotted lines are linear fits to the curves, with the slopes and fit shown below them.

In the graph the light colors are the number of clinical trials per month prior to the implementation of GDPR and the dark colors are the same numbers after it. I’ve also fit linear regression curves to each of these. As one can see, the number of clinical trials up to May 2018, when GDPR was implemented, rises slowly. Interestingly, it rises more slowly in the US than in the other two groups.

After May 2018 the rise in the number of clinical trials in the US and in countries where GDPR was implemented abruptly stops and flattens. However, in countries which did not implement GDPR (and are not the US), the pace of growth rises dramatically and accounts for the expected growth in both this group of countries and most of what we would have expected in the previous two groups. It seems as though GDPR put a break on clinical trials in countries where it was implemented, as well as in the United States.

Which countries benefited from this move from the US and GDPR-implementing countries? To test this, I computed for each country, the fraction of clinical trials conducted after the implementation of GDPR from all trials in the registry. I only looked for countries which had at least 500 clinical trials in the data. The 5 countries which had the largest fraction of trials post-GDPR are Pakistan, Egypt, Turkey, Indonesia, and China. Unfortunately, these countries are not bastions of human rights. According to Freedom House they are judged either “Not free” or “Partly free”.

Thus, it seems that one of the negative aspects of GDPR was the movement of clinical trials from countries which implemented it to those which did not. Whether this is a price worth paying is a personal judgment. To me, it seems that GDPR must be changed so that studies which improve the lives of people should be able to continue even at minimal cost to data privacy.

The current state of things reminds me of a story, possibly apocryphal, told to me by a lecturer during my graduate studies: A colleague of my lecturer who was a pain researcher from one of the industrialized countries took his sabbatical in Libya. This was, I think, in the late 1980s. My lecturer said that he asked the researcher, “why Libya?”. The reply was “it’s easier to do work there”…

Let’s not have GDPR cause medical research to move to countries which don’t take human rights seriously.

Caveat: I know there may be confounders that appeared at similar times. This isn’t a scientific paper, so take my explanations above with a grain of salt.

COVID19 vaccines and Ivermectin: The strange story of trust, politics, and media sensationalism

Over the past couple of years we’ve seen several unusual (to say the least) methods for treating COVID19, ranging from anti-malaria drugs to Yoga. Some of us may recall bleach as another idea suggested by then-president Trump, but he didn’t actually suggest it.

One of the more recent ideas was to repurpose an anti-parasitic medication, Ivermectin, to treat COVID19. This drug is licensed for use in both humans and livestock, leading to the derogatory “cow dewormer” moniker. The evidence for effectiveness of this drug came initially from lab studies, but doses were far greater than approved for human use.

Several randomized controlled trials followed, with the most recent meta-analysis finding an interesting outcome: Studies in some countries outside the US found the drug to be effective, while those conducted in the US did not. It may be that in countries where parasitic infections are common, treating these infections helps people defeat COVID19, but it doesn’t help those who don’t have it.

Nevertheless, some media channels and politicians recommended using the drug, and if you believe recent media stories, many people decided to use ivermectin rather than chose the more effective solution and vaccinate against COVID19. It seems that overdoses of the drug became more common.

However, I wanted to see, how many people were really interested in ivermectin, compared to the vaccine?

As usual, I looked at Google trends data (at the state level) for Ivermectin, Hydroxychloroquine, and the COVID19 vaccine. The volume of searches for ivermectin is negatively correlated with interest in the vaccine during 2021. However, there is no such relationship for hydroxychloroquine. In the graphs below the axes are search volumes.

Interest in ivermectin (horizontal axis) and the COVID19 vaccine (vertical axis) in different states, as measured through Google Trends search volume. The line is a linear regression curve.
Interest in hydroxychloroquine (horizontal axis) and the COVID19 vaccine (vertical axis) in different states, as measured through Google Trends search volume. The line is a linear regression curve.

Second, interest in ivermectin is small compared to interest in the COVID19 vaccine, even in the states where it had the highest search volume. Below are figures for the entire USA and for Oklahoma.

Google Trends search volume in the USA for ivermectin and for the COVID19 vaccine.
Google Trends search volume in Oklahoma for ivermectin and for the COVID19 vaccine.

I tried to see if the voting results for the presidential elections in 2016, 2020 and the current governor of each state were a predictor of the search volume for the vaccine or for ivermectin. The most predictive factor for the 2016 election results is interest in vaccination. The accuracy of the prediction is very high (Area Under the Receiver Operating Curve of 0.91), meaning that more interest in the vaccine correlated with voting for a democrat in the 2016 elections. Outcomes of the 2020 elections are much harder to predict using interest in the vaccine (AUC=0.64).

Interest in hydroxychloroquine doesn’t predict election results, but search volume for ivermectin, and even better the ratio of search volume for ivermectin to the volume for vaccine predicts the 2016 election results (AUC=0.86). Here higher ratios of ivermectin to vaccine searches predict a vote for Trump.

What do all these findings show?

To me, the most interesting finding is that support for former-President Trump is a strong predictor of interest in ivermectin over vaccines. This is somewhat similar to my previous blog post and to a study, about Israeli politics.

As an aside, it seems to me that the ivermectin story was somewhat overblown up by media. Interest (as measured in search engine data) was much lower in actuality.

Politics, income and COVID19 vaccines in Israel

The rate of COVID19 vaccination is strongly correlated with party affiliation. Specifically, the Kaiser Family Foundation found that people in counties that voted for Biden during the last elections had significantly higher rates of COVID19 vaccination compared to those who voted for Trump.

I wondered if the same was true for Israel. So, I download from the Israeli Health Ministry the vaccination rates at 253 towns and cities (as of August 10th), the voting data from our last elections from the Israeli Central Elections Committee and income data from the Israeli National Insurance Bureau. The last source also gave me the Gini index for each location.

I manually labelled the towns and cities as to whether they were predominantly Jewish or not. I also computed the percent of voters in each location who voted for the current coalition government.

Here are a few results. First, in predominantly Jewish towns and cities vaccination rates are strongly correlated with income, but even more strongly (and significantly statistically more so) with voting for the current government.

Vaccination rates as a function of income in Jewish cities
Vaccination rates as a function of the percentage of people who voted for the current coalition in Jewish cities

In predominantly non-Jewish cities the picture is more complicated. First, the correlation is much lower than the one we observed in Jewish cities. More interestingly, while income is still correlated with vaccination rates, voting for the current government is negatively correlated with vaccination rates.

Vaccination rates as a function of income in non-Jewish cities
Vaccination rates as a function of the percentage of people who voted for the current coalition in non-Jewish cities

A linear model of the data (with interactions) bears this out:

The model for Jewish towns reaches R2 of 0.67, which is extremely high. The statistically significant variables are vote for the coalition (positively correlated), Gini index (negatively correlated), and the interaction of income with the Gini index (positive) and with income (negative). Therefore, cities that voted for the government and had less inequality were more likely to vaccinate.

The model for non-Jewish towns reaches a lower R2 of 0.46. Here the statistically significant variables are vote for the coalition (negatively correlated) and the interaction of the Gini index with voting for the government (negatively correlated). This means that the most indicative variable for vaccination rate was not voting for the current government and, in cities that have more inequality and higher income this is even stronger.

My understanding from these results is that, in Israel as in the US, voting is correlated with vaccination rates. I don’t think, however, that one is causal of the other. Instead (at least in Israel) there is probably a third variable driving both. For example, the Arab party which joined the coalition is the Islamic party, who’s voters tend to come from populations with lower income and that live in areas with less access to healthcare. In the Jewish population, one of the main blocks not part of the current government is the Ultra Orthodox, who are also less likely to vaccinate. They are also poorer than the general population.

The bottom line? Vaccination rates in Israel are correlated with political affiliation, but perhaps for different reasons than those in the US.

COVID19: How are we doing?

Note: The following is somewhat different from my usual blog posts because it doesn’t involve internet data. It’s my analysis of publicly available health data which I did to answer a question I had.

The current phase of the COVID19 pandemic is affected by several trends which are driving the pandemic in opposing directions. One the one hand, the vaccination rate is high in many developed countries. On the other, new strains such as the Delta strain are more infective and the vaccines are thought to be less effective against these strains (even though they are still highly effective!).

What is the overall trend?

Scotland may be a good area to examine this question. On the one hand, at the time of writing 54% of the population has received two vaccine doses (73% received only one). On the other, since mid-May 2021 the delta vaccine is the dominant strain in Scotland.

Here is a plot of four indicators (source) of the pandemic: Number of daily positive cases, hospital admissions, ICU admissions and deaths. They are smoothed using a 7-day moving average. 

Four indicators of the COVID19 pandemic in Scotland

On average, hospital and ICU admissions are best correlated with daily cases when those are taken 7 days later (that is, it takes around a week until a case is hospitalized), and another 7 days until deaths occur.

Therefore, I used the daily positive data to predict both hospital admissions and deaths at the appropriate lag (7 and 14 days). In both cases I used a non-linear model (second order polynomial to predict the quadratic root of the dependent variables) trained on data until the end of April 2021. The models had a good fit (R2=0.69 and 0.52, respectively).  

Here are the actual and predicted hospitalizations, compared to case numbers:

Daily positive cases, hospital admissions and predicted hospital admissions

As we can see, hospital admissions are rising since mid-May, but not as fast as the prediction. We would expect around 170 people to be hospitalized at this point, but there are around 45. That’s around one quarter of the expected number.   

A look at deaths is even more telling:

Daily positive cases, deaths and predicted deaths

Deaths have risen very slightly: We would have expected almost 40 per day at this stage, but are seeing around 2 (that’s one twentieth of the expected!).

My takeaway from this is that we will see a rise in hospitalizations and in deaths, but it will be much smaller than in previous waves of COVID19, especially in terms of deaths. The vaccines are providing significant protection against the worst aspects of COVID19.

Why does flu happen in winter? COVID19 could help us answer the question

There are reports of a Respiratory Syncytial Virus (RSV) outbreak in Israel. RSV is a virus which causes a flu-like illness and is especially dangerous for children. What’s strange about this outbreak is that it’s happening in early summer, whereas previously RSV outbreaks always happened in winter.

I was wondering if this is something special to RSV and to Israel or perhaps something bigger?

Luckily, a few years ago we looked at the association of search engine query volume and the incidence of RSV and found that it was quite high. Therefore, I extracted Google Trends data (using the Google Trends Anchor Bank toolbox) for RSV from the US, United Kingdom and Canada and plotted it below:

Query volume for RSV in Canada, United Kingdom and United States between May 2016 and June 2021. Note that Canada and UK are on different axis from that of the US.

However, starting from April 2021 there is a dramatic rise of RSV in the US and UK, but not in Canada. Thus, Israel is similar to US and UK, but Canada seems an outlier.

Is there something special about RSV?

Here are the time series for several other seasonal viruses in the US:

Query volume for RSV, Rotavirus, Norovirus and Common cold in the United States between May 2016 and June 2021. Note that common cold is on a different axis from that of the other viruses.

Here we see similar correspondence, except for two outliers: First, common cold queries happened in the winter of 2021, but to a lesser extent. Second, RSV is rising, but so is norovirus, which started earlier and may already be on its way down.

Here is another virus, Rabies, compared to RSV. Rabies usually spike in summer, and in the summer of 2020 there was no spike. This year, however, it seems to be rising to normal levels. Note that it is unlikely that the search query volume for rabies represents rabies cases, as it does for RSV. Even though there is evidence for seasonality of rabies, in this case it probably reflects worry about rabies due to close contact with mammals.

Query volume for RSV and rabies in the United States between May 2016 and June 2021. Note that RSV is on the left axis and rabies on the right axis.

What’s happening here? Perhaps opening up for social gatherings in Israel, US, and UK have enabled RSV and other viruses to spike. We are looking into whether there is supporting evidence for this question.

These findings raise the interesting question of why RSV (and other viruses) occur in winter? Is it because of the colder weather which causes people to congregate indoors and perhaps constricts our airways? Is it because there is some level of immunity in the population which slowly decays over the year until, early in winter, it is low enough for an epidemic to begin?

COVID19 may allow us to resolve this question.

(Special thanks to Prof. Lev Muchnik for interesting discussions on this topic)

Should the president be impeached? It all depends who you ask

“The President, Vice President and all civil Officers of the United States, shall be removed from Office on Impeachment for, and Conviction of, Treason, Bribery, or other high Crimes and Misdemeanors.” Article II, Section 4, US Constitution

Professor Anat Rafaeli and I teach a course at the Technion which is intended to teach students with a background in psychology the tools of Data Science and specifically how to answer research questions from the social sciences with internet data. As part of the course, students chose a research question and answer it during the semester, using the tools we teach them.

A couple of years ago, while explaining their research question, one of the students (whose name I don’t have anymore, but will be happy to add him, if he’s reading this) showed us an intriguing chart: It displayed the search volume (from Google Trends) for the term “impeachment” for the period of a few months before and after President Trump was inaugurated. There were several spikes during that year from people searching for how to impeach the president. I didn’t find this particularly surprising given the news coming from the US.

What was surprising was a similar chart he showed us for the same period around President Obama’s inauguration. It showed similar spikes! I hadn’t heard of anyone who wanted to impeach Obama, so that spike was shocking to me.

Recently I repeated the exercise this time adding data from similar time periods around President Biden’s inauguration (Technical note: For the first two presidents I used the “impeachment” topic, while for Biden I used the term “impeach Biden”, to exclude searches related to Trump’s impeachment trial). You can see the results in the figure below.

Search volume for impeachment during the months before and after inauguration of presidents Obama (blue), Trump (orange) and Biden (gray).

As you can see, each president has people searching for how to impeach them, first around the time that they are elected and then around inauguration. After these two events “impeachment” spikes every so often (as you can see in the Obama and Trump spikes at the end of May). More broadly, here’s Obama’s entire second term:

Google Trends search volume for impeachment over the period of President Obama’s second term in office.

Who are these people, who are so eager to impeach their president?

We can try to answer this question by looking at the correlation between how each state voted in the recent Presidential elections and the percentage of people searching for impeachment of a president. The graphs below plot the percentage of people in each state who voted for Biden as president (i.e., roughly speaking, are Democrats) compared to the search volume for impeachment.

Percentage of Biden voters in each state (Democrats) versus search volume for impeachment of Obama (top), Trump (center) and Biden (bottom). Outliers are marked.

People who wanted to impeach Obama where mostly from republican states, as shown by the negative correlation between search volume and the percentage of Democrats in the state. The opposite is true for people who search for impeachment during Trump’s first months in office. With Biden the correlation is much worse, but the data is skewed by a single point when, if removed, is again reasonable (R2=0.13): For some reason, the mostly Democratic voters in Vermont are those searching more often for Biden’s impeachment.

What’s the bottom line? If you don’t know anyone who thinks your favorite president should be impeached, you just don’t know the right people.