Our newest paper suggests an intriguing possibility: We may be able to predict a stroke event by observing people’s activity on a search engine.
What does the evidence show?
We started with a group of anonymous Bing users who, at some point, indicated in their queries that they had undergone a stroke. We filtered these users to those who were active pretty much every day, then they were inactive for between one and several days, and indicated their stroke just after that inactivity period. We hypothesize that the inactivity was due to their stroke which happened just after they disappeared. We then tried to separate these users from other users, some who were of similar ages and others who indicated having other medical conditions.
To separate the users we represented them through a variety of attributes such as the time of day of queries, the time since their previous session, etc., but more importantly, attributes which were previously linked to cognitive decline such as the complexity of queries, the deepest link that was clicked, and more.
We found was that it was quite easy to separate these populations of users. Of course, it may be that there were other things that were different between these populations even though we took care to select them in the same way that we chose the stroke population. However, we did find that people with cardiovascular diseases were harder to differentiate from the stroke population than people with other conditions.
We also applied our model to data that was collected a year later. Here we didn’t have many people who indicated a stroke, so used a weaker label, which was the number of times each user was interested in stroke. This is an indicator that was used in the past to find people who are suffering from cancer. The model successfully found those people who are interested in stroke, just by looking at the meta data of their queries, through attributes such as those described above.
Predicting when a stroke will occur
It seems that it’s possible to differentiate populations of users who will undergo stroke from others. Can we also localize the stroke in time? That is, can we predict if a stroke will occur within the next few days?
The results here are not as strong, but they do indicate the possibility of localizing the stroke event. According to our findings, in in the 3-4 months prior to the stroke event something begins to change and peoples attributes begin to be more similar to those of people who will undergo a stroke. This could be because of microstrokes or other cardiovascular events.
The summary of our findings is the intriguing possibility that stroke causes cognitive changes some time before a stroke happens, and that these changes can be identified through people’s interactions with search engines. If this is true, the upshot could be dramatic: We may be able to prevent stroke by analyzing people’s queries and, if they indicate a possible event in future, have doctors prescribe simple medications such as aspirin. As our medical partners (Prof. Stern and Dr. Shaklai) said, there’s a lot to do before stroke and not a lot after it.
However, all our data is derived from queries of people who indicated their health conditions. We don’t have their medical records. Therefore, we’re now trying to set up a clinical trial which will collect both query data and medical records from people and validates our hypothesis.