Researchers are tapping into social media activity to help predict asthma outbreaks
When the thunderclouds blew pollen over Melbourne on 21 November 2016, causing a 10-fold increase in hospital presentations for asthma, it took some time for the authorities to recognise that they had a city-wide emergency on their hands.
But hours before the hospital deluge, people were reporting symptoms of wheezing, coughing and breathing difficulties on their public Twitter accounts.
Researchers at CSIROâs Data61 are now using these types of tweets to demonstrate how social media might form part of an early-warning system for such rapidly developing asthma outbreaks.
The team developed 18 algorithms to analyse the tweets overall. Three of the algorithms would have detected the thunderstorm asthma outbreak up to nine hours before the first official report.
Five of the algorithms could have raised the alert before the first news report.
âWe do not expect that social media alone will be useful for epidemic intelligence,â said lead researcher Dr Aditya Joshi, a computational linguist and postdoctoral fellow at CSIROâs Data61.
âTraditional forms of syndromic surveillance or epidemic intelligence are extremely valuable. However, this work shows that social media (which tends to be a real-time source of information) can be a useful and viable alternative, especially with respect to events, like the thunderstorms, which require a quick response.â
The research was completed with the assistance of Raina MacIntyre, a professor of global biosecurity at the Kirby Institute at UNSW, and Dr Cecile Paris, the chief scientist of Data61, along with other collaborators at that company.
While tweets were a useful source of data, the task of building an alert system using only tweets was a complicated one, said Dr Joshi.
The first big problem is that of false alarms. No one wants an alert system that cries wolf all the time.
To solve this âalert swampingâ issue, the researchers narrowed the dataset to the first health-related tweet from a unique user in a day, geo-restricted to Melbourne.
Another difficulty with tweets was that people often used health words as a figure of speech, Dr Joshi said.
For example, someone might tweet: âI saw the new trailer of this film. Oh my god, itâs awesome. I canât breathe.â This is clearly not a description of physical symptoms.
The researchers separated these figurative tweets using a pre-trained algorithm that uses vectors to tell how closely connected two words are in terms of their meaning.
This algorithm, by GloVe, is publicly available and has been trained to recognise the connections between 27 billion words.
Each word is represented as a point in 200-dimensional space, and the algorithm calculates the connectivity between two words by averaging out the vectors.
For the tweet, âI saw the new trailer of this film. Oh my god, itâs awesome. I canât breathe,â the algorithm would detect that the words âtrailerâ and âfilmâ were far away in meaning from a cluster of health-related words.
By comparison, the tweet: âThereâs smoke today, and I canât breatheâ, contains content words that are mostly health-related so this would be classified as a personal health report by the algorithm.
â[However], there are lots of things that might not work about this approach,â Dr Joshi said.
Firstly, there are many symptoms that people would feel uncomfortable or embarrassed to share on social media, limiting the types of diseases that a tweet-based alert system could track.
Secondly, social anxiety tended to increase the number of fear-related tweets about health, which could throw out the AI, he said.
âSo, for example, the tweet âI have a rash on my hand. Oh my god, do I have measles?â Now, the person is reporting a rash, but this is not really a report or a confirmation of measles,â he said.