Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Wikipedia Medicine Stats

Researchers Forecast the Spread of Diseases Using Wikipedia 61

An anonymous reader writes Scientists from Los Alamos National Laboratory have used Wikipedia logs as a data source for forecasting disease spread. The team was able to successfully monitor influenza in the United States, Poland, Japan, and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand. The team was also able to forecast all but one of these, tuberculosis in China, at least 28 days in advance.
This discussion has been archived. No new comments can be posted.

Researchers Forecast the Spread of Diseases Using Wikipedia

Comments Filter:
  • This is really an interesting stuff. I guess we have every single thing in WIKI.
  • wondering when they start to try to predict diseases (or may be pc sales) from /. posts

  • by Anonymous Coward

    Sounds familiar, hasn't someone already done that half a year or a year ago using Google search string mapping?

    • Re: (Score:2, Informative)

      by Anonymous Coward

      Thought so, it was Google, and they even created a page with real-time stats.
      http://www.google.org/flutrends/us/#US

    • by umghhh ( 965931 )
      it works like fighting evil regimes by clicking on 'likes' button of fb and alikes does.
  • How did they do it? I started reading the linked paper, but my brain started hurting two sentences in. I couldn't extract any useful information on the 'how'.

    • Re:How? (Score:5, Informative)

      by ctrl-alt-canc ( 977108 ) on Friday November 14, 2014 @07:54AM (#48384609)
      They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia.
      This implicitly makes some big assumptions, among which the facts that people are aware of the disease and that they have internet access.

      You can easily understand why their approach is of very limited usefulness, and scientifically questionable. I think that it is not by chance that their method fails to work when analyzing data for Uganda (where internet usage probably isn't widespread) and does not score well for China (where censorships both limits information about disease outbreaks and internet access).

      They also state in their paper: "With these constraints in mind, we used our professional judgement to select diseases and countries.", and this raised my eyebrows a lot...

      I would like to put at chance their approach by sifting wikipedia access data looking for Ebola keyword in slovenian language, and then forecast the diffusion of Ebola in Slovenia (equal to nil up to now...), but I try to use my time for testing methods that are better-posed.

      "There are three kinds of lies: lies, damned lies, and statistics."

      • I don't think you're being fair. This research extends their ground-breaking study that searching Google for "Jennifer Lawrence iCloud hack" predicted fapping with 100% accuracy.
      • Re:How? (Score:4, Funny)

        by Rosco P. Coltrane ( 209368 ) on Friday November 14, 2014 @09:53AM (#48385017)

        They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia

        Imagine the potential: if a lot of search logs contain "EBOL-AAAARGH", they'll know a particularly fast-acting variant of the virus has emerged.

      • Which raises the question: If you search for the symptom keywords(Rash, Boils, Bleeding, coughing), can Wikipedia actually list diseases with those keywords?

        From experience I do know that a lot of food can be typed in a native language, and it will still go to the correct page on English Wikipedia, roughly.
        But if I start search for terms and keywords, Wikipedia tend to be worse than google.

      • 'They made the assumption'
        They made a hypothesis, then tested that hypothesis against the null hypothesis. This is otherwise known as science. Why do you hate science?

    • That was my thought. The only way I can think of to use Wikipedia log data to predict outbreaks, would also of predicted that American was in the grip of a huge Ebola epidemic a few weeks ago. Perhaps this wiki data is just any easy way to measure media attention to a subject, which often is correlated with an epidemic? It is measuring the public's attention, not actually making a prediction.
      • I think the most important piece of news of this story is that Wikipedia is no better than Google or Facebook, and exploits/sells search data too.

  • Wait... what? Diseases now use Wikipedia?

    • Why not? Viruses use Outlook.
  • Now that they've spread the word, will the approach start to be 'gamed' by big pharma or gov't trying to sow the seasonal flu panic?

  • And teachers always say not to use Wikipedia for research. "Wikipedia is the devil!" When used correctly Wikipedia is a valuable resource.
    • by terbo ( 307578 )

      The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
      things that make wikipedia much more advanced than traditional encyclopedias.

      • The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
        things that make wikipedia much more advanced than traditional encyclopedias.

        No, teachers know that lazy students will just blindly copy and paste stuff from wikipedia.

  • I thought Wikipedia was spreading just misinformation and biased information. Now they are spreading actual biological diseases using Wikipedia? I'm not surprised. Internet is a lawless frontier and anything goes there.
  • Why not google trends? It's already categorized.
    • by necro81 ( 917438 )
      Google has been working on that, it's called Flu Trends [google.org]. But it hasn't really proven itself out yet. See my post below [slashdot.org].
      • You can cross-correlate multiple medical term searches and conditions and see the trends in search over broken down by regions. It's not limited to flu. You can do it by other (some slowly-spreading) medical conditions.
  • Google tried (is still trying?) to track the spread of influenza [google.org], by watching the trends in searches for information about the disease. It's a very interesting bit of work, but as I recall, failed to be meaningfully predictive [google.com]. The trouble is, there are lots of prosaic reasons why someone might search out information about the flu (or any other disease) other than actually having it. Separating that noise (general interest in the flu) from the genuine signal (particular interest from people who are infec
  • I always wash my hands after using Wikipedia.

  • Like others I found the headline confusing. I read it as "Researchers are predicting the use of Wikipedia as a vector for the spread of disease". This may mean that:

    • Disinformation and ignorance are diseases.
    • Memes and computer viruses are diseases.
    • Wilipedia contains information that leads to depression.
    • Instructions on Wikipedia lead to substance abuse.
    • This is getting entertaining, fill in your own reason here.
  • google has been forecasting flu through search data for a while.

    http://www.google.org/flutrends/us/

    It doesn't work perfectly though:

    http://www.nature.com/news/when-google-got-flu-wrong-1.12413

    • by Fpdx ( 2689069 )

      yes, but google does not share its log files!

      Google published a Nature paper out of it. AFAIK the data (google queries) on which that research is based is kept well secret. Therefore it is not possible to validate what they did. Science cannot be based on secret data, and the journal Nature in this case published an advertising ("how awesome is google"), not a scientific paper ("these are the data, this is our method, check out our conclusions").

      As they athors here say, approaches from closed sources like

The one day you'd sell your soul for something, souls are a glut.

Working...