Big Data Use Cases in Biotechnology

[Big Data Series] Chapter 3. Reading Trends with Big Data, Big Data Use Cases in Biotechnology

Smartphone users are now able to easily self-diagnose and monitor their health thanks to the wide range of available healthcare apps. These apps can be used to monitor simple aspects of health, such as burned calories and caloric intake, as well as complex aspects, such as blood pressure and sleeping habits. The healthcare sector has begun incorporating big data into their operations to create new added value to the vast volumes of medical data generated by these apps.
There are a wide range of big data use cases in biotechnology, spanning from healthcare and bioresources to agriculture and environment. In this chapter, we will focus on big data use cases in the healthcare industry.

National Health Index Monitoring System

Remember Google’s flu prediction web system? Google launched a system that made predictions about flu activity using search queries and displayed the results on a map of the United States. Google Flu Trends is one of the most well-known big data use cases in the world.
After that, the ACSM (American College of Sports Medicine) began annually publishing more detailed measurements of the overall health of American metropolitan areas.
Using big data, the ACSM developed the American Fitness Index (AFI) that compares the overall health and diet of American metropolitan areas. The annual reports are also openly available to the health care industry.

In Korea, the NHIS (National Health Insurance Service) operates a similar service that forecasts national health levels. This system provides forecasts of common colds, eye infections, food poisoning, asthma, and dermatitis. Disease severity is color coded into four different levels: guarded, elevated, high, and severe. The severity of an illness in each region is displayed on a map for different age groups, along with preventative measures that can be taken against the disease. This system utilizes treatment records and keywords from social media to predict disease transmission.

In 2016, the NHIS took their big data applications a step further. They are currently developing a system that can detect seasonal diseases by combining high volumes of healthcare data, including insurance claims and drug prescriptions, with weather forecast information. Leveraging big data, the Korean government is taking preventative measures to develop a system that can detect potential diseases that might threaten national health.

Big Data Use Cases in Healthcare

In the past, even though hospitals had access to a wide array of patients’ data including diagnosis history, medical charts, nurse’s records, genetic information and personal habits, there was no system that could effectively manage it all. However, big data analytics can now be deployed in the healthcare industry to analyze unstructured data.
An example of a big data use case in Korean healthcare comes from Samsung Medical Center (SMC). The center provides big data-driven personalized and precision medicine based on a patient’s genetic makeup, medical history and lifestyle patterns.
This service uses big data technology to analyze a patient’s entire medical record to formulate a personalized and optimized treatment plan as well as diagnose any other potential illnesses. The focus of this system is more towards managing and protecting patients’ health, rather than just providing treatment.

In 2013, the SMC partnered with Daumsoft to develop a suicide forecast system.
Predictors of suicide from social media data, such as consumer price index, unemployment rate, weather, and well-publicized suicides, were used to monitor national suicide rates. The SMC’s successful application of big data has inspired other hospitals such as Seoul National University Hospital and Ajou University Hospital to follow suit.

Big Data Use Cases in Pharma

Pharmaceutical companies are also using big data across all their operations, starting from new drug development and clinical testing to sales and marketing. Traditionally, developing a new pharmaceutical drug requires massive investment from the development stage to clinical testing phase. However, big data can be used to reduce new drug development costs by utilizing data on existing drugs, such as preference or side effects. Big data can also be used in marketing activities. Big data analytics can be used to estimate sales, determine target demographics, and generate all information required to draw up an effective marketing strategy, ultimately reducing marketing costs.
For example, when Dong-A ST and Ajou University Hospital partnered up to develop a complex drug, they utilized big data analytics. By analyzing prescriptions and adverse side effects, they were successfully able to minimize costs and risks during the drug development phase.
Let’s assume that since arthritis medicine and digestive medicine are often prescribed together, a pharmaceutical company decides to develop a complex drug. During drug development, if the company uses big data technology to combine medical technology with accurate demand instead of just randomly testing different combinations of medicine, it will be able to significantly reduce costs and risks.

Big Data Use Cases in Pharma for New Drug Development : Pharmaceutical companies are utilizing big data technology to save money on new drug development. Big Data Use Cases in Pharma for New Drug Development

IBM Watson Health

IBM’s Watson, a supercomputer that is capable of answering questions posed in natural language, extracts insights from vast volumes of healthcare data to diagnose and recommend treatment options to doctors.
Watson can almost instantaneously sift through millions of medical certificates, patient records, and medical textbooks in order to suggest a diagnosis and appropriate treatment option to doctors. Watson helps minimize diagnostic errors made by doctors using objective data.
While it can take days for a hospital to run tests and find the appropriate treatment option for a patient, with Watson, that process can take less than a few minutes, allowing patients to start receiving treatment a lot faster.
IBM Watson entered into a partnership with MD Anderson Cancer Center for cancer diagnosis and treatment method research. According to a recent announcement, Watson was able to diagnose cancer with 96% accuracy within a couple of minutes, whereas a doctor took 160 hours to reach the same diagnosis. IBM Watson’s imaging precision for CTs and MRIs has also shown significant improvement.

IBM’s Watson Supercomputer (Source: IBM website) IBM’s Watson Supercomputer (Source: IBM website)

Not everyone views IBM’s Watson favorably. Some people question whether a computer like Watson could actually replace doctors. In my opinion, doctors should make the final diagnosis based on Watson’s analysis. In other words, Watson’s diagnosis should only be used as reference.


The U.S. government believes that utilizing big data in the healthcare industry can create more than 100 billion dollars a year and is accordingly allowing healthcare institutions to share medical and pharmaceutical records.
In Korea, utilizing medical data is restricted by the Personal Information Protection Act and Medical Service Act. Moreover, it is difficult for the government to share their medical data with the medical industry. There is also no data standardization between medical institutions to facilitate data utilization.
Even with the possibility of medical data breaches, Korean laws need to be amended to allow medical data to be used with big data to create added value. By reforming current policies in a way that minimizes risks to data, the healthcare industry will be able to leverage big data to save and improve lives.
In the next chapter, we will take a look at multimedia big data use cases.

▶   The contents are protected by copyrights laws and the copyrights are owned by the creator and Samsung SDS.
▶   Re-use or reproduction as well as commercial use of the contents without prior consent is strictly prohibited.

Senior Engineer, Seoyeon Kim
Senior Engineer, Seoyeon Kim AI/Analytics
Samsung SDS Smart Factory Business Division

After receiving her Ph.D for industrial engineering from Gorgeia Tech in 2009, Dr. Seoyeon Kim worked as an industrial engineering researcher at the National University of Singapore before joining Samsung SDS in September 2010. As a data scientist, she leads multiple big data projects while also working as an instructor for an in-house data scientist training program she helped create. Dr. Kim regularly shares her in-depth big data expertise as a contributor for CommonSDS and IE magazine and also actively participates in various industry seminars.