Thanks to the abundance of data available, Data Science is disrupting the healthcare sector. Find out how data analysis and AI are transforming the medical environment, and how to become a "Healthcare Data Scientist"...
The healthcare sector generates immense quantities of data. According to a study conducted by the Ponemon Institute, this field alone accounts for 30% of global data.
Medical records, clinical trials, genetic information, invoices, connected objects, databases, scientific articles are just some of the many sources of data available to the medical community.
With the rise of tele-consultations and health-related internet searches, the volume of data is literally exploding. For industry professionals, patient data is now centralized and more accessible than ever before.
The term “quantified health” is now used to refer to the integration of data from connected objects such as connected bracelets, and accessories such as glucose meters and scales into medical records via smartphones.
This is what platforms like Apple HealthKit and Google Fit offer. Thanks to these resources, it is now possible to quickly detect alarming signals and carefully monitor changes in behavior and vital indicators.
All this data can be exploited by healthcare professionals, opening up a wealth of possibilities. Discover how Data Science is revolutionizing the medical field.
Drug discovery
On average, it takes $2.6 billion and 12 years to create a drug and bring it to market. Data science can drastically reduce both cost and time.
Thanks to data, scientists can now simulate a drug’s reaction with the body’s proteins and different cell types. According to Mark Ramsey, Chief Data Officer at pharmaceutical giant GSK, the process could be reduced to less than two years thanks to this simulation method.
Several startups are also exploring this avenue. London-based BenevolentAI, for example, has raised $115 million to launch over 20 drug creation programs and develop an artificial brain capable of creating new drugs and treatments.
Disease prevention
Prevention is better than cure, as the saying goes. Thanks to connected objects and other tracking devices, taking into account the patient’s history and genetic information, it is possible to detect a problem before it gets out of hand.
Omada Health, for example, uses connected accessories to create personalized behavior plans and online coaching to help prevent chronic diseases such as diabetes, hypertension and cholesterol.
For its part, Propeller Health has created an inhaler usage tracker that uses GPS to couple data from at-risk individuals with environmental data from the US CDC. The aim is to propose interventions for asthma sufferers.
Canadian startup Awake Labs, meanwhile, collects data from autistic children via connected accessories. This enables parents to be alerted in the event of a potential seizure.
Artificial Intelligence has been used several times to detect diseases early. Researchers at the University of Campinas, Brazil, have developed an AI platform to diagnose the Zika virus using metabolic markers.
Disease diagnosis
Today, doctors’ diagnoses are unfortunately still often wrong. According to the National Academies of Sciences, Engineering and Medicine, some 12 million Americans are misdiagnosed.
The consequences can sometimes be fatal. According to a BBC survey, misdiagnosis causes between 40,000 and 80,000 deaths a year.
And yet, Data Science can greatly improve the accuracy of diagnoses. This is particularly the case for medical imaging analysis.
Computers can learn to interpret MRIs, X-rays, mammograms and other types of X-ray. The machine learns to identify patterns in these visual data, and can then detect tumors, arterial stenosis and other anomalies with an accuracy often surpassing that of human experts.
Without even going as far as automated analysis of medical imaging, Data Science makes it possible to increase the size of an image or improve its definition. Interpretation will be easier for human experts.
In addition, researchers at Stanford University have developed Data Driven models for detecting irregular heart rhythms from electrocardiograms faster than a cardiologist. Other models are able to distinguish benign marks on the skin from malignant lesions.
Iquity, a company developing a predictive analytics platform for the healthcare sector, has carried out a study analyzing four million data points on 20 million New Yorkers.
By combining data from patients diagnosed – wrongly or not – with multiple sclerosis, Iquity was able to predict with 90% accuracy the onset of a disease eight months before it could be detected with traditional tools.
For their part, Microsoft researchers analyzed the web search data of 6.4 million Bing users whose search results suggested they had pancreatic cancer.
They then reviewed keywords from their previous searches, such as weight loss or blood clots. It is therefore possible to exploit search engines to anticipate the diagnosis of pancreatic cancer.
Personalized treatment
Thanks to Data Science, it is also possible to offer more targeted and personalized treatments. It is possible to take into account the subtle differences between each of us, for more effective care.
For example, the National Institute of Health’s 1000 Genome project is an open study of genome regions associated with common diseases such as diabetes or coronary heart disease. This study enables scientists to better understand the complexity of human genes and how a specific treatment will be better adapted to an individual.
Emory University and Alfac Cancer Treatment have partnered with NextBio to study malignant medulloblastoma brain tumors. While radiation therapy was once the only treatment for this cancer, analysis of a patient’s genetic and clinical data now makes it possible to discover specific biomarkers for personalized treatment.
The MapReduce tool enables genetic sequences to be read, reducing the time needed to process the data. SQL is used to restore genomic data, manipulate BAM files and process data.
Patient follow-up after discharge
Every operation or treatment can lead to side effects, complications or recurring pain. It can be difficult to track and monitor these phenomena after a patient has left hospital.
Data Science enables doctors to continue monitoring patients remotely in real time after they return home. For example, Cloudera’s software can predict a patient’s chances of readmission within 30 days, based on their medical data and the socio-economic status of the region where the hospital is located.
SeamlessMD is developing a platform for post-operative care. This platform has enabled Healthcare System Saint Peter in New Jersey to reduce the average post-operative length of stay by one day.
This represents a saving of $1,500 for each patient, who simply enters his or her pain level into the application each day and lets caregivers monitor progress over time. In the event of a potential problem, the app also issues alerts.
AI-enabled mobile applications can also help patients. Chatbots, or virtual voice assistants, can communicate with patients. Patients can describe their symptoms or ask questions, and receive valuable information from a vast network linking symptoms to diseases.
These applications can also remind patients to take their medication on time, and arrange an appointment with a doctor if necessary. Among the most popular are the Woebot chatbot developed by Stanford University to help depressed patients, and the virtual assistant from Berlin startup Ada, which predicts illnesses based on symptoms.
Hospital management
Hospitals are complex and difficult to manage. Data analysis helps determine exactly how many caregivers need to be on deck at each hour of the day to maximize efficiency.
It also ensures that enough beds are available to meet demand, and much more. Predictive analytics can also be used to optimize schedules and streamline emergency services.
At Emory University Hospital, Data Science is used to predict demand for laboratory tests. This reduces waiting times by up to 75%.
Business Intelligence can also be used to improve the billing system and identify patients at risk of payment difficulties.
These analyses can be coordinated with insurance and financial departments. For example, the Center for Medicare and Medicaid Services has saved $210.7 million thanks to Big Data-based fraud prevention.
The future of Data Science in the medical field
The healthcare industry is being transformed by data science. Pharmaceutical giants, biotech startups, research centers, investors and healthcare institutions are investing heavily in this revolution.
Many challenges remain. For example, data is often scattered across several regions, administrative units and hospitals.
This makes it difficult to consolidate them into a single system.
In addition, many patients are concerned about the protection and confidentiality of their personal data. Some private companies are interested in exploiting this valuable data for targeting advertising. Google, for example, has been sued for such practices.
Finally, there is concern that the relationship between doctors and patients is disappearing in favor of interaction with machines and algorithms. It’s true that human contact is essential in healthcare.
Nevertheless, despite these challenges, data science holds great promise for the future of medicine. As the technology develops, new possibilities will emerge…
How do I get to be a Healthcare Data Scientist?
The medical field thus presents itself as an ideal breeding ground for Data Science. The term “Health Data Science” is now used to describe the generation of data-driven solutions to healthcare problems. It is an emerging discipline, at the crossroads of statistics, computer science and medicine.
Health Data Scientists are increasingly in demand in the healthcare sector in all countries, in both the public and private sectors. However, only 3% of US Data Scientists currently work in the medical field.
A Healthcare Data Scientist’s role is to design studies and evaluations, carry out complex data analyses, or advise healthcare establishments and caregivers based on the results of their analyses.
He or she will need to use data to predict the effects of drugs, and understand diseases affecting humans. His role is also to deploy the power of artificial intelligence, and enrich public health datasets.
This professional can work for government health departments, hospitals, universities and research institutes, pharmaceutical companies, health insurers or private companies.
Becoming a Healthcare Data Scientist requires the same skills as a classic Data Scientist. However, these skills need to be coupled with a solid knowledge of the healthcare field.
A Healthcare Data Scientist must have skills in mathematics, quantitative analysis and statistics. He or she must also be able to communicate with the various players in the medical field.
Of course, it’s also important to have a good understanding of the concepts involved in this sector, thanks to a background in medicine, epidemiology or virology.
Some companies offer specialized programs. Harvard University, for example, has developed a Master’s degree in Health Data Science. This 18-month program teaches you how to analyze and exploit health data to meet the greatest challenges in this field.
Alternatively, you can combine a generalist Data Scientist degree with training in the healthcare sector. You can start by taking the training offered by DataScientest, to acquire a degree certified by the Sorbonne University. This option is also highly relevant if you are already a doctor and wish to acquire Data Scientist skills…