Real World Data - Applications IA

23th April 2025

Healthcare data is growing faster than in any other industry

The implementation of new policies, along with scientific discoveries and innovative healthcare technologies, has driven a significant increase in the volume and variety of available data.

Genomic sequencing, combined with the rise of biomarker-specific medications, has led to a surge in genetic testing. At the same time, wearable technology and health apps are collecting a wide range of user data, including heart rate, step count, geographic location, and more.

These factors have accelerated the growth of the entire Real World Data (RWD) ecosystem.
The major tech giants, Facebook, Amazon, Microsoft, Google, and Apple, have turned their attention to the healthcare sector, which is valued at $8.3 trillion.

Over $6.8 billion in deals have been closed in this space since the beginning of 2020.

Microsoft has invested in conversational AI for healthcare and launched Microsoft Cloud for Healthcare, a tech stack for enterprise healthcare organizations that combines artificial intelligence, automation, and low-code app development. Meanwhile, Google introduced an AI-powered search tool to diagnose skin conditions, an EHR search solution for providers, and an interoperability tool for payers.

The market is expanding rapidly, and opportunities for innovation abound.

At Cromodata, we are a data-driven platform specialized in healthcare intelligence.
While being classified as a "data originator" would typically require the generation of primary data—like that produced by medical devices, sensors, or clinical studies—in our case, we act as an intermediary between those sources and the data consumers. We establish relationships with healthcare providers, hospitals, clinics, and other institutions that generate the actual data.

Although we could be seen as “data aggregators,” at Cromodata we go a step further.
 We collect, clean, structure, tokenize, and anonymize the information—ensuring both its quality and usability.
Below, we present some of the key areas within the healthcare system that are closely connected to the world of Real World Data (RWD).
Click the buttons below to explore each one.
(mental health, wellness, fitness, nutrition and supplements, remote monitoring)
(POC testing, lab testing, diagnostic technologies, decision support, population health)
(telehealth, home care, primary treatments, specialized treatments, hospitals)
(rehabilitation, social care, chronic care, elder care)
(training and certification, health and safety)
(health records, practice management, scheduling and referrals, health analytics)
 (wearables, medical devices, medical equipment, medical imaging, medical robotics)
(health benefits, corporate wellness, health insurance, health asset financing, healthcare real estate)
(drug manufacturing, drug commercialization, healthcare logistics, pharmacies)
(discoveries, clinical trials, clinical insights, precision medicine, genomics)
Key Characteristics and Applications of Real World Data (RWD)
Firstly, Real World Data is observational in nature, as opposed to data obtained in controlled environments such as randomized clinical trials.

Secondly, a significant portion of RWD is unstructured, such as free-text records or medical images, and may contain inconsistencies due to the heterogeneity in documentation practices across professionals, institutions, and healthcare systems.

Thirdly, this data can be generated at high temporal frequencies, as seen with wearable devices that capture measurements in milliseconds;  resulting in large, complex, and continuously updated datasets.

Fourthly, RWD is often incomplete or lacks critical information for certain analyses, since it was not originally collected for research purposes. For example, insurance data rarely includes clinical outcomes, while many care records offer limited longitudinal follow-up.

Fifthly, this data may be subject to systematic biases and measurement errors. A common example is selection bias in data generated through digital platforms, mobile devices, or wearables, which can compromise the representativeness of the sample compared to the target population.
Applications AI
Artificial Intelligence (AI) is making a strong impact in the healthcare and life sciences industries, with specialties such as oncology and neurology rapidly adopting AI tools developed in recent years,  especially since 2020.

AI is an invaluable support tool for assisting radiologists and pathologists in interpreting mammograms, MRIs, CT scans, and digital devices.

Computer vision, a subfield of AI, is in fact a key application in medical imaging. Based on 2D or 3D image generation and video (which can even be captured using a mobile phone), automated systems can analyze these images, detect potential anomalies, and generate preliminary reports in real time. This not only reduces diagnostic turnaround times, but also improves accuracy and expands medical coverage, especially in remote areas with limited access to radiologists or imaging equipment. These systems can also be integrated into telemedicine solutions, allowing results to be reviewed remotely by specialists located elsewhere in the country or around the world.

Many other specialties, in addition to pathology, have demonstrated clinical applicability for AI, including radiology, ophthalmology, and dermatology. Thanks to these advances, multiple cancers have already been detected in early stages from imaging and studies performed over the last two years, allowing patients to receive treatment more promptly and improve their prognosis.

Some of the applications of AI in healthcare include

La data clínica abarca tanto los datos del Historial Clínico Electrónico (HCE)-es decir data relacionada con internaciones, intervenciones, tratamientos, consultas médicas, diagnósticos, síntomas, análisis de laboratorio e imágenes, y hasta notas de las historias clínicas - como también datos demográficos, resultados de pruebas de laboratorio, procedimientos, datos de patología/histología, imágenes de radiología, datos de microbiología, notas de los proveedores, informes de admisión/alta y progreso, estado funcional, etc.
Datos de pruebas genómicas y genéticas (SNP/paneles); datos multiómicos (proteómica, transcriptómica, metabolómica, lipidómica); y estado de otros biomarcadores.
Son rastreadores de actividad física, dispositivos portátiles y otras aplicaciones de salud para la medición de la actividad y la función corporal. Incluyen a los dispositivos móviles como los teléfonos inteligentes, tablets, dispositivos de monitorización y  asistentes digitales personales. Y también incluyen los dispositivos portátiles, que vendrían a ser los  relojes inteligentes o las pulseras de actividad (Fitbit, Apple Watch, etc.), que monitorean parámetros de salud como ritmo cardíaco, actividad física, niveles de oxígeno en sangre, calidad del sueño y otros.  Incluso otros dispositivos médicos como los monitores de presión arterial, termómetros digitales, oxímetros de pulso, monitores de glucosa en sangre, y cualquier dispositivo  que permita a los pacientes controlar su salud en tiempo real son generadores de datos.
Reclamaciones médicas y otros datos sobre el uso de medicamentos y tratamientos. También incluyen los registros informados por pacientes: encuestas, dietas, hábitos, registros de salud personales, informes de eventos adversos, medidas de calidad de vida, entre otros. Y aquí también se incluyen otros registros de aseguradoras y facturación.
Registros administrativos, terapias concomitantes, datos de punto de venta y reclamaciones médicas.
Factores climáticos, contaminantes, infecciones, hábitos de vida (dietas, hábitos), registros de salud personales, informes de eventos adversos, medidas de calidad de vida, entre otros.
Carga de enfermedad (o disease burden), características clínicas, prevalencia/incidencia, tasas de tratamiento, uso de recursos y costos, control de enfermedades,  medidas de calidad de vida, etc.
Datos históricos sobre condiciones de salud y alergias relacionadas con el paciente y su familia extendida, estado de tabaquismo, consumo de alcohol, hábitos generales y datos demográficos.

🔹Extracting complex and multimodal data to build a more complete longitudinal view of the patient, incorporating not only clinical and patient-reported data, but also handling discrete and continuous molecular data.
🔹Using AI to interpret medical images and videos beyond human capability.
🔹Discovering physiological and molecular targets, and modeling AI-assisted treatments.
🔹Leveraging AI-powered workflows and cloud technologies to enhance virtual research environments and drive greater collaboration and productivity at scale.

The perfect combination: Statistics + AI
The combination of statistical inference and machine learning (ML) is key to improving the generation of Real World Evidence (RWE) and understanding causal relationships in data analysis.

A well-known methodological approach is targeted learning.

Targeted learning uses both statistical inference and machine learning to produce more accurate causal estimates. It has been successfully applied, for example, in causal inference for dynamic treatment regimes using electronic health record (EHR) data, and in evaluating the effectiveness of treatments for COVID-19.

Given the technological and methodological advancements we've discussed, we believe it’s fair to say that the future lies in the ability to integrate statistical inference with machine learning to generate RWE and learn causal relationships.

In fact, one of the most promising recent methodological developments is moving in this direction: leveraging advances in semiparametric theory and empirical processes, while incorporating the benefits of machine learning in comparative effectiveness research using Real World Data (RWD).

This is the true future of data.
The future of Real World Data.

Literature:
Thomason, J. (2021). Big tech, big data and the new world of digital health. Global Health Journal, 5(4), 165-168.
Sherman, R. E., Anderson, S. A., Dal Pan, G. J., Gray, G. W., Gross, T., Hunter, N. L., ... & Califf, R. M. (2016). Real-world evidence—what is it and what can it tell us. N Engl J Med, 375(23), 2293-2297.
Liu, F., & Panagiotakos, D. (2022). Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Medical Research Methodology, 22(1), 287.
US Food and Drug Administration. (2022). Real-world evidence (2022).
Wu, J., Roy, J., & Stewart, W. F. (2010). Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches. Medical care, 48(6), S106-S113.