Kaduceo logo

Detailed of the project

This project analyzed the pathway and the profile of the patients with have been a lung cancer. It purpose was to detect the characteristics of the desease which increase the risk of having a early death.

  • Client
  • Therapeutic area
  • Expertise
  • Client : CHI Créteil
  • Aire thérapeutique : Oncologie
  • Expertise : Analyse data
  • CHI Créteil
  • Oncology
  • Data analysis


Lung cancer is the leading cause of death in France due to cancer. The 5-year survival rate after the lung cancer's detection is only 20%.
As part of her medicine research master, Dr. Hélène Goussaut asked to Kaduceo to assist her in the detection of the aggravating factors of the desease, and so to identify patients with a high probability to have a early death.

Caractéristiques des données

50000Patients were enrolled

Data came from the French Health System database (SNDS). Data used for this study provided medical information from patients which had a lung cancer detected between 2016 and 2018. Environmental data were added to enrich the database.

Solution provided by Kaduceo

To begin, Kaduceo identified and collected relevant tables from SNDS.

Then, Kaduceo cleaned and corrected the selected data, and it enriched them with environmental data. On this new dataset, we first made a descriptive analysis of the variables to detect aggraving factors. This analysis were also done on two subsets of the dataset to compare the pathway between patients which get a chimiotherapy with patients which get a lung surgery.

We finally created a statistical model to help doctors to predict patients with a high probability to have a early death.

Start-up phase

  • Get the SNDS data access from the Clinical Research Center of the CHIC
  • Select and collect data
  • Clean and prepare data

Analysis phase

  • Descriptive analysis on all the dataset
  • Compare pathways of patients which had lung surgery with them which had chimiotherapy
  • Statistical modelling

Early death rates among lung cancer patients between 2016 and 2019 by department

Rate of lung cancer patients between 2016 and 2019 by department

I had the chance to work with the Kaduceo team for my Master 2 thesis. They brought me a precious help for the extraction, the formatting and the analysis of my data. They guided me to make the best use of the available data. Despite the working conditions linked to the pandemic, their availability was not impacted with weekly meetings and very regular exchanges by email to answer my questions and refine the statistical methodology. I will not hesitate to renew this collaboration for my future work. Dr Hélène Goussault Pneumologist - CHI Créteil

Working method

We adopted a working method based on regulars exchanges and interactions between Kaduceo team and the doctor. It allowed us to have a better understanding of the doctor's needs and to quickly solve possible blocking points.

Key figures

  • 3 months of collaboration
  • + 1M reviews of medical records
  • 1 speech of the SPLF congress in 2021
  • 1 scientific paper is being written for submission to a journal

Results and limitations of the study

Generally, studies on lung cancer were only done on a single health etablihsment. This study which analysis all the french health care institutions allowed to highlights differences between this institutions and between the french departments.

Some variables as if the first hospital stay come from emergency, or if the cancer is increasing during the first 3 months, or the type of the etablishment had significant effect on 3 months survival rate.

Infortunately, usual important information such the tobacco or alcohol consumption or some chimiotherapy are under-represented in SNDS data. So that, statistical modeling are not accurate enough.