Detailed of the project

This project analyzed the pathway and the profile of the patients with have been a lung cancer. It purpose was to detect the characteristics of the desease which increase the risk of having a early death.


Lung cancer is the leading cause of death in France due to cancer. The 5-year survival rate after the lung cancer’s detection is only 20%.
As part of her medicine research master, Dr. Hélène Goussaut asked to Kaduceo to assist her in the detection of the aggravating factors of the desease, and so to identify patients with a high probability to have a early death.

Caractéristiques des données

Patients were enrolled

Data came from the French Health System database (SNDS). Data used for this study provided medical information from patients which had a lung cancer detected between 2016 and 2018. Environmental data were added to enrich the database.

Solution provided by Kaduceo

To begin, Kaduceo identified and collected relevant tables from SNDS.

Then, Kaduceo cleaned and corrected the selected data, and it enriched them with environmental data. On this new dataset, we first made a descriptive analysis of the variables to detect aggraving factors. This analysis were also done on two subsets of the dataset to compare the pathway between patients which get a chimiotherapy with patients which get a lung surgery.

We finally created a statistical model to help doctors to predict patients with a high probability to have a early death.

Start-up phase

Analysis phase

Early death rates among lung cancer patients between 2016 and 2019 by department

Rate of lung cancer patients between 2016 and 2019 by department

I had the chance to work with the Kaduceo team for my Master 2 thesis. They brought me a precious help for the extraction, the formatting and the analysis of my data. They guided me to make the best use of the available data. Despite the working conditions linked to the pandemic, their availability was not impacted with weekly meetings and very regular exchanges by email to answer my questions and refine the statistical methodology. I will not hesitate to renew this collaboration for my future work.
Dr Hélène Goussault
Pneumologist - CHI Créteil

Working method

We adopted a working method based on regulars exchanges and interactions between Kaduceo team and the doctor. It allowed us to have a better understanding of the doctor’s needs and to quickly solve possible blocking points.

Key figures

  • 3 months of collaboration
  • + 1M reviews of medical records
  • 1 speech of the SPLF congress in 2021
  • 1 scientific paper is being written for submission to a journal

Results and limitations of the study

Generally, studies on lung cancer were only done on a single health etablihsment. This study which analysis all the french health care institutions allowed to highlights differences between this institutions and between the french departments.

Some variables as if the first hospital stay come from emergency, or if the cancer is increasing during the first 3 months, or the type of the etablishment had significant effect on 3 months survival rate.

Infortunately, usual important information such the tobacco or alcohol consumption or some chimiotherapy are under-represented in SNDS data. So that, statistical modeling are not accurate enough.

Previous post

Bariatric Surgery: Analyse the care pathway and predict the risk of discontinuation

Bariatric Surgery: Analyse the care pathway and predict the risk of discontinuation

This study aims to analyze the causes of disruptions in pre-operative follow-up, identify the profiles of patients at risk and measure the economic impact of disruptions.

Prediction of unplanned readmissions

Prediction of unplanned readmissions

Following initial hospitalization, re-admissions for pulmonary pathologies are among the conditions that generate the most readmission and consequently lead to additional expenditure on social security.