Skip to content

so-link/KidEase

Repository files navigation

KidEase

KidEase is an explainable multimodal AI system that integrates age-stratified division, LLM-driven phenotype learning, and a multi-expert network for differentiated multimodal feature extraction and fusion.

On the held-out test set, KidEase achieved a mean absolute error (MAE) of 0.6150 and a Pearson correlation coefficient of 0.8796 for continuous pain score regression, with consistent performance across all nine centers (MAE: 0.47–0.96) and across pain intensity levels (MAE < 1.0 for mild pain, < 2.0 for severe pain).

The system produces structured, clinician-readable reports that map multimodal signals to specific pain phenotypes and reveals distinct pain expression phenotypes between children aged 0–7 and 8–15 years. The edge-cloud collaborative workflow — video capture on a smartphone with cloud-based analysis — achieves rapid diagnosis within 15 seconds and outputs detailed reports in 30 seconds, supporting deployment in routine clinical settings.

Setup

Prerequisites: Python 3.9

Install dependencies:

pip install -r requirements.txt

MPP Dataset

We conducted experiments across 9 medical centers and constructed the multicenter pediatric acute pain dataset (MPP), comprising 5,609 pediatric samples in MP4 video format. Labels were obtained by averaging pain scores assigned by 4–8 expert clinicians using the FLACC (Face, Legs, Activity, Cry, Consolability) scale.

Weights

We utilize pretrained weights from three open-source models, all available at their respective repositories:

Model Repository
PANNs (CNN6) https://github.com/qiuqiangkong/audioset_tagging_cnn
EmoCLIP https://github.com/NickyFot/EmoCLIP
MSCLAP https://github.com/microsoft/CLAP

Note: Our trained KidEase model weights and the MPP dataset are not publicly available due to patient privacy constraints.

Usage

1. Age Division

We determine the optimal age division via agglomerative hierarchical clustering algorithm on phenotype distances.

python ./emoclip/age_division.py
python ./msclap/age_division.py

2. Phenotype Generation & Selection

All phenotypes (attributes) are generated by prompting the Qwen-plus LLM with the prompts defined in ./emoclip/face_phen_generation_prompt.md and ./msclap/vocal_phen_generation_prompt.md. The generated results have been saved to the corresponding attr_select.py files. Using the optimal age division obtained above, we then perform pain/painless phenotype selection for each age group.

python ./emoclip/attr_select.py
python ./msclap/attr_select.py

3. Multimodal Expert Network Training

With the age-stratified phenotypes, we train a multimodal multi-expert network guided by LLM-generated phenotypes. Training proceeds in three stages:

  1. Unimodal feature extraction model training
  2. Unimodal phenotype-guided multi-expert training
  3. Multimodal phenotype-guided multi-expert training
CUDA_VISIBLE_DEVICES=0 python ./train/main.py --config ./train/config/config.yaml
CUDA_VISIBLE_DEVICES=0 python ./train/main.py --config ./train/config/are_config.yaml
CUDA_VISIBLE_DEVICES=0 python ./train/main.py --config ./train/config/mare_config.yaml

4. Report Generation

Once trained, the model takes a video as input and outputs the pain score, top-10 phenotype intensities, and an evaluation report generated in combination with the Qwen-plus LLM.

python ./amme_wrapper.py

Website

A corresponding website and mobile app are available for online demonstration. Visit: https://kidease.cn/

Acknowledgements

This work was supported by the Guangzhou Medical University 2024 Research Capacity Enhancement Program — Major Clinical Research Projects (Grant No. GMUCR2024-02019) and the Guangdong Basic and Applied Basic Research Foundation (Grant Nos. 2024A1515012287, 2023A1515220013).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages