🇫🇷 🇬🇧




SMITID: Statistical Methods for Inferring Transmissions of Infectious Diseases from deep sequencing data

A 4-year ANR funded project - Work Programme 2016 (extended to March 31, 2022)

ANR summary of the project

Key words: Computational biology ; Quantitative molecular epidemiology ; Model ; Statistical inference ; Disease transmissions



Viruses can cause epidemics of high impact in developing and developed countries alike. For such pathogens, inferring transmission links within a host population or between host populations (e.g. for zoonoses) is crucial to build epidemiological predictions and control strategies. In this aim, for fast-evolving pathogens, one can take advantage of the statistical analysis of pathogen sequence data because they inform which hosts contain pathogen variants that are most closely related to each other. However, so far existing models have mostly exploited a limited amount of information from sequencing data, such as consensus Sanger sequences, although deep Sanger sequencing (DSS; based on amplicon cloning) and high-throughput sequencing (HTS) techniques can reveal the polymorphic nature of within-host populations of pathogens. In this project, we propose an avant-gardist modelling and statistical approach that will exploit DSS and HTS data to infer disease transmission links for fast-evolving pathogens, such as viruses, and to infer relationships between transmissions and environment.

Ribaud M., Gabrie E., Hughes J., Soubeyrand S. (2021). Identifying potential significant factors impacting zero-inflated proportions data. ⟨hal-02936779v3

Alamil M., Bruchou C., Ribaud M., Thébaud G., Soubeyrand S. (2020). A study of factors influencing the performance of the reconstruction of transmissions in disease outbreaks. Poster at CIRM, Mathematical Modeling and Statistical Analysis of Infectious Disease Outbreaks, Marseille, France, 17-21/02/2020.

Alamil M., Hughes J., Berthier K., Desbiez C., Thébaud G., Soubeyrand S. (2019). Inferring epidemiological links from deep sequencing data: a statistical learning approach for human, animal and plant diseases. Philosophical Transactions of the Royal Society B: Biological Sciences 374: 20180258. doi:10.1098/rstb.2018.0258

Picard C., Dallot S., Brunker K., Berthier K., Roumagnac P., Soubeyrand S., Jacquot E., Thébaud G. (2017). Exploiting Genetic Information to Trace Plant Virus Dispersal in Landscapes. Annual Review of Phytopathology 55. doi:10.1146/annurev-phyto-080516-035616



Alamil, M. (2017). Modélisation de la cinétique et de l'évolution virale intra-hôte. Master 2 Internship Report, BioSP, INRA. Under the supervision of S. Soubeyrand.

Gawinowski, M. (2017). A simulation model for the kinetics, evolution and transmission of viral populations. Master 1 Internship Report, BioSP, INRA. Under the supervision of S. Soubeyrand.

Boge, J. (2019). Développement d’un composant de visualisation spatio-temporelle d’une épidémie. Matser 2 Internship Report, BioSP, INRA. Under the supervision of J.-F. Rey.


  • DATA

Melina Ribaud, & Joseph Hughes. (2021). Equine Influenza dataset [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4837560

Mélina Ribaud, Davide Martinetti, & Samuel Soubeyrand. (2021). Data for the comparison of COVID-19 mortality in European and North American geographic entities [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4769671 

People involved

Samuel Soubeyrand (coordinator)
Jean-François Rey
Karine Berthier
Cécile Desbiez
Gaël Thébaud
Joseph Hughes


2020/12/14: Mélina Ribaud gives a talk during the ModStatSAP webinar about the identifiction of environmental factors favoring disease propagation

2020/12/11: Maryam Alamil defends her PhD thesis. Congratulation Maryam !

2020/02/18: Maryam Alamil presents a poster at CIRM, investigating the performance of SLAFEEL

2020/02/18: Samuel Soubeyrand gives a talk at CIRM about SLAFEEL

2019/10/14: Mélina Ribaud starts a postdoc funded by SMITID, with a co-supervision by Edith Gabriel

2019/06/06: Samuel Soubeyrand gives a talk about SLAFEEL for the French Biometrics Society at SFdS days, Nancy

2019/05/26: Maryam Alamil gives a talk about SLAFEEL at the Mathematical and Computational Evolutionary Biology meeting, Porquerolles

2019/05/13: Maryam Alamil gives a talk about SLAFEEL at the meeting of the GDR Ecostat, Avignon

2019/05/01: Maryam Alamil gives a talk about SLAFEEL at the meeting of young statisticians at Porquerolles

2019/03/12: Maryam Alamil gives a talk about SLAFEEL at the ModStatSAP meeting, Paris

2019/05/06: Our manuscript corresponding to the methodological core of the SMITID project is published in Philosophical Transactions B - Alamil et al. (2019)

2019/02/19: R packages SMITIDstruct et SMITIDvisu are on the CRAN

2019/02/01: Julien Boge joined us to work on the SMITID software WP during his Master internship supervised by Jean-François

2019/01/30: Maryam Alamil gives a talk about SLAFEEL at the annual workshop on Statistical Methods for Post Genomic Data held in Barcelona

2018/09/11: R scripts for estimating epidemiological links available in the ZENODO citeable archive, as well as Ebola, Swine influenza and potyvirus data used in our analyses

2018/06/25: First external PhD monitoring meeting for Maryam Alamil 

2018/06/22: Maryam Alamil presents a poster at the 3rd Mathematical Biology Modelling days of Besançon - Summary - Poster

2018/01/08: Karine Berthier will visit BioSP one day per weekly, in particular to work on the SMITID project

2017/10/01: Maryam Alamil begins her PhD funded by SMITID

2017/06/29: Meije Gawinowski starts her summer internship funded by SMITID

2017/06/07: Our review for characterizing plant virus spread using molecular epidemiology is online in Annual Review of Phytopathology

2017/05/02: Maryam Alamil starts her master internship funded by SMITID

2017/02/28: SMITID was presented at INRA’s conference for the 2017 Paris International Agricultural Show (SIA) - Symposium on People, Animals and the Environment: One Health 

2017/02/09-10: Discussion about the links between the research carried out in SMITID and the topics of interest in the STrATEGE network (STATistics in Ecology and GEnomic data)

2016/11/25: Kick-off meeting at the ANR headquarter (Paris) to launch the 2016 projects of the "Emerging pathogens-OneHealth" comity (CES 35)

2016/11/04: Kick-off meeting at Avignon gathering the project members and a few other colleagues

2016/11/01: Starting date of the project