Charles Truong

Tagline:Researcher in Statistics and Computer Science at ENS Paris-Saclay

Paris, France

personal photo of Charles Truong

About Me

⚠️ OPEN PhD POSITION FOR FOREIGN STUDENTS! ⚠️

I obtained my PhD in 2018 in applied mathematics. After a quick round-trip to the start-up nation, I became a post-doctoral researcher at the Centre Borelli (ENS Paris-Saclay) in March 2020, working on machine learning for time series analysis.

I focus a large part of my research activity on the problem of detecting events in multivariate signals.

Various settings are considered: supervised as well as unsupervised, classical statistical models, and end-to-end training deep signal representations.

In addition to studying the theoretical aspects of those methods, I put much effort into proposing documented and efficient implementations (mostly in Python and C/C++). From an application standpoint, I mainly focus on medical data and, more recently, industrial data.

Publications

  • Probing Single-Molecule Dynamics in Self-Assembling Viral Nucleocapsids

    Journal ArticlePublisher:Nano LettersDate:2024
    Authors:
    T. BugeaR. SussL. GargowitschC. TruongK. PerronetG. Tresset
    Description:

    All viruses on Earth rely on host cell machinery for replication, a process that involves a complex self-assembly mechanism. Our aim here is to scrutinize in real time the growth of icosahedral viral nucleocapsids with single-molecule precision. Using total internal reflection fluorescence microscopy, we probed the binding and unbinding dynamics of fluorescently labeled capsid subunits on hundreds of immobilized viral RNA molecules simultaneously at each time point. A step-detection algorithm combined with statistical analysis allowed us to estimate microscopic quantities such as the equilibrium binding rate and mean residence time, which are otherwise inaccessible through traditional ensemble-averaging techniques. Additionally, we could estimate a set of rate constants modeling the growth kinetics from nonequilibrium measurements, and we observed an acceleration in growth caused by the electrostatic screening effect of monovalent salts. Single-molecule fluorescence imaging will be crucial for elucidating virus self-assembly at the molecular level, particularly in crowded, cell-like environments.

  • Convolutional Sparse Coding for Time Series via a L0 Penalty: an Efficient Algorithm with Statistical Guarantees

    Journal ArticlePublisher:Statistical Analysis and Data Mining: The ASA Data Science JournalDate:2024
    Authors:
    C. TruongT. Moreau
    Description:

    Identifying characteristic patterns in time series, such as heartbeats or brain responses to a stimulus, is critical to understanding the physical or physiological phenomena monitored with sensors. Convolutional sparse coding (CSC) methods, which aim to approximate signals by a sparse combination of short signal templates (also called atoms), are well-suited for this task. However, enforcing sparsity leads to non-convex and untractable optimization problems. This article proposes finding the optimal solution to the original and non-convex CSC problem when the atoms do not overlap. Specifically, we show that the reconstruction error satisfies a simple recursive relationship in this setting, which leads to an efficient detection algorithm. We prove that our method correctly estimates the number of patterns and their localization, up to a detection margin that depends on a certain measure of the signal-to-noise ratio. In a thorough empirical study, with simulated and real-world physiological data sets, our method is shown to be more accurate than existing algorithms at detecting the patterns’ onsets.

  • Shape analysis for time series

    Conference PaperPublisher:Advances in Neural Information Processing System (NeurIPS)Date:2024
    Authors:
    T. GermainS. GruffazC. TruongA. O. DurmusL. Oudre
    Description:

    TL;DR: This paper introduces an unsupervised representation learning algorithm for time series tailored to biomedical inter-individual studies using tools from shape analysis.

    Abstract: Analyzing inter-individual variability of physiological functions is particularly appealing in medical and biological contexts to describe or quantify health conditions. Such analysis can be done by comparing individuals to a reference one with time series as biomedical data. This paper introduces an unsupervised representation learning (URL) algorithm for time series tailored to inter-individual studies. The idea is to represent time series as deformations of a reference time series. The deformations are diffeomorphisms parameterized and learned by our method called TS-LDDMM. Once the deformations and the reference time series are learned, the vector representations of individual time series are given by the parametrization of their corresponding deformation. At the crossroads between URL for time series and shape analysis, the proposed algorithm handles irregularly sampled multivariate time series of variable lengths and provides shape-based representations of temporal data. In this work, we establish a representation theorem for the graph of a time series and derive its consequences on the LDDMM framework. We showcase the advantages of our representation compared to existing methods using synthetic data and real-world examples motivated by biomedical applications.

  • An Efficient Algorithm For Exact Segmentation of Large Compositional and Categorical Time Series

    Journal ArticlePublisher:StatDate:2024
    Authors:
    C. TruongV. Runge
    Description:

    Change-point detection, also known as signal segmentation, is an essential preprocessing step in many applications, ranging from industrial monitoring to bioinformatics. In short, it consists in finding the temporal boundaries of homogeneous regimes in long and non-stationary time series. While this area of research is active, most existing methods are designed for Euclidean data. However, in many practical scenarios, the collected time series are compositional, meaning that each observation belongs to the probability simplex (the set of non-negative vectors whose components sum to one). In this work, we propose an algorithm detecting change-points in large compositional signals with an underlying piecewise stationary model. We cast the change-point detection task as a discrete optimization problem, whose solution is shown to converge to the true change-points. We introduce a new and time-efficient dynamic programming algorithm that solves exactly this problem. To limit the number of operations, we describe a novel pruning rule that allows us to reduce the set of candidate change-point indices. Our method is tested on a thorough simulation study, which confirms its efficiency. Additionally, we apply our method to a human activity segmentation task, highlighting the necessity for such novel techniques compared to standard algorithms.

  • Selective review of offline change point detection methods

    Journal ArticlePublisher:Signal ProcessingDate:2020
    Authors:
    C. TruongL. OudreN. Vayatis
    Description:

    This article presents a selective survey of algorithms for the offline detection of multiple change points in multivariate time series. A general yet structuring methodological strategy is adopted to organize this vast body of work. More precisely, detection algorithms considered in this review are characterized by three elements: a cost function, a search method and a constraint on the number of changes. Each of those elements is described, reviewed and discussed separately. Implementations of the main algorithms described in this article are provided within a Python package called ruptures.

Teaching

  • Introduction to Machine Learning

    From: 2023, Until: present

    Organization:ENSIIE (Evry, France)Field:MSc in Statistics (Master 2 MAL)

    Description:

    Co-coordinator with Mathilde Mougeot

  • Introduction to R programming

    From: 2023, Until: present

    Organization:University of Évry-Val d'EssonneField:BSc in Biostatistics

    Description:

    Main coordinator

  • Machine Learning for Time Series

    From: 2022, Until: present

    Organization:ENS Paris-SaclayField:MSc in Computer Science (Master 2 MVA)

    Description:

    Teaching assistant (main coordinator: Laurent Oudre)

  • Introduction to R&D

    From: 2022, Until: present

    Organization:ENSIIE (Evry, France)Field:MSc in Informatics (M1 "Étudiants en Alternance")

    Description:

    Main coordinator: Mathilde Mougeot

  • Data Processing in e-Health

    From: 2022, Until: present

    Organization:Master Erasmus Mundus CYBERField:MSc in Psychology and Medicine

    Description:

    Teaching assistant (main coordinator: Lise Haddouk)

Software and Data

  • Ruptures: Change point detection in Python

    date: 2021

    Description:
    • I maintain a change point detection library in Python called ruptures.

    • ruptures provides methods for the analysis and segmentation of non-stationary signals. Implemented algorithms include exact and approximate detection for various parametric and non-parametric models.

    • ruptures focuses on ease of use by providing a well-documented and consistent interface. In addition, thanks to its modular structure, different algorithms and models can be connected and extended within this package.

    • If you have any questions or if you feel like contributing, do not hesitate to reach me through the GitHub repository.

  • A Data Set for Fall Detection with Smart Floor Sensors

    date: 2021

    Description:
    • An online demo is available to skim through the data set without coding or downloading anything.
    • This article describes a data set of falls and activities of daily living recorded with a pressure floor sensor. These signals have been recorded under two settings, one constrained - with volunteers following a predefined protocol, and one unconstrained - where data were collected in a partner nursing home. Overall 157 hours of signal are made available along with 563 manually annotated falls and 333 manually annotated activities (e.g. running, walking).
  • A Data Set for the Study of Human Locomotion with IMUs

    date: 2019

    Description:
    • An online demo is available to skim through the data set without coding or downloading anything.
    • This data set contains 1020 multivariate gait signals collected with two inertial measurement units (accelerometers and gyroscopes), from 230 subjects undergoing a fixed protocol: standing still, walking 10 m, turning around, walking back, and stopping.
    • In particular, the start and end timestamps of more than 40,000 footsteps are provided, as well as several contextual information about each trial.
    • This exact data set was used in Oudre et al., 2018 (Template-based step detection with inertial measurement units) to design and evaluate a step detection procedure.

Supervisions

  • VG

    Valerio Guerrini

    Motif Discovery : Comprehensive evaluation and application to the multivariate case

    date: 2024 - 2024

    Degree: Master's Degree .

    Description:

    M2 MVA Student
    Co-supervision with Laurent Oudre and Thibaut Germain

  • NC

    Nicolas Cecchi

    Trend filtering for change-point detection

    date: 2024 - 2024

    Degree: Master's Degree .

    Description:

    M2 MVA Student
    Co-supervision with Laurent Oudre and Vincent Runge
    Funded by DATAIA

  • EM

    Even Matencio

    Covariance change point detection for graph signals

    date: 2024 - 2024

    Degree: Master's Degree .

    Description:

    M2 MVA Student
    Co-supervision with Laurent Oudre and Fikri Hafid
    Funded by RTE

  • BL

    Bastien Lhopitallier

    Searching for typical sequences in symbolic time series. Application to behavioral neuroscience.

    date: 2024 - 2024

    Degree: Master's Degree .

    Description:

    M2 MVA Student
    Co-supervision with Laurent Oudre and Lucile Benhaim

  • YG

    Yanis Gomes

    Convolutional Sparse Coding with Multipath Orthogonal Matching Pursuit

    date: 2024 - 2024

    Degree: Master's Degree .

    Description:

    M1 Student from ENS Paris-Saclay
    Co-supervision with Laurent Oudre

  • AV

    Aloïs Vincent

    Video processing using deep neural networks. Application to neuroscience.

    date: 2024 - 2024

    Degree: Bachelor's Degree .University: Université d'Evry Val d'Essonne .

  • EG

    Erwann Gallois

    Time series approximation with trend filtering

    date: 2024 - 2024

    Degree: Bachelor's Degree .University: Université d'Evry Val d'Essonne .

  • CC

    Clémence Cochard

    Convolutional approaches for spike sorting

    date: 2024 - present

    Degree: Master's Degree .

    Description:

    M1 Student from ENS Paris-Saclay
    Co-supervised with François Treussart

  • Ra

    Rémi al Ajroudi

    Large-scale algorithms for convolutional dictionary learning

    date: 2024 - present

    Degree: Master's Degree .

    Description:

    M1 Student from ENS Paris-Saclay

  • SB

    Simon Blotas

    Structured loss for deep change-point detection

    date: 2023 - 2023

    Degree: Master's Degree .University: Ecole Nationale des Ponts et Chausees .

    Description:

    M1 Student
    Published article at EUSIPCO 2024

Event Organization

  • Paris-Saclay Change-Point workshop

    date: 2023

    Organization:Paris-Saclay University

    Description:
    • I co-organized the event with Vincent Runge.
    • Two-day meeting dedicated to change-point detection.
    • Around 50 European researchers attended.
    • Funded by DATAIA and the IDAML Chair.
  • A.I. Cup, a Bavarian-French Artificial Intelligence Challenge

    date: 2022

    Organization:ENS Paris-Saclay and Passau University

    Description:
    • I was a member of the organization committee.
    • A.I. Cup was a data challenge for start-ups.
    • Prize money of 95,000€
  • Digital French-German Summer School with Industry

    date: 2020

    Description:
    • I was a member of the organization committee.
    • One-day virtual event between universities (ENS Paris-Saclay, Passau University) and industrial partners (Atos, Siemens, etc.)

Selected Talks

  • Everything you need to know about change-point detection

    Date: Apr 2024

    Event name: PyConDE & PyData Berlin 2024 .Location: Berlin, Germany .

  • Efficient convolutional sparse coding with a L0 constraint

    Date: Dec 2023

    Event name: Computational and Methodological Statistics (CMStatistics) .Location: Berlin, Germany .

    Description:

    Invited talk

  • Supervised change-point detection with dimension reduction, applied to physiological signals

    Date: Dec 2022

    Event name: Learning from Time Series for Health (NeurIPS Workshop) .Location: New Orleans, USA .

    Description:

    Spotlight presentation

  • Automatic calibration of change-point detection method

    Date: Jun 2022

    Event name: IMS Annual Meeting .Location: London, UK .

Research Interests

  • Time Series
  • Statistics
  • Change Point Detection
  • Geometry
  • Motif Discovery in Time Series
  • Behavioural Neuroscience
  • Open-source software

Patents

  • Procédé de caractérisation de démarche

    Date: Jan 2017

    Patent Number: WO2017021545A1/FR3039763A1 .Status:Issued.

    Inventors:
    R. DadashiT. MoreauC. TruongC. de WaeleA. YelnikR. Barrois-MüllerN. VayatisL. OudreP.-P. VidalD. Ricard