Streaming constrained binary logistic regression with online standardized data. Application to scoring heart failure

Abstract : We study a stochastic gradient algorithm for performing online a constrained binary logistic regression in the case of streaming or massive data. Assuming that observed data are realizations of a random vector, these data are standardized online in particular to avoid a numerical explosion or when a shrinkage method such as LASSO is used. We prove the almost sure convergence of a variable step-size constrained stochastic gradient process with averaging when a varying number of new data is introduced at each step. 24 stochastic approximation processes are compared on real or simulated datasets, classical processes with raw data, processes with online standardized data, with or without averaging and with variable or piecewise constant step-sizes. The best results are obtained by processes with online standardized data, with averaging and piecewise constant step-sizes. This can be used to update online an event rate score in heart failure patients.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02156324
Contributor : Benoît Lalloué <>
Submitted on : Friday, June 14, 2019 - 12:13:40 PM
Last modification on : Saturday, June 15, 2019 - 1:24:32 AM

File

Article_online logistic regres...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02156324, version 1

Collections

Citation

Benoît Lalloué, Jean-Marie Monnez, Eliane Albuisson. Streaming constrained binary logistic regression with online standardized data. Application to scoring heart failure. 2019. ⟨hal-02156324⟩

Share

Metrics

Record views

39

Files downloads

17