Deep Sturm–Liouville: Learnable orthogonal basis functions parameterized by neural networks

We introduce $\textit{Deep Sturm-Liouville}$ (DSL), a novel function approximator obtained by integrating the Sturm-Liouville theorem (SLT) into the deep learning framework. The Sturm-Liouville theorem deals with a class of eigenvalue problems having a wide range of applications in physics, which motivates us to explore its usage on machine learning tasks. The core idea of our work is to learn a vector field, crossing the input space $\Omega \subset \mathbb{R}^n$, such that the ML task along each of its field lines can be solved more easily due to the regularity of the problem on these field lines. A Sturm-Liouville Problem is solved along each field line to obtain orthogonal basis functions that, combined linearly, form the DSL function approximator. The vector field and the functions appearing in the SLT are parameterized by neural networks and they are learnt simultaneously. We also demonstrate that the DSL formulation appears naturally when solving a Rank-1 Parabolic Eigenvalue Problem. DSL is trained by stochastic gradient descent thanks to the implicit differentiation theorem, achieving comparable performances to neural networks on several multivariate datasets and the $\texttt{MNIST}$ dataset.

Keywords

Sturm-Liouville theory Deep Learning function approximation orthogonal basis functions eigenvalues problems Elliptic eigenvalues problem Parabolic eigenvalues problem

Domains

Artificial Intelligence [cs.AI]

Fichier principal

main.pdf (426.09 Ko)

Origin : Files produced by the author(s)

David Vigouroux : Connect in order to contact the contributor

https://hal.science/hal-04446268

Submitted on : Thursday, February 8, 2024-11:45:57 AM

Last modification on : Thursday, May 2, 2024-9:48:22 AM

Dates and versions

hal-04446268 , version 1 (08-02-2024)

Identifiers

HAL Id : hal-04446268 , version 1

Cite

David Vigouroux, Joseba Dalmau, Louis Béthune. Deep Sturm–Liouville: Learnable orthogonal basis functions parameterized by neural networks. 2024. ⟨hal-04446268⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UT1-CAPITOLE IRT_SAINT-EXUPERY IRIT IRIT-ADRIA ANR ANITI IRIT-IA IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

223 View

144 Download