A. Agarwal and J. C. Duchi, The Generalization Ability of Online Algorithms for Dependent Data, IEEE Transactions on Information Theory, vol.59, issue.1, pp.573-587, 2013.
DOI : 10.1109/TIT.2012.2212414

P. Alquier and X. Li, Prediction of Quantiles by Statistical Learning and Application to GDP Forecasting, 15th International Conference on Discovery Science 2012, pp.23-36, 2012.
DOI : 10.1007/978-3-642-33492-4_5

URL : https://hal.archives-ouvertes.fr/hal-00671982

P. Alquier and O. Wintenberger, Model selection for weakly dependent time series forecasting, Bernoulli, vol.18, issue.3, pp.883-913, 2012.
DOI : 10.3150/11-BEJ359

URL : https://hal.archives-ouvertes.fr/inria-00386733

X. Alquier, O. Li, and . Wintenberger, Prediction of time series by statistical learning: general losses and fast rates, Dependence Modeling, vol.1, issue.2, pp.65-93, 2013.
DOI : 10.2478/demo-2013-0004

URL : https://hal.archives-ouvertes.fr/hal-00749729

J. Audibert, Fast learning rates in statistical inference through aggregation. The Annals of Statistics, pp.1591-1646, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00139030

J. Audibert and O. Catoni, Robust linear least squares regression. The Annals of Statistics, pp.2766-2794, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00522534

L. Bégin, P. Germain, F. Laviolette, and J. Roy, PAC-Bayesian bounds based on the Rényi divergence, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp.435-444, 2016.

G. Boucheron, P. Lugosi, and . Massart, Concentration inequalities: A nonasymptotic theory of independence, 2013.
DOI : 10.1093/acprof:oso/9780199535255.001.0001

URL : https://hal.archives-ouvertes.fr/hal-00794821

. Catoni, Statistical Learning Theory and Stochastic Optimization. Saint-Flour Summer School on Probability Theory, Lecture Notes in Mathematics, issue.1, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00104952

O. Catoni, PAC-Bayesian supervised classification: the thermodynamics of statistical learning, Institute of Mathematical Statistics Lecture Notes?Monograph Series, vol.56, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00206119

O. Catoni, Challenging the empirical mean and empirical variance: A deviation study, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, pp.1148-1185, 2012.
DOI : 10.1214/11-AIHP454

URL : https://hal.archives-ouvertes.fr/hal-00517206

. Catoni, PAC-Bayesian bounds for the Gram matrix and least squares regression with a random design. arXiv preprint, 2016.

I. Csiszár and P. C. Shields, Information Theory and Statistics: A Tutorial, Foundations and Trends??? in Communications and Information Theory, vol.1, issue.4, 2004.
DOI : 10.1561/0100000004

J. Dedecker, P. Doukhan, G. Lang, L. R. Rafael, S. Louhichi et al., Weak dependence, Weak Dependence: With Examples and Applications, pp.9-20, 2007.
DOI : 10.1007/978-0-387-69952-3_2

URL : https://hal.archives-ouvertes.fr/hal-00686031

L. Devroye, G. Györfi, and . Lugosi, A Probabilistic Theory of Pattern Recognition, 1996.
DOI : 10.1007/978-1-4612-0711-5

M. Devroye, G. Lerasle, R. I. Lugosi, and . Oliveira, Sub-Gaussian mean estimators . arXiv preprint, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01204519

P. Doukhan, Mixing: Properties and Examples. Lecture Notes in Statistics, 1994.

. Giulini, PAC-Bayesian bounds for Principal Component Analysis in Hilbert spaces, 2015.

P. D. Grünwald and N. A. Mehta, Fast rates with unbounded losses. arXiv preprint, p.12, 2016.

B. Guedj and P. Alquier, PAC-Bayesian estimation and prediction in sparse additive models, Electronic Journal of Statistics, vol.7, issue.0, pp.264-291, 2013.
DOI : 10.1214/13-EJS771

URL : https://hal.archives-ouvertes.fr/hal-00722969

J. Honorio and T. Jaakkola, Tight bounds for the expected risk of linear classifiers and PAC-Bayes finite-sample guarantees, Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp.384-392, 2014.

G. Lecué and S. Mendelson, Regularization and the small-ball method I: sparse recovery. arXiv preprint, 2016.

G. Lugosi and S. Mendelson, Risk minimization by median-of-means tournaments . arXiv preprint, 2016.

D. Mcallester, Some PAC-Bayesian theorems, Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, pp.230-234, 1998.
DOI : 10.1145/279943.279989

D. A. Mcallester, PAC-Bayesian model averaging, Proceedings of the twelfth annual conference on Computational learning theory , COLT '99, pp.164-170, 1999.
DOI : 10.1145/307400.307435

. Mendelson, Learning without Concentration, Journal of the ACM, vol.62, issue.3, pp.1-21, 2015.
DOI : 10.1145/2699439

URL : http://arxiv.org/abs/1401.0304

R. I. Oliveira, The lower tail of random quadratic forms, with applications to ordinary least squares and restricted eigenvalue properties, 2013.

E. Rio, Théorie asymptotique des processus aléatoires faiblement dépendants, Mathématiques & Applications, vol.31, 2000.

Y. Seldin, F. Laviolette, N. Cesa-bianchi, J. Shawe-taylor, and P. Auer, PAC- Bayesian inequalities for martingales. Information Theory, IEEE Transactions on, vol.58, issue.12 2, pp.7086-7093, 2012.

R. Shawe-taylor and . Williamson, A PAC analysis of a Bayes estimator, Proceedings of the Tenth Annual Conference on Computational Learning Theory, pp.2-9, 1997.

I. Steinwart and A. Christmann, Fast learning from non-iid observations, Advances in Neural Information Processing Systems, pp.1768-1776, 2009.

N. N. Taleb, The black swan: The impact of the highly improbable, 2007.

L. G. Valiant, A theory of the learnable, Communications of the ACM, vol.27, issue.11, pp.1134-1142, 1984.
DOI : 10.1145/1968.1972

V. N. Vapnik, The nature of Statistical Learning Theory, 2000.