M. Margulies, M. Egholm, W. Altman, S. Attiya, and J. Bader, Genome sequencing in microfabricated high-density picolitre reactors, Nature, vol.2, pp.376-380, 2005.
DOI : 10.1016/0888-7543(88)90007-9

A. Todd, R. Marsden, J. Thornton, and C. Orengo, Progress of Structural Genomics Initiatives: An Analysis of Solved Target Structures, Journal of Molecular Biology, vol.348, issue.5, pp.1235-1260, 2005.
DOI : 10.1016/j.jmb.2005.03.037

R. George, R. Spriggs, G. Bartlett, A. Gutteridge, and M. Macarthur, Effective function annotation through catalytic residue conservation, Proceedings of the National Academy of Sciences, vol.102, issue.35, pp.12229-12304, 2005.
DOI : 10.1073/pnas.0504833102

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1178014

I. Sillitoe, M. Dibley, J. Bray, S. Addou, and C. Orengo, Assessing strategies for improved superfamily recognition, Protein Science, vol.208, issue.7, pp.1800-1810, 2005.
DOI : 10.1110/ps.041056105

D. Lee, O. Redfern, and C. Orengo, Predicting protein function from sequence and structure, Nature Reviews Molecular Cell Biology, vol.339, issue.12, pp.995-1005, 2007.
DOI : 10.1038/nrm2281

K. Liolios, N. Tavernarakis, P. Hugenholtz, and N. Kyrpides, The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide, Nucleic Acids Research, vol.34, issue.90001, pp.332-334, 2006.
DOI : 10.1093/nar/gkj145

D. Benson, I. Karsch-mizrachi, D. Lipman, J. Ostell, and L. Wheeler, GenBank, Nucleic Acids Research, vol.34, issue.90001, pp.16-20, 2006.
DOI : 10.1093/nar/gkj157

URL : http://doi.org/10.1093/nar/gkj157

B. Dessailly, R. Nair, L. Jaroszewski, J. Fajardo, and A. Kouranov, PSI-2: Structural Genomics to Cover Protein Domain Family Space, Structure, vol.17, issue.6, pp.869-881, 2009.
DOI : 10.1016/j.str.2009.03.015

URL : http://doi.org/10.1016/j.str.2009.03.015

J. Chandonia and S. Brenner, The Impact of Structural Genomics: Expectations and Outcomes, Science, vol.311, issue.5759, pp.347-351, 2006.
DOI : 10.1126/science.1121018

M. Levitt, Nature of the protein universe, Proceedings of the National Academy of Sciences, vol.106, issue.27, pp.1079-11084, 2009.
DOI : 10.1073/pnas.0905029106

A. Andreeva, D. Howorth, S. Brenner, J. Hubbard, and C. Chothia, SCOP database in 2004: refinements integrate structure and sequence family data, Nucleic Acids Research, vol.32, issue.90001, pp.226-229, 2004.
DOI : 10.1093/nar/gkh039

A. Andreeva, D. Howorth, J. Chandonia, S. Brenner, and T. Hubbard, Data growth and its impact on the SCOP database: new developments, Nucleic Acids Research, vol.36, issue.Database, pp.419-425, 2008.
DOI : 10.1093/nar/gkm993

F. Pearl, A. Todd, I. Sillitoe, M. Dibley, and O. Redfern, The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis, Nucleic Acids Research, vol.33, issue.Database issue, pp.247-251, 2005.
DOI : 10.1093/nar/gki024

H. Berman, J. Westbrook, Z. Feng, G. Gilliland, and T. Bhat, The Protein Data Bank, Nucleic Acids Research, vol.28, issue.1, pp.235-242, 2000.
DOI : 10.1093/nar/28.1.235

M. Marti-renom, A. Stuart, A. Fiser, R. Sanchez, and F. Melo, Comparative Protein Structure Modeling of Genes and Genomes, Annual Review of Biophysics and Biomolecular Structure, vol.29, issue.1, pp.291-325, 2000.
DOI : 10.1146/annurev.biophys.29.1.291

C. Venclovas, Comparative modeling of CASP4 target proteins: Combining results of sequence search with three-dimensional structure assessment, Proteins: Structure, Function, and Genetics, vol.277, issue.S5, pp.47-54, 2001.
DOI : 10.1002/prot.10008

T. Schwede, J. Kopp, N. Guex, and M. Peitsch, SWISS-MODEL: an automated protein homology-modeling server, Nucleic Acids Research, vol.31, issue.13, pp.3381-3385, 2003.
DOI : 10.1093/nar/gkg520

URL : http://doi.org/10.1093/nar/gkg520

S. Altschul, T. Madden, A. Schaffer, J. Zhang, and Z. Zang, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, vol.25, issue.17, pp.3389-3402, 1997.
DOI : 10.1093/nar/25.17.3389

A. Schaffer, L. Aravind, T. Madden, J. Shavirin, and S. Spouge, Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements, Nucleic Acids Research, vol.29, issue.14, pp.2994-3005, 2001.
DOI : 10.1093/nar/29.14.2994

K. Karplus, C. Barrett, M. Cline, M. Diekhans, and L. Grate, Predicting protein structure using only sequence information, Proteins, vol.3, pp.121-125, 1999.

A. Marchler-bauer, J. Anderson, P. Cherukuri, C. Dewweese-scott, and L. Geer, CDD: a Conserved Domain Database for protein classification, Nucleic Acids Research, vol.33, issue.Database issue, pp.192-196, 2005.
DOI : 10.1093/nar/gki069

A. Bateman, E. Birney, R. Durbin, S. Eddy, and R. Finn, Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins, Nucleic Acids Research, vol.27, issue.1, pp.260-262, 1999.
DOI : 10.1093/nar/27.1.260

A. Bateman, L. Coin, R. Durbin, R. Finn, and V. Hollich, The Pfam protein families database, Nucl Acids Res, vol.10, pp.138-141, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01294685

R. Finn, J. Tate, J. Mistry, P. Coggill, and S. Sammut, The Pfam protein families database, Nucleic Acids Research, vol.36, issue.Database, pp.281-288, 2008.
DOI : 10.1093/nar/gkm960

URL : https://hal.archives-ouvertes.fr/hal-01294685

J. Gough, K. Karplus, R. Hughey, and C. Chothia, Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure, Journal of Molecular Biology, vol.313, issue.4, pp.903-919, 2001.
DOI : 10.1006/jmbi.2001.5080

M. Madera and J. Gough, A comparison of profile hidden Markov model procedures for remote homology detection, Nucleic Acids Research, vol.30, issue.19, pp.4321-4328, 2002.
DOI : 10.1093/nar/gkf544

M. Madera, C. Vogel, S. Kummerfeld, C. Chothia, and J. Gough, The SUPERFAMILY database in 2004: additions and improvements, Nucleic Acids Research, vol.32, issue.90001, pp.235-239, 2004.
DOI : 10.1093/nar/gkh117

A. Krogh, M. Brown, I. Mian, K. Sjolander, and D. Haussler, Hidden Markov Models in Computational Biology, Journal of Molecular Biology, vol.235, issue.5, pp.1501-1531, 1994.
DOI : 10.1006/jmbi.1994.1104

R. Finn, J. Mistry, B. Schuster-böckler, S. Griffiths-jones, and V. Hollich, Pfam: clans, web tools and services, Nucleic Acids Research, vol.34, issue.90001, pp.247-251, 2006.
DOI : 10.1093/nar/gkj149

URL : http://doi.org/10.1093/nar/gkj149

E. Sonnhammer, S. Eddy, E. Birney, A. Bateman, and R. Durbin, Pfam: multiple sequence alignments and HMM-profiles of protein domains, Nucleic Acids Research, vol.26, issue.1, pp.320-322, 1998.
DOI : 10.1093/nar/26.1.320

Y. Wang, R. Sadreyev, and N. Grishin, PROCAIN: protein profile comparison with assisting information, Nucleic Acids Research, vol.37, issue.11, pp.3522-3530, 2009.
DOI : 10.1093/nar/gkp212

X. Liu, K. Fang, and W. Wang, The number of protein folds and their distribution over families in nature, Proteins: Structure, Function, and Bioinformatics, vol.91, issue.3, pp.491-499, 2004.
DOI : 10.1002/prot.10514

A. Guerler and E. Knapp, Novel protein folds and their nonsequential structural analogs, Protein Science, vol.33, issue.8, pp.1374-1382, 2008.
DOI : 10.1110/ps.035469.108

P. Koehl and M. Levitt, De novo protein design. II. plasticity in sequence space11Edited by F. E. Cohen, Journal of Molecular Biology, vol.293, issue.5, pp.1183-1193, 1999.
DOI : 10.1006/jmbi.1999.3212

S. Larson, A. Garg, J. Desjarlais, and V. Pande, Increased detection of structural templates using alignments of designed sequences, Proteins: Structure, Function, and Genetics, vol.51, issue.3, pp.390-396, 2003.
DOI : 10.1002/prot.10346

S. Larson and V. Pande, Sequence Optimization for Native State Stability Determines the Evolution and Folding Kinetics of a Small Protein, Journal of Molecular Biology, vol.332, issue.1, pp.275-286, 2003.
DOI : 10.1016/S0022-2836(03)00832-5

S. Larson, J. England, J. Desjarlais, and V. Pande, Thoroughly sampling sequence space: Large-scale protein design of structural ensembles, Protein Science, vol.296, issue.12, pp.2804-2813, 2002.
DOI : 10.1110/ps.0203902

G. Dantas, B. Kuhlman, D. Callender, M. Wong, and D. Baker, A Large Scale Test of Computational Protein Design: Folding and Stability of Nine Completely Redesigned Globular Proteins, Journal of Molecular Biology, vol.332, issue.2, pp.449-460, 2003.
DOI : 10.1016/S0022-2836(03)00888-X

C. Saunders and D. Baker, Recapitulation of Protein Family Divergence using Flexible Backbone Protein Design, Journal of Molecular Biology, vol.346, issue.2, pp.631-644, 2005.
DOI : 10.1016/j.jmb.2004.11.062

H. Zhou and Y. Zhou, Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments, Proteins: Structure, Function, and Bioinformatics, vol.1, issue.Suppl 5, pp.321-328, 2005.
DOI : 10.1002/prot.20308

F. Ding and N. Dokholyan, Emergence of Protein Fold Families through Rational Design, PLoS Computational Biology, vol.243, issue.7, p.85, 2006.
DOI : 10.1371/journal.pcbi.0020085.st002

. Schmidt, M. Busch, M. D. Simonson, and T. , Computational protein design as a tool for fold recognition, Proteins: Structure, Function, and Bioinformatics, vol.7, issue.1, pp.139-158, 2009.
DOI : 10.1002/prot.22426

URL : https://hal.archives-ouvertes.fr/hal-00488182

J. Ponder and F. Richards, Tertiary templates for proteins, Journal of Molecular Biology, vol.193, issue.4, pp.775-791, 1988.
DOI : 10.1016/0022-2836(87)90358-5

H. Hellinga and F. Richards, Optimal sequence selection in proteins of known structure by simulated evolution., Proceedings of the National Academy of Sciences, vol.91, issue.13, pp.5803-5807, 1994.
DOI : 10.1073/pnas.91.13.5803

B. Dahiyat and S. Mayo, Protein design automation, Protein Science, vol.1, issue.5, pp.895-903, 1996.
DOI : 10.1002/pro.5560050511

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2143401

P. Harbury, J. Plecs, B. Tidor, T. Alber, and P. Kim, High-Resolution Protein Design with Backbone Freedom, Science, vol.282, issue.5393, pp.1462-1467, 1998.
DOI : 10.1126/science.282.5393.1462

N. Dokholyan and E. Shakhnovich, Understanding hierarchical protein evolution from first principles, Journal of Molecular Biology, vol.312, issue.1, pp.289-307, 2001.
DOI : 10.1006/jmbi.2001.4949

J. Desjarlais and T. Handel, Sidechain and backbone flexibility in protein core design, J Mol Biol, vol.289, pp.305-318, 1999.

B. Kuhlman and D. Baker, Native protein sequences are close to optimal for their structures, Proceedings of the National Academy of Sciences, vol.97, issue.19, pp.10383-10388, 2000.
DOI : 10.1073/pnas.97.19.10383

L. Wernisch, S. Héry, and S. Wodak, Automatic protein design with all atom force-fields by exact and heuristic optimization, Journal of Molecular Biology, vol.301, issue.3, pp.713-736, 2000.
DOI : 10.1006/jmbi.2000.3984

A. Jaramillo, L. Wernisch, S. Héry, and S. Wodak, Folding free energy function selects native-like protein sequences in the core but not on the surface, Proceedings of the National Academy of Sciences, vol.99, issue.21, pp.13554-13559, 2002.
DOI : 10.1073/pnas.212068599

B. Kuhlman, G. Dantas, G. Ireton, G. Varani, and B. Stoddard, Design of a Novel Globular Protein Fold with Atomic-Level Accuracy, Science, vol.302, issue.5649, pp.1364-1368, 2003.
DOI : 10.1126/science.1089427

M. Dwyer, L. Looger, and H. Hellinga, Computational Design of a Biologically Active Enzyme, Science, vol.304, issue.5679, pp.1967-1971, 2004.
DOI : 10.1126/science.1098432

J. Havranek and P. Harbury, Automated design of specificity in molecular recognition, Nature Structural Biology, vol.10, issue.1, pp.45-52, 2003.
DOI : 10.1038/nsb877

S. Ventura and L. Serrano, Designing proteins from the inside out, Proteins: Structure, Function, and Bioinformatics, vol.9, issue.1, pp.1-10, 2004.
DOI : 10.1002/prot.20142

A. Wollacott, A. Zanghellini, P. Murphy, and D. Baker, Prediction of structures of multidomain proteins from structures of the individual domains, Protein Science, vol.16, issue.2, pp.165-175, 2007.
DOI : 10.1110/ps.062270707

J. Swift, W. Wehbi, B. Kelly, X. Stowell, and J. Saven, Design of Functional Ferritin-Like Proteins with Hydrophobic Cavities, Journal of the American Chemical Society, vol.128, issue.20, pp.6611-6619, 2006.
DOI : 10.1021/ja057069x

S. Kang and J. Saven, Computational protein design: structure, function and combinatorial diversity, Current Opinion in Chemical Biology, vol.11, issue.3, pp.329-334, 2007.
DOI : 10.1016/j.cbpa.2007.05.006

P. Koehl and M. Levitt, De novo protein design. I. in search of stability and specificity11Edited by F. E. Cohen, Journal of Molecular Biology, vol.293, issue.5, pp.1161-1181, 1999.
DOI : 10.1006/jmbi.1999.3211

P. Koehl and M. Levitt, Structure-based conformational preferences of amino acids, Proceedings of the National Academy of Sciences, vol.96, issue.22, pp.12524-12529, 1999.
DOI : 10.1073/pnas.96.22.12524

I. Hubner, E. Deeds, and E. Shakhnovich, Understanding ensemble protein folding at atomic detail, Proceedings of the National Academy of Sciences, vol.103, issue.47, pp.17747-17752, 2006.
DOI : 10.1073/pnas.0605580103

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635542

N. Pokala and T. Handel, Energy functions for protein design I: Efficient and accurate continuum electrostatics and solvation, Protein Science, vol.77, issue.4, pp.925-936, 2004.
DOI : 10.1110/ps.03486104

N. Pokala and T. Handel, Energy Functions for Protein Design: Adjustment with Protein???Protein Complex Affinities, Models for the Unfolded State, and Negative Design of Solubility and Specificity, Journal of Molecular Biology, vol.347, issue.1, pp.203-227, 2005.
DOI : 10.1016/j.jmb.2004.12.019

A. Chowdry, K. Reynolds, M. Hanes, M. Voorhies, and N. Pokala, An object-oriented library for computational protein design, Journal of Computational Chemistry, vol.15, issue.14, pp.2378-2388, 2007.
DOI : 10.1002/jcc.20727

K. Raha, A. Wollacott, M. Italia, and J. Desjarlais, Prediction of amino acid sequence from structure, Protein Science, vol.106, issue.8, pp.1106-1119, 2000.
DOI : 10.1110/ps.9.6.1106

A. Lopes, A. Aleksandrov, C. Bathelt, G. Archontis, and T. Simonson, Computational sidechain placement and protein mutagenesis with implicit solvent models, Proteins: Structure, Function, and Bioinformatics, vol.18, issue.4, pp.853-867, 2007.
DOI : 10.1002/prot.21379

. Schmidt, M. Busch, A. Lopes, D. Mignon, and T. Simonson, Computational protein design: Software implementation, parameter optimization, and performance of a simple model, Journal of Computational Chemistry, vol.109, issue.7, pp.1092-1102, 2008.
DOI : 10.1002/jcc.20870

URL : https://hal.archives-ouvertes.fr/hal-00488192

. Schmidt, M. Busch, A. Lopes, N. Amara, C. Bathelt et al., Testing the coulomb/accessible surface area solvent model for protein stability, ligand binding, and protein design, BMC Bioinformatics, vol.9, pp.148-163, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00488191

A. Panchenko and S. Bryant, A comparison of position-specific score matrices based on sequence and structure alignments, Protein Science, vol.291, issue.Suppl. 1, pp.361-370, 2002.
DOI : 10.1110/ps.19902

M. Socolich, S. Lockless, W. Russ, H. Lee, and K. Gardner, Evolutionary information for specifying a protein fold, Nature, vol.9, issue.7058, pp.512-518, 2005.
DOI : 10.1016/0263-7855(96)00009-4

A. Wlodawer, J. Deisenhofer, and R. Huber, Comparison of two highly refined structures of bovine pancreatic trypsin inhibitor, Journal of Molecular Biology, vol.193, issue.1, pp.145-156, 1987.
DOI : 10.1016/0022-2836(87)90633-4

P. Tuffery, C. Etchebest, S. Hazout, and R. Lavery, A New Approach to the Rapid Determination of Protein Side Chain Conformations, Journal of Biomolecular Structure and Dynamics, vol.6, issue.6, p.1267, 1991.
DOI : 10.1080/07391102.1986.10506361

URL : https://hal.archives-ouvertes.fr/hal-00313445

B. Brooks, R. Bruccoleri, B. Olafson, D. States, and S. Swaminathan, CHARMM: A program for macromolecular energy, minimization, and dynamics calculations, Journal of Computational Chemistry, vol.I, issue.2, pp.187-217, 1983.
DOI : 10.1002/jcc.540040211

B. Lee and F. Richards, The interpretation of protein structures: Estimation of static accessibility, Journal of Molecular Biology, vol.55, issue.3, pp.379-400, 1971.
DOI : 10.1016/0022-2836(71)90324-X

A. Street and S. Mayo, Pairwise calculation of protein solvent-accessible surface areas, Folding and Design, vol.3, issue.4, pp.253-258, 1998.
DOI : 10.1016/S1359-0278(98)00036-4

A. Brünger, X-plor version 3.1, A System for X-ray crystallography and NMR, 1992.

D. Anderson, BOINC: A System for Public-Resource Computing and Storage, Fifth IEEE/ACM International Workshop on Grid Computing, 2004.
DOI : 10.1109/GRID.2004.14

R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis, 2002.
DOI : 10.1017/CBO9780511790492

L. Murphy, A. Wallqvist, and R. Levy, Simplified amino acid alphabets for protein fold recognition and implications for folding, Protein Engineering Design and Selection, vol.13, issue.3, pp.149-152, 2000.
DOI : 10.1093/protein/13.3.149

G. Launay, R. Mendez, S. Wodak, and T. Simonson, Recognizing protein???protein interfaces with empirical potentials and reduced amino acid alphabets, BMC Bioinformatics, vol.8, issue.1, pp.270-291, 2007.
DOI : 10.1186/1471-2105-8-270

URL : https://hal.archives-ouvertes.fr/hal-00488198

I. Halperin, H. Wolfson, and R. Nussinov, Correlated mutations: Advances and limitations. A study on fusion proteins and on the Cohesin-Dockerin families, Proteins: Structure, Function, and Bioinformatics, vol.27, issue.4, pp.832-845, 2006.
DOI : 10.1002/prot.20933

A. Marin, J. Pothier, K. Zimmermann, and J. Gibrat, FROST: A filter-based fold recognition method, Proteins: Structure, Function, and Genetics, vol.34, issue.4, pp.493-509, 2002.
DOI : 10.1002/prot.10231

D. Lin, G. Gish, Z. Songyang, and T. Pawson, The Carboxyl Terminus of B Class Ephrins Constitutes a PDZ Domain Binding Motif, Journal of Biological Chemistry, vol.274, issue.6, pp.3726-3733, 1999.
DOI : 10.1074/jbc.274.6.3726

R. Tonikian, Y. Zhang, S. Sazinsky, B. Currell, and J. Yeh, A specifity map for the PDZ domain family, Plos Biology, vol.6, pp.2043-2059, 2008.

B. Shoichet, W. Baase, R. Kuroki, and B. Matthews, A relationship between protein stability and protein function., Proceedings of the National Academy of Sciences, vol.92, issue.2, pp.452-456, 1995.
DOI : 10.1073/pnas.92.2.452

A. Elcock, Prediction of functionally important residues based solely on the computed energetics of protein structure, Journal of Molecular Biology, vol.312, issue.4, pp.885-896, 2001.
DOI : 10.1006/jmbi.2001.5009