Home Recent activites arrow-down favorite My favorites arrow-down favorite My labels arrow-down Downloads
Back to ...  
 
 

About neXtProt

Developed in collaboration between the SIB Swiss Institute of Bioinformatics and Geneva Bioinformatics (GeneBio) SA, neXtProt will be a comprehensive human-centric discovery platform, offering its users a seamless integration of and navigation through protein-related data.

Background

In September 2008, the UniProt/Swiss-Prot group achieved the first complete manual annotation of what is believed to be the full set of human proteins (derived from about 20,000 genes). This was a major milestone as this collection of data is already quite rich in information pertinent to modern biomolecular medical research. But there remains a large gap in our knowledge of human proteins in terms of functional information as well as protein characterization (PTMs, protein/protein interactions, sub cellular locations, etc).

The SIB and GeneBio have joined their efforts and expertise to bring to you neXtProt that was designed to help researchers make sense of what all these human proteins do in our bodies.

At the SIB, neXtProt is developed in the CALIPHO (Computer Analysis and Laboratory Investigation of Proteins of Human Origin) group that was created jointly with the University of Geneva. Headed by Amos Bairoch and Lydie Lane, CALIPHO is an interdisciplinary team which aims to use a variety of methodologies to help uncover the function of uncharacterized human proteins. The development of neXtProt is carried out in close collaboration with the Swiss-Prot group, which was led by Professor Bairoch for 23 years, and with GeneBio.

The team is currently working on three different challenges:

  • Adding more information to the corpus of data on human proteins that is already in Swiss-Prot. In addition to all the data available in Swiss-Prot on human proteins, data originating from a variety of high-throughput approaches (such as micro-array, antibodies screens, proteomics, interactomics, structural genomics) are being added to neXtProt. All of these data sets are carefully selected so as to only provide high-quality data. More data will be integrated into neXtProt. In the short and medium term this will include:
    • High-quality proteomics experimental data
    • siRNA experimental data
    • 3D experimental data
    • Pathway data
    • Population-related variant information
    • Protein-protein and protein-drug interaction data
  • Organizing the data in such a way that it is possible to seamlessly build powerful queries in the most user-friendly way possible.
  • Developing software tools ranging from sequence analysis to text and data mining to be integrated in various research environments. These tools will meet the specific needs of both academic and industrial users.

Our vision

"A journey of a thousand miles begins with a single step."

Lao-Tzu, Chinese philosopher (604 BC - 531 BC)

You are now using the public beta release of neXtProt. This is the beginning of what we hope will be a long journey. We invite you to join us in our mission to build a comprehensive knowledge platform on human proteins.

neXtProt is both a new and an old resource: new, because we want to create an innovative integrative resource around human proteins and old, because we are building it on top of the high-quality solid work that has been the hallmark of UniProtKB/Swiss-Prot since its inception in 1986. The extensive efforts made by Swiss-Prot to functionally annotate human proteins and curate their sequences and many other features is the foundation on which neXtProt relies. However, clearly this is not enough to populate a resource that needs to address the complexity of the universe of human proteins.

Technological advances have made it feasible to accumulate huge amounts of data. This has led to the emergence of many research programs lead by the omics which have completely transformed the way we accumulate information on biological systems. Yet while we are amassing data, we are not advancing at the same rate in knowledge creation. This is due to the high noise ratio of most high-throughput methodologies, but also because data gathering is often perceived as a goal in itself rather than as a means to prove or disprove hypotheses.
Many different types of high-throughput data will need to be integrated in neXtProt, but this task requires a critical appraisal of the quality of such data. This is why the neXtProt data integration philosophy is based on a three-tier system:

  • Gold - included. By default searches will only consider gold-level data.
  • Silver – included, but not shown in searches unless specified by user.
  • Bronze - excluded.

However, we cannot annotate all these high-throughput data sets alone. We need to capitalize on the high-quality databases and bioinformatics resources maintained by dedicated groups worldwide. These collaborations will be key to the success of neXtProt. We already want to thank the groups of Mathias Uhlen (HPA from KTH-Stockholm) and Marc Robinson-Rechavi (Bgee from SIB Lausanne) who allowed us to make use of their excellent protein and mRNA expression resources. We are in discussion with many other groups that have carried out such filtering and cleaning up tasks in fields such as interactomics, proteomics or phylogenomics, to name a few.

We are convinced that the comprehensive biocuration of human proteins is a participative endeavor. It is with this goal in mind that we have built neXtProt as a web-based participative platform and we will need to ask for your contribution very soon. We want to convince you that investing some of your time in an altruistic manner will be good for the community and therefore also good for you!

Data is useful, but we want neXtProt to be much more than a well-organized comprehensive data repository. In addition to enhanced search capabilities, we want to offer tools that will help to make sense of the contents. Some of these tools will be sequence-based (we already offer Blast and will soon add a multiple alignment option), but we want to go beyond sequence into the realm of the semantic analysis of protein lists. The big challenge is to be able to understand how a given set of proteins share or differ in their various features. This is far from being trivial and here we also want to collaborate. We will also offer APIs to allow our users to develop tools that will benefit from neXtProt’s clean and high-quality data.

Ultimately, our goal is to be able to offer the capability of modeling hypotheses within neXtProt. This perhaps sounds a bit ambitious! Maybe it is, but we want you to take this first step and begin to take part in this collaborative journey. Start by using this beta release of neXtProt and tell us what features you like or what you dislike about it.

Enjoy your journey!

Amos Bairoch and the neXtProt team at the Swiss Institute of Bioinformatics and GeneBio.


Contact details

SIB Swiss Institute of Bioinformatics
CMU - Rue Michel-Servet 1
1211 Geneva
Switzerland
Tel : +41 22 379 5050

Geneva Bioinformatics (GeneBio) SA
25 avenue de Champel
1206 Geneva
Switzerland
Tel: +41 22 702 99 00