A stable, scalable and unbiased proteome set for sequence analysis
and functional annotation
Release 2015_02, Feburary 04, 2015
|Representative Proteomes (RPs), are proteomes that are selected from Representative Proteome Groups (RPGs) containing similar proteomes calculated based on co-membership in UniRef50 clusters. Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold are provided to allow users to decrease or increase the granularity of the sequence space based on their requirements (Chen et al., 2011). Representative genomes (RGs) are also constructed based on the corresponding RPs.
The RP set is updated every four weeks
(synchronized with UniProtKB release) and the data is available for
browsing, downloading and BLAST search.
Starting from 2014_09 release, we have used UniProt Proteome identifiers as Representative Proteome identifiers instead of NCBI Taxonomy identifiers. For those using the RPG files below, the first column now contains UniProt Proteome identifiers. Previously, it contains NCBI Taxonomy identifiers.
BLAST sequence search
Browse RPs database
Download RPs files
* Seq files for each cut-off include the sequences from model organisms with complete proteomes.
All sequence files have been filtered to contain one-protein-per-gene.
Make your own RP sequence file
Download RG files
Representative genomes (RGs) are constructed based on the corresponding RPs. UniProt taxonomy ids are mapped to NCBI genome project ids and RefSeq project ids. The RefSeq ids can be used to retrieve corresponding genomes and proteomes from NCBI.
Chen C, Natale DA, Finn RD, Huang H, Zhang J, Wu CH, Mazumder R.
Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation. PLoS One. 2011 Apr 27;6(4):e18910. PubMed PMID: 21556138; PubMed Central PMCID: PMC3083393.