home uniprot
Protein Search Site Search
       Home      About PIR     Databases      Search/Retrieval      Download      Support
HOME / About / Staff Members / C.H. Wu

Cathy H. Wu, Ph.D.

Professor -Department of Biochemistry and Molecular & Cellular Biology
Department of Oncology
Director -Protein Information Resource
Georgetown University Medical Center
10/16/01- Dr. Wu appears in The Scientist
Cathy Wu at the Crossroads: She saved the Protein Information Resource database and now aims to restore it to the world's best
Full article
Primary Expertise
Dr. Wu has conducted bioinformatics research since 1990 and developed several protein classification systems and databases. She has managed large software and database projects, led the bioinformatics effort of the Protein Information Resource (PIR) since 1999, and becoming the PIR Director in 2001. Her research interests include protein family classification and functional annotation, biological data integration, and literature mining.

Academic Appointments
1989-1994 Assistant Professor, Department of Computer Science, University of Texas at Tyler
1990-1999 Assistant Professor (90-94); Associate Professor (94-98); Professor (98-99) of Biomathematics University of Texas Health Center at Tyler
1999-2002 Director of Bioinformatics, PIR (99-02); Vice President (00-02), National Biomedical Research Foundation, Washington, D.C.
2001-present Professor, Department of Biochemistry & Molecular Biology; Director, PIR, Georgetown University Medical Center (GUMC)
2002-present Professor, Department of Oncology; Member, Lombardi Comprehensive Cancer Center, GUMC

Professional Activities
Member, Advisory Committee, Protein Structure Initiative, NIGMS, NIH (2002-present).
Member, Board of Directors, International Society for Computational Biology (2002-2004).
Over 15 Conference Organizing/Program Committees, including: ISMB, PSB, EITC, CBGI, BIOKDD
Over 20 Grant Review Panels/Study Sections at NIH, NSF, and DOE
Over 70 Invited Presentations/Lectures at international conferences, workshops, academia, and industry

B.S., Plant Pathology, National Taiwan University, Taiwan, 1978
M.S., Plant Pathology, Purdue University, W. Lafayette, IN. 1982
Ph.D., Molecular Plant Pathology, Purdue University, W. Lafayette, IN. 1984
Post. Doc., Molecular Biology, Michigan State University, E. Lansing, MI, 1986
M.S., Computer Science. University of Texas at Tyler, Tyler, TX. 1989

United States Patent No. 5,845,049, December 1, 1998, C. H. Wu. A neural network system with n-gram term weighting method for molecular sequence classification and motif identification

BOOK: Bioinformatics for Comparative Proteomics.
Wu CH, Chen C (Eds.).
Methods in Molecular Biology, Volume 694, Series Editor J.M Walker, Humana Press. 2011.
BOOK: Computational Biology and Genome Informatics.
Wang J, Wu CH, Wang P (Eds.).
World Scientific. 2003.
BOOK: Neural Networks and Genome Informatics.
Wu CH, McLarty JM (Eds.).
Methods in Computational Biology and Biochemistry, Volume 1, Series Editor A. K. Konopka, Elsevier Science. 2000.
UniProt: a hub for protein information.
UniProt Consortium.
Nucleic Acids Res. Jan 28;43 (Database issue) (2015)
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches.
Suzek BE, Wang Y, Huang H, McGarvey PB, Wu CH; the UniProt Consortium.
Bioinformatics. 2014 Nov 13. pii: btu739. [Epub ahead of print]. PubMed PMID: 25398609.
Protein Ontology: a controlled structured network of protein entities.
Natale DA, Arighi CN, Blake JA, Bult CJ, Christie KR, Cowart J, D'Eustachio P, Diehl AD, Drabkin HJ, Helfer O, Huang H, Masci AM, Ren J, Roberts NV, Ross K, Ruttenberg A, Shamovsky V, Smith B, Yerramalla MS, Zhang J, AlJanahi A, Celen I, Gan C, Lv M, Schuster-Lezell E, Wu CH.
Nucleic Acids Res. 2014 Jan;42(Database issue):D415-21. doi: 10.1093/nar/gkt1173. Epub 2013 Nov 21. PubMed PMID: 24270789; PubMed Central PMCID: PMC3964965.
Activities at the Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 2014 Jan;42(Database issue):D191-8. doi: 10.1093/nar/gkt1140. Epub 2013 Nov 18. PubMed PMID: 24253303; PubMed Central PMCID: PMC3965022.
Transcription factors and genetic circuits orchestrating the complex, multilayered response of Clostridium acetobutylicum to butanol and butyrate stress.
Wang Q, Venkataramanan KP, Huang H, Papoutsakis ET, Wu CH.
BMC Syst Biol. 2013 Nov 6;7:120. doi: 10.1186/1752-0509-7-120. PubMed PMID: 24196194; PubMed Central PMCID: PMC3828012.
BioC: a minimalist approach to interoperability for biomedical text processing.
Comeau DC, Islamaj Dogan R, Ciccarese P, Cohen KB, Krallinger M, Leitner F, Lu Z, Peng Y, Rinaldi F, Torii M, Valencia A, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ.
Database (Oxford). 2013 Sep 18;2013:bat064. doi: 10.1093/database/bat064. Print 2013. PubMed PMID: 24048470; PubMed Central PMCID: PMC3889917.
A fast Peptide Match service for UniProt Knowledgebase.
Chen C, Li Z, Huang H, Suzek BE, Wu CH; UniProt Consortium.
Bioinformatics. 2013 Nov 1;29(21):2808-9. doi: 10.1093/bioinformatics/btt484. Epub 2013 Aug 19. PubMed PMID: 23958731; PubMed Central PMCID: PMC3799477.
Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.
Ross KE, Arighi CN, Ren J, Huang H, Wu CH.
Database (Oxford). 2013 Jun 7;2013:bat038. doi: 10.1093/database/bat038. Print 2013. PubMed PMID: 23749465; PubMed Central PMCID: PMC3675891.
Use of the protein ontology for multi-faceted analysis of biological processes: a case study of the spindle checkpoint.
Ross KE, Arighi CN, Ren J, Natale DA, Huang H, Wu CH.
Front Genet. 2013 Apr 26;4:62. doi: 10.3389/fgene.2013.00062. eCollection 2013. PubMed PMID: 23637705; PubMed Central PMCID: PMC3636526.
Prediction of contact matrix for protein-protein interaction.
Gonzalez AJ, Liao L, Wu CH.
Bioinformatics. 2013 Apr 15;29(8):1018-25. doi: 10.1093/bioinformatics/btt076. Epub 2013 Feb 15. PubMed PMID: 23418186; PubMed Central PMCID: PMC3624801.
An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.
Arighi CN, Carterette B, Cohen KB, Krallinger M, Wilbur WJ, Fey P, Dodson R, Cooper L, Van Slyke CE, Dahdul W, Mabee P, Li D, Harris B, Gillespie M, Jimenez S, Roberts P, Matthews L, Becker K, Drabkin H, Bello S, Licata L, Chatr-aryamontri A, Schaeffer ML, Park J, Haendel M, Van Auken K, Li Y, Chan J, Muller HM, Cui H, Balhoff JP, Chi-Yang Wu J, Lu Z, Wei CH, Tudor CO, Raja K, Subramani S, Natarajan J, Cejuela JM, Dubey P, Wu C.
Database (Oxford). 2013 Jan 17;2013:bas056. doi: 10.1093/database/bas056. Print 2013. PubMed PMID: 23327936; PubMed Central PMCID: PMC3625048.
Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.
P.G. Arnison, M.J. Bibb, G. Bierbaum, A.A. Bowers, T.S. Bugni, G. Bulaj, J.A. Camarero, D.J. Campopiano, G.L. Challis, J. Clardy, P.D. Cotter, D.J. Craik, M. Dawson, E. Dittmann, S. Donadio, P.C. Dorrestein, K.D. Entian, M.A. Fischbach, J.S. Garavelli, U. Göransson, C.W. Gruber, D.H. Haft, T.K. Hemscheidt, C. Hertweck, C. Hill, A.R. Horswill, M. Jaspars, W.L. Kelly, J.P. Klinman, O.P. Kuipers, A.J. Link, W. Liu, M.A. Marahiel, D.A. Mitchell, G.N. Moll, B.S. Moore, R. Müller, S.K. Nair, I.F. Nes, G.E. Norris, B.M. Olivera, H. Onaka, M.L. Patchett, J. Piel, M.J. Reaney, S. Rebuffat, R.P. Ross, H.G. Sahl, E.W. Schmidt, M.E. Selsted, K. Severinov, B. Shen, K. Sivonen, L. Smith, T. Stein, R.D. Süssmuth, J.R. Tagg, G.L. Tang, A.W. Truman, J.C. Vederas, C.T. Walsh, J.D. Walton, S.C. Wenzel, J.M. Willey, W.A. van der Donk.
(2013) Nat. Prod. Rep. 30, 108-160 [PMID:23165928].
BioCreative-2012 virtual issue.
Wu CH, Arighi CN, Cohen KB, Hirschman L, Krallinger M, Lu Z, Mattingly C, Valencia A, Wiegers TC, John Wilbur W.
Database(Oxford). 2012 Dec 5;2012:bas049. doi: 10.1093/database/bas049. Print 2012. PubMed PMID: 23221175; PubMed Central PMCID: PMC3514749.
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.
Tudor CO, Arighi CN, Wang Q, Wu CH, Vijay-Shanker K.
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012. PubMed PMID: 23221174; PubMed Central PMCID: PMC3514748.
Building a Classifier for Identifying Sentences Pertaining to Disease-Drug Relationships in Tardive Dyskinesia.
X. Bi, H. Huang, S. Matis-Mitchell, P. McGarvey, M. Torii, H. Shatkay and C. Wu.
Proc. of the IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM). November, 2012
Text mining for the biocuration workflow.
Hirschman L, Burns GA, Krallinger M, Arighi C, Cohen KB, Valencia A, Wu CH, Chatr-Aryamontri A, Dowell KG, Huala E, Lourenço A, Nash R, Veuthey AL, Wiegers T, Winter AG.
Database (Oxford). 2012 Apr 18;2012:bas020. doi: 10.1093/database/bas020. Print 2012. PubMed PMID: 22513129; PubMed Central PMCID: PMC3328793.
Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.
Wang Q, Arighi CN, King BL, Polson SW, Vincent J, Chen C, Huang H, Kingham BF, Page ST, Rendino MF, Thomas WK, Udwary DW, Wu CH; North East Bioinformatics Collaborative Curation Team.
Database (Oxford). 2012 Mar 20;2012:bar064. doi: 10.1093/database/bar064. Print 2012. PubMed PMID: 22434832; PubMed Central PMCID: PMC3308154.
Informatics and data quality at collaborative multicenter Breast and Colon Cancer Family Registries.
McGarvey PB, Ladwa S, Oberti M, Dragomir AD, Hedlund EK, Tanenbaum DM, Suzek BE, Madhavan S.
J Am Med Inform Assoc. 2012 Feb 9. [Epub ahead of print] PMID: 22323393.
BioCreative III interactive task: an overview.
Arighi CN, Roberts PM, Agarwal S, Bhattacharya S, Cesareni G, Chatr-Aryamontri A, Clematide S, Gaudet P, Giglio MG, Harrow I, Huala E, Krallinger M, Leser U, Li D, Liu F, Lu Z, Maltais LJ, Okazaki N, Perfetto L, Rinaldi F, Sĉtre R, Salgado D, Srinivasan P, Thomas PE, Toldo L, Hirschman L, Wu CH.
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8:S4. doi: 10.1186/1471-2105-12-S8-S4. PubMed PMID: 22151968; PubMed Central PMCID: PMC3269939.
Overview of the BioCreative III Workshop.
Arighi CN, Lu Z, Krallinger M, Cohen KB, Wilbur WJ, Valencia A, Hirschman L, Wu CH.
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8:S1. doi: 10.1186/1471-2105-12-S8-S1. PubMed PMID: 22151647; PubMed Central PMCID: PMC3269932.
The representation of protein complexes in the Protein Ontology (PRO).
Bult CJ, Drabkin HJ, Evsikov A, Natale D, Arighi C, Roberts N, Ruttenberg A, D'Eustachio P, Smith B, Blake JA, Wu C.
BMC Bioinformatics. 2011 Sep 19;12:371. doi: 10.1186/1471-2105-12-371. PubMed PMID: 21929785; PubMed Central PMCID: PMC3189193.
Reorganizing the protein space at the Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 40 (Database issue): D71-5 (2012). PMID: 22102590
Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation.
Chen C, Natale DA, Finn RD, Huang H, Zhang J, Wu CH, Mazumder R.
PLoS One. 2011 Apr 27;6(4):e18910. PMID: 21556138
A comprehensive protein-centric ID mapping service for molecular data integration.
Huang H, McGarvey PB, Suzek BE, Mazumder R, Zhang J, Chen Y, Wu CH.
Bioinformatics. Apr 15;27(8):1190-1. 2011.
Protein-centric data integration for functional analysis of comparative proteomics data.
McGarvey PB, Zhang J, Natale DA, Wu CH, Huang H.
Methods Mol Biol. 694:323-39. 2011.
Structure-guided rule-based annotation of protein functional sites in UniProt Knowledgebase.
Vasudevan S, Vinayaka CR, Natale DA, Huang H, Kahsay RY, Wu CH.
Methods Mol Biol. 694:91-105. 2011.
A tutorial on protein ontology resources for proteomic studies.
Arighi CN.
Methods Mol Biol. 694:77-90. 2011.
eFIP: a tool for mining functional impact of phosphorylation from literature.
Arighi CN, Siu AY, Tudor CO, Nchoutmboube JA, Wu CH, Shanker VK.
Methods Mol Biol. 694:63-75. 2011.
Protein bioinformatics databases and resources.
Chen C, Huang H, Wu CH.
Methods Mol Biol. 694:3-24. 2011.
Omics-Based Molecular Target and Biomarker Identification.
Zhang-Zhi Hu, Hongzhan Huang, Cathy H. Wu, Mira Jung, Anatoly Dritschilo, Anna T. Riegel, Anton Wellstein
Methods Mol Biol. 719, 547-571. 2011.
Ongoing and future developments at the Universal Protein Resource.
UniProt Consortium.
Nucleic Acids Res. 39(Database issue):D214-9. 2011.
The Protein Ontology: a structured representation of protein forms and complexes.
Natale DA, Arighi CN, Barker WC, Blake JA, Bult CJ, Caudy M, Drabkin HJ, D'Eustachio P, Evsikov AV, Huang H, Nchoutmboube J, Roberts NV, Smith B, Zhang J, Wu CH.
Nucleic Acids Res. 39(Database issue):D539-45. 2011.
Phylogenomic analysis of marine Roseobacters.
Tang K, Huang H, Jiao N, Wu CH.
PLoS One. 5(7):e11604. 2010.
Document classification for mining host pathogen protein-protein interactions.
Yin L, Xu G, Torii M, Niu Z, Maisog JM, Wu C, Hu Z, Liu H.
Artif Intell Med. 49(3):155-60. 2010.
Molecular mechanisms mediating the effect of mono-(2-ethylhexyl) phthalate on hormone-stimulated steroidogenesis in MA-10 mouse tumor Leydig cells.
Fan J, Traore K, Li W, Amri H, Huang H, Wu C, Chen H, Zirkin B, Papadopoulos V.
Endocrinology. 151(7):3348-62. 2010.
Prediction of Catalytic Residues in Proteins Using a Consensus of Prediction (CoP) Approach.
Petrova NV, Wu CH.
IEEE International Conference on Bioinformatics and Bioengineering, bibe, 226-231. 2010.
A database developed from information extracted from chemotherapy drug package inserts to enhance future prescriptions.
D'Souza MK, Alabed GJ, Wheatley JM, Roberts N, Veturi Y, Bi X, Continisio CH.
CISE2011, IEEE Conference Record #17768; IEEE Catalog Number: CFP1160F-PRT; ISBN: 978-1-4244-8361-7. 2010.
From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase.
Hinz U; UniProt Consortium.
Cell Mol Life Sci. 67(7):1049-64. 2010.
Community annotation in biology.
Mazumder R, Natale DA, Julio JA, Yeh LS, Wu CH.
Biol Direct. 5:12. 2010.
Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput omics Data.
Chen C, McGarvey PB, Huang H, Wu CH.
Adv Bioinformatics. 2010; 2010:423589. 2010.
The Universal Protein Resource (UniProt) in 2010.
UniProt Consortium.
Nucleic Acids Res. 38(Database issue):D142-8. 2010.
Systems integration of biodefense omics data for analysis of pathogen-host interactions and identification of potential targets.
McGarvey PB, Huang H, Mazumder R, Zhang J, Chen Y, Zhang C, Cammer S, Will R, Odle M, Sobral B, Moore M, Wu CH.
PLoS One. 4(9):e7162. 2009.
Sequence signatures in envelope protein may determine whether flaviviruses produce hemorrhagic or encephalitic syndromes.
Barker WC, Mazumder R, Vasudevan S, Sagripanti JL, Wu CH.
Virus Genes. 39(1):1-9. 2009.
Infrastructure for the life sciences: design and implementation of the UniProt website.
Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P, Gasteiger E.
BMC Bioinformatics. 10:136. 2009.
TGF-beta signaling proteins and the Protein Ontology.
Arighi CN, Liu H, Natale DA, Barker WC, Drabkin H, Blake JA, Smith B, Wu CH.
BMC Bioinformatics. 10 Suppl 5:S3. 2009.
BioTagger-GM: a gene/protein name recognition system.
Torii M, Hu Z, Wu CH, Liu H.
J Am Med Inform Assoc. 16(2):247-55. 2009.
An improved ontological representation of dendritic cells as a paradigm for all cell types.
Masci AM, Arighi CN, Diehl AD, Lieberman AE, Mungall C, Scheuermann RH, Smith B, Cowell LG.
BMC Bioinformatics. 10:70. 2009.
InterPro: the integrative protein signature database.
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.
Nucleic Acids Res. 37(Database issue):D211-5. 2009.
The Universal Protein Resource (UniProt) 2009.
UniProt Consortium.
Nucleic Acids Res. 37(Database issue):D169-74. 2009.
Structure-guided comparative analysis of proteins: principles, tools, and applications for predicting function.
Mazumder R, Vasudevan S.
PLoS Comput Biol. 4(9):e1000151. 2008.
Integrated Bioinformatics for Radiation-Induced Pathway Analysis from Proteomics and Microarray Data.
Hu ZZ, Huang H, Cheema A, Jung M, Dritschilo A, Wu CH.
J Proteomics Bioinform. 1(2):47-60. 2008.
Protein Bioinformatics.
McGarvey P, Huang H, Wu CH.
in: Medical Applications of Mass Spectrometry. Part III Biomolecules, Chapter 10:203-222. K Vekey, A Telekes, A Vertes (Eds.) Elsevier Science. 2008.
Protein functional annotation by homology.
Mazumder R, Vasudevan S, Nikolskaya AN.
Methods Mol Biol. 484:465-90. 2008.
An emerging cyberinfrastructure for biodefense pathogen and pathogen-host data.
Zhang C, Crasta O, Cammer S, Will R, Kenyon R, Sullivan D, Yu Q, Sun W, Jha R, Liu D, Xue T, Zhang Y, Moore M, McGarvey P, Huang H, Chen Y, Zhang J, Mazumder R, Wu C, Sobral B.
Nucleic Acids Res. 36(Database issue):D884-91. 2008.
Bioinformatic Databases.
Herbert KG, Spirollari J, Wang JTL, Piel WH, Westbrook J, Barker WC, Hu ZZ, Wu CH.
in: Wiley Encyclopedia of Computer Science and Engineering (Cassie Craig Assistant Editor), John Wiley & Sons, Ltd. 2007.
A comparison study on algorithms of detecting long forms for short forms in biomedical text.
Torii M, Hu ZZ, Song M, Wu CH, Liu H.
BMC Bioinformatics. 8 Suppl 9:S5. 2007.
Framework for a protein ontology.
Natale DA, Arighi CN, Barker WC, Blake J, Chang TC, Hu Z, Liu H, Smith B, Wu CH.
BMC Bioinformatics. 8 Suppl 9:S1. 2007.
Computational analysis and identification of amino acid sites in dengue E proteins relevant to development of diagnostics and vaccines.
Mazumder R, Hu ZZ, Vinayaka CR, Sagripanti JL, Frost SD, Kosakovsky Pond SL, Wu CH.
Virus Genes. 35(2):175-86. 2007.
Integration of bioinformatics resources for functional analysis of gene expression and proteomic data.
Huang H, Hu ZZ, Arighi CN, Wu CH.
Front Biosci. 12:5071-88. 2007.
Identification of Sensory and Signal-Transducing Domains in Two-Component Signaling Systems.
Galperin MY, Nikolskaya AN.
Methods in Enzymology 422:47-74. 2007.
UniRef: comprehensive and non-redundant UniProt reference clusters.
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH.
Bioinformatics. 23(10):1282-8. 2007.
Challenges and solutions in proteomics.
Huang H, Shukla HD, Cathy W, Satya S.
Curr Genomics. 8(1):21-8. 2007.
PIRSF family classification system for protein functional and evolutionary analysis.
Nikolskaya AN, Arighi CN, Huang H, Barker WC, Wu CH.
Evol Bioinform Online. 2:197-209. 2007.
Dependence network modeling for biomarker identification.
Qiu P, Wang ZJ, Liu KJ, Hu ZZ, Wu CH.
Bioinformatics. 23(2):198-206. 2007.
Comparative Bioinformatics Analyses and Profiling of Lysosome-Related Organelle Proteomes.
Hu ZZ, Valencia JC, Huang H, Chi A, Shabanowitz J, Hearing VJ, Appella E, Wu C.
Int J Mass Spectrom. 259(1-3):147-160. 2007.
New developments in the InterPro database.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.
Nucleic Acids Res. 35(Database issue):D224-8. 2007.
The Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 35(Database issue):D193-7. 2007.

Torii M., Liu H.F., Hu Z.Z. and Wu C.H. (2006). A comparison study of biomedical short form definition detection algorithms. Proceedings of ACM First International Workshop on Text Mining in Bioinformatics, TMBIO 2006.
Natale D.A., Arighi C.N., Barker W., Blake J., Chang T., Hu Z.Z., Liu H., Smith B., Wu C.H. (2006). Framework for a Protein Ontology Proceedings of ACM First International Workshop on Text Mining in Bioinformatics, TMBIO 2006.

Qiu P., Wang J., Ray Liu K.J., Hu Z.Z., Wu C.H. (2006). Dependence network modeling for biomarker identification.
Bioinformatics, 23:198-206.

Hu Z.Z., Valencia J.C., Huang H., Chi A., Shabanowitz J., Hearing V.J., Appella E., Wu C.H. (2006). Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes. Int J Mass Spec, 259:147-160.

Chi A., Valencia J.C., Hu Z.Z., Watabe H., Yamaguchi H., Mangini N.J., Huang H., Canfield V.A., Cheng K.C., Yang F., Abe R., Yamagishi S., Shabanowitz J., Hearing V.J., Wu C.H., Appella E., Hunt D.F. (2006). Proteomic and Bioinformatic Characterization of the Biogenesis and Function of Melanosomes. J Proteome Res, 5:3135-3144.

Liu H., Hu Z.Z., Torii M., Wu C.H., Friedman C.(2006). Quantitative Assessment of Dictionary-based Protein Named Entity Tagging. J Am Med Inform Assoc, 13:497-507, 2006.

Han B., Obradovic Z., Hu Z.Z., Wu C.H., Vucetic S.(2006). Substring selection for biomedical document classification. Bioinformatics, 22:2136-42.
Petrova N.V., Wu C.H. (2006). Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties. BMC Bioinformatics, 7:312.

Yuan X., Hu Z.Z., Wu H.T., Torii M., Narayanaswamy M., Ravikumar K.E., Vijay-Shanker K., Wu C.H. (2006). An online literature mining tool for protein phosphorylation.Bioinformatics, 22(13):1668-1669.

Nikolskaya A.N., Arighi C.N., Huang H., Barker W.C., Wu C.H. (2006).PIRSF Family Classification System for Protein Functional and Evolutionary Analysis. Evolutionary Bioinformatics Online, 2:209-221.

Liu, H.F., Hu, Z.Z., Zhang, J., Wu, C.H. (2006). BioThesaurus: a web-based thesaurus of protein and gene names.Bioinformatics, 22, 103-105.

Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Mazumder, R., O'donovan, C., Redaschi, N., Suzek, B. (2006). The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Research, 34, D187-91.

Liu, H., Hu, Z.Z., Wu, C.H. (2005). DynGO: a tool for visualizing and mining of Gene Ontology and its associations .BMC Bioinformatics, 6, 201.

Mazumder, R., Natale, D., Murthy, S., Thiagarajan, R., Wu, C.H. (2005). Computational identification of strain-, species- and genus-specific proteins. BMC Bioinformatics, 6, 279.

Schneider, M., Bairoch, A., Wu, C.H., Apweiler, R. (2005). Plant Protein Annotation in the UniProt Knowledgebase Plant Physiology, 138, 59-66.
Hu, Z.Z., Narayanaswamy, M., Ravikumar, K.E., Vijay-Shanker, K., Wu, C.H. (2005). Literature mining and database annotation of protein phosphorylation using a rule-based system Bioinformatics, 21(11), 2759-2765.
Mani I., Hu Z., Jang S.B., Samuel K., Krause M., Phillips J., Wu C.H. (2005). Protein name tagging guidelines: lessons learned. Comparative and Functional Genomics, 6(1-2), 72-76.

Natale, D. A., Vinayaka, C. R. and Wu, C. H. (2005). Large-scale, classification-driven, rule-based functional annotation of proteins. Wiley, New York.

Wu, C.H., Huang, H., Nikolskaya, A., Vinayaka, C. R., Chung, S., Zhang, J. (2005). Family Classification and Integrative Associative Analysis for Protein Functional Annotation in Bioinformatics: New Research. Nova Publishers, New York.

Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O'Donovan, C., Redaschi, N., Yeh, L.S. (2005). The Universal Protein Resource (UniProt).Nucleic Acids Research, 33: D154-159.
Wu, C. H. and Nebert, D. W. (2004). Update on human genome completion and annotations: Protein Information Resource. Human Genomics, 1, 229-233.
Wu, C. H., Huang, H., Nikolskaya, A., Hu, Z. and Barker, W. C. (2004). The iProClass integrated database for protein functional analysis. Computational Biology and Chemistry, 28, 87-96.
Wu, C. H., Nikolskaya, A., Huang, H., Yeh, L.-S., Natale, D., Vinayaka, C. R., Hu, Z., Mazumder, R., Kumar, S., Kourtesis, P., Ledley, R. S., Suzek, B. E., Arminski, L., Chen, Y., Zhang, J., Cardenas, J. L., Chung, S., Castro-Alvear, J., Dinkov, G. and Barker, W. C. (2004). PIRSF family classification system at the Protein Information Resource. Nucleic Acids Research, 32, D112-114.
Apweiler, R., Bairoch, A. and Wu, C. H. (2004). Protein sequence databases. Current Opinion in Chemical Biology, 8, 76-80.
Apweiler R, Bairoch A, Wu, C. H., Barker, W. C., Boeckmann, B., Ferro1, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M. J., Natale, D. A., O Donovan, C., Redaschi, N., Yeh, L. S. (2004). UniProt: Universal Protein Knowledgebase. Nucleic Acids Research, 32, D115-119.
Hu, Z., Mani, I., Hermoso, V., Liu, H. and Wu, C. H. (2004). iProLINK: an integrated protein resource for literature mining. Computational Biology and Chemistry, 28, 409-416.
Wu, C. H., Yeh, L.-S., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z., Kourtesis, P., Ledley, R. S., Suzek, B.E., Vinayaka, C.R., Zhang, J. and Barker, W.C. (2003). The Protein Information Resource. Nucleic Acids Research, 31, 345-347.
Huang, H., Barker, W. C., Chen, Y. and Wu, C. H. (2003). iProClass: An Integrated Database of Protein Family, Function, and Structure Information. Nucleic Acids Research, 31, 390-392.
Wu, C. H., Huang, H., Yeh, L.-S. and Barker, W. C. (2003). Protein family classification and functional annotation. Computational Biology and Chemistry, 27, 37-47.
Wu, C. H., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z., Ledley, R. S., Lewis, K. C., Mewes, H. W., Orcutt, B. C., Suzek, B. E., Tsugita, A., Vinayaka, C. R., Yeh, L. S., Zhang, J. and Barker, W. C. (2002). The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Research, 30, 35-37.
Wu, C.H., Xiao, C., Hou, Z., Huang, H., and Barker, W. C. (2001). iProClass: An integrated and comprehensive protein classification database. Nucleic Acids Research, 29, 52-54.
McGarvey, P., Huang, H., Barker, W. C., Orcutt, B. C. and Wu, C. H. (2000). PIR Web site: New resource for bioinformatics. Bioinformatics, 16, 290-291.
Wu, C. H., Huang, H. and McLarty, J. (1999). Gene family identification network design for protein sequence analysis. International Journal of Artificial Intelligence Tools, 8, 419-432.
Wu, C. H., Shivakumar, S. and Huang, H. (1999). ProClass protein family database. Nucleic Acids Research, 27, 272-274.
Barker, W. C., Garavelli, J. S,, McGarvey, P. B, Marzec, C. R., Orcutt, B. C., Srinivasarao, G. Y., Yeh, L. S., Ledley, R. S., Mewes, H. W., Pfeiffer, F., Tsugita, A. and Wu, C. H. (1999). The PIR-International Protein Sequence Database. Nucleic Acids Research, 27, 39-43.
Wu, C. H., S. Shivakumar, C. V. Shivakumar and S. Chen. (1998). GeneFIND web server for protein family identification and information retrieval. Bioinformatics, 14, 223-224.
Wu, C. H. (1997). Artificial neural networks for molecular sequence analysis. Computers & Chemistry, 21, 237 - 256.
Wu, C. H., Chen, H. L. and Chen, S. (1997). Counter-propagation neural networks for molecular sequence classification: Supervised LVQ and dynamic node allocation. Applied Intelligence, 7, 27-38.
Wu, C. H., Zhao, S. and Chen, H. L. (1996). A protein class database organized with ProSite protein groups and PIR superfamilies. Journal of Computational Biology, 3, 547-562.
Wu, C. H., Zhao, S., Chen, H. L., Lo, C. J. and McLarty, J. (1996). Motif identification neural design for rapid and sensitive protein family search. CABIOS, 12, 109-118.
Wu, C. H. (1996). Gene Classification Artificial Neural System. Methods In Enzymolog, 266, 71-88.
Wu, C. H., Berry, M., Shivakumar, S. and McLarty, J. (1995). Neural networks for full-scale protein sequence classification: Sequence encoding with singular value decomposition. Machine Learning, 21, 177-193.
Wu, C. H. and Shivakumar, S. (1994). Back-propagation and counter-­propagation neural networks for phylogenetic classification of ribosomal RNA sequences. Nucleic Acids Research, 22, 4291-4299.
Wu, C. H., Whitson, G., McLarty, J., Ermongkonchai, A. and Chang, T. (1992). Protein classification artificial neural system. Protein Science, 1, 667-677.
Wu, C. H., Caspar, T., Browse, J., Lindquist, S. and Somerville, C. (1988). Characterization of an HSP70 cognate gene family in ArabidopsisPlant Physiology, 88, 731-740.
Wu, C. H., Warren, H. L., Sitaraman, K. and Tsai, C. Y. (1988). Translational alterations in maize leaves responding to pathogen infection, paraquat treatment or heat shock. Plant Physiology, 86, 1323-1329.

Revised 07/13/07

 HomeAbout PIRDatabasesSearch/AnalysisDownloadSupport  SITE MAPTERMS OF USE
©2016 Protein Information Resource