Introduction to BioCyc and Pathway Tools
BioCyc and Pathway Tools accelerate science by providing comprehensive data for sequenced organisms,
and a wide suite of bioinformatics tools.
BioCyc is a database collection and website with extensive search and analysis tools.
The Pathway Tools software can be installed at your site to compute metabolic reconstructions and
create BioCyc-like databases for genomes of interest.
BioCyc Database Collection
The BioCyc collection of Pathway/Genome Databases (PGDBs)
provides a reference on the genomes, metabolic pathways, and (in some cases) regulatory networks of
thousands of sequenced organisms. Each database combines information
from three sources:
- Computational inferences: Our Pathway Tools
software predicts the metabolic pathways of an
organism, predicts which genes code for missing enzymes in metabolic
pathways, predicts protein complexes, and predicts operons. We compute orthologs across BioCyc databases.
- Imported data: BioCyc integrates information
from other bioinformatics databases, such as protein feature and Gene
Ontology information from UniProt, gene-essentiality datasets from OGEE,
and regulatory information from RegTransBase.
- Manual curation:
The curated databases (called
Tier 1 and Tier 2 PGDBs)
have received literature-based curation to enter new gene functions,
pathways, protein complexes, regulation, and more. Curated PGDB
entries include mini-review summaries, thousands of literature
citations, and evidence codes.
- The EcoCyc DB is the result of more than 20 person-years of effort to enter
information from 40,000+ E. coli articles about gene function, metabolism, transport, and regulatory processes.
- The MetaCyc DB describes metabolic pathways, enzymes,and metabolites from all domains of life, curated from 74,000+ publications.
The EcoCyc and MetaCyc databases are freely available to all users because their curation is supported by NIH funding.
Also free is the database for the cyanobacterium Arthrospira platensis NIES-39 as an example of a Tier 3 database.
The other BioCyc databases are available via subscription, which supports their curation.
To obtain free access to the other BioCyc databases for teaching purposes, please click here.
BioCyc data files may be downloaded to your site, and BioCyc
data can be queried via web services.
BioCyc.org Bioinformatics Tools
BioCyc.org provides a suite of bioinformatics tools (see Tools menu) for accessing and analyzing the
BioCyc databases. The tools provide search and visualization, omics
data analysis, and comparative genomics and comparative pathway
analysis:
- Search:
Multiple search tools enable users to find genes, pathways, and metabolites of
interest, which are presented in corresponding information pages. Most searches apply to the currently selected organism database,
which can be changed with the "Change Current Database" button at the top of most pages. There are two ways to search across multiple databases:
(1) Use Tools → Search → Cross Organism Search or (2) In commands such as Tools → Search → Search Genes, Proteins, and RNAs,
select "Search across multiple organisms/databases" under the list of buttons.
- Visualization:
A variety of visualization tools are provided, such as metabolic-pathway diagrams, and zoomable diagrams depicting the complete metabolic chart of each organism
[example].
- Genome Browser: The BioCyc genome browser
[example]
enables visual genome exploration and analysis of positional genome datasets via tracks.
- Omics Data Analysis: Tools include statistical over-representation analysis;
and visualization of gene expression, proteomics, or
metabolomics data on metabolic-chart diagrams
[example] and on the Omics Dashboard
[example].
- SmartTables: Provide biologist-friendly analysis capabilities for groups of genes or metabolites that are stored in your BioCyc account.
- Metabolic Route Search: Search for reaction paths connecting specified
metabolites in the metabolic network, with the option of adding new reactions
from the MetaCyc DB.
- Comparative Analysis: Tools include comparison of pathways, metabolites, transporters,
and regulatory networks -- see menu command Analysis → Comparative Analysis and the new Comparative Genome Dashboard at Analysis → Comparative Genome Dashboard.
- Sequence Analysis: Extract sequences, perform BLAST searches, sequence pattern searches, and perform multiple alignments.
Pathway Tools Software
Pathway Tools is an enterprise genome and pathway data management tool and is among the most extensive
bioinformatics software packages. It is
the software used to create BioCyc databases and it powers the BioCyc.org website.
Its capabilities are described in detail
here.
Pathway Tools can run as both a desktop application and as a web server.
Installing Pathway Tools at your site brings these advantages:
- Install a private local set of BioCyc PGDBs on your intranet
- Create new PGDBs from your own genome data, generating metabolic reconstructions, operon inferences, and more.
- Apply its extensive search, visualization, and analysis tools to your own genome data.
- Edit PGDBs interactively to add new gene functions and pathways
- Build quantitative metabolic flux models using Flux-Balance Analysis with the MetaFlux tool
How to Learn More About BioCyc
The following additional information exists about the BioCyc site:
Definitions of Terminology on the BioCyc Website
Here we define a few key terms. See the
glossary for
more definitions.
Pathway/Genome Database (PGDB). A database that describes
- The genome of an organism -- its chromosome(s), genes, and genome sequence
- The product of each gene
- The metabolic network of the organism -- its pathways,
reactions, enzymes, and metabolites
- The transporter complement of the organism
- The regulatory network of the organism, including its operons,
transcription factors, and the interactions between transcription
factors and their small-molecule ligands and DNA binding sites
Tier 1 PGDB.
PGDBs in Tier 1, such as EcoCyc, MetaCyc, and HumanCyc, have received at least one year of
literature-based curation by scientists.
More information about curation practices is available in the Curator Guide.
Tier 2 PGDB.
PGDBs in Tier 2 were generated by the PathoLogic
program, which predicted their metabolic pathways; their operons (for bacteria only); protein complexes; and
some missing enzymes in their predicted pathways (pathway hole fillers). The resulting PGDBs
underwent manual review by a person to remove false-positive pathway
predictions that they could detect, and to perform
refinements such as defining protein complexes. The resulting PGDBs
also underwent a period of literature-based curation, such as
to enter metabolic pathways that had been experimentally elucidated
in the organism but that were not inferred by PathoLogic. [list of Tier 2 PGDBs]
Tier 3 PGDB.
PGDBs in Tier 3 were generated by PathoLogic, which
predicted metabolic pathways, operons (for bacteria only), pathway hole fillers,
and transport reactions.
The resulting PGDBs did not undergo manual review of the pathway predictions,
nor subsequent literature curation. Therefore, the pathway predictions should
be treated with due caution.
[list of Tier 3 PGDBs]
Pathway Tools Software. Pathway Tools is used to
construct, update, visualize, query, and analyze PGDBs, such as the BioCyc collection.
It is freely available to academics interested in creating PGDBs for
organisms of interest to them.
Components of Pathway Tools are:
- The Pathway/Genome Navigator supports querying, visualization, and analysis of PGDBs
- The Pathway/Genome Editors support interactive updating
and refinement of PGDBs
- PathoLogic performs computational inferences such as
pathway prediction
- MetaFlux enables creation of quantitative metabolic models from PGDBs
BioCyc: The collection of PGDBs at URL
https://BioCyc.org/ is called the BioCyc Database Collection.
EcoCyc and MetaCyc are component databases within the BioCyc collection.
BioCyc Organism Home Pages
BioCyc contains home pages for the following organisms. You can visit these pages to pre-select these organism databases
for searches.