A description of the condensed chromosomes of a eukaryote as they are seen at metaphase. Additional details are revealed by a variety of staining techniques that produce banded chromosomes. See the idiogram of the mouse karyotype at the Department of Pathology at the University of Washington.
In molecular biology, a "library" is a complex mixture of recombinant DNA molecules in a suitable cloning vector representing either the entire genome of an organism (a genomic library) or the messenger RNA population of a whole organism, cell type, or tissue type (a cDNA library).
A ligand is the thing that a protein binds.
In molecular biology, to join two separate DNA or RNA segments to form a single DNA or RNA molecule enzymatically.
The property displayed by two genes that do not segregate independently of each other. Genes that are linked are on the same chromosome.
Literally, "place". The location of a gene or set of genes on a chromosome.
A statistical estimate of whether two loci are likely to lie near each other on a chromosome and are therefore likely to be inherited together. Lod stands for "log of the odds ratio." In this case, the odds ratio is the likelihood that two markers are linked divided by the likelihood that they are not linked. A LOD score of three or more is generally taken to indicate that the two loci are close.
- Any biological feature that can be positioned with respect to other features on a chromosome, by genetic, physical or other mapping methods. For example, a gene, anonymous DNA segment, mutation, or phenotype.
- A feature that distinguishes a particular biological state. For example, an expression profile of natural or engineered genes, or a characteristic morphology.
A gene involved in controlling the expression of a large number of genes.
A pair of nuclear divisions forming gametes wherein the number of chromosomes is reduced from the diploid to the haploid number; resulting cells normally contain one member of each pair of homologous chromosomes.
- That type of inheritance in which a specific trait is affected by a set of alleles of a single gene.
- That type of inheritance in which genetic information is transmitted by one or more nuclear genes, as opposed to cytoplasmic or epigenetic mechanisms.
A metabolic cluster is a Pathway Tools term for a set of biochemical reactions that are biologically related, but are largely unconnected, and therefore do not constitute a pathway in the traditional sense of the word.
A poster-size depiction of the Cellular Overview diagram generated from a PGDB by Pathway Tools.
The Metabolite Tracing facility of Pathway Tools permits users to graphically trace the path of a metabolite through the metabolic network.
An array of hundreds or thousands of spots containing specific DNA sequences for the analysis of gene expression by hybridization. Microarrays are used to detect changes in gene expression by comparing radioactively- or chemically-labeled cDNA prepared from the total mRNA of an experimental sample to that of a control sample. The relative intensity of the signal corresponding to each spot in the microarray reveals whether the expression of a particular gene is increased, decreased, or unchanged in the experimental mRNA sample compared to the control mRNA sample.
MicroRNAs (miRNAs) are a group of noncoding RNA molecules, generally 21 to 24 nucleotides in length, which are usually cleaved from a larger hairpin-containing RNA (itself often processed from some portion of mRNA). miRNAs are conserved through evolution, and their abundance and expression patterns are suggestive of diverse regulatory roles.
A cytoskeletal element of eukaryotic cells that is a long, generally straight, hollow tube with an external diameter of 24 nm, consisting of polymerized monomers of tubulin. Microtubules make up the bulk of the spindle.
The organelles that generate energy in eukaryotic cells. Mitochondria have their own genome encoding a subset of the proteins found in mitochondria; the mitochondrial genome uses an alternate genetic code.
A gene contained within the mitochondrial genome of a eukaryote, transmitted independently of the nuclear genome. The mitochondrial genome is transmitted maternally (from the female parent).
The division of the replicated chromosomes of a eukaryotic cell into two daughter nuclei that are genetically identical to that of the original cell. See the Figure at NHGRI.
Model Organism Database
A database that describes
the genome of an organism, plus other information. A Pathway/Genome Database
is a type of model-organism database.
The organism may
or may not be a model for studying some biological process --- the
term "organism-specific database" could be used equally well, but "model-organism database"
(MOD) has been used historically in bioinformatics. MODs provide a central
resource where new computational and experimental findings about a
genome can be integrated and reviewed by experts for that organism.
MODs serve as a distributed collaboration framework that allows a set
of experts to collaboratively develop a complex DB that serves as a
platform for disseminating the genome and its analyses to the
scientific community. MODs are indispensable resources in the
day-to-day work of experimentalists for a given organism, who use them
as reference sources for genome information, gene annotations,
pathways, and regulatory information.
Refers to the tasks or activities characteristic of particular gene products. For example, transcription factor refers to one of a number of proteins performing similar tasks. In the GO Project vocabularies, Molecular Function is a primary class of terms. See the GO Consortium site for further information.
An antibody produced by cultured cells that have their origin in a single antibody-producing cell and which is therefore of a single molecular type, in contrast to the polyclonal antibodies normally found in the serum of an immunized animal.
An individual consisting of cells of two or more genotypes. One example is that of a normal female mammal heterozygous for different alleles of X-chromosome genes; because of the process of X-inactivation, such females consist of two cell types, each with a different X chromosome inactivated. This is an unusual example because there is no actual difference in genotype between the two cell types, but rather there is an epigenetic difference.
The Multi-Genome Browser displays chromosomal regions for multiple organisms centered on corresponding genes.
An assay that detects specific RNA molecules using a DNA or RNA probe with sequence similarity. Samples are subjected to electrophoresis on a slab gel. A replica of the gel is then made on a membrane by capillary transfer. Specific RNA sequences are then detected on the membrane with a radioactively- or chemically-labeled probe.
DNA or RNA. Each of these compounds consists of a backbone of sugar molecules ribose for RNA and deoxyribose for DNA linked by single phosphate groups. Attached to the sugars of the backbone are any of four nitrogenous bases, A, T, C or G for DNA and A, U, C or G for RNA. See the Figure at NHGRI.
A monomer unit of nucleic acid, consisting of a purine or pyrimidine base, a sugar molecule (ribose or deoxyribose), and phosphate group(s).
Nucleotide Repeat Expansion
A type of mutation in which a set of tandemly repeated sequences replicates inaccurately to increase the number of repeats. An example of this kind of mutation in humans is the FMR1 gene.
The organelle in a eukaryotic cell that contains the chromosomes. In most types of eukaryotic cells, the nucleus breaks down as chromosomes condense during cell division. See the Figure at NHGRI.
In mathematics, a set with no members or of zero magnitude. If a field has a value of null, it means that the value is unknown. A null value is not the same as a value of zero. (To appreciate the difference, consider the terms "free" and "priceless." If something is free, it has a price of zero. If something is priceless, it has no known price. The difference between null and zero can be crucial; for example, when calculating the average value of a field among many records where one row contains a zero, the zero gets factored into the average. If the field has a null value, it does not get factored into the average.)
The Pathway Tools omics viewer uses the Cellular, Regulatory and Genome Overviews to illustrate the results of high-throughput experiments in a global metabolic and genomic context by painting experimental data onto these diagrams.
Ontologies are used to structure biological knowledge such that the knowledge
can be manipulated computationally. One type of ontology is a controlled vocabulary,
meaning a list of biological terms, each of which contains an English definition,
and a unique identifier. Databases refer to that term by its unique identifier to
avoid ambiguity, such as the ambiguity that occurs when the same word has multiple meanings.
Another type of ontology is an "is-a" hierarchy in which terms in a controlled
vocabulary are arranged in a generalization hierachy (or hierarchical classification) such that one term is a
parent of another term if the first term denotes a more general concept than the second term.
For example, the MetaCyc database contains a hierarchical classification of metabolic pathways.
Hierarchical classification systems allow users to retrieve
related sets of objects within the database, and to drill down from more general
to more specific classes of information.
Database schemas and knowledge representation systems are also types of ontologies.
A unit of genetic material that is expressed in a coordinated manner by means of an operator, a promoter, and one or more structural genes that are transcribed together.
One of a number of different kinds of membrane-bound substructures within a eukaryotic cell. Examples include the nucleus, mitochondria, and chloroplasts.
One of a set of homologous genes that have diverged from each other as a consequence of speciation. For example, the alpha globin genes of mouse and chick are orthologs. See the Figure at NCBI.
The relationship of any two homologous characters whose common ancestor lies in the most recent common ancestor of the taxa being considered.
A type of genetic cross in which an organism is crossed to a strain from which it was not recently derived.
P1 Artificial Chromosome. A type of cloning vector derived from bacteriophage P1 that allows foreign DNA segments to be cloned in bacteria. The capacity of a PAC is up to 100 kb of foreign DNA.
The inference component of the Pathway Tools software. PathoLogic contains four predictors:
A predictor of metabolic pathways; a predictor of missing enzymes in metabolic pathways
(the pathway hole filler); an operon predictor; and a program that predicts transport
reactions from transporter functional descriptions. See the Pathway Tools Overview for more information.
An interconnected set of biochemical reactions, where reactions are connected by sharing common reactants and products. In metabolic pathways, reactants and products are typically low-molecular-weight chemical compounds. In signaling pathways, reactants and products are typically proteins.
A Pathway/Genome Database (PGDB) is a database managed by SRI's Pathway Tools software
that describes an information space ranging from genomes to pathways.
A PGDB such as EcoCyc describes the genome of an organism, the product of
each gene, the biochemical reaction(s) catalyzed by each gene product,
the substrates of each reaction, and the organization of reactions
into pathways. The schema of a PGDB can also describe the regulatory
network of an organism.
Pathway Tools Overviews
A genome-scale depiction of information within a PGDB. There are three different Overviews: the Cellular Overview depicts the metabolic network, the Regulatory Overview depicts the regulatory network, and the Genome Overview depicts the full genome.
Pathway Tools Posters
Pathway Tools can generate postscript and/or PDF files of a poster-size depiction of a genome map and of a metabolic map from a PGDB.
One of a set of homologous genes that have diverged from each other as a consequence of genetic duplication. For example, the mouse alpha globin and beta globin genes are paralogs. The relationship between mouse alpha globin and chick beta globin is also considered paralogous. See the Figure at NCBI.
The relationship of any two homologous characters that arose by a genetic duplication.
An item of information such as a name, a number, or a selected option passed to a program by a user or another program. Parameters affect the operation of the program receiving them. Parameters are values that you select or enter in the query form fields.
Pathway Evidence Report
Pathologic can generate a “pathway evidence report” Web page that lists all pathways it has predicted in an organism and the evidence supporting each predicted pathway. This report provides a convenient way for a scientist to review the evidence for each pathway.
A pathway hole is a pathway reaction thought to occur in an organism for which no corresponding enzyme has been identified in the genome.
In BioCyc, this term refers to terms in a hierarchical controlled vocabulary such as those containing Gene Ontology (GO) terms. A "parent" of a term is one any number of levels above it in the hierarchy from which it is descended. For example, the GO term enzyme [GO:0003824] is a parent to the GO term alcohol dehydrogenase [GO:0004022].
Polymerase Chain Reaction. A method of amplifying specific DNA segments based on hybridization to a primer pair. A DNA sample is denatured by heating in the presence of a vast molar excess of short single-stranded DNA primers (around 20 nucleotides) whose sequence is chosen based on the target sequence. The reaction mixture also contains a thermostable DNA polymerase, dNTPs, and buffer. The primer sequences are selected so that they:
The sample is then cooled to a temperature that allows primer annealing and in vitro replication. The sample is subjected to multiple cycles of denaturation and cooling to allow multiple rounds of replication. The quantity of the target sequence doubles during each cycle, causing the target sequence to be amplified, while other DNA sequences in the sample remain unamplified. See the Figure at Access Excellence.
- are derived from opposite strands of the target sequence,
- have their 3' ends facing each other, and
- are separated by a length of DNA that can be reliably synthesized in vitro.
The fraction of individuals of a given genotype that show a particular phenotype, usually expressed as a percentage.
See also Expressivity.
The PGDB registry is an Internet-based mechanism for sharing PGDBs among Pathway Tools users.
- A bacteriophage, a virus capable of infecting bacteria.
- A type of cloning vector derived from a bacteriophage, usually capable of carrying an amount of foreign DNA that is at the upper range of that carried by a plasmid.
A type of cloning vector derived from a phage and a plasmid. Phagemids are capable of carrying an amount of foreign DNA comparable to a plasmid, but have some special feature such as the ability to produce single-stranded DNA.
The condition of an individual resembling that of a phenotype produced by a particular mutation by some experimental treatment other than the presence of that mutation, e.g., drug treatment.
A description of the observable state of an individual with respect to some inherited characteristic. Often, individuals with different genotypes display the same phenotype. See dominant and recessive.
The detection of radioactivity using "phosphor" compounds that emit visible light when exposed to radiation. Phosphorimaging instruments produce images of, for example, Southern blots and Northern blots, that are comparable to those produced by autoradiography, with superior quantitation.
A map of DNA showing distances between and within genes or specified markers measured in base pairs of DNA. It is based on the direct measurement of DNA.
A type of cloning vector derived from autonomously-replicating extrachromosomal circular DNAs in bacteria. The amount of foreign DNA that can be carried in a plasmid is small, ranging up to about 20 kb.
A type of mutation in which a single nucleotide is changed to one of the other three possible nucleotides.
The process by which a series of adenosine (A) ribonucleotides is added to the 3' end of a spliced RNA to make a mature mRNA. This addition to the RNA is sometimes referred to as a poly-A tail, and commonly contains several hundred bases.
An instance of genotypic variation within a population.
Metabolism is the term used to describe all of the chemical reactions and interactions
that take place in a biological system. Primary metabolism encompasses reactions involving
those compounds which are formed as a part of the normal anabolic and catabolic processes
which result in assimilation, respiration, transport, and differentiation. These processes
take place in most, if not all, cells of the organism. Common examples of primary compounds
are sugars, amino acids, nucleotides etc. Primary metabolism is simply defined as the metabolism of primary compounds.
A single-stranded nucleic acid that can "prime" replication of a template. More specifically, a single-stranded nucleic acid capable of hybridizing to a template single-stranded nucleic acid in such a way as to leave part of the template to the 3' end of the primer single-stranded. DNA polymerase can then synthesize a new strand starting from the 3' end of the primer and adding nucleotides to the growing strand by base complementarity to the template.
See also PCR.
In molecular biology, a nucleic acid that has been labeled either radioactively or chemically that allows the detection of nucleic acids with sequence similarity in a sample by hybridization. Probes are used to detect DNA on membranes in Southern blots, to detect RNA on membranes in Northern blots, and either DNA or RNA in cytological preparations for in situ hybridization.
Cell or organism lacking a membrane-bound, structurally discrete nucleus and other subcellular compartments. Bacteria are prokaryotes.
See also eukaryote.
A region of a protein responsible for a particular function, as recognized experimentally and by the occurrence of similar segments in other proteins sharing that function, e.g., a DNA binding domain.
A gene whose product is a protein.
- A method of detecting a particular enzyme in a cell or tissue sample. A sample of cells or tissue is fixed, then treated with a chromogenic substrate for the enzyme to be detected. Microscopic examination reveals the presence of staining, and hence of the specific protein to be detected.
The complete collection of all proteins encoded by the genome of an organism.
Systematic analysis of protein expression of normal and diseased tissues that involves the separation, identification and characterization of all of the proteins in an organism.
The small region of homology shared between the X chromosome and the Y chromosome in mammals. All crossovers between the X and Y chromosomes occur in this region.
Quantitative Trait Locus (QTL)
- A heritable genetic region that affects a measurable characteristic of the animal (e.g., body weight or blood pressure).
- The type of marker described by statistical association to quantitative variation in a particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci.
A request for information submitted to a computerized database.
A Query Form is a web page allowing users to retrieve information from a database.
A DNA or protein sequence submitted to a computerized database for comparison, e.g., a BLAST search.
- Electromagnetic energy: gamma rays, X rays, ultraviolet light, visible light, infrared light, microwaves and radio waves. In mouse genetics, this term generally refers to gamma rays and X rays.
- Subatomic particles emitted by the decay of unstable isotopes: electrons (beta particles) and helium nuclei (alpha particles). Common unstable isotopes in molecular biology are tritium (3H),which emits low-energy beta particles, 35S, which emits beta particles of moderate energy, and 32P, which emits high-energy beta particles.
- Subatomic particles from a particle accelerator, such as protons, neutrons, and electrons.
Radiation Hybrid Mapping
A type of genetic mapping providing resolution between relatively low-resolution linkage analysis and high-resolution physical mapping by the assembly of contiguous cloned DNA segments. The method consists of fusing irradiated cultured cells of one species with cultured cells of a different species. A panel of hybrid cells is then tested for the occurrence of pairs of markers. The closer two markers are to each other, the more likely that both are present in an individual hybrid cell.
Radiation Induced Mutation
A mutation induced by irradiation, in mouse usually gamma-ray or X-ray.
Given a starting set of metabolites (called the nutrients), the Pathway Tools Reachability Analysis tool determines which reactions can fire, and which other metabolites are produced as a result of this qualitative simulation, in an automated and iterative manner.
One of three ways of reading a single strand of nucleic acid sequence as codons.
One of a series of terms applied to the phenotypic effect of a particular allele in reference to another allele (usually the standard wild-type allele) with respect to a given trait. An allele "a" is said to be recessive with respect to the allele "A" if the A/A homozygote and the A/a heterozygote are phenotypically identical and different from the a/a homozygote. An example is the nonagouti (a) allele of the mouse. A(+)/A(+) and (+)/a mice have identical agouti banding of individual hairs in the coat, while a/a mice have hairs of uniform color.
Transfer of information from one DNA molecule to another. Recombination may be reciprocal, in which case the products are equivalent to breakage of the two DNA molecules and rejoining of the broken ends to form new molecules. Recombination may also be nonreciprocal, in which case the product is equivalent to transfer of information from the donor DNA molecule to the recipient DNA molecule, with no change in the donor DNA molecule. Reciprocal recombination events are also called crossovers.
In computer science, a collection of fields.
Redox Half reactions
Redox half reactions are elementary reactions in which explicitly stated electrons are reducing an oxidized molecular species. These reactions do not stand alone, because electrons do not occur freely. Instead, a half reaction must be paired with another half reaction to form a complete, overall transformation.
A DNA sequence that is required for a gene on the same DNA molecule to be transcribed, or to be transcribed in the proper cell type(s) and developmental stage(s).
The Regulatory Overview is a whole-organism depiction of the transcriptional regulatory network.
A type of database in which information is organized into tables.
Relative Data Values, Omics Viewer
The numerical values in the data file include positive and negative values.
The process of synthesizing a copy of a DNA molecule from nucleotides using information contained within one strand of a template DNA molecule. The new strand of DNA is synthesized from the 5' end to the 3' end. See the Figure at NHGRI.
A gene whose product is easily detected and not ordinarily present in an organism or cell type under study that is expressed as part of a DNA construct introduced experimentally. Bacterial beta-galactosidase, whose activity can be detected using a staining reaction, is a commonly used reporter gene, as is green fluorescent protein.
A protein that recognizes specific, short nucleotide sequences and cuts DNA at those sites.
An enzyme that is able to synthesize DNA from information in RNA. It requires an RNA template and a DNA or RNA primer.
See also cDNA.
A mutation event that alters an allele conferring a mutant phenotype into one conferring a wild-type phenotype. The mutation need not restore the gene to its original nucleotide sequence to be considered a reversion event.
An RNA molecule with catalytic activity.
Ribonucleic acid. A nucleic acid that is the primary product of gene expression. Chemically, it differs from DNA by the substitution of ribose for deoxyribose in the sugar-phosphate backbone and by the substitution of the base uracil for thymine. See the Figure at NHGRI.
The alteration of the sequence of an RNA molecule by enyzmatic modification of individual bases without normal splicing.
A gene whose product is an RNA.
A method of detecting the presence of a specific RNA in a sample. A radioactively-labeled RNA probe is prepared by transcribing the antisense strand of a DNA construct. The labeled probe is hybridized to the sample. The sample is then treated with RNAse, which is specific to single-stranded RNA. The sample is then subjected to electrophoresis and autoradiography. The presence of full-length probe that has not been cleaved by RNAse indicates the presence of the sense strand, and hence gene expression, in the sample.
A particular type of translocation in which the breakpoints in the two chromosomes occur at or near the centromere, followed by centric fusion such that the long arms now form a metacentric chromosome with a single centromere. Any small fragments generated in the exchange are usually lost.
See also Translocation.
Ribosomal RNA. The RNA molecules that are a structural and catalytic component of the ribosome.
Reverse-Transcription PCR. A method of amplifying mRNA by first synthesizing cDNA with reverse transcriptase, then amplifying the cDNA using PCR. A positive result is evidence of a particular mRNA, and hence of gene expression, in a sample.
SAM Output File
The Omics Viewer can import gene expression data from a spreadsheet generated by the SAM (Significance Analysis of Microarrays) Microsoft Excel plug-in. This package combines multiple expression experiments to produce a list of statistically significant positively and negatively regulated genes. The Omics Viewer displays the positively regulated genes in one color, and the negatively regulated genes in another color.
SAQP: Structured Advanced Query Page
The SAQP is a graphical user interface to formulate a query to a PGDB without knowing the underlying query language (BioVelo).
- An underlying organizational pattern or structure; conceptual framework.
- A collection of items that model part or all of a real world object, particularly in the context of a database, i.e., a database schema.
- The structure of a database system, described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables. Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure.
- In computer science, a description of the logical organization, structure, and content of a database.
One of a series of terms applied to the phenotypic effect of a particular allele in reference to another allele (usually the standard wild-type allele) with respect to a given trait. An allele "A" is said to be semidominant with respect to the allele "a" if the A/A homozygote has a mutant phenotype, the A/a heterozygote has a less severe phenotype, while the a/a homozygote is wild-type. An example is Pmp22(Tr-J) in mouse. Pmp22(Tr-J)/Pmp22(Tr-J) animals display a myelination defect associated with a "trembler" phenotype, while Pmp22(Tr-J)/Pmp22(+) animals are less severely affected, and Pmp22(+)/Pmp22(+) animals are wild-type.
- n. Additional information added to genomic sequence to identify genes, delimit the intron and exon structures of those genes, identify regulatory elements, note the positions of allelic variation, etc.
- v. The analysis process used to create sequence annotations. The process relies heavily on the homology principle, whereby similarity to known genes is used to help identify new genes and propose functions for them.
Sequence ID (SeqID)
Sequence accession identifier. A unique alphanumeric character string that unambiguously identifies a sequence record in a database. Examples of genomic sequence providers are NCBI and Ensembl; examples of sequence IDs from these providers are 16590 and ENSMUSG00000053869, respectively.
The sequencing of a large DNA segment through the sequencing of randomly-derived subsegments whose order and orientation within the large segment is unknown until the assembly of overlapping sequences. The method works if all positions in the large segment are covered by multiple overlapping subsegments.
See also Whole-genome shotgun sequencing.
This term refers to terms in a hierarchical controlled vocabulary such as those containing Gene Ontology (GO) terms.
A "sibling" of a term is a term at the same level of the hierarchy sharing at least one ancestor. For example, the GO term
alcohol dehydrogenase [GO:0004022] is a sibling to the GO term aldehyde oxidase [GO:0004031]; they share the ancestor term
A protein component of RNA polymerase that determines the specific site on DNA where transcription begins.
Signal transduction pathway
Pathways that describe the chain of events, such as protein phosphorylation, that occurs during the propagation of a signal in a cell. These pathways start with the binding of ligand by a trans-membrane receptor, and proceed through a series of intermediate molecules until final regulatory molecules, such as transcription factors, are modified in response.
In comparison of nucleic acid sequences, the extent to which two nucleic acid sequences have identical bases at equivalent positions, usually expressed as a percentage. In comparison of protein sequences, the extent to which the amino acid sequences of two proteins have identical or functionally similar amino acids at equivalent positions, usually expressed as a percentage.
See also Identity.
An instance of this Pathway Tools ontology class should be a single reaction involving only a small number of defined participants.
Simple Sequence Repeat (SSR)
A sequence consisting largely of a tandem repeat of a specific k-mer (such as (CA)15). Many SSRs are polymorphic and have been widely used in genetic mapping.
A language for writing chemical structures in terms of character strings . The chemical substructure searcher within Pathway Tools accepts substructures in the SMILES language.
Cells in an animal other than those that constitute the germ line.
Somatic Cell Hybrid
A type of mapping experiment permitting the assignment of markers to chromosomes. The method consists of fusing cultured cells of one species with cultured cells of a different species. The hybrid cells are unstable in karyotype during growth, with most chromosomes from one species typically being lost. Among clonal populations of hybrid cells following growth, different chromosomes are retained from one species. A panel of hybrid cell cultures can be assayed for which mouse chromosomes (for example) are retained, and simultaneously assayed for the presence of particular markers. The correlation of the presence of a particular marker across the panel with the presence of a particular mouse chromosome allows that marker to be assigned to that chromosome.
See also Radiation Hybrid Mapping.
An assay that detects specific DNA molecules using a DNA or RNA probe with sequence similarity. Samples are subjected to electrophoresis on a slab gel. A replica of the gel is then made on a membrane by capillary transfer following denaturation. Specific DNA sequences are then detected on the membrane with a radioactively- or chemically-labeled probe. See the Figure from Alberts, et al., Molecular Biology of the Cell.
As a type of mutation, one that has occurred in the absence of any experimental mutagenic treatment, such as irradiation or treatment with chemical mutagens.
Structured Query Language. SQL is used to communicate with a database. According to ANSI (American National Standards Institute), it is the standard language for relational database management systems.
SQL statements perform tasks such as updating data in or retrieving data from a
database. Some common relational database management systems that use SQL are:
Oracle, Sybase, Microsoft SQL Server, Access, Ingres, etc. Although most
database systems use SQL, most of them also have their own additional
proprietary extensions that are usually only used on their system.
Simple Sequence Length Polymorphism, a type of polymorphism that results from variation in the length of an SSR.
Simple Sequence Repeat, a DNA sequence consisting largely of a tandem repeat of a specific k-mer (such as (CA)15). Many SSRs are polymorphic and have been widely used in genetic mapping.
Strain is a low-level taxonomic rank used in three related ways. In Microbiology, a strain is a genetic variant or subtype of a microorganism
(e.g. virus or bacterium or fungus). In plants, a strain is a designated group of offspring that have descended from a modified plant,
produced either by conventional breeding or by biotechnological means or result from genetic mutation. In rodents, a strain is a group of
animals that is genetically uniform.
A protein that functions as a structural element of cells rather than as an enzyme, for example, collagen.
Structured data are data that have been represented in a manner that allows computation
with those data. Data become structured when they are carefully dissected and
assigned to distinct fields of a database with clearly defined meanings,
so that the data are independently queryable and computable. Therefore, we can
ask questions across the data such as "find all enzymes that use magnesium as a
cofactor" or "find all pathways in which pyruvate is an input substrate". See also Unstructured Data.
Structured Query Language
Structured Query Language (SQL) is used to communicate with a database. According to ANSI (American National Standards Institute), it is the standard language for relational database management systems. SQL statements perform tasks such as updating data in or retrieving data from a database. Some common relational database management systems that use SQL are: Oracle, Sybase, Microsoft SQL Server, Access, Ingres, etc. Although most database systems use SQL, most of them also have their own additional proprietary extensions that are usually only used on their system. The Query Forms at MGI extract information from databases by generating instructions in SQL.
Sequence Tagged Site. A short segment of unique sequence derived from genomic DNA. A large collection of STSs can be used to assemble a physical map of the genome from a collection of genomic clones (e.g., BACs or YACs) by testing each clone for the presence of each STS. Two clones that contain one or more STSs in common must overlap. For examples, see the physical maps of the mouse genome at MGI.
A molecule acted upon by an enzyme.
Superpathways are a class of PGDB metabolic pathways that are constructed by combining and connecting individual pathways (which can be shown separately) to depict relationships between them. In some cases those individual pathways start from a common precursor, or produce a common product, but they can have other relationships as well. Superpathways can have individual reactions as their components in addition to other pathways. Superpathways can be defined recursively, that is, the component pathway of a Superpathway can be a base pathway or can itself be a superpathway. Most superpathways will have an additional parent class within the pathway ontology to define their biological role.
A curated protein sequence database. See the SWISS-PROT site for more details.
A synonym is one of several names that are, or have been used, in the scientific
literature or in public databases to refer to one object. For example,
2-phosphoglyceric acid, 2-PGA, and glycerate 2-phosphate are all synonyms of the
compound 2-phosphoglycerate. In BioCyc, one of the synonyms is designated as the
The state of being on the same chromosome. A gene is also said to be syntenic to a particular chromosome if it is known to be located on that chromosome but is otherwise unmapped.
See also Conserved Synteny.
The data dictionary of a DBMS. The system catalog stores metadata including the schemas of the databases. It is a mini-database, and is usually stored using the DBMS itself in special tables called system tables. It maybe referred to as being "on line", as it is active, and users can query it like any other table.
A text file that uses tabs to separate adjacent fields. It is a common format for downloading information into a spreadsheet.
Refers to data arranged in rows and columns. A spreadsheet, for example, is a table. In relational databases, all information is stored in the form of tables.
A type of mutation in which a chromosomal gene is altered by the substitution of a DNA construct assembled in vitro.
The constructs are usually designed to eliminate gene function; such targeted mutations are often casually referred to as knock outs. Some DNA constructs are designed to alter gene function; such targeted mutations are often casually referred to as knock ins.
A specialized structure at the ends of linear chromosomes in eukaryotes. Telomeres confer stability on chromosome ends. Chromosome ends lacking telomeres, such as those generated from interstitial sites by chromosome breaks, are reactive, often fusing with other broken ends to generate chromosome rearrangements. Telomeres also permit the ends of linear chromosomes to replicate fully. See the Figure at NHGRI.
A DNA sequence that signals the end of transcription.
A type of cross in which individuals whose genotype
with respect to one or more genes is unknown are crossed to a test strain
homozygous for a recessive
allele at the genes under study. For example, a cross of an individual that
was A/A or A/a (identical in phenotype) to a/a would reveal the genotype of the
individual being tested, because if the individual being tested were A/A, all of the progeny would show the
dominant phenotype, while if the individual being tested were A/a, half of the
progeny would show the dominant phenotype and half would show the
Used to describe an enzyme or other protein that is not denatured at temperatures that denature most other proteins.
A particular aspect of the phenotype that can be measured or observed directly, e.g., blood pressure or body weight.
An RNA molecule (or species of RNA molecule) that is the product of transcription.
The location at the 5′ end of a gene, adjacent to the promoter, at which the RNA polymerase complex binds to the DNA and initiates the process of transcription of that gene into mRNA. The precise context of the TSS depends on the gene, its host organism, the type of polymerase involved, and other factors.
A transcription unit is a sequence of nucleotides in DNA that codes for a single RNA molecule, along with the sequences necessary for its transcription; normally a transcription unit contains a promoter, an RNA-coding sequence, and a terminator. Similar to operons, however, operons containing multiple promoters and/or terminators correspond to multiple transcription units.
Any DNA sequence or combination of sequences that has been introduced via a construct into the germ line of the animal by random integration.
A mouse that contains a stably inherited DNA which has been inserted randomly into the genome.
The inserted gene sequence (the transgene) may or may not be derived from mouse sequence.
A type of point mutation in which a purine is substituted for another purine or a pyrimidine for another pyrimidine. These substitutions include A for G, G for A, C for T, or T for C.
See also Transversion.
A type of mutation in which two nonhomologous chromosomes are each broken and then repaired in such a way that:
- the resulting chromosomes each contain material from the other chromosome (areciprocal translocation), see the Figure at NHGRI)
- one of the chromosomes contains an insertion of material from the other chromosome, with the other chromosome containing a deletion (an insertional translocation, see the Figure at NHGRI) or
- the two chromosomes, each with breaks near the centromere, fuse to form a single chromosome with a single centromere (a Robertsonian translocation).
This Pathway Tools ontology class defines reactions in which at least one species is transported (passively or actively) across a membrane. The species may or may not be chemically modified in the course of the reaction. A transport reaction is assumed to occur physiologically in the direction written; if it proceeds in the reverse direction, this fact should be indicated in the enzymatic-reaction for a given transporter.
A type of mobile genetic element consisting of DNA that moves to new genomic locations conservatively (without replicating itself) or replicatively (moving a copy of itself).
A type of point mutation in which a purine is substituted for a pyrimidine or a pyrimidine for a purine. These substitutions include C or T for A, C or T for G, A or G for C, and A or G for T.
See also Transition.
A protein sequence database that contains all the translations of EMBL nucleotide sequences. See the TrEMBL site for more details.
The condition of having three chromosomes of a particular type. Down Syndrome in humans is a trisomy for chromosome 21.
See also Monosomy.
A try-set is a mechanism used within the Pathway Tools MetaFlux module to allow MetaFlux to explore potential modifications to an FBA model. A try-set defines a set of reactions or metabolites that can be added to a base model that is considered incomplete. Try-sets can be specified for reactions, nutrients, secretions, and biomass metabolites.
A type of database link that links an object in one database to an object
in another database that represents the same biological object.
An experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location. See the UniGene page at NCBI.
The inheritance, in a diploid organism, of both copies of a single chromosome from one parent. This may result from the union of a gamete bearing two copies of one chromosome with a gamete bearing no copy of that chromosome, or from the union of a gamete bearing two copies of one chromosome with a normal gamete, followed by the loss of one chromosome through an error in mitosis. Because of imprinting, uniparental disomy can have phenotypic consequences in mammals. See, for example, Prader-Willi Syndrome.
Data which is not structured, such as the free-text comments within a database. Such
comments are not structured because the computer cannot
compute with the data. Computers cannot read text, therefore they cannot
extract individual data elements from large text blocks such that the meanings of those data elements
are reliably known. See also Structured Data.
Uniform Resource Locator. An Internet address giving the protocol to be used for obtaining resources on the Internet such as "ftp:" for an FTP site or "http:" for a World Wide Web page. It also includes the server name and sometimes the path to the resource. The URL for BioCyc is "http://www.biocyc.org" target="_blank".
Vertebrate Genome Annotation. The VEGA database is a central repository for high quality, frequently updated, manual annotation of vertebrate finished genome sequence. VEGA developed within Ensembl as a joint project between EMBL-EBI and the Sanger Institute.
A noncellular biological entity that requires a host cell for reproduction. Viruses consist of a nucleic acid genome that is either DNA or, in the case of retroviruses, RNA. The viral genome is covered with a protein coat; some viruses have a host-derived membrane over the protein coat.
A suite of programs and databases for comparative analysis of genomic sequences. Users can either submit sequences and alignments for analysis or examine precomputed whole-genome alignments of different species. See http://genome.lbl.gov/vista/index.shtml.
An assay that detects specific proteins within a protein mixture. Samples are subjected to electrophoresis on a slab gel. A replica of the gel is then made on a membrane by electrophoretic transfer. Specific proteins are then detected on the membrane using antibody staining.
Whole-genome Shotgun Sequencing
The sequencing of the entire genome of an organism through the sequencing of randomly-derived subsegments whose order and orientation is unknown until the assembly of overlapping sequences is performed computationally. The method works if all positions in the genome are covered by multiple overlapping subsegments.
- The phenotype with respect to a given inherited characteristic that is considered to be the "normal" type commonly found in natural populations.
- The allele of a particular gene that confers the phenotype considered to be the "normal" type commonly found in natural populations. N.B.: Because some DNA sequence polymorphisms do not produce different phenotypes, there can be multiple "wild-type" alleles of a gene.
Wild Type Allele
One of many possible versions of a gene that functions normally, as opposed to versions of a gene that are functionally abnormal (i.e., mutant alleles).
With respect to gene nomenclature, a withdrawn symbol name was once the approved symbol or name for a marker; there is currently a different approved symbol or name for that marker.
One of pair of chromosomes that is sexually dimorphic in mammals. Normal female mammals have two X chromosomes, while normal male mammals have an X chromosome and a Y chromosome.
The condensation of all but one of the X chromosomes of a mammal into a heterochromatic state, eliminating gene expression from all but the active X chromosome. This process ensures that male and female mammals have the same level of gene activity of X-chromosome genes.
Yeast Artificial Chromosome.
One of pair of chromosomes that is sexually dimorphic in mammals. Normal female mammals have two X chromosomes, while normal male mammals have an X chromosome and a Y chromosome.