The human Genome and DNA RNA (Robbins)

The sequencing of the human genome in the early 21st century represented a landmark achievement in biomedical science. Since then, the rapidly falling cost of sequencing and the computational power to analyze large amounts of data promise to revolutionize our understanding of health and disease. At the same time, the emerging information has also revealed an impressive level of complexity far beyond linear genome sequencing. The potential of these powerful new tools to expand our understanding of pathogenesis and drive therapeutic innovation excites and inspires scientists and the general public.

Noncoding DNA

The human genome contains about 3.2 billion base pairs of DNA. However, within the genome there are only about 20,000 genes that code for proteins, comprising only 1.5% of the genome. The proteins encoded by these genes are the building blocks of cells and function as enzymes, building blocks, and signaling molecules. Although 20,000 underestimates the actual number of encoded proteins (many genes produce multiple RNA transcripts encoding different protein isoforms), it is surprising that worms made up of fewer than 1,000 cells, and with genomes 30 times smaller, also assemble from of approximately 20,000 genes that code for proteins. Perhaps even more disturbing is that many of these proteins are recognizable homologues of molecules expressed in humans. So what separates humans from worms?

The answer is not fully known, but the evidence supports the claim that the difference lies in the 98.5% of the human genome that does not encode proteins. The function of such long stretches of DNA (which has been called the “dark matter” of the genome) was mysterious for many years. However, it is now clear that more than 85% of the human genome is ultimately transcribed, and almost 80% is dedicated to the regulation of gene expression. It follows that while proteins provide the building blocks and machinery needed to assemble cells, tissues, and organisms, it is the noncoding regions of the genome that provide the critical “architectural planning.”

The organization of nuclear DNA

The organization of nuclear DNA. At the light microscopic level

The major classes of functional non–protein-coding DNA sequences found in the human genome include (Fig. 1.1):

  • Promoter and enhancer regions that bind protein transcription factors
  • Binding sites for proteins that organize and maintain higher order chromatin structures
  • Noncoding regulatory RNAs. Of the 80% of the genome dedicated to regulatory functions, the vast majority is transcribed into RNAs—micro-RNAs and long noncoding RNAs (described later)—that are never translated into protein, but can regulate gene expression
  • Mobile genetic elements (e.g., transposons). Remarkably, more than one-third of the human genome is composed of such “jumping genes.” These segments can cruise around the genome, and are implicated in gene regulation and chromatin organization.
  • Special structural regions of DNA, including telomeres (chromosome ends) and centromeres (chromosome “tethers”).

Regulation may be more important in causing disease than structural changes in specific proteins. Another surprise that emerged from genome sequencing is that any two humans are typically> 99.5% identical DNA (and are 99% identical in sequence to chimpanzees).

The two most common forms of DNA variation in the human genome are single nucleotide polymorphisms (SNPs) and copy number variations (CNV).

  • SNPs are variants at single nucleotide positions and are almost always biallelic (there are only two options at a given site within the population, such as A or T). More than 6 million human SNPs have been identified, and many show wide variation in frequency in different populations. The following characteristics are noteworthy:
  • SNPs are found throughout the genome, within exons, introns, intergenic regions, and coding regions.
  • About 1% of SNPs are found in coding regions, which is roughly what you would expect by chance, because coding regions comprise about 1.5% of the genome.
  • SNPs located in non-coding regions can appear in regulatory elements of the genome, thus altering gene expression; in such cases, the SNP may have a direct influence on disease susceptibility.
  • SNPs can also be “neutral” variants with no effect on gene function or carrier phenotype.
  • Even “neutral” SNPs can be useful markers if they turn out to be co-inherited with a disease-associated gene as a result of physical proximity. In other words, the SNP and the causative genetic factor are in linkage disequilibrium.
  • The effect of most SNPs on disease susceptibility is weak, and it remains to be seen whether the identification of such variants, alone or in combination, can be used to develop effective strategies for disease prediction or prevention.
  • CNVs are a form of genetic variation that consists of different numbers of large contiguous stretches of DNA; these can range from 1000 base pairs to millions of base pairs. In some cases, these loci are, like SNPs, biallelic and are simply duplicated or deleted in a subset of the population. In other cases, there are complex rearrangements of genomic material, with multiple alleles in the human population. CNVs are responsible for several million base pairs of sequence difference between any two individuals. About 50% of CNVs involve sequences that encode genes; thus, CNVs may underlie much of human phenotypic diversity.
  • It is important to note that DNA sequence alterations cannot by themselves explain the diversity of phenotypes in human populations; furthermore, classical genetic inheritance cannot explain the different phenotypes in monozygotic twins. The answers to these puzzles are likely to be found in epigenetics: heritable changes in gene expression that are not caused by alterations in DNA sequence (see below).

Chromatin organization

Recommended:   Pathology Case Studies



Histone Organization

Even though virtually all cells in the body have the same genetic composition, differentiated cells have distinct structures and functions arising through lineage-specific programs of gene expression. Such cell type–specific differences in DNA transcription and translation are regulated by epigenetic modifications that consist of several changes that profoundly influence gene expression, including:

Organization of chromatin (Fig. 1.2). Genomic DNA is packaged into nucleosomes, which are made up of 147 base pair DNA segments wrapped around a central core of proteins called histones. Nucleosomes resemble beads joined by short DNA linkers; the entire structure is generically called chromatin. Importantly, the winding and compaction of chromatin in any cell varies in different genomic regions.

Thus, nuclear chromatin exists in two basic forms (visualizable by standard histology): (1) histochemically dense and transcriptionally inactive heterochromatin and (2) histochemically dispersed and transcriptionally active euchromatin. Because only euchromatin allows gene expression and therefore dictates cell identity and activity, there are a number of mechanisms that tightly regulate chromatin status (described below).

DNA methylation. High levels of DNA methylation in gene regulatory elements generally result in chromatin condensation and transcriptional silencing. Like histone modifications (see below), DNA methylation is tightly regulated by methyltransferases, demethylating enzymes, and methylated DNA-binding proteins.

  • Histone modifying factors. Nucleosomes are highly dynamic structures regulated by a series of nuclear proteins and post-translational modifications:
  • Chromatin remodeling complexes can reposition nucleosomes in DNA, exposing (or obscuring) gene regulatory elements such as promoters.
  • “Chromatin writer” complexes carry out more than 70 different covalent histone modifications generically referred to as labels. These include methylation, acetylation, and phosphorylation of specific histone amino acid residues: histone methylation of lysines and arginines is accomplished by specific writing enzymes; Methylation of histone lysine residues can lead to transcriptional activation or repression, depending on which histone residue is “tagged.” Histone acetylation of lysine residues (which occurs via histone acetyl transferases) tends to open chromatin and increase transcription; Histone deacetylases (HDACs) reverse this process, leading to chromatin condensation. Histone phosphorylation of serine residues can variably open or condense chromatin, to increase or decrease transcription, respectively.
  • Histone marks are reversible through the activity of “chromatin erasers”. Other proteins function as “chromatin readers,” binding histones that carry particular markings and thereby regulating gene expression.

The mechanisms involved in the cell-specific epigenetic regulation of genomic organization and gene expression are undeniably complex. Despite the complexities, learning to manipulate these processes will likely bring important therapeutic benefits because many diseases are associated with inherited or acquired epigenetic abnormalities, and “epigenome” dysregulation plays a central role in the genesis of benign and malignant neoplasms (Chapter 6 ). Furthermore, unlike genetic changes, epigenetic alterations (eg, histone acetylation and DNA methylation) are easily reversible and therefore susceptible to intervention; in fact, HDAC inhibitors and DNA methylation inhibitors are already being used in the treatment of various forms of cancer.

Micro-RNA and Long Noncoding RNA

Another mechanism of gene regulation depends on the functions of non-coding RNAs. As the name implies, these are encoded by genes that are transcribed but not translated. Although there are many different families of noncoding RNAs, only two examples are discussed here: small RNA molecules called microRNAs and long noncoding RNAs> 200 nucleotides in length.

  • MicroRNAs (miRNAs) are relatively short RNAs (22 nucleotides on average) that function primarily to modulate the translation of target mRNAs into their corresponding proteins. Post-transcriptional silencing of gene expression by miRNA is a fundamental and evolutionarily conserved mechanism of gene regulation present in all eukaryotes (plants and animals). Even bacteria have a primitive version of the same general machinery that they use to protect themselves against foreign DNA (eg, from phages and viruses).
  • The human genome contains almost 6,000 miRNA genes, only 3.5 times fewer than the number of genes that code for proteins. In addition, individual miRNAs appear to regulate multiple protein-encoding genes, allowing each miRNA to coreduce entire gene expression programs. The transcription of miRNA genes produces a primary transcript (pri-miRNA) that is processed into progressively smaller segments, including clipping by the enzyme Dicer. This generates mature single-stranded miRNAs of 21 to 30 nucleotides that associate with a multiprotein aggregate called the RNA-induced silencing complex (RISC; Fig. 1.3). Subsequent base pairing between the miRNA strand and its target mRNA signals the RISC to induce cleavage of the mRNA or repress its translation. In this way, the target mRNA is posttranscriptionally silenced.

Taking advantage of the same pathway, small interfering RNAs (siRNAs) are short RNA sequences that can be introduced into cells. These serve as substrates for Dicer and interact with the RISC complex in a manner analogous to endogenous miRNAs. Synthetic siRNAs that can target specific mRNA species are therefore powerful laboratory tools for studying gene function (so-called knockdown technology); they also show promise as therapeutic agents for silencing pathogenic genes, eg, oncogenes involved in neoplastic transformation.

Taking advantage of the same pathway, small interfering RNAs (siRNAs) are short RNA sequences that can be introduced into cells. These serve as substrates for Dicer and interact with the RISC complex in a manner analogous to endogenous miRNAs. Synthetic siRNAs that can target specific mRNA species are therefore powerful laboratory tools for studying gene function (so-called knockdown technology); they also show promise as therapeutic agents for silencing pathogenic genes, eg, oncogenes involved in neoplastic transformation.

Long noncoding RNA (lncRNA). The human genome also contains a large number of lncRNAs, at least 30,000, and the total number potentially exceeds the coding mRNAs 10-20 times. LncRNAs modulate gene expression in many ways (Fig. 1.4); for example, they can bind to regions of chromatin, restricting RNA polymerase access to coding genes within the region. The best-known example of repressive function involves XIST, which is transcribed from the X chromosome and plays an essential role in the physiological inactivation of the X chromosome. XIST itself escapes the inactivation of X, but forms a repressive “blanket” on the chromosome. X from which it is transcribed, resulting in gene silencing. Conversely, it has been appreciated that many enhancers are lncRNA synthesis sites, with the lncRNAs expanding the transcription of gene promoters through the sequence of interest. Once targeted by the gRNA, Cas9 induces double-stranded DNA breaks.

 Generation of microRNAs (miRNA)
Fig. 1.3 Generation of microRNAs (miRNA) and their mode of action in regulating gene function.miRNA genes are transcribed to produce a primary miRNA (pri-miRNA), which is processed within the nucleus to form premiRNA composed of a single RNA strand with secondary hairpin loop structures that form stretches of double-stranded RNA. After this premiRNA is exported out of the nucleus via specific transporter proteins, the cytoplasmic enzyme Dicer trims the pre-miRNA to generate mature doublestranded miRNAs of 21 to 30 nucleotides.The miRNA subsequently unwinds, and the resulting single strands are incorporated into the multiprotein RISC. Base pairing between the single-stranded miRNA and its target mRNA directs RISC to either cleave the mRNA target or to repress its translatio.

Fig. 1.4 Roles of long noncoding RNAs (lncRNAs).

Gene Editing

Exciting new developments that allow for exquisitely specific genome editing usher in an era of molecular revolution. These advances come from a totally unexpected source: the discovery of regularly interspaced clustered short palindromic repeats (CRISPR) and Cas (or CRISPR-associated genes). These are linked genetic elements that endow prokaryotes with a form of acquired immunity against phages and plasmids. Bacteria use this system to sample the DNA of infectious agents, incorporating it into the host genome as CRISPR. CRISPRs are transcribed and processed into an RNA sequence that binds and directs Cas9 nuclease to a sequence (eg, a phage), leading to its cleavage and destruction of the phage. Gene editing reuses this process through the use of artificial guide RNAs (gRNAs) that bind to Cas9 and are complementary to a DNA sequence of interest.

Once targeted by the gRNA, Cas9 induces double-stranded DNA breaks. Repair of the resulting highly specific cleavage sites can lead to somewhat random disruptive mutations in target sequences (through non-homologous end junction [NHEJ]), or the precise introduction of new sequences of interest (by homologous recombination) . Both the gRNAs and the Cas9 enzyme can be delivered to cells with a single, easy-to-build plasmid.


Refrence: Robbins Basic patholgy.


Download More Medical Books here

Leave a Reply

Your email address will not be published.