Topic of Contents

  1. The explanation of this graph
Fig. 01
Fig. 01
Fig. 01
(Please note that this is a distribution of genome sizes of different species. The vertical axis represents the relative complexity of the species, while the horizontal axis is the logarithmic value of genome size.)

> See details below in **Thought** section

随着物种复杂度上升,基因组内有效的核苷酸长度也呈上升趋势,但是存在一些特例,该图展示了不同生命领域中基因组大小的广泛范围,一些生物如变形虫拥有异常庞大的基因组,而其他生物如细菌则拥有较小的基因组。
  1. The reason why in complex eukaryotes genomes,the gene density↑ ,gene size ↓?
Fig. 02
Fig. 02
Fig. 02
> See details below in **Thought** section

在像人类这样的复杂真核生物中,基因通常更大,含有更多内含子和调控区域,这是由于基因调控的增加复杂性和对更复杂控制机制的需求所致。另一方面,在像大肠杆菌这样的简单原核生物中,基因较小,通常不含内含子,因为它们具有更简化的基因表达过程。
  1. The difference between U and T is a methyl ($-{CH}_3$)

  2. Non-Watson-Crick base pairs can occur in RNA structures due to the flexibility and versatility of RNA molecules. RNA can form a variety of base pairing interactions beyond the traditional Watson-Crick base pairs (A-U, G-C) due to its ability to adopt alternative conformations and structures. These non-canonical base pairs contribute to the structural diversity and functional versatility of RNA molecules, enabling them to perform a wide range of biological functions such as catalysis, gene regulation, and protein synthesis.

  3. The genetic material of prokaryotic cells is typically single-copy, while that of eukaryotic cells is usually multi-copy.

  4. DNA Supercoil

Circular_DNA_Supercoiling
Circular_DNA_Supercoiling
Circular_DNA_Supercoiling
<center><small>By Richard Wheeler (Zephyris) - <a class="external free" href="https://upload.wikimedia.org/wikipedia/en/1/1e/Circular_DNA_Supercoiling.png">Source</a>, <a href="http://creativecommons.org/licenses/by-sa/3.0/" title="Creative Commons Attribution-Share Alike 3.0">CC BY-SA 3.0</a>, <a href="https://commons.wikimedia.org/w/index.php?curid=1937295">Link</a></small></center>

**DNA supercoiling** refers to the amount of twist in a particular DNA strand, which determines the amount of strain on it. A given strand may be "positively supercoiled" or "negatively supercoiled" (more or less tightly wound). The amount of a strand's supercoiling affects a number of biological processes, such as compacting DNA and regulating access to the genetic code (which strongly affects DNA metabolism and possibly gene expression). Certain enzymes, such as topoisomerases, change the amount of DNA supercoiling to facilitate functions such as DNA replication and transcription. The amount of supercoiling in a given strand is described by a mathematical formula that compares it to a reference state known as "relaxed B-form" DNA.

Volcabulary

English Chinese
methyl-uracil 甲基尿嘧啶
stem-loop 环茎结构
internal loop 内部循环
bulge 鼓起
junction 会合处
pseudoknot 伪结
aptamer 核酸适配体
riboswitch 核糖开关
RNase P 核糖核酸酶 P
introns 内含子
viroids 病毒类囊体
peptidyl transferase 肽基转移酶
chromatin 染色质
onhistone proteins 染色质蛋白
haploid set 单倍体组
diploid 二倍体
homolog 同源物

Thoughts

English contents below are partically translated by gpt-3.5-turbo-0125

Question mentioned in TOC-1

This image is a graphical representation of the genome sizes of various organisms, measured in nucleotide pairs per haploid genome. The x-axis represents the genome size on a logarithmic scale, ranging from 10^5 to 10^12 nucleotide pairs. The organisms are grouped into categories, such as mammals, birds, reptiles, amphibians, fishes, crustaceans, insects, nematode worms, plants, algae, fungi, protozoans, bacteria, and archaea. Each organism is represented by a green horizontal bar, which shows the range of genome sizes within that group or species. The yellow lines within the bars indicate specific genome sizes for certain species.

Key Observations:

  • Mammals, Birds, Reptiles: The human genome is highlighted with a yellow line, and the genome size range for this group is between 10^9 and 10^10 nucleotide pairs.
  • Amphibians, Fishes: Species like Fugu, zebrafish, frogs, and newts are shown, with newts having a particularly large genome size (close to 10^11 nucleotide pairs).
  • Crustaceans, Insects: The genome sizes of shrimp and Drosophila (fruit fly) are shown, with shrimp having a larger genome.
  • Nematode Worms: Caenorhabditis is represented with a genome size around 10^8 nucleotide pairs.
  • Plants, Algae: Arabidopsis, wheat, and lily are highlighted, with lily having one of the largest genomes among plants, close to 10^11 nucleotide pairs.
  • Fungi: Yeast is shown with a smaller genome size compared to other organisms, around 10^7 nucleotide pairs.
  • Protozoans: The amoeba has the largest genome size in the chart, approaching 10^12 nucleotide pairs.
  • Bacteria: Mycoplasma and E. coli are represented with relatively small genome sizes, around 10^6 to 10^7 nucleotide pairs.
  • Archaea: This group has a genome size ranging around 10^6 nucleotide pairs.

Conclusion:

The chart illustrates the vast range of genome sizes across different domains of life, with some organisms like amoebas having remarkably large genomes, while others like bacteria have much smaller ones.

Question mentioned in TOC-2

The conclusion that “gene density increases while gene size decreases in complex eukaryotic genomes” can be deduced from the diagram, mainly through the following analyses:

  1. Relationship between Gene Number and Genome Length:
    • Escherichia coli has 57 genes with a relatively small genome length, showing high gene density.
    • Saccharomyces cerevisiae has 31 genes with a moderate genome length, exhibiting high gene density.
    • Drosophila melanogaster has 9 genes with an increased genome length, resulting in decreased gene density.
    • Human has only 2 genes, but with a very large genome length, displaying the lowest gene density.
  2. Variation in Gene Size:
    • In complex eukaryotes (such as humans), genes are often larger, containing more introns and regulatory regions.
    • Conversely, genes in simple organisms (like Escherichia coli) are smaller, usually lacking introns, and with smaller intergenic regions.
  3. Definition of Gene Density:
    • Gene density refers to the number of genes per unit length of DNA. In complex organisms, as the genome expands, the number of genes relatively decreases, leading to a decrease in gene density.

From these observations, the conclusion can be drawn that in complex eukaryotic genomes, gene density decreases while gene size increases.

In complex eukaryotic organisms like humans, genes are often larger and contain more introns and regulatory regions due to the increased complexity of gene regulation and the need for more sophisticated control mechanisms. Eukaryotic genes require a variety of regulatory elements to control their expression, including enhancers, silencers, and promoters, which contribute to the larger size of genes. Introns play a role in alternative splicing, which allows a single gene to code for multiple protein isoforms with different functions.

On the other hand, in simple prokaryotic organisms like Escherichia coli (E. coli), genes are smaller and typically do not contain introns because they have more streamlined gene expression processes. Prokaryotic genomes are compact and efficient, with genes arranged closely together without much non-coding DNA between them. This compact organization allows for rapid gene expression and adaptation to changing environmental conditions.

Overall, the differences in gene size and organization between eukaryotic and prokaryotic organisms reflect their distinct evolutionary histories and the complexity of gene regulation in eukaryotes.

Other

  1. The structure and function of organisms are adapted to each other, which is also reflected in RNA, here are some examples:
    • the process of RNA splicing: In eukaryotic organisms, RNA splicing allows for the removal of non-coding sequences (introns) from the pre-mRNA molecule and the joining together of coding sequences (exons) to form a mature mRNA molecule. This process is crucial for the correct expression of genes and the production of functional proteins in organisms.
    • the role of ribosomal RNA (rRNA) in protein synthesis: Ribosomes are molecular machines composed of rRNA and proteins that carry out the translation of mRNA into proteins. The structure and function of rRNA are highly adapted to interact with mRNA and tRNAs in a coordinated manner to ensure accurate and efficient protein synthesis in organisms.
  2. Biological organisms always contain unity within diversity, and the most conservative protein in eukaryotes - histone, binds to DNA in a consistent manner of 147 bp across different eukaryotic organisms.

Translation

The second and third slides correspond to Chapters 5 and 8 of the textbook.

Chapter 5 The Structure and Versatility of RNA

RNA 的结构和多功能性

Original Text

RNA differs from DNA in the following ways: Its backbone contains ribose rather than 20-deoxyribose; it contains the pyrimidine uracil in place of thymine; and it usually exists as a single polynucleotide chain, without a complementary chain. As a consequence of being a single strand, RNA can fold back on itself to form short stretches of double helix between regions that are complementary to each other. RNA allows a greater range of base pairing than does DNA. Thus, as well as A:U and C:G pairing, non-Watson – Crick pairing is also seen, such as U pairing with G. This capacity to form noncanonical base pairs adds to the propensity of RNA to form double-helical segments. Freed of the constraint of forming long-range regular helices, RNA can form complex tertiary structures, which are often based on unconventional interactions between bases and the sugar – phosphate backbone.

Some RNAs act as enzymes—they catalyze chemical reactions in the cell and in vitro. These RNA enzymes are known as ribozymes. Most ribozymes act on phosphorous centers, as in the case of the ribonuclease RNase P. RNase P is composed of protein and RNA, but it is the RNA moiety that is the catalyst. The hammerhead is a self-cleaving RNA, which cuts the RNA backbone via the formation of a $2^\prime,3^\prime$ cyclic phosphate. Peptidyl transferase is an example of a ribozyme that acts on a carbon center. This ribozyme, which is responsible for the formation of the peptide bond, is one of the RNA components of the ribosome.

Translated Text

RNA 与 DNA 在以下几个方面有所不同:其骨架含有核糖,而不是 2-脱氧核糖;它含有嘧啶尿嘧啶(uracil),而不是胸腺嘧啶(thymine);并且通常以单链多核苷酸的形式存在,没有互补链。由于是单链,RNA 可以自我折叠,形成短的双螺旋结构,这些结构位于互补区域之间。RNA 允许比 DNA 更广泛的碱基配对。因此,除了 A:U 和 C:G 配对外,还可以观察到非 Watson-Crick 配对,例如 U 与 G 的配对。这种形成非典型碱基对的能力增加了 RNA 形成双螺旋片段的倾向。摆脱了形成长程规则螺旋的限制,RNA 可以形成复杂的三级结构,这些结构通常基于碱基与糖-磷酸骨架之间的非常规相互作用。

一些 RNA 作为酶发挥作用——它们催化细胞内和体外的化学反应。这些 RNA 酶被称为 核酶(ribozymes)。大多数核酶作用于磷中心,例如核糖核酸酶 RNase P。RNase P 由蛋白质和 RNA 组成,但催化剂是 RNA 部分。锤头 RNA 是一种自切割 RNA,通过形成 $2^\prime,3^\prime$ 环状磷酸盐切割 RNA 骨架。肽酰转移酶是作用于碳中心的核酶的一个例子。这种核酶负责肽键的形成,是核糖体的 RNA 组分之一。

Chapter 8 Structure, Chromatin, and the Nucleosome

结构,染色质和核小体

Original Text

Within the cell, DNA is organized into large structures called chromosomes. Although the DNA forms the foundations for each chromosome, approximately half of each chromosome is composed of protein. Chromosomes can be either circularor linear; however, each cell has a characteristic number and composition of chromosomes. We now know the sequence of the entire genome of thousands of organisms. These sequences have revealed that the underlying DNA of each organism’s chromosomes is used more or less efficiently to encode proteins. Simple organisms tend to use the majority of DNA to encode protein; however, more complex organisms use onlya small portion of their DNA to encode proteins. The in-creased complexity of regulatory sequences, the appearance of introns, and the presence of additional regulatory RNAs(e.g., miRNAs) all contribute to the expansion of the non-coding regions of the genomes of more complex organisms.

Cells must carefully maintain their complement of chromosomes as they divide. Each chromosome must have DNA elements that direct chromosome maintenance during cell division. All chromosomes must have one or more origins of replication. In eukaryotic cells, centromeres play a critical role in the segregation of chromosomes, and telomeres helpto protect and replicate the ends of linear chromosomes. Eukaryotic cells carefully separate the events that duplicateand segregate chromosomes as cell division proceeds. Chromosome segregation can occur in one of two ways. During mitosis, a highly specialized apparatus ensures that onecopy of each duplicated chromosome is delivered to each daughter cell. During meiosis, an additional round of chromosome segregation (without DNA replication) reduces the number of chromosomes in the resulting daughter cells by half to generate haploid gametes.

The combination of eukaryotic DNA and its associated proteins is referred to as chromatin. The fundamental unit of chromatin is the nucleosome, which is made up of two copies each of the core histones (H2A, H2B, H3, and H4) and $\sim$ 147 bp of DNA. This protein – DNA complex serves two important functions in the cell: it compacts the DNA to allow it to fit into the nucleus, and it restricts the accessibility of the DNA. This latter function is extensively exploited by cells to regulate manydifferent DNA transactions including gene expression.

The atomic structure of the nucleosome shows that theDNA is wrapped about 1.7 times around the outside of adisc-shaped, histone protein core. The interactions between the DNA and the histones are extensive but uniformly base-nonspecific. The nature of these interactions explains boththe bending of the DNA around the histone octamer and the ability of virtually all DNA sequences to be incorporated into a nucleosome. This structure also reveals the location of the amino-terminal tails of the histones and their role indirecting the path of the DNA around the histones.

Once DNA is packaged into nucleosomes, it has the abilityto form more complex structures that further compact theDNA. This process is facilitated by a fifth histone called H1.By binding the DNA both within and adjacent to the nucleosome, H1 causes the DNA to wrap more tightly around the octamer. A more compact form of chromatin, the 30-nm fiber, is readily formed by arrays of H1-bound nucleosomes. This structure is more repressive than DNA packaged into nucleosomes alone. The incorporation of DNA into this structure results in a dramatic reduction in its accessibility to the enzymes and proteins involved in transcription of the DNA.

The interaction of the DNA with the histones in the nucleosome is dynamic, allowing DNA-binding proteins intermit-tent access to the DNA. Nucleosome-remodeling complexes increase the accessibility of DNA incorporated into nucleosomes by increasing the mobility of nucleosomes. Two forms of mobility can be observed: sliding of the histone octamer along the DNA or complete release of the histone octamer from the DNA. In addition, these complexes facilitate the exchange of H2A/H2B dimers. Nucleosome-remodeling complexes are recruited to particular regions of the genome to facilitate alterations in chromatin accessibility. A subset of nucleosomes is restricted to fixed sites in the genome and is said to be “positioned.” Nucleosome positioning canbe directed by DNA-binding proteins or particular DNA sequences.

Modification of the histone amino-terminal tails also alters the accessibility of chromatin. The types of modifications include acetylation and methylation of lysines, methylation of arginines, and phosphorylation of serines, threonines, and tyrosines. Acetylation of amino-terminal tails is frequently associated with regions of active gene expression and inhibits formation of the 30-nm fiber. Histone modifications alter the properties of the nucleosome itself, as well asacting as binding sites for proteins that influence the accessibility of the chromatin. In addition, these modifications recruit enzymes that perform the same modification, leading to similar modification of adjacent nucleosomes and facilitating the stable propagation of regions of modified nucleosomes/chromatin as the chromosomes are duplicated.

Nucleosomes are assembled immediately after the DNA is replicated, leaving little time during which the DNA isunpackaged. Assembly involves the function of specialized histone chaperones that escort the H3.H4 tetramers andH2A.H2B dimers to the replication fork. During the replication of the DNA, nucleosomes are transiently disassembled. Histone H3.H4 tetramers and H2A.H2B dimers are randomly distributed to one or the other daughter molecule. On average, each new DNA molecule receives half old and half new histones. Thus, both chromosomes inherit modified histones that can then act as “seeds” for the similar modification of adjacent histones.

Translated Text

在细胞内,DNA 被组织成被称为染色体的大型结构。虽然 DNA 构成了每个染色体的基础,但大约一半的染色体由蛋白质组成。染色体可以是圆形或线性的;然而,每个细胞都有一定数量和组成的染色体特征。我们现在已经知道成千上万种生物的整个基因组序列。这些序列揭示了每个生物的染色体底层 DNA 被更或少有效地用于编码蛋白质。简单生物倾向于使用大部分 DNA 编码蛋白质;然而,更复杂的生物只使用少部分 DNA 编码蛋白质。调控序列的复杂性增加、内含子的出现以及额外的调控 RNA(例如 miRNA)的存在,都导致更复杂生物的基因组非编码区域的扩展。

细胞在分裂时必须仔细维持其染色体组合。每个染色体必须具有指导细胞分裂期间染色体维持的 DNA 元素。所有染色体必须具有一个或多个复制起点。在真核细胞中,着丝粒在染色体分离中起着关键作用,端粒有助于保护和复制线性染色体的末端。真核细胞在细胞分裂进行时仔细分离复制和分离染色体的事件。染色体分离可以通过两种方式之一进行。在有丝分裂期间,高度专门化的装置确保每个复制的染色体拷贝被传递给每个子细胞。在减数分裂期间,染色体分离的额外一轮(不进行 DNA 复制)将导致生成单倍体配子的结果子细胞染色体数量减半。

真核 DNA 及其相关蛋白质的组合被称为染色质。染色质的基本单位是核小体,由两份核心组蛋白(H2A、H2B、H3 和 H4)和约 147 bp 的 DNA 组成。这种蛋白质 - DNA 复合物在细胞中发挥两个重要功能:它压缩 DNA 以使其适应细胞核,同时限制 DNA 的可访问性。细胞广泛利用后者的功能来调控许多不同的 DNA 事务,包括基因表达。

核小体的原子结构显示 DNA 大约绕着一个盘状的组蛋白核心外部约 1.7 圈。 DNA 与组蛋白之间的相互作用广泛但均匀且与碱基无关。这些相互作用的性质解释了 DNA 围绕组蛋白八聚体弯曲的现象,以及几乎所有 DNA 序列都能被纳入核小体的能力。这种结构还揭示了组蛋白氨基末端尾部的位置及其在引导 DNA 绕组蛋白路径中的作用。

一旦 DNA 被包装成核小体,它就有能力形成更复杂的结构,进一步压缩 DNA。这一过程由第五种组蛋白 H1 促进。通过在核小体内部和相邻的 DNA 上结合,H1 使 DNA 更紧密地绕组蛋白八聚体。一种更紧凑的染色质形式,30nm 纤维,可以通过 H1 结合的核小体阵列轻松形成。这种结构比单独包装成核小体的 DNA 更具抑制性。将 DNA 纳入这种结构会显著降低其对参与 DNA 转录的酶和蛋白质的可访问性。

DNA 与核小体中组蛋白的相互作用是动态的,允许 DNA 结合蛋白间歇性地访问 DNA。核小体重塑复合物通过增加核小体的移动性来增加纳入核小体的 DNA 的可访问性。可以观察到两种形式的移动性:组蛋白八聚体沿 DNA 滑动或完全释放。此外,这些复合物促进 H2A/H2B 二聚体的交换。核小体重塑复合物被招募到基因组的特定区域,以促进染色质可访问性的改变。某些核小体受限于基因组中的固定位点,被称为“定位”。核小体定位可以由 DNA 结合蛋白或特定 DNA 序列指导。

组蛋白氨基末端尾部的修饰也会改变染色质的可访问性。这些修饰类型包括赖氨酸的乙酰化和甲基化、精氨酸的甲基化,以及丝氨酸、苏氨酸和酪氨酸的磷酸化。氨基末端尾部的乙酰化经常与活跃基因表达区域相关,并抑制 30nm 纤维的形成。组蛋白修饰改变了核小体本身的性质,同时作为影响染色质可访问性的蛋白质结合位点。此外,这些修饰招募执行相同修饰的酶,导致相邻核小体/染色质区域的类似修饰,促进染色体复制时修改核小体/染色质区域的稳定传播。

DNA 复制后立即装配核小体,使 DNA 解包的时间很短。装配涉及专门的组蛋白伴侣,它们将 H3.H4 四聚体和 H2A.H2B 二聚体引导到复制叉。在 DNA 复制过程中,核小体会暂时解体。组蛋白 H3.H4 四聚体和 H2A.H2B 二聚体会随机分布到一个或另一个子分子。平均而言,每个新的 DNA 分子会接收一半旧的和一半新的组蛋白。因此,两条染色体都会继承修饰的组蛋白,这些组蛋白随后可以作为相邻组蛋白类似修饰的“种子”。