Professor of the Immanuel Kant Baltic Federal University proved that it could be possible to describe genetic code and its evolution with the help of linguistic methods: its history is presented as semiosis, that is the origin of semiotic relations and the procedure of processing of linguistic information as communication. Previously the scientist described the genetic code as a semiotic system controlled by a set of rules (“grammar”), and this article continues his work. The results of the research, supported by the Russian Scientific Foundation's grant, are published in the leading semiotic magazine Semiotica.

Newswise — From the very origin of genetics, scientists found a certain analogy between language and genetic information processing. Researchers began to compare the genome with texts also because the origin of the genetic code was an unsolved question in biology, as well as the origin of natural language in linguistics.

Genetic code has a double nature: it performs not only biochemical but also informative functions that can be described as a system of signs, controlled by their position, linear order, and context. Thus, genes present a program for the development of the biological structure's embryo. This program reminds the linear text, which is written by specific rules and contains information about biochemical molecular structures and their functions.

Thus, all information in genes is written through four “letters” – nucleotides G – Guanine; cytosine – C; Adenine – A; Uracil – U. They are associated as “three-letter words” – triplets (codons) in order to coding amino acids. So, one can treat genes as information units of heredity because the differences between codons consist not only in a composition of elements but also in a linear sequence of symbols.

Different combinations of three letters from the set “A, U, G, C” make it possible to compose 64 combinations. However, it’s not a random combinatorics, but the system controlled by certain rules, and it can be described by the analogy with grammar. Semiotics, a science that studies general rules of transferring information through signs, provides methodological tools for it.

Doctor of Philology, Professor Suren Zolyan, staff member of the Immanuel Kant Baltic Federal University (Kaliningrad), The National Academy of Sciences of Armenia (Erevan) and Institute of Scientific Information on Social Sciences of the Russian Academy of Sciences (Moscow) suggests the original conception of structural-semiotic analysis of the genetic code: it can be studied as a semiotic system and a process of transferring the genetic information. This meta- representation of well-known basics of the molecular genetics provides a new explanation for informational genetic processes and mechanisms based on similarity and difference with language.

The impetus of this theory comes from the following statement of the discoverer of the genetic code, Francis Crick: “The genetic code is the small dictionary, which relates the four-letter language of nucleic acids to the twenty-letter language of the proteins”. The author developed this idea and presented the genetic code as a language that consisted of four blocks: alphabet, dictionary, grammar as rules of word formation, and correspondence rules correlating units of dictionary and grammar categories. Such treatment makes it possible to observe systemic and structural characteristics of various genetic processes, such as protein synthesis.

The difference between an alphabet (nucleotides), lexicon (codons), and grammar categories (empty positions within triplets) allows find out the rules of formation of significant units of genetic code (duplets and triplets) and explain their compositional semantics – rules of correspondence between codons and amino acids. These context-sensitive rules enables to describe of cases when the biochemically identical sequence of nucleotides acquires a different meaning and performs another function, depending on their position. This also helps to find out the individual profile for every nucleotide.

The researcher studied various hypotheses of the origin and evolution of the genetic code and concluded that it was based on the formation of the fundamental semiotic principle of arbitrariness of a sign, that is, an unmotivated semantic connection of signifier and signified. This also was envisaged by Crick, who saw in these unmotivated connections the main difference of the genetic code from Mendeleev’s periodic table. Besides, during the evolution, such characteristic for coding information processes as minimization of mistakes and disambiguation is also crucial.

Summing up his observations, Suren Zolyan introduces the concept of semiopoiesis, which is the final stage of the self-organization of biological systems (autopoiesis). Associations of material phenomena (in this case, nucleotides and amino acids) led to semiotic connections, and as a result, information mechanisms emerged. These processes enable to create the stable forms of life. The rising complexity of organization leads to the crystallization of informative and semiotic beginnings.

The dualism of the genetic information may be explained by the fact that biochemical substance acquires a semiotic form. In general, the evolutional process is supposed to be treated as a process of semiosis in action, which leads to the formation of new, more complex semiotic structures. However, they use the same substance (the same minimal set of nucleotides).

As genes have their inner structural hierarchy, the procedure of information processing may be presented in the following way. On the first, pre-text level, nucleotides in a gene are combined in triplets; on the second level – triplets encoded amino acids. If one compares this principle to any natural language, nucleotides, triplets, and amino acids would be accordingly correlated with phonemes, morphemes (such parts of a word as prefix, root, suffix), and words. On the third level, the sequence of amino acids form information blocks for proteins), in a similar way as words form a sentence. As the biochemical regularities become more complex, genetic processing is supplemented by linguistic and semiotic principles. The meaning of such texts emerges due to biochemical differences, namely differences in the sequence of “letters” – nucleotides. At this stage, communication appears as a systemic interactive relationship that resembles characteristics of not so much biological as semiotic systems.

The genetic code is not something eternally and invariably existing but is the product of multi-stage evolution, which leads to the appearance of various synchronic and diachronic variants of the genetic code. The evolution of the genetic code can be viewed as a process of semipoiesis – semiosis in action. The genetic code was born out of the matter, just like the organic world grows out of the inorganic by introducing new organization principles of autopoiesis. Thus, the association of material phenomena (in this case, nucleotides and amino acids) led to the creation of semiotic connections. As a final result of random processes, mechanisms for storing and transmitting information emerged, providing the possibility for stable forms of life. The increasing complexity of the organization leads to the crystallization of informational and semiotic principles.

“Semiopoiesis, recursive auto-referential processing of semiotic system, becomes a form of organization of the bio-world when and while notions of meaning and aiming are introduced into it”, summarizes Suren Zolyan.

Journal Link: Semiotica 2022

Register for reporter access to contact details

Semiotica 2022