Sequence Optimization Consideration

Sequence Optimization Consideration


Heterologous protein expression


You may benefit from re-designing or creating de novo an expression-prone DNA sequence from the protein sequence using a computer-based reverse-translation approach.  This work is included as a part of our service.


Protein heterologous expression projects can reveal to be a demanding process. Among the many factors that may affect successful expression, that of the actual coding sequence utilized has been shown in many instances to be a central issue. 

Poor or no expression, troncated proteins, amino-acids misincorporation are the most common consequences of codon bias and unbalanced tRNA pool. 

This issue has been extensively studied for E. coli expression (1) and is perfectly examplified by the arginine amino acid. Among the codons that may translate to arginine with the same tRNA: AGG and AGA are hardly ever found in E. coli highly and continuously expressed ORFs during exponential growth.   However, these 2 arginine codons are found with significantly higher frequency in coding DNAs of many other organisms (see table below).

Thus, when considering E. coli  expression system, it is important to replace the latter AGG and AGA by arginine codons used more frequently in your host organism e.g. CGT or CGC.

Arginine Codons frequency
(% of all arg codons)
E. coli
A. thaliana
C. elegans
D. melanogaster
H. sapiens
S. cerevisae

(1) Hénaut and Danchin, Analysis and Predictions from Escherichia coli sequences.
Escherichia coli and Salmonella, Vol. 2, Ch. 114:2047-2066, 1996, Neidhardt FC ed., ASM press, Washington, D.C. 

(2) Recombinant gene source, all genes frequencies,


Gene design


Designing a gene that will express in a particular organism a recombinant protein boils down to choosing the most appropriate triplet for each amino acids. With a ratio of 64 codons to 20 aa plus termination there is quite some flexibility to include other constraints in addition to that of codon bias adjustment such as: 


  • Gene & protein engineering: Addition or removal of specific motives (e.g. restriction sites), tags for purification, multiple stops, etc. 
  • Gene manufacturing: Maintain average GC content, avoid long repeats, palindroms, etc.