Removing only seven amino acids from the C terminus and one from the N terminus destroys GFP function (
Yang
et al.
1996
). In the
GFP
coding sequence (
Figure S1
), premature termination codons (PTC) can be induced by EMS only at codons for glutamine and tryptophan (
Table S1
). The fact that we retrieved mutations affecting the single tryptophan in the GFP protein (Trp57) and six of eight glutamine residues (only Gln80 and Gln204 were not mutated in our screens;
Figure S1
) suggests that the combined screens approached saturation. The truncations resulting from these PTCs far exceed those tolerated at the N and C termini of GFP, which is consistent with the lack of GFP protein accumulation (
) and visible fluorescence in PTC mutant seedlings (
).
Chromophore
From the chromogenic tripeptide (Thr65-Tyr66-Gly67), we recovered substitutions of Thr65 (T65I) and Gly67 (G67S and G67D) (
,
). Because the only EMS-induced mutation of tyrosine codons produces a silent mutation (
Table S1
), it was not possible to obtain substitutions of highly conserved Tyr66 in our screen. Gly67 is also among the 23 highly conserved amino acids in GFP-related proteins (
Ong
et al.
2011
). Gly67 mutant plants did not show visible GFP fluorescence (
,
Figure S2
), although they did accumulate detectable, albeit reduced, levels of GFP protein (
), suggesting that the G67 substitutions compromise to some extent GFP protein stability. The lack of visible fluorescence despite detectable protein accumulation is consistent with previous results showing that substituting Gly67 with any other amino acid prevents chromophore formation (
Lemay
et al.
2008
;
Stepanenko
et al.
2013
). Glycine is unique in having H as a side chain, in contrast to other amino acids that have a carbon, endowing glycine with exceptional conformational flexibility (
Betts and Russell, 2003
). Thus, glycine is the only amino acid at position 67 that permits formation of a kinked internal α-helix, which places Gly67 close to the residue at position 65 for nucleophilic attack during chromophore synthesis (
Lemay
et al.
2008
;
Stepanenko
et al.
2013
).
Position 65 is considered the most variable of the chromogenic tripeptide (
Stepanenko
et al.
2013
). Accordingly, the T65I substitution we identified reduced but did not eliminate visible fluorescence in mutant seedlings (
Figure S2
). Similar to the G67 substitutions, the amount of GFP protein in the T65I mutant was lowered but not abolished (
), implying reduced GFP protein stability in addition to probable inhibition of chromophore formation in this mutant.
Chromophore maturation
Chromophore synthesis requires a network of polar interactions involving amino acids of the chromogenic tripeptide as well as several closely apposed amino acids, including four that we identified in our screen: Glu222 and Arg96, which are among the 23 most highly conserved amino acids in GFP-related proteins (
Ong
et al.
2011
), as well as Thr62 and Ser205 (
Yang
et al.
1996
;
Ormö
et al.
1996
). In particular, Arg96, which acts as an electrostatic catalyst, and Glu222, which behaves as a base catalyst, are considered critical features for chromophore synthesis (
Stepanenko
et al.
2013
).
The E222K substitution we identified did not eliminate fluorescence but resulted in weak to moderate GFP fluorescence in seedlings (
,
Figure S2
). In accordance with this finding, an E222G mutant has been reported previously to fluorescence, indicating that Glu222 is not absolutely required for chromophore formation (
Lemay
et al.
2008
). Only low levels of GFP protein were observed in the E222K mutant (
), suggesting an impact of the E222K substitution on GFP protein stability. In addition to a catalytic role, Glu222 has been reported to have a stabilizing function by contributing to the rigidity of the chromophore cavity (
Royant and Noirclerc-Savoye 2011
;
Stepanenko
et al.
2013
).
The S205F mutant displayed weak to moderate GFP fluorescence in seedlings (
Table S1
) despite the accumulation of nearly wild-type levels of GFP protein (
). These findings are consistent with the role of Ser205 in the hydrogen bonding network required for chromophore maturation (
Yang
et al.
1996
).
The most extreme losses of fluorescence (
Figure S2
) accompanied by nearly wild-type levels of GFP protein (
) were observed with the Arg96 substitutions: R96C and R96H (
). Although a R96C substitution has been reported previously to fluoresce (
Lemay
et al.
2008
), our findings with this substitution as well as R96H suggest that Arg96 is essential for chromophore formation and hence GFP fluorescence in plants. Arg96 forms a hydrogen bond with T62 (
Ormö
et al.
1996
) and, in our study, a T62I substitution resulted in decreased fluorescence in mutant seedlings (
). However, the diminished fluorescence was likely a reflection of decreased protein stability because the T62I mutant did not accumulate appreciable amounts of GFP protein (
).
Highly conserved glycine residues and GFP protein stability
There are 22 glycine residues in the GFP protein (
Figure S1
). Along with substitutions of Gly67 in the chromogenic tripeptide, loss-of-function substitutions were recovered for seven additional glycines, all of which are among the 23 most highly conserved amino acids in GFP-related proteins (
Ong
et al.
2011
) (
). The roles of these glycine residues have so far been unclear (
Stepanenko
et al.
2013
), but our data suggest an essential role in GFP protein stability.
We recovered substitutions in three highly conserved "lid" residues in our screen: Gly91 is situated at the lid of the N and C termini, which are in close proximity at the same end of the barrel (
), whereas Gly20 and Gly127 are located on the opposite lid, which is referred to as the "top" of the barrel (
Ong
et al.
2011
;
Zimmer
et al.
2014
). Consistent with the fact that glycines can reside in parts of proteins that are unable to accommodate other amino acids, such as tight turns in structures (
Betts and Russell, 2003
), these three lid residues as well as Gly40, also identified in our screen, are all present in the vicinity of bends representing transitions from β-strands to loops in the folding pattern of GFP (
) (
Zimmer
et al.
2014
).
Additional essential glycines identified in the screen include Gly31, Gly33, and Gly35, which are located on the second β-strand of the 11 β-strand barrel (
). These three glycines are the only conserved residues residing on β-strands that are not involved in chromophore formation and their function is uncertain (
Ong
et al.
2011
;
Stepanenko
et al.
2013
). With the exception of G35S, which showed very weak fluorescence in seedlings (
Figure S2
) and very low levels of GFP protein (
), all of the glycine substitutions resulted in a lack of both visible fluorescence (
) and detectable GFP protein (
) despite normal transcription of
GFP
mRNA (
). Our results implicate these highly conserved glycine residues in facilitating GFP protein folding and stability, which is consistent with a previous computational analysis of the structures of GFP and related proteins (
Zimmer
et al.
2014
).
Nonconserved residues contributing to GFP protein folding or stability
Four loss-of-function mutations in residues that are not among the most highly conserved amino acids of GFP-related proteins were identified in our screen (
). P56L, A110V, and V112M substitutions are likely to represent residues important for folding or structural stability because GFP protein did not accumulate to detectable levels in the respective mutants (
). Pro56 is located at the beginning of the internal α-helix containing the chromogenic tripeptide (
) and, together with several other proline residues, is thought to be important for maintaining kinks in the α-helical backbone, which are necessary for chromophore synthesis (
Stepanenko
et al.
2013
). Ala110 and Val112 are not at transitions in the secondary structure (
), and they are not among the 23 most highly conserved amino acid residues (
) (
Ong
et al.
2011
). The details of their contributions to GFP protein folding and stability thus remain to be clarified.
A C70Y substitution also significantly decreased accumulation of GFP protein (
), resulting in substantially reduced visible fluorescence in seedlings (
Figure S2
). Cys70 is relatively close to the chromophore and to a transition between the internal α-helix and a loop (
). In contrast to the C70Y loss-of-function substitution, a C70V substitution was found to improve folding properties of a GFP variant (
Zapata-Hommer and Griesbeck 2003
). The different effects of the two mutations suggest that different substitutions of Cys70 can have either positive or negative effects on GFP folding and stability.
Highly conserved residues not identified in this genetic screen
We recovered loss-of-function substitutions in 11 of the 21 highly conserved residues found in the GFP protein used in this study (
Ong
et al.
2011
) (
,
Figure S1
). The remaining 10 highly conserved residues include Tyr66 of the chromogenic tripeptide, Phe27, Phe130, Leu53, and Ile136 (
Figure S1
). As described above, EMS-induced mutation of tyrosine codons produce only a silent change, and the same holds for the codons of phenylalanine, leucine, and isoleucine (
Table S1
). Therefore, EMS-induced mutagenesis cannot be used to probe the contributions of these highly conserved amino acids to GFP fluorescence and stability.
Assuming that conservation implies an essential function, it is not clear why we did not recover loss-of-function mutations in the final five highly conserved residues: Val55, Asp102, Gly104, Gly134, and Pro196 (
Figure S1
) (
Ong
et al.
2011
). The codons of these amino acids are in principle targets of EMS-induced point mutations that result in amino acid substitutions (
Table S1
). In particular, the failure to recover substitutions of G104 and G134 is unusual, because mutations altering seven other highly conserved glycines were retrieved in the screen, in most cases more than once (
). These results may indicate that our screen is not yet saturated or that substitutions in these highly conserved residues do not lead to losses of GFP fluorescence that are detectable by our visual screening procedure.
Summary
We identified a collection of GFP loss-of-function mutations that provide information about amino acids important for chromophore maturation and GFP protein stability. The mutations we identified result in substitutions of 11 of the 21 most conserved amino acids in the GFP protein used in this study as well as in seven less conserved amino acids. Mutations leading to substitutions of highly conserved Arg96 required for chromophore formation appear to substantially decrease or eliminate GFP fluorescence by impairing chromophore formation without dramatically affecting protein stability. By contrast, other mutations result in amino acid substitutions that apparently compromise protein stability and accumulation, hence leading to visible reductions in fluorescence. These substitutions affect the absolutely conserved Gly67 of the chromogenic tripeptide, seven highly conserved glycines in the lids and secondary structural transitions of the β-barrel structure (Gly20, Gly40, Gly91, Gly127) and in the second β-strand (Gly31, Gly33, Gly35), and four nonconserved residues not previously implicated in GFP protein stability (Pro56, Cys70, Ala110, Val112). We also identified substitutions of amino acids involved in chromophore maturation (Thr62 and highly conserved Glu222) that are likely to influence GFP protein stability in addition to impairing fluorescence, presumably because of unsuccessful chromophore formation. These genetic findings support results from biochemical and structural analyses and contribute to a fuller understanding of amino acids, particularly numerous highly conserved glycine residues with previously unknown roles, that are essential for the function and stability of the GFP protein in a higher eukaryotic organism.