Evolutionary Trace Analysis of the RGS Family:
The regulators of G protein signaling (RGS) family of
proteins accelerate the GTPase activity of G protein a subunits and therefore
acts to down-regulate G protein signaling cascades. In vitro data
has shown that many different RGS proteins can activate a single species of Ga
and that many Ga's can be activated by a single RGS
protein. These finding raise the question of how RGS-G protein specificity
is maintained in cells that have multiple RGS and Ga
proteins. In order to gain insight into an answer to this question, I have
conducted an Evolutionary Trace of the RGS family. What is an Evolutionary
Trace?
The Evolutionary Trace (ET):

ET uses patterns of sequence conservation and variation
to identify evolutionarily important residues within a protein. This is
done by first acquiring the sequences of all family members of interest using,
for example, a BLAST search. Next, the sequences are aligned with respect
to one another, forming a multiple sequence alignment (MSA) and the
corresponding sequence identity tree (dendrogram). The dendrogram is then
used to divide the family into increasingly more subgroups, defined by the
branch points (nodes) in the tree. In the example above, the tree is
partitioned into 5 subgroups since there are 5 main nodes to the right of the
red dividing line. A class consensus sequence is then generated for each
of the subgroups. Residues that are identical in all members of the
subgroup are represented by the corresponding amino acid letter and residue
positions with varying identity are denoted with an underscore. These
class consensus sequences are then compared to one another to produce the
"Trace Sequence". Residue positions that have conserved amino
acids within each subgroup, yet have varying identities among the subgroups are
denoted by an "X" and are referred to as class specific
residues. Residue positions with identical amino acids for all members of
the entire protein family are called invariant are assigned their
corresponding 1 letter abbreviation.
The final step in ET is to map the Trace Sequence to a
representative 3-dimensional structure from the protein family. If Trace
residues (i.e. both class specific and invariant) cluster either inside
or on the surface of the protein, then this site is likely to be an active
site.
ET Analysis of the RGS domain:
(Note: The following work is published: Sowa
ME, He, W, Wensel, TG, & Lichtarge O "A regulator of G protein signaling interaction surface linked to effector specificity." Proc Natl Acad Sci U S A. 2000 Feb 15;97(4):1483-8.)
ET analysis was performed on the RGS catalytic core
domain from 42 members of the RGS family. The figure below shows the
results at 4 different functional resolutions or ranks mapped onto
the representative RGS4 structure. The rank of the Trace is a number which
denotes into how many branched the dendrogram has been cut.
Invariant residues are shown in red, while class specific residues are shown in
blue. There is one very large cluster of residues which emerges as the
rank increases, while the other side of the protein remains free of ET signal
until rank 23. Since rank 20 provides the largest cluster, with the least
amount of noise (considered to be randomly scattered Trace residues), this
resolution was chosen for all subsequent analysis.

Although ET signal is spread throughout the sequence of the RGS domain (A.) is
clusters on only one surface in the 3D structure of the protein (B.) Since
the structure of RGS4 was solved in complex with the Ga
subunit Gi1a, the RGS residues responsible for
binding to the G protein are known. ET correctly identifies 10 out of
these 11 contact residues at rank 20 (as shown in B.). When the complete
binary structure is viewed, residues located in the upper right of the structure
(as displayed in panel B. of the above figure) form a second evolutionarily
privileged site with no known function a priori. This is shown
below.

Panel A shows the Gi1a in yellow, the RGS domain in
white, and the class specific residues forming site 2 in blue. In panel B,
part of the RGS domain structure is extracted for easy viewing of both the
cluster and the secondary structure. The a5-a6
connecting loop, which is now known to be a critical determinant of RGS activity
is composed almost entirely of class specific residues. The question
arises as to the function of site2 since it lies too far away from the Ga
interacting site to participate directly in Ga
binding. One possible role for site is in the effector mediated regulation
of GAP activity. The g inhibitory subunit of
the cGMP phosphodiesterase (PDE), the effector which Gta
activates, is known to be able to modulate the activity of several RGS
domains. In the case of RGS9, PDEg enhances its
activity, whereas it inhibits the activity of RGS4, RGS16, GAIP, and the RGS9
subfamily members RGS6 and RGS7. An examination of the residue identity in
site 2 reveals a pattern of amino acid conservation consistent with the PDEg
effect between proteins inhibited and enhanced by PDEg.

Furthermore, when putative PDEg
interacting residues from Gta are mapped onto the
structure of Gi1a/RGS4, these residues lie in near
contiguity with the ET identified residues from site 2 (A and B). ET
analysis of the Ga family reveals that there is a
large evolutionarily privileged surface that is below the RGS binding site and
could function as a site for effector binding. This site on Ga
is also in close proximity to RGS site 2 and shares many residues with the
putative PDEg interacting residues on Gta
(C).

Residues in site 2 therefore form a likely region that
could help mediate RGS domain regulation by additional factors such as
effectors. These residues were subsequently selected for mutational
analysis.