The PSI-K subunit of photosystem I from barley (Hordeum vulgare L.). Evidence for a gene duplication of an ancestral PSI-G/K gene.

Photosystem I of barley contains a polypeptide with an apparent molecular mass of 7 kDa when isolated using the detergent n-decyl-beta-D-maltopyranoside. The 7-kDa polypeptide is lost from the PS I complex isolated using Triton X-100. The 7-kDa polypeptide and a corresponding full-length cDNA clone have been isolated. Based on high sequence similarity to an N-terminal sequence of PSI-K from spinach and to the deduced amino acid sequence of Psak from Chlamydomonas reinhardtii the 7-kDa barley polypeptide is identified as PSI-K. The cDNA clone encodes a precursor polypeptide of 131 amino acid residues with a calculated molecular mass of 13,726 Da. The transit peptide shows characteristics of polypeptides imported into the chloroplast. PSI-K has two hydrophobic regions predicted to be membrane-spanning alpha-helices. In vitro expressed prePSI-K polypeptide was imported into intact chloroplasts, whereas an in vitro expressed prePSI-K lacking 7 amino acid residues (Met-Ala-Ser-Gln-Leu-Ser-Ala) at the N-terminal end of the transit peptide failed to be imported. The mRNA encoding PSI-K increases during illumination. PsaK is located in a single locus in the genome. PSI-K has significant similarity to PSI-G. When comparing the barley PSI-K and PSI-G with the reported PSI-K sequence from Synechococcus vulcanus, the degree of similarity is equal, suggesting that an ancestral gene has been duplicated in a chloroplast progenitor but not in a cyanobacterial.

embedded pigment-protein complex that mediates the electron transport from plastocyanin in the lumen of the thylakoids to ferredoxin in the stroma. The eukaryotic PS I complex is composed of at least 13 polypeptides of which five are chloroplast encoded (PSI-A, -B, -C, -I, and -J) and eight are nuclear encoded (PSI-D, -E, -F, -G, -H, -K, -L, and -N) (Golbeck and Bryant, 1991;Andersen and Scheller, 1993;and Knoetzel and Simpson, 1992). The PSI-G, -H, and -N polypeptides have not been found in the cyanobacterial PS I complex which contains a subunit (PSI") not found in eukaryotic organisms. With this report all the PS I subunits have been identified in barley, and cDNA clones or genomic clones are available for all of them (H0j et al., 1987;Kj83rulff and Okkels, 1993;Knoetzel and Simpson, 1992;Okkels et aL, 1988Okkels et aL, ,1989Okkels et aL, ,1991Okkels et aL, ,1992Scheller et al., 1989, 199Ob)?s3 The heterodimer consisting of the two 80-kDa polypeptides PSI-A and PSI-B binds the reaction center P700, the electron acceptors &, AI, and X, and all the pigments of PS I (Scheller and Mdler, 1990). The terminal electron acceptors, the two [4Fe-4S] centers A and B, are bound to the PSI-C subunit (H0j et al., 1987). Cross-linking studies have demonstrated specific interactions between PSI-F, PSI-D, PSI-E, and the soluble electron transfer proteins plastocyanin Malkin, 1988, Hippler et al., 1989), ferredoxin (Merati and Zanetti, 1987;Zanetti and Merati, 1987;Zilber and Malkin, 1988;andAndersen et al., 1990, 1992a), and ferredoxin:NADP+ oxidoreductase (Andersen et al., 1992b), respectively, indicating a function of these polypeptides in ensuring effective electron transport to and from PS I. The function of the remaining PS I subunits is still unknown.
The PSI-A and -B subunits, which are hydrophobic polypeptides predicted to span the membrane several times, form the basis to which all the small polypeptides bind. The PSI-C subunit is located on the stromal side of the membrane covered and kept in the proper orientation by the hydrophilic subunits PSI-D and PSI-E (Oh-oka et al., 1989;Li et al., 1991;and Andersen et al., 1992a). PSI-F is located on the lumenal side of the membrane (Steppuhn et al., 1988) and contains a segment of hydrophobic amino acids that could represent either an internal a-helix or a membrane anchor. The PSI-G, -H, I, -J, -K, and -L subunits are hydrophobic polypeptides all predicted to contain membrane spanning helices (Bryant, 1992). The PSI-N subunit has been shown to be a hydrophilic polypeptide located on the lumenal side of the membrane (He and Malkin, 1992). In analogy to the reaction center from Rhodopseudomom viridis, the structure of the PS I complex is thought to be pseudosymmetrical with the PS I subunits U. Karnahl, unpublished data. P. Scott, unpublished data. PSI-AD, PSI-Dm, PSI-G/K, and PSI-I/J representing pairs of similar polypeptides (Ikeuchi, 1992;Andersen and Scheller, 1993).
In this paper, we present the first nucleotide sequence of a full-length cDNA clone encoding the PSI-K subunit from a higher plant. N-terminal amino acid sequencing defines the maturation site. The in vitro transcribed and translated product of the cDNA clone is imported into isolated barley chloroplasts. Although the PSI-K subunit is hydrophobic and predicted to contain two membrane-spanning segments, the subunit is easily dissociated from the PS I complex. We propose that the PSI-G and PSI-K subunits are derived by gene duplication of an ancestral chloroplast gene.
SDS-Polyacrylamide Gel Electrophoresis and Amino Acid Sequencing-SDS-polyacrylamide gel electrophoresis was carried out using 8-25% high Tris gels (Fling and Gregerson, 1986) or 16% Tricine gels (Schagger and von Jagow, 1987). The SDS-gels used for analysis of polypeptide composition were stained with Coomassie Brilliant Blue R-250. The proteins separated on Tricine gels were electroblotted onto Pro-Blot (Applied Biosystems, Foster City, CA) according to Ploug et al. (1989). The proteins immobilized on the membrane were visualized with 0.1% Coomassie Brilliant Blue R-250 (Serva) in 50% methanol and destained in 40% methanol, 10% acetic acid. After  several washes with water, the stained band was excised, dried, and stored at -20 "C until sequencing. The reagents used for electroblotting were of HPLC-grade. N-terminal amino acid sequencing was performed with an Applied Biosystems sequenator model 470 A coupled to a phenylthiohydantoin Analyzer model 120 A (Applied Biosystems, Ltd., Warrington, United Kingdom). Import Experiments-The pBluescript plasmids from clone K1 and K2 were linearized and transcribed with a mCAP" mRNA capping kit (Stratagene, La Jolla, CA) using T3 RNA polymerase. The labeled precursors were synthesized using a wheat germ translation system (Promega Corp., Madison, WI) and ~-[~S]methionine (Amersham International plc., Buckinghamshire, United Kingdom). Seeds of barley were germinated and grown in vermiculite for 7 days at 18 "C under a 12-h light (75 pE m-' s-')/12 h dark cycle. Isolation of intact chloroplasts and import assays were performed essentially as described in Dahlin and Cline (1991).

C C ! A T A Q C T Q~T C C A T C Q Q Q C Q C T C C A T C T A~-C T M~~ 648 T M T X T Q A T C ! A l "~T A ' T Q T V I ' A T Q C ! A T C l T T Q~
Isolation and Characterization of the cDNA Clone for the PSI-K  5. Southern blot analysis of genomic DNA isolated from barley. The genomic DNA was digested with SphI ( l a n e 1 ) and XhoI ( l a n e 2). " FIG. 6. Hydropathy plot of the PSI-K sequence. The plot was calculated with an averaging window of 7 amino acid residues using modified membrane helix parameters (Rao and Argos, 1986). The positively and negatively charged residues are marked (+) and (-).
The predicted membrane-spanning a-helices ( --. ) and the maturation site (J) are indicated. 1989), a mixture of 96 different oligonucleotides were synthesized on a Cyclone Plus DNA Synthesizer (Millipore Corp., Milford, MA). The oligonucleotides were used to screen the XZAPII cDNA library. Plaques hybridizing to the 5'-end-labeled oligonucleotide probe were isolated. The pBluescript plasmids containing the cDNA inserts were in vivo excised from the positive XZAPII phages to XL1-blue with the helper phage M13K07 (Short et al., 1988). DNA sequencing was carried out by the dideoxy chain method (Sanger et al., 1977) using [36S]-dATP (Amersham International plc.) and a Sequenase kit (United States Biochemical Corp.).
Northern and Southern Blotting-Total RNA was electrophoresed on a 1.2% agarose-formaldehyde gel and was transferred to Zetaprobe membranes using 10 mM NaOH as transfer buffer. Restriction endonuclease digested DNA was electrophoresed on a 0.9% agarose gel. The Southern blotting was carried out using 0.4 M NaOH as transfer buffer. Hybridization of Northerns and Southerns was carried out at 68 "C in 10% dextran sulfate, 1.5 X SSPE (0.27 M NaCl, 15 mM Na2HP04, and 1.5 mM EDTA, pH 70), 1% SDS, and 0.5% skimmed milk. The probe for hybridization was labeled with [a-32P]dCTP (Amersham International plc.) using a random primer kit (Amersham International plc.).
Computer Analyses-The nucleotide sequence data bases provided by EMBL (Stoehr and Cameron, 1991) were used to search for sequence similarities using the Genetics Computer Groups sequence analysis software package, version 7.1 (Devereux et al., 1984). Hydropathy profile and secondary structure predictions were calculated with the Seqanal program package, version 2.0 (Crofts et al., 1990). The phylogenetic tree was calculated with the Neighbor-joining method (Saitou and Nei, 1987) with the Clustalv program (D. Higgins and P. Sharp, Department of Genetics, Trinity College, Dublin).

RESULTS
A barley cDNA library was screened with a mixture of 5'end-labeled oligonucleotide probes specifying the partial amino acid sequence Thr-Asn-Leu-Ile-Met-Val of the photosystem I subunit PSI-K in spinach (Hoshina et al., 1989). After three rounds of plaque lifts, four positive plaques were isolated. Sequencing of one clone K1 showed a 3'-poly(A) tail and a start codon ATG at the 5'-end. A short transit peptide of 35 amino acid residues and a short 5"noncoding region indicated that the isolated clone was not full length. The i n uitro transcribed and translated product from clone K1 failed to be imported into intact chloroplasts (Fig. 1, lanes 6-8) which further substantiated that the clone was not full length. The partial cDNA clone was used as probe for a new library screen. A new selected clone K2 was a full-length clone as indicated by sequencing and import of the in uitro transcribed and translated product from this longer cDNA clone into chloroplasts (Fig. 1, lanes 3-5). The predicted transit peptide starts with the Met-Ala amino acid residues found in all other barley PS I transit peptides. The sequencing strategy and restriction map are shown in Fig. 2. The nucleotide sequence of 1.3 kilobase pairs revealed a single open reading frame of 396 bp (Fig. 3). The open reading frame encodes a hydrophobic precursor protein with a calculated M , of 13,726. 1

S . v u f c~6 . 5 k D a Barley P S I 4 F V F F N F Q R E N V A K Q ---V P E Q N G T H F E A G
Spinach P S I 4  SDS-gel analysis of the polypeptide composition of the barley PS I complex isolated using DM as detergent (Fig. 1,  lane 2 ) resulted in a PS I complex which, in addition to the components found in the PS I complex isolated using Triton X-100 as detergent (Fig. 1, lane 1 ), contained light-harvesting complex I (LHC I) polypeptides, ferredoxin:NADP+ oxidoreductase, and two additional polypeptides migrating with apparent molecular masses of 7 and 9.3 kDa. The 7-kDa polypeptide was identified as the PSI-K subunit by N-terminal amino acid sequencing (Fig. 3).

Barley
The N-terminal amino acid sequence of the mature polypeptide is identical to the amino acid sequence deduced from bp 298 to 357 of the open reading frame, showing that the mature polypeptide starts from the aspartate residue at position 298 bp and consists of 89 amino acid residues (Fig. 3). The mature PSI-K polypeptide from barley is 5 amino acid residues longer at the C terminus than the polypeptide from Chlamydomonas reinhurdtii, but lacks 1 residue at position 10 and 2 residues in the middle of the polypeptide at position 64 and 70 in the alignment shown in Fig. 7. The mature polypeptide has a calculated molecular mass of 9022 Da, and the total charge at pH 7 is estimated to be +7.29. The first 42 amino acid residues of the open reading frame represents the transit peptide. The maturation motif of nuclear encoded precursor polypeptides directed toward the chloroplast membrane is characterized by (Ile/Val)-X-(Ala/Cys)J,Ala and by the presence of 1 or more arginine residues 6-10 positions Nterminal to the maturation site (Gavel and von Heijne, 1990). This consensus motif is recognizable at the PSI-K maturation site, with an aspartic acid replacing the predicted alanine at the start site of the mature peptide (Fig. 3).
Northern blot analysis shows that the mRNA for the PsaK gene is about 700 bp long in agreement with the size of the cDNA clone, including the poly(A) tail (Fig. 4). The gene is transcribed in the dark, and the level of mRNA increases in relative amount after illumination. The transcription in darkand the light-induced accumulation of transcripts have been reported previously for psaC, PsaD, PsaE, PsaG, PsaH, psal, and PsaL in barley (Okkels et al., , 1991Scheller et al., 1990aScheller et al., , 1992. Although mRNA can be detected in darkgrown plants, accumulation of PS I polypeptides takes place only after illumination. Therefore, the expression of the PS I genes seems to be regulated mostly at the translational or post-translational level. Southern blot analysis of barley DNA digested with different restriction enzymes (Fig. 5) resulted in two hybridization bands when digesting with SphI, as expected from the SphI restriction site in the PsaK cDNA clone (Fig. 2). Upon digestion with XhoI, only one hybridization band was detected, indicating that the barley genome contains a single copy of the PsaK gene.

DISCUSSION
The amino acid sequence deduced from the isolated cDNA clone (Fig. 3) matches the N-terminal sequence obtained for the barley PS I polypeptide with an apparent molecular mass of 7 kDa (Fig. 1, lane 2). The amino acid sequence deduced from the open reading frame shows high similarity to the amino acid sequence deduced from a PsaK cDNA clone from C. reinhurdtii (Franzin et al., 1989). It is concluded that the cDNA clone encodes the PSI-K precursor polypeptide from barley. The unusually long 5'-end of the cDNA clone has most likely arisen from cloning of two cDNA clones into the same XZAPII vector.
The nuclear encoded PS I polypeptides from higher plants generally have a longer transit peptide than the corresponding polypeptides from C. reinhurdtii (Franzin et al., 1989). This is also the case with PSI-K with the barley transit peptide being 16 amino acid residues longer. In accordance with other chloroplast transit peptides (Schmidt and Mishkind, 1986;Keegstra et al., 1989), the transit peptide of PSI-K is rich in hydroxylated and positively charged amino acid residues and lacks acidic residues. The arginine residue at bp 292 is conserved in C. reinhurdtii. It is our general observation that among transit peptides for the same chloroplast located subunits, one or several positively charged amino acid residues are conserved at the cleavage site, even in as distantly related organisms as algae and higher plants. The hydrophobic thylakoid transfer domain characteristic of polypeptides directed to the thylakoid lumen is absent in the PSI-K precursor. In vitro import of an N-terminally truncated precursor of PSI-K ( Fig. 1) demonstrates that lack of 7 amino acid residues (Met-Ala-Ser-Gln-Leu-Ser-Ala) in the N terminus of the transit peptide abolish transport across the chloroplast envelope. A similar result was obtained using a precursor of plastocyanin where 11 amino acid residues (Met-Ala-Thr-Val-Thr-Ser-Ser-Ala-Ala-Val-Ala) were deleted in the N terminus (Hageman et al., 1990). The N terminus of the transit peptide is therefore important for the translocation process. A hydropathy plot (Fig. 6) shows two hydrophobic domains long enough to form membrane-spanning a-helices. Treatment of thylakoids with the proteases thermolysin and Pronase results in degradation of PSI-K (Zilber and Malkin, 1992), indicating that part of the PSI-K polypeptide is exposed on the stromal side of the thylakoid membrane.
The role of PSI-K in the PS I complex has not been established. Studies of the binding of PSI-K to PS I have produced contradicting results. PSI-K from spinach has been reported to be tightly associated with the PSI-A/PSI-B heterodimer, since PSI-K was not removed along with other low molecular mass polypeptides upon heat treatment in the presence of ethylene glycol (Hoshina et al., 1989). A similar result was obtained by Wynn and Malkin (1990) who quantitatively retained PSI-K as the single low molecular mass subunit in a PSI-A/B heterodimer preparation obtained using SDS. The addition of 200 mM dithiothreitol during SDS treatment resulted in loss of the PSI-K subunit. In contrast to these binding studies, Ikeuchi et al. (1990) have reported that the PSI-K subunit is depleted by methods used for separation of LHC I from PS I. The absence of the PSI-K polypeptide in the TX-PS I preparation (Fig. l), which is devoid of LHC I, supports this result. The use of other nonionic detergents and subsequent separation of PS I and LHC I consistently resulted in barley PS I complexes devoid of PSI-K.' It is conceivable that the membrane-spanning PSI-K subunit is located near the rim of the PS I complex between the PS I and LHC I and thus easily lost upon detergent treatment. Wynn and Malkin (1990) speculate that the PSI-K subunit is bound to the heterodimer via a disulfide bridge but negate this possibility, because the cysteine residue is lacking in the C. reinhurdtii sequence. The discrepancies can be reconciled if disulfide bridge formation is a detergent introduced artifact. Treatment of PS I with SDS results in denaturation of the iron-sulfur cluster X and in the generation of a PSI-A/B heterodimer containing 4 reactive cysteine residues close to the surface originally facing the stroma. From the hydropathy plot (Fig. 6) and application of the positive inside rule (Nilsson and von Heijne, 1990), the cysteine residue of PSI-K is also predicted to be located near the stromal surface. Therefore, the formation of a disulfide bridge between the cysteine residue of PSI-K and 1 of the 4 cysteines of the PSI-A/B heterodimer previously coordinating center X is very likely.
A significant sequence similarity between barley PSI-G and PSI-K from C. reinhurdtii has been observed previously (Okkels et d., 1992). A computer comparison of PSI-G and PSI-K from barley shows 49/30% similarity/identity. A comparison of PSI-G and PSI-K from C. reinhurdtii shows 62/ 38% similarity/identity. The suggested 2-fold symmetry of PS I with the PSI-A/PSI-B heterodimer as a back bone is supported by the significant sequence similarity between PSI-G and PSI-K and by the predicted presence of two membranespanning helices (Fig. 6) within each of these subunits.
The amino acid sequence derived from a genomic clone encoding a 6.5-kDa polypeptide in Synechococcus vulcanus has been reported to be PSI-K (Bryant, 1992). Computer comparison between PSI-K and PSI-G from barley and PSI-K from S. uulcanus shows a sequence similarity/identity of 53/29% and 50/21%, respectively (Fig. 7). No other PSI-Glike sequence has been published from cyanobacterial PS I preparations in spite of careful investigations (Bryant, 1992;Ikeuchi, 1992). This indicates that the cyanobacterial "PSI-K" is in fact a "PSI-G/K" polypeptide. Presumably the PsaC and PsaK genes in higher plants have evolved from a gene duplication of an ancestral PSI-G/K gene. To address the evolutionary question further, we prepared two unrooted phylogenetic trees based on nucleotide sequences (data not shown) and predicted amino acid sequences. The phylogenetic trees show that PSI-G and PSI-K from eukaryotes have approximately the same evolutionary distance to the predicted gene duplication as the evolutionary distance between the predicted gene duplication and the cyanobacterial PSI-K polypeptide (Fig. 8). An interesting question arises as to whether the gene duplication occurred before or after the endosymbiosis of the chloroplast progenitor. The photosynthetic organelle ("cyanelle") of Cyamphoraparadoxa, and the chloroplast of plants and green algae are believed to be results of different endosymbiosis events (Lockhart et al., 1992). Thus, analysis of PS I polypeptides in C. paradoxa homologous to PSI-G and PSI-K may enable conclusions whether the gene duplication leading to PsaG and PsaK in plants and algae occurred before or after the endosymbiosis event.