[go: up one dir, main page]

WO2007110600A2 - Promoteurs végétaux, séquences codantes et leurs utilisations - Google Patents

Promoteurs végétaux, séquences codantes et leurs utilisations Download PDF

Info

Publication number
WO2007110600A2
WO2007110600A2 PCT/GB2007/001051 GB2007001051W WO2007110600A2 WO 2007110600 A2 WO2007110600 A2 WO 2007110600A2 GB 2007001051 W GB2007001051 W GB 2007001051W WO 2007110600 A2 WO2007110600 A2 WO 2007110600A2
Authority
WO
WIPO (PCT)
Prior art keywords
endosperm
promoter
seq
gene
accession number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/GB2007/001051
Other languages
English (en)
Other versions
WO2007110600A3 (fr
Inventor
Roderick Scott
Melissa Spielman
Sushima Tiwari
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Bath
Original Assignee
University of Bath
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Bath filed Critical University of Bath
Publication of WO2007110600A2 publication Critical patent/WO2007110600A2/fr
Publication of WO2007110600A3 publication Critical patent/WO2007110600A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8222Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
    • C12N15/823Reproductive tissue-specific promoters
    • C12N15/8234Seed-specific, e.g. embryo, endosperm
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants

Definitions

  • the present invention relates to materials and methods for the expression of a gene of interest in a particular tissue and for the modification of plant phenotype by preferential expression of a gene of interest in a particular tissue.
  • the present invention relates to materials and methods for modification of seed and/or fruit size in a plant.
  • Endosperm a second fertilization product in the seeds of flowering plants, accounts for a large proportion of human nutrition.
  • the extent of cell division in endosperm is also a crucial component in determining seed size.
  • a major role of endosperm is to transmit nutrients from the seed parent to the embryo (Brink and Cooper, 1947; Bewley and Black, 1978; Lopes and Larkins, 1993; Berger, 1999) .
  • species with persistent endosperms, such as cereals the endosperm forms a large part of the mature seed and contains, stored food reserves which are mobilized and supplied to the embryo during germination.
  • the number of endosperm cells is set in the first stage of seed development; in cereals this is known as the lag phase. This is followed by the storage (or linear fill) phase, when the seed accumulates storage products, and finally the maturation phase, when the rate of dry weight accumulation declines and the seed desiccates.
  • final grain weight is positively correlated with the number of endosperm cells formed during the lag phase, and treatments such as shading or water stress that affect seed weight are most effective during this time (e.g. Singh and Jenner, 1982; Chojecki et al., 1986; Ouattar et al., 1987; Jones et al., 1996; Yang et al., 2002). Therefore the extent of endosperm proliferation early in seed development is considered to have a significant effect on sink strength, i.e. the capacity of the seed to acquire nutrients from the seed parent.
  • Endosperm development begins with fusion of a haploid sperm and the diploid central cell to produce a triploid primary endosperm nucleus.
  • the primary endosperm nucleus divides without formation of a cell plate, initiating the free-nuclear stage of endosperm development.
  • micropylar endosperm (ME) surrounding the zygote; chalazal endosperm (CE) at the opposite pole of the seed; and peripheral endosperm (PE) lining the walls of the expanding embryo sac.
  • ME micropylar endosperm
  • CE chalazal endosperm
  • PE peripheral endosperm
  • the PE proliferates rapidly throughout the free-nuclear stage but division slows at the heart stage of embryogenesis (approximately 5 DAP) , when the endosperm begins to cellularize from the micropylar pole. Increase in seed length follows a similar pattern, with growth levelling off at about 6 DAP (Alonso-Blanco et al . , 1999).
  • Crosses between Arabidopsis plants of different ploidies illustrate the relationship between endosperm proliferation and seed size in a species with ephemeral endosperm.
  • Crosses between a diploid seed parent and a tetraploid pollen parent [2x X 4x] in the C24 accession produce seeds that are more than double the weight of seeds from [2x X 2x] crosses and over 40% heavier than those from [4x X 4x] crosses; these large seeds also contain large embryos. Seed development in these crosses is characterized by an increase in the rate and duration of division in the PE, delayed endosperm cellularization, and an increase in the size of the CE (Scott et al., 1998).
  • RNA from flowers or siliques at three stages of development against the Affymetrix ATHl array (Redman et al., 2004), ranging from flowers with fully developed ovules to young siliques containing seeds with an embryo proper of 4-8 cells, generating a set of 1,043 genes called ⁇ reproduction related' .
  • Gene expression in seeds has been most studied in cereals.
  • a study of genes expressed in developing rice seeds aimed to identify promoters for high expression or specific spatial and temporal patterns of expression during seed development. Genes associated primarily with metabolism were found to be expressed during later stages of seed development, including in endosperm and the activities of promoters of some of these were characterised further (Qu and Takaiwa, 2004) .
  • promoters that direct expression in early endosperm, during or soon after the normal proliferative phase.
  • promoters active in early endosperm are highly desirable as tools for altering endosperm and seed size.
  • Such promoters preferably have no activity elsewhere in the plant and/or preferably are active in different endosperm compartments, for example the chalazal endosperm region.
  • the inventors have identified endosperm-expressed genes not previously reported to be expressed in endosperm and have provided novel promoters that are active in endosperm.
  • the inventors have also identified genes and provided promoters that are active only in endosperm, in other words, that are endosperm-specific.
  • the inventors have further identified genes and provided promoters that are active in a particular compartment of endosperm, in other words, that are specific within the endosperm to that compartment.
  • the inventors have thus provided means for expressing a gene of interest in endosperm.
  • means for preventing expression of a gene in endosperm have provided means for modifying endosperm size.
  • the inventors have provided means for modifying seed size in plants.
  • the endosperm-expressed genes and promoters also provide means for identifying corresponding endosperm- expressed genes and promoters in further plant species.
  • the inventors achieved this by using a cross between 2x A. thaliana seed parents and 4x A. arenosa pollen parents, which produces seeds with a paternal excess phenotype (Bushell et al . , 2003) .
  • seeds from this cross the embryo arrests by globular-heart transition but the endosperm overproliferates and survives for many days longer, resulting in seeds enriched for endosperm, and particularly for CE endosperm.
  • arrays were constructed based on a developing seed cDNA library prepared from this cross. Of 1,317 randomly sequenced clones, 1,304 had significant homology to unique annotated A.
  • the seed cDNA library contains many hundreds of genes that have not been identified in other experiments designed to detect seed expression. For example, many genes identified here were not identified in the array experiments indicated above; 849 genes were not identified in Ruuska et al . (2002) and 197 genes were not identified in Hennig et al. (2004).
  • the arrays were probed with RNA from different tissues, both vegetative and reproductive, of A. thaliana, as well as endosperms extracted from Brassica napus seeds, to generate a list of ⁇ endosperm-expressed' genes.
  • this array experiment based on endosperm- enriched hybrid seeds has provided a novel source of endosperm- expressed and endosperm-specific genes and novel promoters for driving gene expression during early endosperm development. Identification of such genes and promoters enables the provision of the corresponding genes and promoters from further plant species .
  • the inventors have provided such genes for expression under the control of the endosperm promoters for use in modifying endosperm size and seed size in plants.
  • they have provided genes for expression or overexpression in endosperm for increasing endosperm and seed size in plants.
  • novel endosperm promoters provided by the inventors now provide the ability to direct expression of genes of interest in proliferative phase endosperm, in particular, specifically in endosperm or specifically in chalazal endosperm. Such use of the promoters for directed expression allows the effect of endosperm expression of a gene of interest to be assessed in a plant.
  • such directed expression of a gene in endosperm where the expression of that gene in endosperm will affect endosperm development or size, and the provision of suitable genes for this purpose advantageously provides a means of modifiying, for example increasing, endosperm size and seed size in plants.
  • the invention also advantageously provides and enables the provision of endosperm promoters suitable for directed expression of genes in endosperm of important crop plants, for example, oilseed rape, rice, maize and soybean. As such the invention provides a means of modifying, particularly increasing, endosperm size and seed size in crop plants.
  • the present invention lies in the identification of endosperm-expressed genes and the provision of novel endosperm promoters, more particularly, those that are expressed or active, respectively, in early or proliferative phase endosperm.
  • the present invention lies in particular in the identification of genes that are expressed specifically in endosperm, or are expressed specifically in a particular region or compartment within the endosperm.
  • the invention provides promoters that are active in endosperm, including those that are active only in endosperm.
  • the invention thus provides endosperm-specific, more particularly, proliferative phase endosperm- specific promoters.
  • the invention lies in the provision of endosperm promoters active in a particular endosperm region or compartment.
  • the invention also provides promoters that are active in only the chalazal endosperm compartment within the endosperm.
  • the invention thus provides proliferative phase chalazal endosperm promoters .
  • the invention also generally relates to the use of the identified endosperm-expressed genes and use of the promoters of the invention for the identification and provision of corresponding endosperm- expressed genes and endosperm-active promoters in other plant species, for example crop species.
  • the present invention concerns methods for identifying and obtaining corresponding endosperm- expressed genes and endosperm promoters in other plant species and also the promoters identified and obtained thereby.
  • the present invention lies in the provision of means for expressing a gene of interest in endosperm. More particularly, the invention provides means for expressing a gene of interest during early endosperm development, in early or proliferative phase endosperm.
  • the invention is concerned with means for expressing a gene of interest specifically in endosperm and/or for expressing a gene of interest in a particular compartment of endosperm, for example, in only the chalazal endosperm compartment within the endosperm.
  • the present invention concerns the use of promoters of the invention for expressing genes of interest in early or proliferative phase endosperm, including expression specifically in endosperm and expression specifically in a particular compartment of the endosperm.
  • the invention also concerns methods for expressing genes of interest in early or proliferative phase endosperm, including expression specifically in endosperm and expression specifically in a particular compartment of the endosperm, using promoters of the invention.
  • the present invention generally relates to means for expressing a recombinant protein in endosperm.
  • the present invention provides means for expressing a recombinant protein during early endosperm development, more particularly, in early or proliferative phase endosperm.
  • the invention is concerned with means for expressing a recombinant protein specifically in endosperm and/or for expressing a recombinant protein in a particular compartment of endosperm, for example, in only the chalazal endosperm compartment within the endosperm.
  • the invention provides use of promoters of the invention for the expression of recombinant proteins in early or proliferative phase endosperm and methods for the expression of recombinant proteins in early or proliferative phase endosperm using promoters of the invention, in any of the above ways.
  • the invention lies in providing means for modifying endosperm size and seed size.
  • the invention is concerned with providing means for increasing endosperm size and seed size.
  • the present invention concerns the use of promoters of the invention to increase endosperm and seed size.
  • the invention also concerns methods for increasing endosperm and seed size, using promoters of the invention.
  • the invention is further concerned with identifying and providing genes associated with endosperm development, in particular early endosperm development, for expression under the control of the endosperm promoters of the invention for use in modifying endosperm size and seed size in plants.
  • the invention provides genes for expression or overexpression in endosperm for increasing endosperm and seed size in plants.
  • the inventors have newly identified endosperm-expressed genes and have provided plant endosperm promoters and the present invention in various aspects and embodiments is based on the sequence information identified, obtained and provided herein.
  • the present invention thus provides promoters of the Arabidopsis genes identified for the first time by the inventors to be expressed in early or proliferative phase endosperm and their homologues in other plant species.
  • the present invention provides promoters of the Arabidopsis genes At5g46950 (accession numbers NC_003076 and NM_124066) ; Atlgl4520 (accession numbers NC_003070 and NM_101319); At2g38900 (accession numbers NC_003071 and NMJL29447); At5g07210 (accession numbers NC_003076 and NM_120803) ; At2g41000 (accession numbers NC_003071 and NM_129665) ; At5g39260 (accession numbers NC_003076 and NM_123288) ; Atlg62080 (accession numbers NC_003070 and NM_104889) and their homologues CAE04313 (accession number AL606658
  • the identifiers of Arabidopsis genes of the format Atxgxxxxx are unique identifiers assigned by TAIR (see http : //www . arabidopsis . org/info/guidelines . j sp) .
  • the GenBank accession numbers corresponding to (i) the chromosome on which the Atxgxxxxx gene is located (of the format NC_xxxxxx) and (ii) to the gene transcript (of the format NM_xxxxxx) are also provided for each.
  • the location of the Atxgxxxxx gene within the chromosome sequence is identified in the accession number annotations and the location of the start codon is also identified within both the chromosome and the transcript accession number annotations.
  • the homologous genes are identified by the protein they encode, in terms of the database accession number of the protein as identifier, and by the database accession number of the chromosome encoding that protein.
  • the location of the gene and protein within the chromosome sequence is identified in the accession number annotations and the location of the start codon is also identified within both the chromosome and the protein accession number annotations.
  • the chromosome sequence for some of the genes is the complement sequence (annotated as "complement") .
  • the sense sequence of the gene and promoter can be readily determined from the complement sequence.
  • Homologous promoter sequences also have been identified herein by reference to the database accession numbers of the genomic DNA clones in which they are found, e.g. a BAC clone.
  • the database sequence is of the complement strand, from which the sense sequence of the gene and promoter can be readily determined,
  • At5g46950 is an invertase/pectin methylesterase inhibitor
  • Atgl4520 is myo-inositol oxygenase (MIOXl)
  • At2g38900 is a serine protease inhibitor
  • At5g07210 is Arabidopsis response regulator 21 (ARR21)
  • At2g41000 is a DNAJ heat shock protein
  • At5g39260 is an expansin (ATEXPA21)
  • Atlg62080 is unknown.
  • Preferred promoters of the invention are of endosperm-specific genes which are not detectably expressed in other tissues of the plant.
  • Preferred promoters of the invention are of the endosperm,-specific Arabidopsis genes At5g46950; Atlgl4520; and At2g38900 as defined above and their rice homologues CAE04313, BAD07668, BAD07661; BAD53821; and ABA98883 as defined above, respectively.
  • Preferred promoters of the invention are of chalazal endosperm-specific genes which are not detectably expressed in other compartments of the endosperm.
  • Preferred promoters of the invention are of the chalazal endosperm-specific Arabidopsis gene At5g07210 as defined above, and its rice homologue BAD72541 as defined above.
  • the present invention provides an isolated nucleic acid comprising a proliferative phase endosperm promoter, in other words, a promoter of a gene expressed in proliferative phase endosperm, which is active as a promoter in proliferative phase endosperm according to the definitions provided herein.
  • the present invention provides an isolated nucleic acid comprising a promoter of a proliferative phase endosperm-expressed gene selected from the group consisting of At5g46950 (accession numbers NC_003076 and NM_124066) , Atlgl4520 (accession numbers NC_003070 and NM_101319) , At2g38900 (accession numbers NC_003071 and NM__129447), At5g07210 (accession numbers NC_003076 and NMJ.20803) , At2g41000 (accession numbers NC_003071 and NM_129665) , At5g39260 (accession numbers NC_003076 and NM_123288) , Atlg62080 (accession numbers NC_003070 and NM_104889) , CAE04313 (accession number AL606658), BAD07668 (accession number AP004096) , BAD07661 (accession number AP00
  • the present invention also provides promoters of the homologues of those genes in other plant species.
  • homologues in other plant species which are preferred are those in crop plant species, for example, oilseed rape (Brassica napus) , rice (Oryza sativa) , maize (Triticum spp.) and soybean [Glycine max).
  • a promoter of a proliferative phase endosperm-expressed gene may be that of a gene selected from the group consisting of: (i) At5g46950, Atlgl4520, At2g38900, CAE04313, BAD07668, BAD07661, BAD53821 and ABA98883; (ii) At5g07210 and BAD72541; (iii) At5g46950, Atlgl4520, At2g38900, At5g07210, At2g41000, At5g39260, Atlg62080; (iv) At5g46950, Atlgl4520, At2g38900; or (v) At5gO721O, all as defined above.
  • a promoter of a proliferative phase endosperm-expressed gene may be that of a gene which is At5g46950, CAE04313, BAD07668 or BAD07661; At5g07210 or BAD72541; At5g46950; or At5g07210.
  • the present invention provides an isolated nucleic acid comprising a promoter of a proliferative phase endosperm-expressed gene, which promoter comprises a nucleotide sequence which is 5' to the indicated position of the start codon of any of the above genes.
  • the promoter may comprise a nucleotide sequence which is 5' to position 19077274 of NC_003076 on the complement strand; for Atlgl4520, the promoter may comprise a nucleotide sequence which is 5' to position 4968552 of NC_003070 on the complement strand; for At2g38900, the promoter may comprise a nucleotide sequence which is 5' to position 16250825 of NC_003071; for At5g07210, the promoter may comprise a nucleotide sequence which is 5' to position 2253221 of NC_003076; for At2g41000, the promoter may comprise a nucleotide sequence which is 5' to position 17117906 of NC_003071; for At5g39260, the promoter may comprise a nucleotide sequence which is 5' to position 15744273 of NC_003076 on the complement strand; for Atlg62080, the promoter may comprise a
  • An endosperm promoter of the invention may comprise or consist essentially of a sequence of nucleotides extending at least about 500 bp to at least about 3.5 kb, for example, at least about 500, 600, 700, 800, 900 bp, or 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4 or 3.5 kb upstream (5') of the translation start site (ATG codon) of a proliferative phase endosperm-expressed gene as disclosed herein.
  • the promoter comprises or consists essentially of a sequence of nucleotides extending at least about 500 bp to at least about 2100 bp upstream (5') of the translation start site.
  • the promoter comprises or consists essentially of a sequence of nucleotides extending at least about 700 bp to at least about 3300 bp upstream (5') of the translation start site.
  • the promoter may be substantially free of or free of any nucleotide sequence extending downstream (3' ) of the translation start site (ATG codon) of a proliferative phase endosperm-expressed gene as disclosed herein.
  • the promoter may be substantially free of or free of any nucleotide sequence extending downstream (3' ) of the translation start site (ATG codon) , including free of the start codon.
  • the promoter may be substantially free of or free of any coding sequence of a proliferative phase endosperm-expressed gene as disclosed herein.
  • the promoter may additionally comprise a sequence of nucleotides extending downstream (3') of the translation start site (ATG codon) of a proliferative phase endosperm-expressed gene as disclosed herein.
  • This sequence may comprise 3' sequence up to at least about the first to up to at least about the sixth intron, for example, up to at least about the first, second, third, fourth, fifth, or sixth intron of the gene.
  • the sequence may comprise part or substantially all of an intron.
  • the sequence may comprise 3' sequence of at least about the first, second, third, fourth, fifth, or sixth intron of the gene, including part or substantially all of an intron.
  • the downstream sequence may comprise the start codon.
  • the 3' sequence may be substantially free of or free of any coding sequence of the gene.
  • the promoter additionally comprises a sequence of nucleotides extending downstream (3' ) of the translation start site to at least about the first or second intron, including the start codon.
  • a promoter comprises or consists essentially of a sequence of nucleotides extending from at least about 1.1 kb upstream (5 f ) of the translation start site (ATG codon) to about the end of the sixth intron, or at least about 1.9 kb upstream (5') of the translation start site (ATG codon) , or at least about 650 bp upstream (5' ) of the translation start site (ATG codon) to about the end of the first intron, or at least about 1.2 kb upstream (5') of the translation start site (ATG codon) , or at least about 2 kb upstream (5' ) of the translation start site (ATG codon) to about the end of the first intron, or at least about 1.7 kb upstream (5') of the translation start site (ATG codon) to about the end of the first intron, or at least about 2.1 kb upstream (5') of the translation start site (ATG codon) .
  • a promoter comprises or consists essentially of a sequence of nucleotides extending at least about 700 bp, 1.2 kb, 2.1 kb, 2.2 kb, 2.5 kb, 2.6 kb, 2.8 kb, 2.9 kb, or 3.2 kb upstream (5') of the translation start site (ATG codon) .
  • a promoter of the invention may comprise or consist essentially of a nucleotide sequence selected from position 19077275 to 19079400 of NC_003076 (At5g46950), from position 4968553 to 4970959 of NC_003070 (Atlgl4520), from position 16249729 to 16250824 of NC_003071 (At2g38900), from position 2250286 to 2253220 of NC_003076 (At5g07210), from position 17116764 to 17117908 of NC_003071 (At2g41000), from position 15744274 to 15746345 of NC_003076 (At5g39260), from position 22950551 to 22952414 of NC_003070 (for Atlg62080), from position 13922 to 16505 of AL606658 (CAEQ4313) , from position 65980 to 68660 of AP004096 (BAD07668), from position 33065 to 35229 of AP
  • a promoter of the invention may comprise or consist essentially of a nucleotide sequence which is from position 118 to 564 of BH965409; from position 28560 to 28917, or from position 29385 to 29771, of AC189499, or both; from position 55 to 442 of BH995823; from position 80604 to 82515 of AC189496; from position 157589 to 158507 of AC183495; from position 30088 to 31189 of AC189364; from position 57029 to 57640 of AC189487; or from position 47071 to 47995 of AC189214.
  • nucleotide sequence defined by the given positions of the database sequence is that of the sense sequence, which can of course readily be determined from the complement sequence.
  • a nucleotide sequence which is from position 118 to 564 of BH965409 is the sense sequence of the complement strand shown in BH965409.
  • Restriction enzymes or nucleases may be used to digest nucleic acid comprising the appropriate gene followed by an appropriate assay (for example as illustrated herein using ⁇ -glucuronidase (GUS) reporter constructs) to determine promoter activity as is apparent to those skilled in the art and as described elsewhere herein.
  • an appropriate assay for example as illustrated herein using ⁇ -glucuronidase (GUS) reporter constructs
  • a nucleotide sequence may be amplified from isolated genomic DNA using the polymerase chain reaction (PCR) with appropriate primers based on the gene sequence, using techniques which are known to those skilled in the art, followed by an appropriate assay as above (for example as illustrated herein using ⁇ -glucuronidase (GUS) reporter constructs) to determine promoter activity.
  • the present invention provides an isolated nucleic acid comprising a promoter, the promoter comprising or consisting essentially of a nucleotide sequence selected from the group consisting of the nucleotide sequences shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81.
  • the present invention provides an isolated nucleic acid comprising a promoter, the promoter comprising or consisting essentially of a nucleotide sequence selected from the group ' consisting of the nucleotide sequences shown as SEQ ID NOS: 1 to 14.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 1, 2, 7, 8, 61, 62, 63, 66, 73, or 77.
  • the isolated nucleic acid sequence comprises a promoter, the promoter comprising or consisting essentially of a nucleotide sequence selected from the group consisting of the nucleotide sequences shown as SEQ ID NOS: 1 to 6 and SEQ ID NOS: 61 to 65 and SEQ ID NOS: 73 to 76.
  • the promoter comprises or consists essentially of a nucleotide sequence selected from the group consisting of the nucleotide sequences shown as SEQ ID NOS: 1 to 6.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 1, 2, 61, 62, 63, or 73.
  • the isolated nucleic acid sequence comprises a promoter, the promoter comprising or consisting essentially of a nucleotide sequence selected from the group consisting of the nucleotide sequences shown as SEQ ID NOS: 7 and 8 and SEQ ID NO: 66 and SEQ ID NO: 77.
  • the promoter comprises or consists essentially of a nucleotide sequence selected from the group consisting of the nucleotide sequence shown as SEQ ID NOS: 7 and 8.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 7, 8, 66 or 77.
  • a promoter sequence is substantially free of or free of any nucleotide sequence extending downstream (3') of the translation start site (ATG codon) .
  • the promoter sequence may be substantially free of or free of any nucleotide sequence extending downstream (3') of the translation start site (ATG codon) , including free of the start codon.
  • the promoter sequence may be substantially free of or free of any coding sequence of the gene.
  • a promoter sequence comprises or consists essentially of a nucleotide sequence as shown in any one of SEQ ID NOS: 1, 3, 5, 7, 9, 11 or 13.
  • a promoter sequence comprises or consists essentially of a nucleotide sequence as shown in any one of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14 or SEQ ID NOS: 61 to 70 or SEQ ID NOS: 73 to 81.
  • the promoter may comprise or consist essentially of the nucleotide sequence shown as SEQ ID NOS: 2 or 8.
  • Promoter as used herein means a sequence of nucleotides from which transcription may be initiated of DNA operably linked downstream (i.e. in the 3' direction on the sense strand of double-stranded DNA) .
  • sequence of nucleotides is derived from any of the endosperm-expressed Arabidopsis genes and their homologoues identified or described herein.
  • the nucleotide sequence may be identical to any one of those shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81 in both size and sequence, or it may differ or have been modified by insertion, addition, deletion or substitution of one or more nucleotides without fundamentally altering the essential activity of the promoter as described herein.
  • the term promoter encompasses a fragment, mutant, allele, homologue, orthologue, derivative or variant, by way of addition, insertion, deletion, substitution of one or more nucleotides which retains promoter activity and promotes gene expression as described herein.
  • oilseed rape (Brassica napus) , rice ⁇ Oryza sativa) , maize ⁇ Triticum spp.) and soybean ⁇ Glycine max) .
  • oilseed rape S. napus
  • B. rapa and B. oleracea are constituent genomes of B. napus, as is understood by those skilled in the art.
  • Brassica is used, this may refer generally to the Brassica genus, or may refer to B. napus, B. rapa, B. oleracea, as will be apparent from the context, as understood by those skilled in the art.
  • the essential activity of the promoter may be determined by analysis of individual promoter elements as to their size and sequence as described herein, for example by an appropriate assay such as that illustrated herein using ⁇ -glucuronidase (GUS) reporter constructs.
  • GUS ⁇ -glucuronidase
  • operably linked as used herein means joined as part of the same nucleic acid molecule, suitably positioned and orientated for transcription to be initiated from the promoter.
  • DNA operably linked to a promoter is under "transcriptional initiation regulation" of the promoter.
  • proliferative phase endosperm as used herein means endosperm which is in the free-nuclear phase of endosperm development and is interchangeably used with the term “early endosperm”, and may also simply be referred to as “endosperm”.
  • proliferative phase endosperm includes endosperm from the start of its development with fusion of a haploid sperm cell and the diploid central cell to produce a triploid primary endosperm nucleus up until the time the endosperm starts to cellularize.
  • proliferative phase endosperm promoter as used herein means a sequence of nucleotides from which transcription may be initiated of ⁇ downstream, operably linked DNA in early or proliferative phase endosperm.
  • the sequence of nucleotides may be derived from any of the endosperm-expressed Arabidopsis genes and their homologues identified or described herein, providing the nucleotide sequence retains the essential activity of the promoter in endosperm as described herein. This term includes promoters from which transcription may be initiated also in one or more other tissues.
  • Promoters which do not have this property are referred to herein as "proliferative phase endosperm-specific” or simply “endosperm-specific” and are included within the term proliferative phase endosperm promoter.
  • an "endosperm-specific" promoter as used herein means a sequence of nucleotides from which transcription may be initiated of downstream, operably linked DNA in early or proliferative phase endosperm only. This term also includes promoters from which transcription may be initiated only in particular compartment (s) of endosperm, for example, chalazal endosperm (CE) .
  • Such a promoter is referred to herein as “chalazal endosperm-specific” or “CE-endosperm specific”, or more generally, “endosperm compartment-specific”.
  • the specificity is with respect to the endosperm as a whole rather than the plant as a whole. Therefore, an endosperm compartment-specific promoter may also be active in other tissues of a plant. Alternatively, an endosperm compartment-specific promoter may not be active in other tissues of a plant, such that it is both endosperm-specific and endosperm compartment-specific as defined herein.
  • Promoter activity as used herein means the ability to initiate transcription.
  • the level of promoter activity is quantifiable, for example, by assessment of the amount of mRNA produced by transcription from the promoter or by assessment of the amount of protein product produced by translation of mRNA produced by transcription from the promoter.
  • the amount of specific mRNA present in an expression system may be determined for example using specific oligonucleotides which are able to hybridize with the mRNA and which are labelled or may be used in a specific amplification reaction such as the polymerase chain reaction.
  • Use of a reporter gene as discussed further below facilitates determination of promoter activity by reference to protein production.
  • a promoter of the invention may comprise one or more fragments of a promoter of a proliferative phase endosperm-expressed gene as disclosed herein or of a promoter sequence shown in any one of SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, sufficient to have promoter activity and promote gene expression in endosperm.
  • the promoter may comprise one or more fragments as indicated as long as it retains the ability, when operably linked to a nucleotide sequence, to direct initiation of transcription of that sequence in endosperm.
  • the promoter retains the ability to direct expression in endosperm in the same manner as the promoter from which the fragment (s) was derived (the 'parent' promoter) .
  • the parent promoter directed endosperm-specific expression
  • a promoter comprising one or more fragments of that endosperm-specific promoter retains the ability to direct expression specifically in endosperm, in other words, with no detectable expression in tissue other than endosperm.
  • a promoter comprising one or more fragments of a promoter of a proliferative phase endosperm-expressed gene as disclosed herein or of a promoter sequence of the invention may comprise or consist essentially of a sequence of nucleotides extending at least 20, 25, 30, 40, 50, 100, 150 or 200 base pairs (bp) , or 250, 300, 350, 400, 450, 500 bp upstream (5 f ) of the translation start site of the proliferative phase endosperm-expressed gene or of the promoter sequence shown in any one of SEQ ID NOS: 1 to 14 or SEQ ID NOS: 61 to 70, provided that promoter activity and promotion of gene expression in endosperm is retained.
  • a promoter comprising one or more fragments of a promoter of a proliferative phase endosperm-expressed gene as disclosed herein or of a promoter sequence of the invention may comprise or consist essentially of a sequence of nucleotides of at least 20, 25, 30, 40, 50, 100, 150 or 200 base pairs (bp) , or 250, 300, 350, 400, 450, 500 bp of a promoter of a proliferative phase endosperm-expressed gene or of any one of the promoter sequences shown as SEQ ID NOS: 1 to 14 or SEQ ID NOS: 61 to 70 or SEQ ID NOS: 73 to 81, provided that promoter activity and promotion of gene expression in endosperm is retained.
  • nucleotide sequence may be a fragment being 200 nucleotides or fewer in length, for example, 150, 100, 50, 40, 35, 30, 25 or 20 nucleotides in length. Restriction enzyme or nucleases may be used to digest the nucleic acid, followed by an appropriate assay (for example as illustrated herein using ⁇ - glucuronidase (GUS) reporter constructs) to determine the minimal sequence required for promoter activity as is apparent to those skilled in the art and as described elsewhere herein.
  • GUS ⁇ - glucuronidase
  • the invention provides an isolated nucleic acid comprising a promoter, the promoter comprising the minimal sequence of nucleotides from any one of the nucleotide sequences shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81 required for promoter activity.
  • the present invention extends to a promoter which has a nucleotide sequence which is an allele, mutant, variant, or derivative, by way of nucleotide addition, insertion, substitution or deletion of a promoter sequence as provided herein, which has promoter activity and promotes gene expression in endosperm.
  • Systematic or random mutagenesis of nucleic acid to make an alteration to the nucleotide sequence may be performed using any technique known to those skilled in the art, or restriction enzyme or nucleases may be used to digest the nucleic acid, followed by an appropriate assay (for example as illustrated herein using ⁇ -glucuronidase (GUS) reporter constructs) to determine promoter activity as is apparent to those skilled in the art and as described fully elsewhere herein.
  • GUS ⁇ -glucuronidase
  • One or more alterations to a promoter sequence according to the present invention may increase or decrease promoter activity, provided that promoter activity in directing expression in endosperm is not lost altogether.
  • modification of promoter activity or ⁇ strength' of a promoter to be used in an expression system may be desirable as a means of affecting the level of expression of a gene operably linked to the promoter.
  • the promoter sequence of the invention which is ⁇ parent' to the allele, mutant, variant, or derivative promoter drives expression not exclusively in endosperm, in other words is not endosperm-specific
  • the promoter activity in one or more other tissues may be altered in the allele, mutant, variant, or derivative promoter.
  • a promoter of the invention is active in endosperm and additionally in pollen (for example, promoters derived from the Arabidopsis genes At5gO721O, At5g39260 and Atlg62080)
  • an allele, mutant, variant, or derivative of such a promoter may have its activity in pollen reduced or removed altogether.
  • further endosperm-specific promoters may be derived from the promoters of the invention, for example.
  • the present invention extends to a promoter which is a homologue of a promoter sequence provided herein and which has promoter activity in endosperm.
  • Such homologues may be identified and provided as described herein.
  • the terms “homologue” and “homologous” used in respect of promoters are used to indicate promoters of homologous genes.
  • the invention extends to a promoter which is a promoter of a homologous gene as provided herein.
  • "Homologous promoters or promoter homologues” are thus promoters of genes homologous to the endosperm-expressed genes provided herein and which have the same promoter activity as the promoters of those homologous endosperm- expressed genes.
  • homologous genes can be readily identified based on homology between coding regions of the genes as is apparent to those skilled in the art and described further herein. Whilst homologous promoters have the endosperm promoter activity of the promoters of the endosperm-expressed genes provide herein, as is apparent to those skilled in the art, they may have low sequence identity with those promoters, at least in terms of sequence identity across the full length of promoter sequences provided herein. Also as is apparent to those skilled in the art, promoters of homologous genes from closely related species may share a higher level of sequence identity than those from less closely related species. For example, Arabidopsis and B.
  • promoters of homologous endosperm-expressed genes of each are expected to share a higher level of sequence identity than those from more remotely related species.
  • a homologous promoter has endosperm promoter activity, its level of sequence identity to promoter sequences provided herein is not critical.
  • a promoter which has a sequence that is a fragment, homologue, mutant, allele, derivative or variant, by way of addition, insertion, deletion, or substitution of one or more nucleotides, of a promoter of a proliferative phase endosperm-expressed gene as disclosed herein or of a promoter sequence shown as any one of SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, has at least about 30% sequence identity with one or more of the sequences shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, preferably at least about 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity.
  • sequence identity may be found over a sequence of at least 10 nucleotides, preferably of at least 20, 30, 40 or 50 nucleotides. Such fragments themselves individually represent aspects of the present invention.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81.
  • Percent (%) sequence identity with respect to promoter sequences is defined as the percentage of nucleotides in a candidate sequence that are identical with nucleotides in any one of the promoter sequences shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity.
  • percent (%) sequence identity with respect to any sequence (whether gene, promoter or coding) is defined as the percentage of nucleotides in a candidate sequence that are identical with nucleotides in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity.
  • Sequence alignment can be carried out by the skilled person using techniques well known in the art for example using publicly available software such as BLAST, BLAST2 or Align software, see Altschul et al (Methods in Enzymology, 266:460-480 (1996); http: //blast. wustl/edu/blast/README. html) or Pearson et al (Genomics, 46, 24,36, 1997).
  • the Align program is available from: http: // ⁇ tolbiol. soton.ac.uk/compute/align.html .
  • the alignments and percentage sequence identities reported herein and in accordance with the present invention use BLAST programs with default settings. More generally, the skilled person can readily determine appropriate parameters for determining alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • the present invention thus provides an isolated nucleic acid comprising a promoter, the promoter comprising or consisting essentially of any one of the nucleotide sequences shown as: (i) SEQ ID NO: 2, 61, 62, 63 or 73; (ii) SEQ ID NO: 4, 64, 74 or 75; (iii) SEQ ID NO: 6, 65 or 76; (iv) SEQ ID NO: 8, 66 or 77; (v) SEQ ID NO: 10, 67, 78 or 79; (vi) SEQ ID NO: 12, 68, 69, 70 or 80; or (vii) SEQ ID NO: 14 or 81; or a fragment, homologue, mutant, allele, derivative or variant thereof, which is active as a promoter in proliferative phase endosperm according to the definitions provided herein.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 2, 61, 62, 63 or 73, or as SEQ ID NOS: 2, 61, 62 or 63, or as SEQ ID NO: 2, or a fragment, homologue, mutant, allele, derivative or variant thereof, which is active as a promoter in proliferative phase endosperm according to the definitions provided herein.
  • the promoter may comprise or consist essentially of any one of the nucleotide sequences shown as SEQ ID NOS: 8, 66 or 77, or as SEQ ID NOS: 8 or 66, or as SEQ ID NO: 8, or a fragment, homologue, mutant, allele, derivative or variant thereof, which is active as a promoter in proliferative phase endosperm according to the definitions provided herein.
  • the present invention also includes nucleic acid molecules which are capable of hybridising to one or more of the promoter sequences disclosed herein or a fragment thereof, or a complementary sequence thereof (since DNA is generally double-stranded) .
  • the present invention extends to a nucleic acid that is capable of hybridising to one or more of the promoter sequences shown as SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, or a fragment thereof, or the complementary sequences thereof, and which has promoter activity in endosperm as defined herein.
  • nucleic acids having the appropriate level of sequence homology with the respective promoters may be identified by using hybridi2ation and washing conditions of appropriate stringency.
  • Hybridisation generally depends on the ability of denatured DNA to re-anneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridisable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so.
  • stringency of hybridisation reactions see Ausubel et al, Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).
  • a nucleic acid sequence will hybridise to a promoter sequence of the invention, or a complementary sequence thereof under "stringent conditions".
  • stringent conditions include those that: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/ 0.1% sodium dodecyl sulfate at 50°C; (2) employ during hybridisation a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50mM sodium phosphate buffer at pH
  • hybridizations may be performed, according to a method of Sambrook, Fritsch and Maniatis, Molecular Cloning, A Laboratory Manual, Cold Spring Harbour Laboratory Press, 1989, using a hybridization solution comprising: 5X SSC, 5X Denhardt's reagent, 0.5-1.0% SDS, 100 ⁇ g/ml denatured, fragmented salmon sperm DNA, 0.05% sodium pyrophosphate and up to 50% formamide.
  • Hybridization may be carried out at 37-42 0 C for at least six hours.
  • filters may be washed as follows: (1) 5 minutes at room temperature in 2X SSC and 1% SDS; (2) 15 minutes at room temperature in 2X SSC and 0.1% SDS; (3) 30 minutes-1 hour at 37°C in IX SSC and 1% SDS; (4) 2 hours at 42-65°C in IX SSC and 1% SDS, changing the solution every 30 minutes.
  • the T n is 57°C.
  • the T m of a DNA duplex decreases by 1 - 1.5°C with every 1% decrease in homology.
  • targets with greater than about 75% sequence identity would be observed using a hybridization temperature of 42°C.
  • Such a sequence would be considered substantially homologous to the respective promoter sequences of the present invention.
  • oligonucleotide probes or primers may be designed and as such are included within the present invention. Suitable probes or primers for a given application may be designed by those skilled in the art. Generally specific probes for hybridisation are about 20 to 600 nucleotides in length. Those skilled in the art are well versed in the design of probes for use in hybridisation protocols. For example, a probe may be designed corresponding to a coding region of an endosperm-expressed gene identified herein and may be used to identify corresponding genes in further plant species, particularly crop species, as described further below.
  • a probe comprising all or a portion of a promoter sequence of the invention, for example, a promoter sequence as shown in any one of SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, may be designed for use in isolating corresponding promoter sequences from other plants, particularly crop plants, as described further below.
  • Generally specific primers are upwards of 14 nucleotides in length but not more than 35.
  • suitable primers for generating promoter fragments for incorporation into expression constructs may be designed by those skilled in the art using the sequences provided herein. Examples of such suitable primers are also described herein.
  • the present invention includes promoters of homologous genes to those identified herein identified by means of homology searching of databases using gene and promoter sequences provided herein.
  • Homologous genes from plant species for which the genome has been fully sequenced or there is good genomic sequence and/or EST or cDNA clone data available may be identified in such a way.
  • BLAST searches see Altschul et al (Methods in Enzymology, 266:460- 480 (1996); http : //blast . wustl/edu/blast/README. html) or Pearson et al (Genomics, 46, 24,36, 1997)) may be carried out using the endosperm-expressed Arabidopsis genes identified herein against database sequences.
  • protein-protein BLAST searches using the Arabidopsis genes allow the identification of genomic DNA comprising the corresponding genes in other species, generally in the form of a BAC clone or region of a chromosome.
  • Such searches make use of the coding regions of the endosperm-expressed genes, the sequences of which coding regions are shown in the sequences of both the chromosome database accession numbers and the gene transcript database accession numbers for the endosperm- expressed genes, all of which are provided herein.
  • an amino acid identity of at least 30% and similarity (allowing for conservative amino acid changes) of at least 40% is considered to indicate a homologue.
  • genomic sequence comprising the corresponding gene
  • this sequence can itself be used in a BLAST search to identify the corresponding EST or cDNA clone in the database allowing the location of the translation start codon within the gene sequence to be confirmed.
  • a promoter region of the corresponding gene comprising sequence 5' to the start codon of the gene as described herein, can be determined. Observation of the region of the alignment of the promoter sequence of the Arabidopsis gene with the genomic sequence allows the corresponding promoter sequence to be determined. Additionally or alternatively, alignment of only the promoter of the Arabidopsis gene with the genomic sequence may be carried out to compare the corresponding promoters.
  • BLAST searches may be carried out using promoter sequences of the invention provided herein against database sequences.
  • Preferred databases for searching include those comprising genomic sequence data from a specific plant species, preferably a specific crop species, for example, the rice sequence database at NCBI or the Brassica database at BBSRC (http://brassica.bbsrc.ac.uk/BrassicaDB). Crop plant sequence databases can be found at NCBI (www.ncbi.nlm.nih.gov) and TIGR (www. tigr.org) .
  • Confirmation of the identified promoter as an endosperm-active promoter can readily be determined by those skilled in the art using an appropriate assay, for example using a promoter- reporter construct comprising the promoter operably linked to a ⁇ - glucuronidase (GUS) reporter gene as described herein.
  • GUS ⁇ - glucuronidase
  • the invention provides means for identifying promoters corresponding to endosperm promoter sequences provided herein (i.e. the promoters of corresponding endosperm-expressed genes) in further plant species, for example, further crop species.
  • the invention provides the use of a promoter sequence provided herein for identifying corresponding promoters in further plant species. Promoters obtained in such a way are further examples of the promoters of the present invention.
  • the invention provides use of a promoter sequence as shown in any one of SEQ ID NOS: 1 to 14 and SEQ ID NOS: 61 to 70 and SEQ ID NOS: 73 to 81, or a fragment thereof, or a complementary sequence thereof, for identifying a corresponding promoter sequence from a plant.
  • the invention provides the use of an endosperm-expressed gene provided herein for identifying promoters of corresponding genes in further plant species. Promoters obtained in such a way are further examples of the promoters of the present invention.
  • the invention provides use of an endosperm- expressed gene as provided herein, or a fragment thereof, or a complementary sequence thereof, for identifying a corresponding promoter sequence from a plant.
  • the gene sequence and coding sequence of each of the endosperm-expressed genes provided by the invention are provided herein with reference to database accession numbers providing the gene sequence and its coding region.
  • the coding sequence information of such a gene is used.
  • the plant is oilseed rape, maize or soybean.
  • endosperm-expressed genes and their promoter sequences provided herein may be used to identify promoters of homologous genes by means of homology searching of databases as described above.
  • Such uses of endosperm-expressed gene and promoter sequences provided herein and methods of using endosperm-expressed gene and promoter sequences provided herein to identify promoters of homologous genes by homology searching are included within the present invention.
  • endosperm-expressed genes and their promoter sequences provided herein may be used to identify promoters of homologous genes by means of PCR techniques based on degenerate PCR primers.
  • homologues in other species may be identified using a PCR-based approach where the primers are designed using protein sequence information from the coding regions of the genes . This information is available from a number of species and sequence alignments between coding and protein sequences of the endosperm-expressed genes provided herein and their homologues allow identification of conserved regions.
  • Degenerate PCR primers may be made using this protein sequence information. Nucleotide sequences that could code for the determined sequence of amino acids can be deduced from the genetic code.
  • nucleotide sequences can encode the same peptide sequence.
  • the oligonucleotide is synthesized incorporating, where needed, multiple nucleotides.
  • the product is called a degenerate oligonucleotide.
  • An amplified RT-PCR product amplified from, for example, RNA extracted from seeds containing proliferative stage endosperm using degenerate primers, or a degenerate oligonucleotide may be used in nucleic acid hybridization approach as described further below to screen the colonies or plaques of a cDNA library (made from seeds containing proliferative stage endosperm) for clones containing complementary sequences. Positive clones obtained from a cDNA library may be used as nucleic acid hybridization probes to screen a library of genomic DNA to identify clones containing the gene that can produce the corresponding mRNA.
  • Obtaining the promoter sequence upstream of the translation start codon from the genomic clone may be carried out as described herein.
  • degenerate PCR is described in PCR PROTOCOLS ISBN: 0-12-372180-6 Michael Innis, David Gelfand, John Sninsky, Thomas White.
  • Such uses of endosperm-expressed gene and promoter sequences provided herein and methods of using endosperm- expressed gene and promoter sequences provided herein to identify promoters of homologous genes using denerate PCR and hybridisation techniques are included within the present invention.
  • corresponding promoters may nonetheless be identified using conventional hybridisation techniques. Endosperm-expressed genes and their promoter sequences provided herein may be used to identify corresponding promoters using any of a variety of hybridisation techniques well known in the art.
  • a cDNA library of a crop of interest may be prepared from a suitable tissue according to techniques well known in the art to screen for the gene coding region in that species corresponding to an Arabidopsis endosperm-expressed gene of the invention.
  • a suitable tissue is any which comprises proliferative phase endosperm, preferably developing seeds, or may be any other tissue in which the particular endosperm-expressed gene for which the corresponding promoter is sought -is also expressed. This can be determined for the promoters of the invention according to the results provided herein.
  • the cDNA library may then be screened using coding region of the Arabidopsis gene as a probe to identify hybridising cDNA clones which correspond to the Arabidopsis gene coding region.
  • Coding regions of the endosperm-expressed genes provided herein are provided herein with reference to the database accession numbers for the gene sequence and the transcript sequence, both of which identify the coding sequence. Therefore, for example, all or part of the coding region of any one of the Arabidopsis genes provided herein shown in the sequences provided in At5g46950 (accession numbers NC_003076 and NM_124066); Atlgl4520 (accession numbers NC_003070 and NM_101319) ; At2g38900 (accession numbers NC_003071 and NM__129447) ; At5gO7210 (accession numbers NC_003076 and NM_120803) ; At2g41000 (accession numbers NC_003071 and NM_129665) ; At5g39260 (accession numbers
  • NC_003076 and NM_123288) ; or Atlg62080 may be used as a probe to screen a cDNA library. Again, such screening is routine in the art and can be carried out by those skilled in the art.
  • the identified cDNA clone may then be used to screen a genomic library of the crop, which again may be prepared and screened according to known techniques.
  • the corresponding coding region identified need not be full-length in order to successfully screen the genomic library.
  • the full-length coding sequence may be obtained using so-called "RACE" (rapid amplification of cDNA ends) in which cDNAs in the library are ligated to an oligonucleotide linker and PCR is performed using a primer which hybridises with sequence at the 5' end of the isolated cDNA clone and a primer which hybridises to the oligonucleotide linker.
  • RACE rapid amplification of cDNA ends
  • the genomic clones identified then may be sequenced and the location of the translational start codon within the genomic sequence identified using the cDNA sequence information and from alignment with the Arabidopsis gene sequence.
  • sequence 5' to the start codon of the gene can thus be determined and the promoter region of the corresponding gene can be determined.
  • Confirmation of the identified promoter as an endosperm-active promoter can readily be determined by those skilled in the art using an appropriate assay, for example using a promoter-reporter construct comprising the promoter operably linked to a ⁇ -glucuronidase (GUS) reporter gene as described herein.
  • GUS ⁇ -glucuronidase
  • a promoter of the invention may be used to screen a genomic library of a crop of interest to isolate corresponding promoter sequences according to techniques well known in the art.
  • a promoter sequence of the invention may be used as a probe for hybridisation with a genomic library under medium to high stringency conditions.
  • stringency of hybridisation reactions see Ausubel et al, Current Protocols in Molecular Biology, Wiley Inters ⁇ ience Publishers, (1995) . and Sambrook, Fritsch and Maniatis, Molecular Cloning, A Laboratory Manual, Cold Spring Harbour Laboratory Press (1989) .
  • Hybridising genomic clones may be isolated and purified.
  • promoter activity of the genomic fragment in endosperm can readily be determined by those skilled in the art using an appropriate assay, for example using a promoter-reporter construct comprising the genomic sequence operably linked to a ⁇ -glucuronidase (GUS) reporter gene as described herein.
  • GUS ⁇ -glucuronidase
  • the present invention provides a nucleic acid construct or vector comprising an endosperm promoter of the invention as described above. Accordingly, in a further aspect, the invention provides a nucleic acid construct comprising an isolated nucleic acid which comprises a promoter of the invention, as described above.
  • the nucleic acid construct or vector is an "expression construct or vector", further comprising a heterologous gene or nucleotide sequence, for example a coding sequence, operably linked to the promoter sequence.
  • the present invention provides a nucleic acid construct comprising an isolated nucleic acid which comprises a promoter of the invention operably linked to a heterologous gene or nucleotide sequence.
  • the heterologous gene or sequence is a coding sequence.
  • the present invention provides an expression cassette comprising an isolated nucleic acid which comprises a promoter of the invention operably linked to a heterologous gene or nucleotide sequence.
  • the heterologous gene or sequence is a coding sequence.
  • the expression cassette is for integration into a plant genome.
  • Heterologous gene as used herein means a gene, or nucleotide sequence from a gene, other than the gene that the promoter directs transcription of, i.e. is operably linked to, in nature.
  • a heterologous gene may be any other than the gene from which the promoter was derived. Modified forms of those genes are generally excluded.
  • a heterologous gene may be an isolated or recombinant form of the gene that the promoter directs transcription of in nature.
  • the heterologous gene may be transcribed into mRNA which may be translated into a peptide or polypeptide product which may be detected and preferably quantitated or its effects assessed following expression.
  • a heterologous gene may be part of a gene or coding sequence sufficient to produce a functional gene product when expressed.
  • the heterologous gene may be a reporter gene.
  • a gene whose encoded product may be assayed following expression is termed a "reporter gene", i.e. a gene which "reports" on promoter activity.
  • An expression construct comprising as the heterologous gene a reporter gene may be referred to as a "promoter-reporter construct".
  • the reporter gene preferably encodes an enzyme which catalyses a reaction which produces a detectable signal, preferably a visually detectable signal, such as a coloured product.
  • Many examples are known, including ⁇ -glucuronidase (GUS), ⁇ -galactosidase and luciferase.
  • ⁇ - glucuronidase (GUS) activity may be assayed as described elsewhere herein.
  • ⁇ -galactosidase activity may be assayed by production of blue colour on substrate, the assay being by eye or by use of a spectrophotometer to measure absorbance.
  • Luminescence for example that produced as a result of luciferase activity, may be quantitated using a spectrophotometer.
  • Radioactive assays may be used, for instance using chloramphenicol acetyltransferase, which may also be used in non-radioactive assays.
  • the presence and/or amount of gene product resulting from expression from the reporter gene may be determined using a molecule able to bind the product, such as an antibody or fragment thereof.
  • the binding molecule may be labelled directly or indirectly using any standard technique.
  • Reporter genes such as GFP and YFP which are not enzymes, but encode fluorescent proteins are also commonly used.
  • GFP and YFP which are not enzymes, but encode fluorescent proteins are also commonly used.
  • Any suitable reporter/assay may be used and it should be appreciated that no particular choice is essential to or a limitation of the present invention.
  • Expression of a reporter gene from the promoter may be in an in vitro expression system or may be intracellular (in vivo) . Determination of expression of a reporter gene thus allows determination of the activity of the promoter driving that reporter gene .
  • the reporter construct is expressed and assayed in the same plant species as that from which the promoter in the construct was derived.
  • promoter activity may be determined using a promoter-reporter construct and assaying for reporter activity in a plant.
  • suitable promoter-reporter constructs and assay of the reporter gene are apparent to those skilled in the art using known techniques. More particularly, for example, a promoter-reporter construct comprising a promoter of the invention may be constructed and assayed as described below.
  • a genomic fragment comprising the promoter may be amplified by the polymerase chain reaction (PCR) from genomic DNA using appropriate primers specific to the particular promoter and gene as described elsewhere herein and which introduce, for example, a Sail and a Xhol site at the 5' and 3' ends of the PCR fragment respectively.
  • PCR polymerase chain reaction
  • the PCR fragment may be A-tailed and ligated into pGEMT, for example, then excised with, for example, Sail and Xhol and ligated into the Sail and Xhol sites of BJ60, for example, 5' to the uidA reporter which includes a terminator signal, forming a promoter-BJ60 construct containing the reporter cassette.
  • a binary vector for transformation into a plant may be constructed as follows.
  • the reporter cassette may be excised with Notl, for example, from the promoter-BJ60 construct and ligated into the Notl sites of the binary vector BJ40, for example, forming the construct for transformation, promoter-uidA- BJ40.
  • the binary vector may be transformed into Agrobacterium tumefaciens and then into a plant as described elsewhere herein.
  • the uidA gene encodes ⁇ -glucuronidase (GUS) , which may be assayed as described herein using standard protocols (e.g. Jefferson, 1987).
  • Seedlings and immature seeds dissected from siliques may be assayed by placing them in GUS staining buffer (100 mM KPO 4 pH 7.0,1 mg/ml X- Gluc (5-bromo, 4- ⁇ hloro, 3-indolyl-b-D-glucuronide sodium salt), 0.25 mM each K 3 Fe(CN) 6 and K 4 Fe(CN) 6 , 0.1% Triton X-100) and incubated overnight at 37 "C. Observation of blue staining indicates GUS activity which indicates activity of the promoter.
  • GUS staining buffer 100 mM KPO 4 pH 7.0,1 mg/ml X- Gluc (5-bromo, 4- ⁇ hloro, 3-indolyl-b-D-glucuronide sodium salt
  • Observation of blue staining indicates GUS activity which indicates activity of the promoter.
  • suitable linkers, restriction sites, vectors and reporter genes may be used as is apparent to those skilled in the art, and
  • a heterologous gene generally requires the presence, in addition to the promoter which initiates transcription, of a translational initiation region and transcriptional and translational termination regions.
  • One or more introns may be present in the gene, along with iriRNA processing signals (e.g. splice sites).
  • a construct or expression cassette of the invention may further comprise a suitably positioned restriction site or other means for insertion into the construct of a sequence heterologous to the promoter to be operably linked thereto, appropriate regulatory sequences, including promoter sequences, terminator sequences, polyadenylation sequences, enhancer sequences, selectable marker genes and other sequences as appropriate.
  • the heterologous gene is preferably any gene for which expression in endosperm is desired, for example to cause a desired effect in a plant or simply to assess the effect of expression in endosperm of that gene on the plant.
  • Expression of such a gene from the promoter is thus preferably in a plant cell, preferably in vivo.
  • the expression construct or expression cassette is for expression in the same plant species as that from which the promoter was derived.
  • an expression construct or expression cassette may be introduced and expressed in any plant in which the promoter retains activity in endosperm, i.e. retains faithful promoter activity.
  • Arabidopsis promoters may be used with faithful promoter activity in oilseed rape (B. napus) (for example see Byzova, M.
  • a construct or expression cassette comprising a heterologous gene of interest for expression in endosperm may comprise further nucleotide sequences, depending on the manner of introduction of the construct into a plant.
  • a construct or cassette of the invention may further comprise sufficient Agrobacterium T-DNA sequences for transfer of the construct to a plant by Agrobacterium-mediated transformation .
  • the heterologous gene for expression in a plant is one the expression of which in endosperm will modify endosperm size and seed size, preferably increase endosperm size and seed size.
  • constructs or expression cassettes of the invention comprising a heterologous gene of interest for expression in endosperm for modifying endosperm size and seed size are described further below.
  • suitable heterologous genes for expression in endosperm to increase endosperm size include, but are not limited to, Arabidopsis genes Atlg55600 (WRKYlO transcription factor MINISEED3, accession number NM_104436) , At4g31900 (chromatin remodelling factor, accession number NM_119341) , At5gl4960 (DEL2/E2Fd/E2Ll transcription factor, accession number NM_121500) , At5g39260 (EXPA21 expansin, accession number NM_123288) , Atlg01530 (AGL28 MADS-box transcription factor, accession number NM_100035) , Atlg65330 (PHERESl (PHEl) MADS- box transcription factor, accession number NM_105207), At4g37750
  • Atlg77210 a sugar transporter, accession number NM_106370
  • At4g24650 adenylate isopentenyltransferase (IPT4)
  • accession number NM_118598 accession number NM_118598
  • At3g25280 proton-dependent oligopeptide transport family protein, accession number NM_113434)
  • At4g21480 glucose transporter, accession number NM_118268) .
  • the heterologous gene is selected from the group consisting of Arabidopsis genes: Atlg55600 (WRKYlO transcription factor MINISEED3, accession number NM_104436) , At4g31900 (chromatin remodelling factor, accession number NM_119341) , At5gl4960
  • the heterologous gene is selected from the group consisting of Arabidopsis genes: Atlg77210 (a sugar transporter, accession number NM_106370) , At4g24650 (adenylate isopentenyltransferase
  • At3g25280 proton-dependent oligopeptide transport family protein, accession number NM_113434)
  • At4g21480 glucose transporter, accession number NM_118268)
  • Other suitable genes as ascertained by those skilled in the art may be used, including homologues of the above Arabidopsis genes in other plants .
  • the coding region of such a gene is used.
  • a part of the gene or coding region may be used provided it is sufficient to produce a functional gene product.
  • the coding sequence used may be all or part of any of those provided under the accession numbers indicated above as necessary to produce a functional gene product, which accession numbers and sequences are herein incorporated by reference .
  • an expression cassette comprising a promoter of the invention operably linked to a heterologous gene or nucleotide sequence as indicated above.
  • the heterologous sequence is a coding sequence of a gene desired to be expressed in endosperm, for example those indicated above.
  • An expression cassette of the invention may comprise further nucleotide sequences depending on the manner of its introduction into a plant, for example T-DNA sequences, as described above.
  • the present invention provides host cells and methods comprising culturing host cells comprising a promoter, nucleic acid construct comprising a promoter, or expression cassette comprising a promoter, of the invention as described above.
  • Host cells of the invention may be bacteria, for example for use during construction and manipulation of constructs. Introduction of a construct of the invention into a bacterial host cell and culturing of such host cells may be achieved using any of the many techniques and protocols available in the art.
  • a promoter of the present invention may be used to drive expression of a heterologous gene in a selected host cell.
  • expression of a heterologous gene under control of a promoter is in a plant such that a host cell is preferably a plant cell.
  • Introduction of a construct or expression cassette of the invention into a plant cell may be achieved using any of the many techniques and protocols available in the art for the transformation of plants with DNA, including for example, transformation employing Ti-plasmid DNA and Agrobacterium, protoplast fusion, injection, electroporation and the like.
  • the construct generally comprises a selectable marker that allows for the selection of transformed cells.
  • Suitable markers are well known in the art, for example, genes conferring resistance to antibiotics such as kanamycin (nptll), neomycin, bleomycin, chloramphenicol and others, and genes conferring resistance to herbicides such as glufosinate.
  • kanamycin nptll
  • neomycin neomycin
  • bleomycin neomycin
  • chloramphenicol genes conferring resistance to herbicides
  • herbicides such as glufosinate.
  • further nucleotide sequences may be required in a construct or cassette, such as sufficient Agrobacterium T-DNA for transfer to a plant cell.
  • Such constructs and cassettes containing further sequences to enable their introduction into and detection in plants are included within the constructs and expression cassettes of the invention and such constructs may be referred to as "plant transformation constructs" as well as "expression constructs”.
  • Transformation protocols for transforming a variety of plants are available, including for Arabidopsis (Clough and Bent (1998)); for rice (Supartana, P. (2005) Journal of Bioscience and Bioengineering
  • the invention provides a method of culturing a host cell under conditions for transcription of the heterologous gene from the promoter.
  • the invention provides such methods comprising culturing under conditions for expression of the encoded polypeptide product where the heterologous gene is a coding sequence.
  • Such methods may further comprise detection of transcription of the heterologous gene and/or detection of expression of the encoded product.
  • Any plant cell for which it is desired to introduce a construct or expression cassette of the invention may be a plant host cell in accordance with the invention.
  • Preferred plants are Arabidopsis, oilseed rape, rice, maize and soybean. Of particular interest are crop plants.
  • a plant host cell of the invention includes a plant cell into which a construct is first introduced during a transformation procedure, as well as a plant cell in which the construct is present following regeneration of plants after transformation.
  • a plant host cell maybe of a cell type which is preferred for use in a transformation protocol or may be any cell type within a plant .
  • the construct or expression cassette comprising promoter and heterologous gene may be integrated into the genome (e.g. chromosome) of the host cell. Integration may be promoted by inclusion in the construct or cassette of sequences which promote recombination with the genome, in accordance with standard techniques.
  • Nucleic acid molecules, constructs and vectors according to the present invention may be provided isolated and/or purified (i.e. from their natural environment) , in substantially pure or homogeneous form, free or substantially free of a gene coding sequence, or free or substantially free of nucleic acid or genes of the species of interest or origin other than the promoter sequence.
  • Nucleic acid according to the present invention may be wholly or partially synthetic or may be produced using recombinant DNA technology, according to techniques well known in the art. The term, "isolate" encompasses all these possibilities.
  • the present invention provides use of a promoter, construct or expression cassette of the invention for promoting transcription of an operatively linked heterologous gene in proliferative phase endosperm and use for driving expression of the encoded product of an operably linked heterologous gene in endosperm.
  • the present invention provides use of a promoter, construct or expression cassette of the invention for expressing a recombinant protein in endosperm.
  • the heterologous gene may be a reporter gene and the uses and methods of the invention enable the activity of a promoter sequence to be determined.
  • An expression construct in which the heterologous gene is a reporter gene may be generated and activity of the reporter gene assayed as described elsewhere herein.
  • the heterologous gene may be any gene which is desired to be expressed in endosperm, preferably any gene which when expressed in endosperm has the effect of modifying endosperm and seed size.
  • a heterologous gene is one the expression of which in endosperm has the effect of increasing endosperm and seed size.
  • the present invention provides use of a promoter, construct or expression cassette of the invention for modifying endosperm and seed size in a plant.
  • Further aspects of the invention provide a method for modifying endosperm and seed size in a plant, comprising the step of introducing a construct or expression cassette of the invention into. a plant and allowing transcription of the heterologous gene.
  • the invention provides such methods comprising allowing expression of the encoded product of the heterologous gene where the heterologous gene is a coding sequence.
  • modifying endosperm and seed size is increasing endosperm and seed size.
  • the invention provides uses and methods as described above for increasing seed size in a plant.
  • a plant in which endosperm and seed size is modified may be Arabidopsis and is preferably a crop plant, for example, oilseed rape, rice, maize or soybean.
  • the invention provides a plant cell, and provides a plant comprising such a host cell, comprising a promoter, construct or expression cassette of the invention.
  • the invention provides a plant cell and a plant expressing a heterologous gene under the control of a promoter, construct or expression cassette of the invention.
  • the invention provides a plant cell and a plant expressing a heterologous gene in endosperm.
  • the promoter and heterologous gene, or expression cassette are integrated into the plant genome.
  • a plant cell may be from, or a plant may be Arabidopsis and is preferably a crop plant, for example, oilseed rape, rice, maize or soybean.
  • Promoters, constructs and heterologous genes preferred for use according to the invention, including for use in modifying, for example increasing, endosperm and seed size in plants have been described in detail elsewhere herein.
  • promoters for use according to the invention are endosperm-specific promoters of the invention.
  • chalazal endosperm-specific promoters of the invention are preferred promoters for use according to the invention.
  • a heterologous gene for use for modifying endosperm and seed size in a plant may be any gene which when expressed in endosperm affects endosperm size and seed size, and preferably has the effect of increasing endosperm size and seed size. Genes with potential roles in processes such as cell division, biogenesis, signalling, transcriptional regulation, and transport are likely to be useful in modifying endosperm and seed size.
  • genes involved in response to auxin, cytokinin, gibberrellin, ethylene, and brassinosteroids, genes involved in calcium signalling, kinases, sugar transporters, MADS-box, MYB, and bZIP transcription factors, and expansins were found in the microarray experiments reported herein.
  • the list of endosperm- expressed genes identified in the macroarray experiments reported herein includes many candidate open reading frames for creating recombinant genes that may increase endosperm and seed size. Genes previously reported to promote cell division may also be used.
  • transcription factors AINTEGUMENTA and ARGOS which increase organ size due to prolonged cell proliferation when expressed under the 35S promoter
  • E2Fa and DPa which occupy a key regulatory position in the cell cycle, and also increase cell division when ectopically expressed
  • Preferred heterologous genes for expression in endosperm to increase endosperm size and seed size include, but are not limited to, Arabidopsis genes Atlg55600 (WRKYlO transcription factor MINISEED3, accession number NM__104436) , At4g31900 (chromatin remodelling factor, accession number NM_119341) , At5gl4960 (DEL2/E2Fd/E2Ll transcription factor, accession number NM_121500) , At5g39260 (EXPA21 expansin, accession number NM_123288), Atlg01530 (AGL28 MADS-box transcription factor, accession number NM_100035) , Atlg65330 (PHERESl (PHEl) MADS- box transcription factor, accession number NM_105207) , At4g37750 (AINTEGUMATA (ANT) transcription factor, accession number NM_119937), Atlg77210 (a sugar transporter, accession number NM_106370) , At
  • the coding region of such a gene is used.
  • a part of the gene or coding region may be used provided it is sufficient to produce a functional gene product.
  • the coding sequence used may be all or part of any of those provided under the accession numbers indicated above as necessary to produce a functional gene product, which accession numbers and sequences are herein incorporated by reference.
  • Preferred for use with an endosperm-specific promoter of the invention is a heterologous gene selected from the group consisting of Arabidopsis genes: Atlg55600 (WRKYlO transcription factor MINISEED3, accession number NM_104436), At4g31900 (chromatin remodelling factor, accession number NM_119341) , At5gl4960 (DEL2/E2Fd/E2Ll transcription factor, accession number NM_121500) , At5g39260 (EXPA21 expansin, accession number NM_123288) , AtlgO153O (AGL28 MADS-box transcription factor, accession number NM_100035) , Atlg65330 (PHERESl (PHEl) MADS-box transcription factor, accession number NM_105207) , At4g37750 (AINTEGUMATA (ANT) transcription factor, accession number NM_119937) .
  • Arabidopsis genes Atlg55600 (WRKYlO transcription factor
  • the heterologous gene is selected from the group consisting of Arabidopsis genes: Atlg77210 (a sugar transporter, accession number NM_106370) , At4g24650 (adenylate isopentenyltransferase (IPT4) , accession number NM_118598) , At3g25280 (proton-dependent oligopeptide transport family protein, accession number NM_113434) , and At4g21480 (glucose transporter, accession number NM_118268) .
  • Atlg77210 a sugar transporter, accession number NM_106370
  • At4g24650 adenylate isopentenyltransferase (IPT4)
  • accession number NM_118598) accession number NM_118598
  • At3g25280 proton-dependent oligopeptide transport family protein, accession number NM_113434
  • At4g21480 glucose transporter, accession number NM_11826
  • Atlg55600 is a WRKYlO transcription factor (also known as MINISEED3) .
  • MINISEED3 WRKYlO transcription factor
  • endosperm cellularizes early and is small, and seeds are small (Luo et al . , 2005, PNAS 102, 17531). Accordingly, overexpression is expected to generate large endosperm and large seeds.
  • At4g31900 is a chromatin remodelling factor, which is potentially involved in transcriptional control.
  • At5gl4960 is the DEL2/E2Fd/E2Ll transcription factor. Some DEL genes are associated with increased cell proliferation. At5g39260 is EXPA21 / an expansin, predicted to be involved in cell wall loosening and identified as endosperm-expressed herein.
  • Atlg65330 is PHERBSl (PHEl), a MADS-box transcription factor. It is repressed by the FIS complex (Kohler et al., 2005, Nature Genetics 37, 28) and fis mutants cause overproliferation of endosperm. Accordingly, overexpression of PHEl is expected to increase endosperm size.
  • Atlg01530 is AGL28, a MADS-box transcription factor. It is co- expressed and interacts with PHEl (de Folter et al . , 2005, Plant Cell 17, 1424) .
  • At4g37750 is AINTEGUMENTA (ANT), a transcription factor. Overexpression of ANT prolongs cell proliferation (Mizukami and
  • Atlg77210 is a sugar transporter, and is one of the silique-preferred genes identified in the microarray experiments reported herein.
  • At4g24650 is adenylate isopentenyltransferase 4 (IPT4), which catalyses a rate-limiting step in cytokinin biosynthesis. This member of the IPT gene family is expressed in chalazal endosperm
  • At3g25280 is a proton-dependent oligopeptide transport family protein, similar to NTLl (inducible nitrate transporter) .
  • At4g21480 is a glucose transporter, and is one of the silique- preferred genes identified in the experiments reported herein.
  • an expression construct comprising an endosperm promoter of the invention for transformation into a plant to effect expression of an operably linked heterologous gene, for example to increase endosperm and seed size, may be constructed as described below.
  • a genomic fragment including the promoter may be amplified by PCR from genomic DNA using appropriate primers specific to the particular promoter and gene as can readily be determined by those skilled in the art as described elsewhere herein and which introduce, for example, a Sail and a Kpnl site at the 5' and 3' ends of the PCR fragment respectively.
  • the PCR fragment may be A-tailed and ligated into pGEMT, for example, and then excised with Sail and Kpnl and ligated into the Sail and Kpnl sites of BJ36, for example, 5' to the ocs terminator signal, forming an endosperm promoter-BJ36 construct.
  • Introduction into the expression construct of an operably linked heterologous gene may be achieved as follows.
  • the coding region of the heterologous gene to be driven by the promoter may be amplified by PCR from cDNA, or other suitable source, using appropriate primers specific to that cDNA and gene and which introduce, for example, a Kpnl and an Xmal site at the 5' and 3' ends of the PCR fragment respectively.
  • the PCR fragment may be A-tailed and ligated into pGEMT, for example, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the endosperm promoter-BJ36 construct, 3 f to the endosperm promoter and 5' to the ocs terminator signal, forming an endosperm promoter: : gene coding sequence-BJ36 construct.
  • Binary vectors for transformation into plants using Agrobacterium may then be constructed as follows.
  • Expression cassettes including the ocs terminator
  • Notl from the promoter: : gene coding sequence-BJ36 construct and ligated into the Notl sites of the binary vector BJ40, for example, forming a promoter :: gene coding sequence-BJ40 construct for transformation.
  • the binary vectors may be transformed into Agrobacterium tumefaciens using standard techniques known in the art.
  • Agrobacterium may then be used to transform the constructs into plants, for example, Arabidopsis thaliana, oilseed rape, rice, maize or soybean using standard techniques known in the art as referred to above.
  • other suitable linkers, restriction sites and vectors may be used as described above.
  • Successful transformation of a plant may be determined by assaying for the presence of the expression construct and/or expression of the heterologous gene using techniques well known in the art, for example, Southern and northern blot analysis and PCR. Analysis of the effect of endosperm expression of the heterologous gene on plant phenotype in plants which have been demonstrated to express the heterologous gene may be assessed and may be compared against control plants. The effect of endosperm expression of the heterologous gene may be determined by assessing endosperm size and/or seed size in samples of endosperm and/or seeds from a transformed plant (test plant) and comparison with a control plant in which the heterologous gene is not present and/or is not expressed.
  • endosperm size may be assessed by counting endosperm nuclei and/or measuring the volume or cross-sectional area of chalazal endosperm during seed development.
  • seed size may be assessed by weighing mature seeds.
  • endosperm and seed size is increased by up to about 5, 10, 15, 20, 30, 40, 50, 75 or 100%, more preferably by at least 5, 10, 15, 20, 30, 40, 50, 75, or
  • control plant for comparison with a test plant expressing a heterologous gene is the same accession or cultivar of the same species as the test plant that does not contain the relevant construct or expression cassette and/or does not express the heterologous gene.
  • endosperm and seed size may be compared between test and control plants in any manner apparent to those skilled in the art.
  • Figure 1 shows the strategy for array construction, filtering, and validation of endosperm-expressed promoters (see Methods for details) .
  • A Diploid (2x) Arabidopsis thaliana seed parents were crossed with tetraploid (4x) A. arenosa pollen parents; this produces endosperm-enriched seeds in which peripheral and chalazal endosperm overproliferate, endosperm cellularization does not occur, and the embryo arrests by globular-heart transition (Bushell et al., 2003) .
  • Seeds from the above cross at 2 to 11 DAP were dissected out of siliques and used to prepare a whole seed cDNA library.
  • Figure 2 shows a comparison of functional categories of seed- expressed genes and the entire annotated A. thaliana genome. Only categories to which >15 seed genes were assigned are shown. The most abundant category, ⁇ unclassified proteins' (57.0% of seed genes and 73.7% of the whole genome), is also omitted. The data for all functional categories is provided in Table 2.
  • Figure 3 shows GUS expression in plants transformed with reporter constructs directed by putative early endosperm promoters (see Table 3 for details) .
  • A GUS activity assayed in seeds from self- pollinations at 3 to 5 DAP, seedlings, and anthers. The seven constructs pictured were expressed in endosperm but not seedlings; three also showed activity in pollen. The construct based on the putative promoter of At2g41000 was also expressed in the embryo sac before fertilization.
  • B Expression of four constructs in early seeds following transmission through the pollen parent only, confirming expression in endosperm rather than seed coat.
  • Figure 4 shows RT-PCR analysis of endogenous genes corresponding to six of the constructs.
  • the GAPC gene was used as a control.
  • Figure 5 shows the vectors BJ60, BJ36 and BJ40 used in the construction of expression constructs and plant transformation constructs as described herein.
  • Figure 6 shows construction of examples of promoter-reporter constructs for transformation into a plant for determining promoter activity.
  • the examples show use of promoters of the Arabidopsis genes At5g07210, At5g46950 and Atlgl4520 with the uidA (GUS) reporter gene as described in the Examples and Experiments .
  • GUS uidA
  • Figure 7 shows construction of examples of expression constructs for transformation into a plant to increase endosperm and seed size.
  • the examples show use of promoters of the Arabidopsis gene At5g07210 with the heterologous Arabidopsis genes Atlg77210, At4g24650, At3g25280 and At4g21480, and use of promoters of the Arabidopsis genes At5g46950 and Atlgl4520 each with each of the heterologous Arabidopsis genes Atlg55600, At4g31900, At5gl4960, At5g39260, Atlg01530, Atlg65330 and At4g37750 as described in the Examples and Experiments .
  • Figure 8 shows representative alignments of promoters of Arabidopsis endosperm-expressed genes (bottom line) with those of corresponding rice genes (top line) .
  • A Alignment of promoter of Atlgl4520 with that of BAD53821;
  • B Alignment of promoter of At2g38900 with that of ABA98883.
  • the ATG of the Arabidopsis sequence is immediately after the 3' end of the sequence.
  • the ATG of the rice sequence is marked as a block.
  • Figure 9 shows examples of promoter-reporter constructs for use in determining promoter activity in a plant.
  • (a) shows use of the Arabidopsis gene At5g07210 promoter ( ⁇ CZE promoter') transcriptionally fused with the uidA (GUS) reporter gene as described in the Examples and Experiments.
  • (b) shows use of the Arabidopsis gene At5g46950 promoter ( ⁇ peripheral promoter' ) transcriptionally fused with the uidA (GUS) reporter gene as described in the Examples and Experiments.
  • Figure 10 shows GUS expression in plants transformed with reporter constructs corresponding to those described for Figure 9. GUS activity assayed in seeds driven by promoters of (a) Arabidopsis gene At5g07210 and (b) Arabidopsis gene At5g46950.
  • Figure 11 shows examples of alignments of promoters of Arabidopsis endosperm-expressed genes (top line "Query” sequence) with corresponding Brassica promoter sequences (bottom line “Sbjet” sequence) .
  • A Alignment of promoter of At5g07210 with BAC clone AC189496 of Brassica rapa, shown over positions 80604-82515 of AC189496;
  • B Alignment of promoter of Atlg41000 with BAC clone AC183495 of Brassica oleracea, shown over positions 157589-158507 of AC183495 (NB: the alignment shows the reverse complement of the promoter sequence due to the strand of the BAC clone that has been sequenced) .
  • C Alignment of promoter of At5g07210 with BAC clone AC189496 of Brassica rapa, shown over positions 80604-82515 of AC189496;
  • B Alignment of promoter of Atlg41000 with BAC clo
  • RNA was purified 2x through an OligodT column prepared by suspending the OligodT in a binding buffer (0.5 M NaCl, 0.01 M Tris-HCl pH 7.5, 0.5% SDS, 0.1 itiM EDTA).
  • PoIyA RNA was eluted in 0.01 M Tris-HCl pH 7.5, I mM EDTA.
  • a cDNA library from the polyA RNA of developing seeds was constructed using a Superscript Plasmid System for cDNA Synthesis and Cloning (Invitrogen, Paisley, UK) following the manufacturer's instructions.
  • the PoIyA RNA was ⁇ fotl-poly (dT) primed and the cDNA was size selected by passing through a column. Fractions containing 0.7-12 kb were pooled for cloning.
  • cDNA was directionally cloned with 5' Notl and 3' Sail adapters into the pSPORTl vector.
  • PCR amplification of cDNAs and macroarray construction The 2000 sequenced clones along with 4000 randomly picked clones were amplified by PCR in 94 well plates using Ml3 forward and reverse primers.
  • a typical lOO ⁇ l reaction contained 3 ⁇ l of template, lO ⁇ l of 1OX PCR buffer (Sigma-Aldrich, Gillingham, Dorset, UK) , 4 ⁇ l each of forward and reverse primers (10 ⁇ M) , and 0.5 ⁇ l of Taq Polymerase (Sigma-Aldrich). PCR conditions were as follows: 94°C for 4 min;
  • PCR products were separated on a 1% agarose gel to verify amplification. PCR reactions were dried down and resuspended in 40 ⁇ l of SdH 2 O. 94 well plates were transferred and concatenated into 384 conical well plates (Genetix, New Milton, Hampshire, UK) .
  • RNA in lO ⁇ l dH 2 O 25 ⁇ g of total RNA in lO ⁇ l dH 2 O was combined with 2 ⁇ l of (0.5 ⁇ g/ ⁇ l) Oligo dT primer and heated at 7O 0 C for 5 min, then cooled on ice.
  • the following components were added: 3 ⁇ l of 0.1 M DTT, 5 ⁇ l of 5X 1 st strand Buffer, 3 ⁇ l of 1 mM ChromaTide fluorescein-12- dUTP (Invitrogen) , 1 ⁇ l of 15 mM dNTPs, 1.2 ⁇ l of RNAsin, l ⁇ l of Superscript II reverse transcriptase (Invitrogen), and 2 ⁇ l of dH 2 O.
  • the RNA mix was incubated for Ih at 42 0 C.
  • EDTA was added to a final concentration of 2OmM to terminate the reaction.
  • Membranes were prehybridized in 20ml of formamide/Church buffer (1 mM EDTA, 0.5 M sodium phosphate pH 7.2, 7% SDS, 30% formamide) at 45°C for Ih. The probe was denatured at 95 0 C for 5 min. and snap cooled on ice. The prehybridization solution was replaced with 5 ml of formamide/Church buffer and the denatured probe added. Hybridisation was carried out overnight at 45°C. Membranes were washed twice at 45°C for 20 minutes in 0. IX SSC and then in 1% SDS.
  • the array membranes were scanned DNA side down at 488nm excitation and 540 nm emission with a FLA-3000 Phosphoimager (Fuji Life Sciences, Bedford, UK) . Hybridization intensities of the spots were quantified using ArrayVision 6 software (Amersham) after direct import of the 16-bit Phosphoimager files. Background was subtracted using the ⁇ surrounding the spot' feature of ArrayVision. Normalization of signals was performed with reference to all spots on the array using the reference setting of ArrayVision.
  • Genomic DNA was extracted from leaves of A. thallana (CoI-O accession) according to the method of Aljanabi and Martinez (1997) .
  • a genomic fragment was amplified by PCR using a high fidelity Taq polymerase (KOD Hi Fi Taq, Merck Biosciences, Nottingham, UK) using the primers indicated below.
  • Primers were designed to include restriction sites suitable for cloning. The primers used with the appropriate restriction sites are given below.
  • Amplification products were gel eluted and cloned into pGEMT (Promega, Southampton, UK) . After sequencing to confirm the correct amplification product, the desired fragments were excised from the vector using appropriate restriction enzymes and cloned upstream of the uidA (GUS) reporter gene in the vector BJ60.
  • the promoter :: GUS cassette was lifted out of BJ60 with a Not! digest and cloned into the WotI site of the binary vector BJ40.
  • the binary vector containing the promoter : :GUS cassette was transformed into Agrobacterium strain GV3101. A. thaliana plants (CoI-O accession) were transformed following the protocol of Clough and Bent (1998) .
  • a reporter vector was constructed, transformed into Arabidopsis and analysed to test for chalazal endosperm-specific or endosperm- preferred promoter activity to determine its suitability for driving chalazal endosperm-specific or endosperm-preferred expression of an operably linked nucleic acid capable of increasing endosperm and seed size.
  • the cloning strategy is shown in Figure 6 and the BJ60, BJ36, and BJ40 vectors used in the cloning strategies described in this and following examples are shown in Figure 5.
  • a reporter vector based on the promoter of At5g07210 was constructed as described below.
  • a 2.93 kb genomic fragment including the At5g07210 promoter ⁇ proAt5g07210/l, SEQ ID NO: 7) was amplified by the polymerase chain reaction (PCR) from Arabidopsis thaliana genomic DNA using the primers At5g07210-BJ60-F and At5g07210-BJ60-R (as indicated in above table) which introduce a Sail and a Xhol site at the 5' and 3' ends of the proAt5g07210 PCR fragment respectively:
  • At5g07210-BJ60-R 5' TTTCTCGAGCTTCGGATCGTCTACAGCTATAACTGC 3' (SEQ ID NO: 32) .
  • the proAt5g07210/l PCR fragment was A-tailed and ligated into pGEMT, then excised with Sail and Xhol and ligated into the Sail and Xhol sites of BJ60, 5' to the uidA reporter which includes a terminator signal, forming the vector proAt5g07210/l-BJ60.
  • the reporter cassette was excised with Notl from the vector proAt5g07210/l-BJ60 and ligated into the Notl sites of the binary vector BJ40, forming the vector for transformation, proAt5g07210/l- uidA-BJ40.
  • the binary vector was transformed into Agrobacterium tumefaciens and then into Arabidopsis thaliana as described herein.
  • the uidA gene encodes ⁇ -glucuronidase (GUS) , which was assayed as described herein using standard protocols (e.g. Jefferson, 1987).
  • GUS ⁇ -glucuronidase
  • Two reporter vectors were constructed, transformed into Arabidopsis and analysed to test for endosperm-specific or endosperm-preferred promoter activity to determine their suitability for driving endosperm-specific or endosperm-preferred expression of an operably linked nucleic acid capable of increasing endosperm and seed size.
  • the cloning strategy is shown in Figure 6.
  • a reporter vector based on the promoter of At5g46950 was constructed as described below.
  • a 2.12 kb genomic fragment including the At5g46950 promoter (pro
  • At5g46950/1 SEQ ID NO: 1 was amplified by PCR from Arabidopsis thaliana genomic DNA using the primers At5g46950-BJ60-F and
  • At5g46950-BJ60-R (as indicated in above table) which introduce a Sail and a Xhol site at the 5 f and 3' ends of the proAt5g46950 PCR fragment respectively:
  • At5g46950-BJ60-F 5' AAAGTCGACAATTGATGGAAATAAAATTTCGC 3' (SEQ ID NO: 19)
  • At5g46950-BJ60-R 5' AAACTCGAGCAAGAAAAGAGAGAAGATCACCAATGAAAC 3' (SEQ ID NO:
  • the proAt5g46950/1 PCR fragment was A-tailed and ligated into pGEMT, then excised with Sail and Xhol and ligated into the Sail and Xhol sites of BJ60, 5' to the uidA reporter which includes a terminator signal, forming the vector proAt5g46950/l-BJ60.
  • a reporter vector based on the promoter of Atlgl4520 was constructed as described below.
  • a 2.39 kb genomic fragment including the Atlgl4520 promoter was amplified by PCR from Arabidopsis thaliana genomic DNA using the primers Atlgl4520-BJ60-F and
  • Atlgl4520-BJ60-R (as indicated in above table) which introduce a Sail and a Xhol site at the 5' and 3' ends of the proAtlgl4520 PCR fragment respectively: Atlgl4520-BJ60-F: 5' TTTGTCGACGTTGACTGGTTGGTCTTTTCTCTAC 3 1 (SEQ ID NO: 15)
  • Atlgl4520-BJ60-R 5' TTTCTCGAGGCTTTTGCTGTAGAGATCATACTTGTTGAACAC 3' (SEQ ID NO: 16) .
  • the proAtlgl4520/l PCR fragment was A-tailed and ligated into pGEMT, then excised with Sail and Xhol and ligated into the Sail and Xhol sites of BJ60, 5' to the uidA reporter which includes a terminator signal, forming the vector proAtlgl4520/l-BJ60.
  • the reporter cassettes were excised with Notl from the vectors proAt5g46950/l-BJ60 and proAtlgl4520/l-BJ60 and ligated into the Notl sites of the binary vector BJ40, forming the vectors for transformation, proAt5g46950/l-uidA-BJ40 and proAtlgl4520/l-uidA- BJ40, respectively.
  • Each binary vector was transformed into Agrobacterium tumefaciens and then into Arabidopsis thaliana as described herein.
  • the uidA gene encodes ⁇ -glucuronidase (GUS), which was assayed as described herein using standard protocols (e.g. Jefferson, 1987) .
  • seedlings and immature seeds were assayed for uidA expression in GUS staining buffer as described in the example above, except that concentrations of K 3 Fe(CN) 6 and K 4 Fe (CN) 6 of up to 3mM were used.
  • a reporter vector was constructed, transformed into Arabidopsis and analysed to test for chalazal endosperm-specific or chalazal endosperm-preferred promoter activity to determine the promoter' s suitability for driving chalazal endosperm-specific or chalazal endosperm-preferred expression of an operably linked nucleic acid capable of increasing endosperm and seed size.
  • the cloning strategy was essentially as already described for Example 1.
  • a reporter vector based on the promoter of At5g07210 was constructed essentially as already described above for Example 1, except that Sail and Kpnl restriction sites were used instead of Sail and Xhol.
  • a genomic fragment including the At5g07210 promoter was amplified by PCR from Arabidopsis thaliana genomic DNA using primers which introduce a Sail and a Kpnl site at the 5' and 3' ends of the proAt5g07210/2 PCR fragment respectively (fragment shown as SEQ ID NO: 72) .
  • the Sall/Kpnl digested fragment was ligated into the Sail and Kpnl sites of BJ60, 5' to the uidA reporter, forming the vector shown in Figure 9a.
  • the reporter cassette containing the promoter and uidA gene was excised from this vector and transferred into the binary vector BJ40 to form the vector for transformation into Agrobacterium and then Arabidopsis as already described, ⁇ -glucuronidase (GUS) activity was assayed as already described.
  • GUS ⁇ -glucuronidase
  • a reporter vector was constructed, transformed into Arabidopsis and analysed to test for endosperm-specific or endosperm-preferred promoter activity to determine the promoter's suitability for driving endosperm-specific or endosperm-preferred expression of an operably linked nucleic acid capable of increasing endosperm and seed size.
  • the cloning strategy was essentially as already described for Example 2.
  • a reporter vector based on the promoter of At5g46950 was constructed essentially as already described above for Example 2, except that Sail and Kpnl restriction sites were used instead of Sail and Xhol .
  • a genomic fragment including the At5g46950 promoter was amplified by PCR from Arabidopsis thaliana genomic DNA using primers which introduce a Sail and a Kpnl site at the 5' and 3' ends of the proAt5g46950/2 PCR fragment respectively (fragment shown as SEQ ID NO: 71) .
  • the Sall/Kpnl digested fragment was ligated into the Sail and Kpnl sites of BJ60, 5' to the uidA reporter, forming the vector shown in Figure 9b.
  • the reporter cassette containing the promoter and uidA gene was excised from this vector and transferred into the binary vector BJ40 to form the vector for transformation into Agrobacterium and then Arabidopsis as already described, ⁇ -glucuronidase (GUS) activity was assayed ⁇ as already described.
  • GUS ⁇ -glucuronidase
  • RT-PCR RNA from root, stem, leaf, flower, seedling, pollen, and silique was extracted using an RNeasy Plant Minikit (Qiagen) following the manufacturer's instructions.
  • First strand cDNA was synthesized using a Reverse-IT 1 st Strand Synthesis Kit (ABgene, Epsom, UK) with oligodT and random primers, followed by PCR with Taq polymerase (Sigma-Aldrich) using gene specific primers to detect expression (see below) .
  • GapC was used as an internal control for amplification.
  • Examples of constructing and transforming into plants expression vectors comprising particular examples of endosperm promoters of the invention and particular examples of genes provided by the invention for expression or overexpression in endosperm for increasing endosperm and seed size are provided herein.
  • the cloning strategies are shown in Figure 7 and the BJ60, BJ36, and BJ40 vectors used in the cloning strategies described in this and following examples are shown in Figure 5.
  • Examples of expression vectors comprising a chalazal endosperm promoter with a heterologous gene for increasing endosperm size are provided (Example 3) , as well as examples of expression vectors comprising an endosperm-specific promoter with a heterologous gene for increasing endosperm size (Example 4) .
  • An expression vector based on the promoter of At5g07210 (accession numbers NC_003076 and NM_120803) is constructed as described below.
  • a 1.95 kb fragment including the At5g07210 promoter (proAt5g07210/2, SEQ ID NO: 8) with Sail and Kpnl linkers is amplified by PCR from Rrabidopsis thaliana genomic DNA using the primers At5g07210F and At5g07210R which introduce a Sail and a Kpnl site at the 5' and 3' ends of the proAt5g07210 PCR fragment respectively: At5g07210F: 5' tttgtcgacATGGGATGATCTCCGTTACC 3' (SEQ ID NO: 33) At5g07210R: 5' tttggtaccTCTAATAATCTTTGCAAAGAG 3' (SEQ ID NO: 34).
  • the proAt5g07210/2 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Sail and Kpnl and ligated into the Sail and Kpnl sites of BJ36, 5' to the ocs terminator signal, forming the vector proAt5g07210/2-BJ36.
  • Atlg77210 cDNA (accession no. NM_106370) is amplified by PCR from Arabidopsis thaliana cDNA using the primers Atlg77210F and Atlg77210R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the Atlg77210 PCR fragment respectively:
  • Atlg77210F 5' tttggtaccATGGCCGGTGGAGCTCTTACCG 3' (SEQ ID NO: 39)
  • Atlg77210R 5' tttcccgggTTATTCATCAACATCTTCGACATATTTC 3' (SEQ ID NO: 40).
  • the Atlg77210 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5 r to the ocs terminator signal, forming the vector proAt5g07210/2 : : Atlg77210-BJ36.
  • the At4g24650 cDNA (accession no. NM_118598) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At4g24650F and At4g24650R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the At4g24650 PCR fragment respectively:
  • At4g24650F 5' aaaggtaccATGAAGTGTAATGACAAAATGG 3' (SEQ ID NO: 41)
  • At4g24650R 5' aaacccgggCTAGTTAAGACTTAAAAATC 3' (SEQ ID NO: 42).
  • the At4g24650 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :At4g24650 -BJ36.
  • the At3g25280 cDNA (accession no. NM_113434) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At3g25280F and At3g25280R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the At3g25280 PCR fragment respectively:
  • At3g25280F 5' tttggtaccATGGAGAATGATATGGAAGAG 3' (SEQ ID NO: 43)
  • At3g25280R 5' tttcccgggTTAATATCTCTTTGCCC 3' (SEQ ID NO: 44).
  • the At3g25280 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: : At3g25280 -BJ36.
  • At4g21480 The At4g21480 cDNA (accession no. NM_118268) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At4g21480F and At4g21480R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the At4g21480 PCR fragment respectively: At4g21480F: 5' tttggtaccATGCCTTCAGTCGGAATTG 3' (SEQ ID NO: 45) At4g21480R: 5' tttcccgggCTAGATAACAACTTTCGTTAGATTC 3' (SEQ ID NO: 46).
  • the At4g21480 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: : At4g21480-BJ36. Construction of binary vectors and transforma tion into Arabidopsis thaliana
  • Expression cassettes (including the ocs terminator) are excised with
  • the binary vectors are transformed into Agrobacterium tumefaciens and then into Arabidopsis thaliana. Transformed plants are assessed for the effect of expression of the heterologous gene as described below.
  • At5g46950 accession numbers NC_003076 and NM_124066
  • a 2.08 kb fragment including the At5g46950 promoter (proAt5g46950/2, SEQ ID NO: 2) with Sail and Kpnl linkers is amplified by PCR from Arabidopsis thaliana genomic DNA using the primers At5g46950F and At5g46950R which introduce a Sail and a Kpnl site at the 5' and 3' ends of the proAt5g46950/2 PCR fragment respectively: At5g46950F: 5' tttgtcgacAATTGATGGAAATAAAATTTCGC 3' (SEQ ID NO: 35) At5g46950R: 5' tttggtaccTTTTTTGTTTTTTTACTTTGAGAAGAAG 3' (SEQ ID NO: 36).
  • the proAt5g46950/2 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Sail and Kpnl and ligated into the Sail and Kpnl sites of BJ36, 5' to the ocs terminator signal, forming the vector proAt5g46950/2-BJ36.
  • Atlgl4520 accession numbers NC_003070 and NM_101319) is constructed as described below.
  • a 1.01 kb fragment including the Atlgl4520 promoter (proAtlgl4520/2, SEQ ID NO: 4) with Sail and Kpnl linkers is amplified by PCR from Arabidopsis thaliana genomic DNA using the primers Atlgl4520F and Atlgl4520R which introduce a Sail and a Kpnl site at the 5' and 3 1 ends of the proAtlgl4520/2 PCR fragment respectively: Atlgl4520F: 5' tttgtcgacGTTGACTGGTTGGTCTTTTCTC 3 1 (SEQ ID NO: 37) Atlgl4520R: 5' tttggtaccGGAGATTCTTCAGATCCAGAG 3 ' (SEQ ID NO: 38).
  • the proAtlgl4520/2 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Sail and Kpnl and ligated into the Sail and Kpnl sites of BJ36, 5' to the ocs terminator signal, forming the vector proAtlgl4520/2-BJ36.
  • Atlg55600 cDNA (accession no. NM_104436) is amplified by PCR from
  • Atlg55600 PCR fragment respectively: Atlg55600F: 5' tttggtaccATGAGTGATTTTGATGAAAACTTCATCG 3' (SEQ ID NO:
  • Atlg55600R 5' aaacccgggCTACATGTCGACACCAAACTTAAAATACAGGCG 3' (SEQ ID NO: 1
  • the Atlg55600 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: : Atlg55600-BJ36.
  • At4g31900 cDNA (accession no. NM_119341) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At4g31900F and At4g31900R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the At4g31900 PCR fragment respectively: At4g31900F: 5' aaaggtaccATGGCTAATCTGTTGCAAAGGC 3' (SEQ ID NO: 49)
  • At4g31900R 5' tttcccgggTCAATCCAGCACAATGATGTTATCC 3' (SEQ ID NO: 50) .
  • the At4g31900 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :At4g31900-BJ36.
  • the At5gl4960 cDNA (accession no. NM_121500) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At5gl4960F and At5gl4960R which introduce a Kpnl and an Xmal site at the 5 ' and 3 ' ends of the At5gl4960 PCR fragment respectively:
  • At5gl4960F 5" tttggtaccATGGATTCTCTCGCTCTCGCTCC 3' (SEQ ID NO: 51)
  • At5gl4960R 5' aaacccgggTCATTTCTCCCGACCAAACTCTTC 3 F (SEQ ID NO: 52).
  • the At5gl4960 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5gO721O promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :At5gl4960-BJ36.
  • At5g39260 cDNA (accession no. NM_123288) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At5g39260F and At5g39260R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the At5g39260 PCR fragment respectively: At5g39260F: 5' aaaggtaccATGAAATTGCTAGAAAAAATGAC 3' (SEQ ID NO: 53) At5g39260R: 5' tttcccgggTTAAAAGTTAGTCTTTCCATC 3' (SEQ ID NO: 54).
  • the At5g39260 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5'g07210/2 : :At5g39260-BJ36.
  • the Atlg01530 cDNA (accession no. NM__100035) is amplified by PCR from Arabidopsis thaliana cDNA using the primers Atlg01530F and Atlg01530R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the
  • AtlgO153O PCR fragment respectively:
  • Atlg01530F 5' tttggtaccATGGCGAGAAAGAATCTTGG 3' (SEQ ID NO: 55)
  • AtlgO153OR 5' aaacccgggCTAATAGTAACGAGCCCAATAC 3' (SEQ ID NO: 56).
  • the Atlg01530 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :Atlg01530-BJ36.
  • Atlg65330 The Atlg65330 cDNA (accession no. NM_105207) is amplified by PCR from Arabidopsis thaliana cDNA using the primers Atlg65330F and Atlg65330R which introduce a Kpnl and an Xmal site at the 5' and 3' ends of the Atlg65330 PCR fragment respectively: Atlg65330F: 5' aaaggtaccATGAGGGGGAAGATGAAG 3' (SEQ ID NO: 57)
  • Atlg65330R 5' tttcccgggCTAGAGATCATTGATGATGTTAGG 3' (SEQ ID NO: 58).
  • the Atlg65330 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5gO7210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :Atlg65330-BJ36.
  • the At4g37750 cDNA (accession no. NM_119937) is amplified by PCR from Arabidopsis thaliana cDNA using the primers At4g37750F and At4g37750R which introduce a Kpnl and an Xmal site at the 5 ' and 3 ' ends of the
  • At4g37750 PCR fragment respectively:
  • At4g37750F 5' tttggtaccATGAAGTCTTTTTGTGATAATGATG 3' (SEQ ID NO: 59)
  • At4g37750R 5' aaacccgggTCAAGAATCAGCCCAAGCAGCG 3' (SEQ ID NO: 60).
  • the At4g37750 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAt5g07210/2-BJ36 vector, 3' to the At5g07210 promoter and 5' to the ocs terminator signal, forming the vector proAt5g07210/2: :At4g37750-BJ36.
  • Atlg556O0 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the Atlg55600 PCR fragment is A-tailed and ligated into pGEMT, and then excised with
  • the At4g31900 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the At4g31900 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAtlgl4520/2-BJ36 vector, 3' to the Atlgl4520 promoter and 5' to the ocs terminator signal, forming the vector proAtlgl4520/2: :At4g31900-BJ36.
  • proAtlgl4520/2: :At5gl4960 The At5gl4960 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the At5gl4960 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAtlgl4520/2-BJ36 vector, 3' to the Atlgl4520 promoter and 5' to the ocs terminator signal, forming the vector proAtlgl4520/2: :At5gl4960-BJ36.
  • the At5g39260 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the At5g39260 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAtlgl4520/2-BJ36 vector, 3' to the Atlgl4520 promoter and 5' to the ocs terminator signal, forming the vector proAtlgl4520/2: :At5g39260-BJ36.
  • Atlg01530 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the Atlg01530 PCR fragment is A-tailed and ligated into pGEMT, and then excised with
  • the Atlg65330 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the Atlg65330 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAtlgl4520/2-BJ36 vector, 3' to the Atlgl4520 promoter and 5' to the ocs terminator signal, forming the vector proAtlgl4520/2: :Atlg65330-BJ36.
  • proAtlgl4520/2: :At4g37750 The At4g37750 cDNA with Kpnl and Xmal linkers is amplified by PCR from Arabidopsis thaliana cDNA as described above.
  • the At4g37750 PCR fragment is A-tailed and ligated into pGEMT, and then excised with Kpnl and Xmal and ligated into the Kpnl and Xmal sites of the proAtlgl4520/2-BJ36 vector, 3' to the Atlgl4520 promoter and 5' to the ocs terminator signal, forming the vector proAtlgl4520/2: :At4g37750-BJ36.
  • Expression cassettes (including the ocs terminator) are excised with
  • the binary vectors are transformed into Agrobacterium tumefaciens and then into Arabidopsis thaliana. Transformed plants are assessed for the effect of expression of the heterologous gene as described below.
  • the effect of endosperm expression of the heterologous gene is determined by assessing endosperm size and/or seed size in samples of endosperm and/or seeds from a transformed plant (test plant) and comparison with a control plant in which the heterologous gene is not present and/or is not expressed.
  • Endosperm size may be assessed by counting endosperm nuclei and/or measuring the volume or cross- sectional area of chalazal endosperm during seed development. Seed size may be assessed by weighing mature seeds.
  • the control plant for comparison with a test plant expressing a heterologous gene is the same accession or cultivar of the same species as the test plant that does not contain the relevant construct or expression cassette and/or does not express the heterologous gene.
  • Constructs have been produced for analysing the effect of chalazal endosperm expression of a sucrose transporter gene on endosperm and seed size using the chalazal endosperm promoter of the At5g07210 gene (promoter fragment proAt5g07210/2 (SEQ ID NO: 8)) and the cDNA encoding the sucrose transporter gene Atlg77210 (accession number NM_106370) , essentially as described above for Example 3.
  • constructs have been produced for analysing the effect of chalazal endosperm expression of an adenylate isopentenyltransferase (IPT4) gene on endosperm and seed size using the chalazal endosperm promoter of the At5g07210 gene (promoter fragment proAt5g07210/2 (SEQ ID NO: 8)) and the cDNA encoding the IPT4 gene At4g24650 (accession number NM_118598), essentially as described above for Example 3.
  • IPT4 adenylate isopentenyltransferase
  • expression cassettes have been produced for analysing the effect of endosperm-specific expression of a transcription factor gene on endosperm and seed size using the endosperm-specific promoter of the At5g46950 gene (promoter fragment proAt5g46950/2 (SEQ ID NO: 2) ) and the cDNA encoding the AINTEGUMATA transcription factor gene At4g37750 (accession number NM_119937), essentially as described above for Example 4.
  • expression cassettes have been produced for analysing the effect of endosperm-specific expression of a MADS-box transcription factor gene on endosperm and seed size using the endosperm-specific promoter of the At5g46950 gene (promoter fragment proAt5g46950/2 (SEQ ID NO: 2)) and the cDNA encoding the PHERESl (PHEl) MADS-box transcription factor gene Atlg65330 (accession number NM_105207 ) , essentially as described above for Example 4.
  • expression cassettes have been produced for analysing the effect of endosperm-specific expression of a MADS-box transcription factor gene on endosperm and seed size using the endosperm-specific promoter of the At5g46950 gene (promoter fragment proAt5g46950/2 (SEQ ID NO: 2)) and the cDNA encoding the AGL28 MADS-box transcription factor gene Atlg01530 (accession number NM_100035) , essentially as described above for Example 4.
  • NCBI rice genomic sequence database
  • protein-protein BLAST blastp
  • Positive hits obtained from the searches which represent BAC clones or chromosome fragments comprising the corresponding rice genes, were chosen for a detailed homology search of the 5' sequence of the rice genes.
  • the location of the corresponding rice gene within the overall sequence of the BAC or chromosome fragment provided by the BLASTp search was identified from the alignment of the Arabidopsis gene with the BAC or chromosome fragment.
  • a Brassica genomic sequence database was searched (http: //brassica.bbsrc.ac.uk/BrassicaDB/blast form. html) using each of the Arabidopsis endosperm promoters provided herein using BLAST (blastn) (Altschul et al . (1997)), which compares nucleic acid with nucleic acid sequence. Positive hits obtained from the searches represent BAC clones or chromosome fragments comprising corresponding Brassica promoter sequences.
  • a cDNA macroarray based on seeds enriched for both proliferative free- nuclear peripheral endosperm and for chalazal endosperm. Tissue- specific arrays such as this ensure a large number of true positive signals (Herwig et al., 2001) .
  • endosperm-expressed genes contains hundreds of unique annotated genes that have not previously been identified in other microarray experiments to detect seed-expressed transcripts (see below) .
  • eight genes we selected for validation by promoter-reporter fusions and RT-PCR, seven were expressed in proliferating endosperm, and one emerged as a chalazal endosperm-specific marker that has not been identified in other seed array experiments.
  • 1,302 unique sequences to have significant homology to the A. thaliana genome (i.e. identity over at least 70% of a sequence at least 200 nucleotides in length) .
  • 1,253 represented annotated genes with At identifiers as of January 2005.
  • the ⁇ whole seed' gene list is available in Supplemental Data online (Table Sl) .
  • Approximately 3% of the ESTs had roughly equal similarity to more than one A. thaliana sequence, possibly because they were derived from A. arenosa transcripts, or because of polymorphisms between the CoI-O accession (used as the source of published A. thaliana genomic sequence) and C24 (the accession used in this experiment) . Where this occurred we have provided the alternative At identifiers.
  • RNA from leaf, stem, root, flower, and siliques containing seeds at 2-7 DAP from A. thaliana, and free-nuclear endosperm extracted from Brassica napus seeds up to the early heart stage of embryogenesis .
  • A. thaliana seeds are too small ( ⁇ 300Om in length) to extract sufficient endosperm for the amount of RNA required
  • B. napus was chosen as a source of endosperm because it has larger seeds and is a crop plant closely related to A. thaliana .
  • Each hybridization experiment was performed in duplicate using independent RNA samples. Signal intensities were quantified and data were normalized using Array Vision software (see Methods) . The correlation coefficients between replicates ranged from 0.89 to 0.96 depending on the tissue; these are within recommended limits for cDNA nylon arrays (Herwig et al., 2001) . The entire data set is available in Supplemental Data online (Table Sl) .
  • At identifiers were represented by more than one spot on the array. Three-quarters of these occurred only two or three times but several were found at a greater frequency, most likely reflecting high expression in the A. thaliana X A. arenosa seeds.
  • the most frequently occurring single gene (23 spots) was At4gl2960, annotated as a gamma interferon responsive lysosomal thiol reductase (GILT) family protein.
  • GILT gamma interferon responsive lysosomal thiol reductase
  • GILT proteins may facilitate protein unfolding by reducing disulfide bonds in acidic conditions; the only known role is for antigen processing in humans and mice, although GILT homologues have been cloned from fruit flies, nematodes, and zebrafish as well as Arabidopsis (Phan et al., 2001).
  • silique-preferred if it was absent in root, shoot, or leaf, and if the minimum signal for hybridization to A. thaliana siliques was at least 1.5-fold greater than the maximum signal for these tissues.
  • the silique-preferred set could represent genes expressed in embryos or seed coat rather than endosperm, but could also include Arabidopsis endosperm genes that did not hybridize to Brassica endosperm RNA, for example because of lack of sequence homology or low transcript abundance.
  • RNA from flowers or siliques at three stages of development against the Affymetrix ATHl array (Redman et al., 2004), ranging from flowers with fully developed ovules to young siliques containing seeds with an embryo proper of 4-8 cells. 1,043 genes were called Reproduction related' ; of these, 914 were in our whole seed list, but our list also contained 197 others.
  • our endosperm/silique- preferred list contained 13 genes not identified by Ruuska et al. (2002) or Hennig et al. (2004).
  • RNA prepared from flowers hybridized to a high proportion of the cDNAs we called endosperm-expressed we also compared our endosperm list with data from a recent transcription profiling experiment for A. thaliana pollen (Honys and Twell, 2004). In this experiment, RNA was isolated from male gametophytes at four stages of
  • endosperm-expressed genes contained many with previously reported expression patterns or mutant phenotypes indicating activity in this tissue. These included seven genes previously classed by seed-lethal mutant phenotypes as emb (embryo defective) (Tzafrir et al., 2003; www.seedgenes.org): (Atlg60170, Atlg80070, At2g38280, At4g28210, At4g39090, At5g55940) . In some cases embryo lethality in emb mutants may reflect a primary effect in endosperm rather than embryo. We detected a sequence with homology to either PHERESl
  • PHEl PHEl
  • PHE2 PHE2
  • MADS-box transcription factors expressed in seeds and expression of PHEl in endosperm is regulated by a Polycomb repressive complex containing FERTILISATION INDEPENDENT SEED (FIS) class proteins
  • FIS FERTILISATION INDEPENDENT SEED
  • ASKIl encodes one of the Arabidopsis SKPl-like proteins involved in ubiquitin-mediated proteolysis; the endosperm-expressed list also includes ASKl (Atlg75950), ASK16 (At2g03190) , and ASKl2 (At4g34470). Of these four, ASK 16 and ASKIl were previously reported to show silique- preferred expression (Zhao et al., 2003).
  • LTPs lipid transfer family proteins
  • At2g38540 (LTPl), At3g08770 (LTP6) , At3g22600, At3g43720, At3g57310, At4gl5160, At5gO937O, At5g59310 (LTP4), At5g59320 (LTP3) .
  • LTPs are capable of binding fatty acids and transferring phospholipids between membranes; proposed biological functions include membrane biogenesis, cutin and wax assembly, and defense against pathogen attack (Arondel et al., 2000).
  • the endosperm list also contained five genes encoding enzymes involved in pigmentation of the inner layer of the maternally derived seed coat: BANYULS (BAN) (dehydroflavonol 4-reductase) (Atlg61720) , TRANSPARENT TESTA 18 ( TTl8) /LDOX (leucoanthocyanidin dioxygenase) (At4g22880), TT4 (chalcone synthase) (At5gl3930) , TT5 (chalcone flavone isomerase) (At3g55120) , and TT19/ ATGSTF12 (glutathione S- transferase) (At5gl7220) .
  • TRANSPARENT TESTA genes are expressed in many tissues (Shirley et al., 1995; Pelletier et al., 1997; Kitamura et al., 2004), but BAN has been shown by in situ hybridization to be restricted to the inner layer of the seed coat (Devic et al . , 1999). This suggests that there was some contamination of the extracted B. napus endosperm used in preparing RNA for hybridization by surrounding tissue.
  • transcription factors AINTEGUMENTA and ARGOS which increase organ size due to prolonged cell proliferation when expressed under the 35S promoter
  • E2Fa and DPa 1 which occupy a key regulatory position in the cell cycle, and also increase cell division when ectopically expressed (De Veylder et al., 2002) .
  • Suitable genes which are preferred heterologous genes for expression in endosperm to increase endosperm size and seed size include genes Atlg55600 (WRKYlO transcription factor MINISEED3, accession number NM_104436) , At4g31900 (chromatin remodelling factor, accession number NM_119341), At5gl4960 (DEL2/E2Fd/E2Ll transcription factor, accession number NM_JL21500) , At5g39260 (EXPA21 expansin, accession number NM_123288), Atlg01530 (AGL28 MADS-box transcription factor, accession number NM_100035) , Atlg65330 (PHERESl (PHEl) MADS-box transcription factor, accession number NM_105207) , At4g37750 (AINTEGUMATA (ANT) transcription factor, accession number NM_119937) ⁇ Atlg77210 (a sugar transporter, accession number NM_106370) , At4g24650 (aden
  • promoter A major goal of the project was to identify promoters to drive endosperm-specific gene expression.
  • promoter To this end we made promoter :: reporter constructs based on ten genes from the endosperm/silique-preferred shortlist; these were chosen to represent a variety of predicted activities, including some with unknown function. Endosperm expression had not been previously reported for any of these ten genes tested.
  • a genomic promoter fragment as described was translationally fused to a uidA (GUS) reporter gene and transformed into A. thaliana.
  • GUS uidA
  • GUS activity was assayed in seeds at the globular and heart stages of embryo-genesis (i.e. containing proliferating free-nuclear endosperm), pollen, and whole seedlings (Fig. 3A) as well as flowers and mature leaves (not shown) .
  • embryo-genesis i.e. containing proliferating free-nuclear endosperm
  • pollen and whole seedlings
  • Fig. 3A whole seedlings
  • Fig. 3A for each construct we assayed three to five independently transformed lines; we observed variation in staining intensity but not in spatial patterns of expression. It can be difficult to distinguish staining in the endosperm and in the surrounding seed coat; therefore for some reporters we also crossed a wild-type seed parent with a transgenic pollen parent (Fig. 3B) . Since the seed coat is of maternal origin, staining in the seed cavity would show only endosperm expression of the reporter. However, staining of the seed coat following self-pollination does not necessarily
  • proAt5g46950/l : :uidA and proAtlgl4520/l : : uidA constructs are active in seedlings.
  • proAt5g46950/l : :uidA construct immature seeds at the globular stage of embrogenesis showed GUS staining throughout the endosperm, indicating endosperm activity in seeds.
  • no GUS staining was observed elsewhere in the plant, indicating endosperm- specific activity in seeds.
  • proAtlgl4520/l : :uidA construct immature seeds at the globular/heart stage of embrogenesis showed GUS staining throughout the endosperm, indicating endosperm activity in seeds.
  • this construct also, no GUS staining was observed elsewhere in the plant, including none in the embryo, indicating endosperm-specific activity in seeds.
  • GUS activity was assayed in seeds containing proliferating free nuclear endosperm as described above, including for Examples 1 and 2.
  • GUS expression patterns in seeds for the At5g07210 and At5g46950 promoter transcriptional fusion reporter constructs are shown in Figure 10. In both cases, the expression patterns observed in seeds were the same as those observed for the At5g07210 and At5g46950 promoter translational fusion reporter constructs ( Figure 3A) .
  • the seed expression pattern was the same as that obtained for the At5g07210 promoter translational fusion reporter construct of Example 1.
  • immature seeds at the globular stage of embrogenesis showed GUS staining only in chalazal endosperm ( Figure 10a) , confirming chalazal endosperm-specific activity for the At5g07210 promoter.
  • the seed expression pattern was the same as that obtained for the At5g46950 promoter translational fusion reporter construct of Example 2.
  • immature seeds at the globular stage of embrogenesis showed GUS staining throughout the endosperm ( Figure 10b) , confirming endosperm activity for the At5g46950 promoter.
  • At5g46960 we were able to design primers predicted to be specific to each gene, suggesting it is unlikely that we amplified different endogenous genes (i.e. from the same gene family) than those used for promoter selection. Comparisons of endogenous gene expression and reporter activity suggest that the upstream sequences we chose to drive reporter expression did not necessarily confer expression patterns identical to those of the endogenous gene. Therefore it should be feasible to augment the number of endosperm-specific promoters gained from this experiment by promoter deletion. In particular, identifying the region of the ARR21 promoter that gives endosperm but not pollen expression would provide a unique tool.
  • Promoter motifs We searched the putative promoters used in the reporter constructs for motifs that could be involved in endosperm or pollen expression with the PLACE (Higo et al., 1999; http://www.dna.affrc.go.jp/PLACE) and PlantCARE (Lescot et al., 2002; http: //intra. psb.ugent.be: 8080/PlantCARE/) facilities (data not shown) .
  • This identified sequences previously linked to endosperm expression such as the prolamin box (-300 element) , GCN4, and AACA motifs which cooperate to confer endosperm activity in some cereal genes (Marzabal et al . , 1998; Wu et al .
  • Each of the Arabidopsis endosperm-expressed genes At5g46950, Atlgl4520, At2g38900, At5g07210, At2g41000 and At5g39260 were used to identify corresponding genes and promoters in rice.
  • Corresponding rice genes and their homology with the Arabidopsis genes are shown in Table 4.
  • At5g07210 BAD72541 131 bits (330) 8.00E-29 93/283 146/283
  • Corresponding rice promoters are shown as SEQ ID NOS: 61 to 70 and examples of representative alignments with corresponding Arabidopsis promoters are shown in Figure 8.
  • Promoter sequences from Brassica corresponding to the promoters of Arabidopsis endosperm-expressed genes
  • Corresponding Brassica promoter sequences and their homology with the Arabidopsis promoters are shown in Table 5.
  • Corresponding Brassica promoter sequences are shown as SEQ ID NOS: 73 to 81 and examples of alignments with corresponding Arabidopsis promoters are shown in Figure 11.
  • endosperm accounts for a large proportion of human nutrition and is also a major determinant of seed viability and size, not only in cereals but in species with ephemeral endosperms such as soybean and oilseed rape.
  • the extent of endosperm proliferation early in seed development is a crucial component in setting seed size; therefore a biotechnological approach to modifying this trait requires promoters active in early endosperm. To find such promoters we constructed an array based on cDNAs extracted from developing Arabidopsis seeds enriched for proliferating endosperm.
  • Hybridization with RNA extracted from vegetative and reproductive tissues, including endosperm, and subsequent data filtering yielded sets of endosperm-expressed and endosperm-preferred genes, including many hundreds not previously identified in array experiments designed to detect genes expressed in Arabidopsis seeds.
  • eight promoters selected for validation seven were active in early endosperm, three of them with no detected activity elsewhere in the plant, and one specifically in the chalazal endosperm. Therefore this strategy has yielded proliferative phase endosperm promoters, providing means for altering seed size in plants.
  • Lipid transfer proteins are encoded by a small multigene family in Arabidopsis thaliana . Plant Science 157, 1-12.
  • Floral dip a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana . Plant Journal 16, 735-743.
  • Plant formin AtFH5 is an evolutionarily conserved actin nucleator involved in cytokinesis. Nature Cell Biology 7, 374-380. Jones, R. J., Schriber, B. M. N. and Roessler, J. A. (1996) Kernel sink capacity in maize: genotypic and maternal regulation. Crop Science 36, 301-306.
  • TRANSPARENT TESTA 25 is involved in the accumulation of both anthocyanins and proanthocyanidins in Arabidopsis. Plant Journal 37, 104-114.
  • Arabidopsis MSIl is a component of the MEA/FIE Polycomb group complex and required for seed development. EMBO Journal 22, 4804-4814.
  • Ruuska S. A., Girke, T., Benning, C. and Ohlrogge, J. B. (2002) Contrapuntal networks of gene expression during Arabidopsis seed filling. Plant Cell 14, 1191-1206.
  • MIPS Arabidopsis thaliana Database (MatDB) : an integrated biological knowledge resource for plant genomics. Nucleic Acids Research 32, D373-D376.
  • Arabidopsis gene At5g46950 (accession numbers NC__003076 and NM_124066) promoter sequence. Start codon shown in bold. Coding sequence shown in uppercase. proAt5g46950/l aattgatggaaataaaatttcgctttacatatatgatctcttttaatttttggtactatatacatata atatttactatttagaaaccttacaagaaaacaaactatcttttattaaagtagcaaatacgcgattttt gttaattagagtgtatataaaaattggttcaaattcagtaattaaaagtttgttcaacca atagaacataaataagtcagaatctatagttttgaatggttgcaaatattggcatggactaatcagcta gcggacaaaccattct
  • SEQ ID NO: 2 Arabidopsis gene At5g46950 (accession numbers NC_003076 and NM_124066) promoter sequence.
  • proAt5g46950/2 aattgatggaaataaaatttcgctttacatatatgatctcttttaatttttggtactatatacatata
  • SEQ ID NO: 3 Arabidopsis gene Atlgl4520 (accession numbers NC_003070 and
  • SEQ ID NO: 4 Arabidopsis gene Atlgl4520 (accession numbers NC_003070 and
  • Arabidopsis gene At2g38900 (accession numbers NC_003071 and NM_129447) promoter sequence. Start codon shown in bold. Coding sequence shown in uppercase. promoter At2g38900/l cgggcaatgatatatatgtcttgggtgcgttacaaggcatcgtttgcatgttgagttggataagtcaac tgtctttttttttggtttgtagtagctgcctttttttttcctttgttttaagaaatagcccgaaaaaaagaatgttctacatttcggagcagaaaactaaccgaatgagtttttggtcggatcatcggatcgat cagatatattttgagttacgaactgttataaaaaaagccataattttgtgtgtgt
  • Arabidopsis gene At2g38900 accession numbers NC_003071 and
  • promoter sequence promoter At2g38900/2 cgggcaatgatatatatgtcttgggtgcgttacaaggcatcgtttgcatgttgagttggataagtcaac tgtcttcttttttggttgtagtagctgccttttttttttttccttttgttttaagaaatagcccgaaaaaaaagaatgttctacatttcggagcagaaaactaaccgaatgagtttttggtcggatcatcggatcgat cagatatattttgagttacgaactgttataaaaaaagccataattttgtgtgtgagtttgcaaaatacct tataacttgttatttgagattgcacctccatatatattaatt
  • SEQ ID NO: 7 Arabidopsis gene At5g07210 (accession numbers NC_003076 and
  • Arabidopsis gene At2g41000 (accession numbers NC_003071 and NM_129665) promoter sequence. Start codon shown in bold. Coding sequence shown in uppercase. proAt2g41000/l- tctggtgtatgctgttgtcaaatttaaatacaagtataagcaaaaaataaaccaatttactatttggt atctgcttggaaattaacgtatcaatatttatcaaatgttttgattgtcctatacatctgttattatta ttttacgtcgaaattgagatatattgattaaaaatgattaaagtcatagtacaattgttcaattatcc cagaggatagacaatcagaattgatggaaagaaaaagacacatacagagttgaaattacattataga cacccaaatttacgttttt
  • Arabidopsis gene At2g41000 (accession numbers NC_003071 and NM_129665) promoter sequence.
  • SEQ ID NO: 11 Arabidopsis gene At5g39260 (accession numbers NC_003076 and
  • Arabidopsis gene At5g39260 (accession numbers NC__003076 and NM_123288) promoter sequence.
  • SEQ ID NO: 13 Arabidopsis gene Atlg62080 (accession numbers NC_003070 and NM_104889) promoter sequence. Start codon shown in bold. Coding sequence shown in uppercase. proAtlg62080/l gtcttgtcaaggattttgctttatttttgaatttctgaaaccttgattttagacttcaaaagctatcc aacttggaaatgtaactttttaatgtaatgttgaatcttttatgatatattaccaagattggtagcttag aaacggcaatgcagccaattagtagacatgtgaatccaaactatgtgtttgctgactttagttttctaa ataatttttcttctctcttgaccttgttcgtcttgaattttttttttgaat
  • SEQ ID NO: 14 Arabidopsis gene Atlg62080 (accession numbers NC_003070 and
  • Atlgl4520-F (Atlgl4520-BJ60-F)
  • SEQ ID NO: 19 Primer for amplifying promoter fragment of Arabidopsis gene At5g46950 At5g46950-F (At5g46950-BJ60-F) AAAGTCGACAATTGATGGAAATAAAATTTCGC
  • SEQ ID NO: 20 Primer for amplifying promoter fragment of Arabidopsis gene At5g46950 At5q46950-R (At5g46950-BJ60-R) AAACTCGAGCAAGAAAAGAGAGAAGATCACCAATGAAAC
  • SEQ ID NO: 21 Primer for amplifying promoter fragment of Arabidopsis gene At2g41000 At2q41000-F TCTGGTGTATGCTGTTGTCA
  • SEQ ID NO: 22 Primer for amplifying promoter fragment of Arabidopsis gene At2g41000 At2q41000-R CCATGGGTTGATTGCCCTTTTTTATG
  • SEQ ID NO: 23 Primer for amplifying promoter fragment of Arabidopsis gene Atlg62060 Atlg62060-F TTTGTCGACTGTTTTTGGTTTATAATTTATTTTTAAGATTTTGG
  • SEQ ID NO: 24 Primer for amplifying promoter fragment of Arabidopsis gene Atlg62060 Atlq62060-R TTTCTCGAGTTTGGACACTTCTTCGACCTGCCTTGC
  • SEQ ID NO: 25 Primer for amplifying promoter fragment of Arabidopsis gene Atlg62000 Atlg62000-F AAAGTCGACATCAGAAGGCACACCTACAAC
  • SEQ ID NO: 26 Primer for amplifying promoter fragment of Arabidopsis gene Atlg62000 Atlq62000-R AAACTCGAGAAACTTTGTGGCATTCAT
  • At5g07210-F (At5g07210-BJ60-F) TTTGTCGACATGGGATGATCTCCGTTACC
  • Primer for amplifying promoter fragment of Arabidopsis gene At5g07210 At5g07210F tttgtcgacATGGGATGATCTCCGTTACC
  • Atlg55600R aaacccgggCTACATGTCGACACCAAACTTAAAATACAGGCG
  • At4g31900F aaaggtaccATGGCTAATCTGTTGCAAAGGC
  • SEQ ID NO: 51 Primer for amplifying cDNA of Arabidopsis gene At5gl4960
  • At5gl4960R aaacccgggTCATTTCTCCCGACCAAACTCTTC
  • At5g39260F aaaggtaccATGAAATTGCTAGAAAAAATGAC
  • Atlg01530R aaacccgggCTAATAGTAACGAGCCCAATAC
  • Atlg65330F aaaggtaccATGAGGGGGAAGATGAAG
  • SEQ ID NO : 60 Primer for amplifying cDNA of Arabidopsis gene At4g37750 At4g37750R aaacccgggTCAAGAATCAGCCCAAGCAGCG
  • Rice promoter sequence (BAD07661 within AP004096) corresponding to promoter of Arabidopsis gene At5g46950
  • Rice promoter sequence (BAD53821 within AP003762) corresponding to promoter of Arabidopsis gene Atlgl4520
  • Rice promoter sequence (ABA98883 within DPOOOOIl) corresponding to promoter of Arabidopsis gene At2g38900
  • Rice promoter sequence (BAD72541 within AP007226) corresponding to promoter of Arabidopsis gene At5g07210
  • SEQ ID NO: 70 Rice promoter sequence (BAD81125 within AP000837) corresponding to promoter of Arabidopsis gene At5g39260
  • Arabidopsis gene At5g46950 (accession numbers NC_003076 and NM_124066) promoter sequence with Sail and Kpnl linkers (in italics) .
  • proAt5g46950/2 (Sall/Kpnl) tttgrtcgacaattgatggaaataaaatttcgctttacatatatgatctctttttaattttttggtactat atacatataatatttactatttagaaacctttacaagaaaacaaactatcttttattaaagtagcaaatac
  • SEQ ID NO: 72 Arabidopsis gene At5g07210 (accession numbers NC_003076 and
  • Brassica promoter sequence (within BH965409) corresponding to promoter sequence of Arabidopsis gene At5g46950 NB: this is the reverse complement of the promoter sequence
  • Brassica promoter sequence (within AC189499) corresponding to promoter sequence of Arabidopsis gene Atlgl4520 NB: this is the reverse complement of the promoter sequence
  • Brassica promoter sequence (within AC189499) corresponding to promoter sequence of Arabidopsis gene Atlgl4520
  • Brassica promoter sequence (within BH995823) corresponding to promoter sequence of Arabidopsis gene At2g38900
  • Brassica promoter sequence (within AC183495) corresponding to promoter sequence of Arabidopsis gene At2g41000
  • Brassica promoter sequence (within AC189364) corresponding to promoter sequence of Arabidopsis gene At2g41000
  • Brassica promoter sequence (within AC189487) corresponding to promoter sequence of Arabidopsis gene At5g39260
  • Brassica promoter sequence (within AC189214) corresponding to promoter sequence of Arabidopsis gene Atlg62080

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Cell Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Reproductive Health (AREA)
  • Pregnancy & Childbirth (AREA)
  • Developmental Biology & Embryology (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne des matériels et des méthodes permettant d'exprimer un gène d'intérêt dans un tissu particulier et de modifier un phénotype végétal par l'expression préférentielle d'un gène d'intérêt dans un tissu particulier. En particulier, la présente invention concerne des matériels et des méthodes permettant d'exprimer un gène d'intérêt dans l'endosperme et de modifier la taille des graines et/ou des fruits d'une plante, ainsi que des promoteurs végétaux actifs dans l'endosperme et leurs utilisations.
PCT/GB2007/001051 2006-03-24 2007-03-23 Promoteurs végétaux, séquences codantes et leurs utilisations Ceased WO2007110600A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0606033A GB0606033D0 (en) 2006-03-24 2006-03-24 Plant Promoters, Coding Sequence And Their Uses
GB0606033.9 2006-03-24

Publications (2)

Publication Number Publication Date
WO2007110600A2 true WO2007110600A2 (fr) 2007-10-04
WO2007110600A3 WO2007110600A3 (fr) 2008-03-20

Family

ID=36384199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2007/001051 Ceased WO2007110600A2 (fr) 2006-03-24 2007-03-23 Promoteurs végétaux, séquences codantes et leurs utilisations

Country Status (2)

Country Link
GB (1) GB0606033D0 (fr)
WO (1) WO2007110600A2 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2314703A1 (fr) * 2009-10-23 2011-04-27 Friedrich-Alexander-Universität Erlangen-Nürnberg Promoteur de plante spécifique à un tissu pour un endosperme et un embryon d'une graine de plante
WO2017178318A1 (fr) 2016-04-11 2017-10-19 Bayer Cropscience Nv Promoteurs spécifiques des graines et préférentiels de l'endosperme et leurs utilisations
WO2017178322A1 (fr) 2016-04-11 2017-10-19 Bayer Cropscience Nv Promoteurs spécifiques des graines et préférentiels de l'endosperme et leurs utilisations
US11124801B2 (en) 2018-04-18 2021-09-21 Pioneer Hi-Bred International, Inc. Genes, constructs and maize event DP-202216-6
US12371702B2 (en) 2018-04-18 2025-07-29 Pioneer Hi-Bred International, Inc. Improving agronomic characteristics in maize by modification of endogenous mads box transcription factors

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU3781897A (en) * 1996-08-30 1998-03-19 Danny N. P. Doan Endosperm and nucellus specific genes, promoters and uses thereof
FR2799203B1 (fr) * 1999-10-01 2003-03-21 Biogemma Fr Promoteurs specifiques de l'albumen des graines de vegetaux

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2314703A1 (fr) * 2009-10-23 2011-04-27 Friedrich-Alexander-Universität Erlangen-Nürnberg Promoteur de plante spécifique à un tissu pour un endosperme et un embryon d'une graine de plante
WO2017178318A1 (fr) 2016-04-11 2017-10-19 Bayer Cropscience Nv Promoteurs spécifiques des graines et préférentiels de l'endosperme et leurs utilisations
WO2017178322A1 (fr) 2016-04-11 2017-10-19 Bayer Cropscience Nv Promoteurs spécifiques des graines et préférentiels de l'endosperme et leurs utilisations
US10975380B2 (en) 2016-04-11 2021-04-13 Basf Agricultural Solutions Seed, Us Llc Seed-specific and endosperm-preferental promoters and uses thereof
US11124801B2 (en) 2018-04-18 2021-09-21 Pioneer Hi-Bred International, Inc. Genes, constructs and maize event DP-202216-6
US11421242B2 (en) 2018-04-18 2022-08-23 Pioneer Hi-Bred International, Inc. Genes, constructs and maize event DP-202216-6
US12234470B2 (en) 2018-04-18 2025-02-25 Pioneer Hi-Bred International, Inc. Genes, constructs and maize event DP-202216-6
US12371702B2 (en) 2018-04-18 2025-07-29 Pioneer Hi-Bred International, Inc. Improving agronomic characteristics in maize by modification of endogenous mads box transcription factors

Also Published As

Publication number Publication date
WO2007110600A3 (fr) 2008-03-20
GB0606033D0 (en) 2006-05-03

Similar Documents

Publication Publication Date Title
US6225529B1 (en) Seed-preferred promoters
Zhang et al. Characterization of Arabidopsis MYB transcription factor gene AtMYB17 and its possible regulation by LEAFY and AGL15
CA2401858A1 (fr) Genes specifiques du sac embryonnaire
CN102933712A (zh) 用于在植物中调节转基因表达的调控序列
Chen et al. Isolation and heterologous transformation analysis of a pollen-specific promoter from wheat (Triticum aestivum L.)
US9139839B2 (en) Plant egg cell transcriptional control sequences
US7659448B2 (en) Plant regulatory sequences for selective control of gene expression
WO2007110600A2 (fr) Promoteurs végétaux, séquences codantes et leurs utilisations
US20020152500A1 (en) Tissue-preferred promoter from maize
Federico et al. The complex developmental expression of a novel stress-responsive barley Ltp gene is determined by a shortened promoter sequence
Kusano et al. Molecular characterization of ONAC300, a novel NAC gene specifically expressed at early stages in various developing tissues of rice
CA2880151A1 (fr) Promoteur ccp1 du soja et son utilisation dans l'expression constitutive de genes transgeniques chez des plantes
Tiwari et al. Proliferative phase endosperm promoters from Arabidopsis thaliana
US8044263B2 (en) Cytokinin oxidase promoter from maize
WO2014025858A1 (fr) Promoteur adf1 du soja et son utilisation dans l'expression constitutive de gènes transgéniques dans des plantes
Sakakibara et al. Identification of the gene structure and promoter region of H+-translocating inorganic pyrophosphatase in rice (Oryza sativa L.)
US20040187176A1 (en) Methods for improving plant agronomical traits by altering the expression or activity of plant G-protein alpha and beta subunits
WO2016044090A1 (fr) Promoteur if5a du soja et son utilisation dans l'expression constitutive de gènes transgéniques dans des végétaux
CN1869233A (zh) 来源于棉花的基因启动子及其应用
CA2906461A1 (fr) Promoteur d'agb1 du soja et son utilisation dans l'expression specifique d'un tissu de genes transgeniques dans des plantes
US9556447B2 (en) Soybean HRP1 promoter and its use in tissue-specific expression of transgenic genes in plants
WO2008052285A1 (fr) Séquences de régulation transcriptionnelle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07732113

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07732113

Country of ref document: EP

Kind code of ref document: A2