WO2015117265A1 - LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS - Google Patents
LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS Download PDFInfo
- Publication number
- WO2015117265A1 WO2015117265A1 PCT/CN2014/071873 CN2014071873W WO2015117265A1 WO 2015117265 A1 WO2015117265 A1 WO 2015117265A1 CN 2014071873 W CN2014071873 W CN 2014071873W WO 2015117265 A1 WO2015117265 A1 WO 2015117265A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- incrna
- molecule
- rna
- plant
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8218—Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
Definitions
- the present invention relates to the fields of the agriculture and molecular biology. More particularly, the invention is directed at methods for identification of long non- coding RNAs in plants, particularly non-polyadenylated long non-coding RNAs which are differentially expressed under phosphate starvation conditions, such as long non- coding RNAs whose transcription is upregulated in plants during phosphate starvation. Modulation of the expression of such long non-coding RNAs can be used to alter phosphate use efficiency in plants.
- IncRNAs have been found to fulfill essential roles in different species [8]. Because of their much lower expression level and evolutionary conservation than protein-coding mRNAs [9, 10], as well as uncertain functions for these transcripts, they were initially considered as transcriptional noise [11, 12]. But this hypothesis was soon challenged by the discovery that many IncRNAs were detected during specific developmental stages [13, 14]. Although most IncRNAs have not been well characterized due to difficulties of cloning, some have been extensively studied and are well known for us, such as Xist, an inactive X-specific IncRNA [15, 16], and COLD AIR, an intronic IncRNA [17].
- Sequencing of small RNA libraries from stressed A. thaliana seedlings has discovered 26 new miRNAs and 102 novel endogenous small RNAs, suggesting that they have roles in stress responses [3].
- nutrient homeostasis e.g. phosphorus balance
- nutrient homeostasis e.g. phosphorus balance
- Pi- dependent miR-399 targets and cleaves PH02 mRNA, which encodes the E2 conjugase that decreases the Pi content and weakens Pi remobilization [5].
- IPS 1 INDUCED BY PHOSPHATE STARVAYIO 1
- AT4 act as target mimics to inhibit miRNA- 399 activity [6, 7].
- RNA-seq RNA-sequencing
- the invention provides long non-coding RNA( IncRNA) molecules which are differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply and which are unpolyadenylated such as IncRNA molecules comprising a nucleotide sequence having at least 80% percent identity with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No.
- the invention provides single stranded DNA molecules which when transcribed yield RNA molecules comprising such IncRNA or the complement of such single stranded DNA molecules or double stranded DNA molecules.
- the expression of the IncRNA molecule may be upregulated as is the case for IncRNA molecules comprises a nucleotide sequence having at least 80% percent identity with any one of SEQ ID Nos. 34, 107, 139, 194 or 195.
- the IncRNA molecule may comprise at least two conserved hairpin loops, preferably further comprising two co-variance base pairings as is the case for IncRNA molecules comprises a nucleotide sequence having at least 80% percent identity with SEQ ID No. 149.
- the invention provides use of such IncRNA molecules or DNA molecules which when transcribed yield such IncRNA molecules, to modulate inorganic phosphate use efficiency in a plant.
- the invention provides the use of a IncRNA molecule which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply, or a DNA molecule which when transcribed yields such IncRNA molecule, or the complement of the DNA molecule, to modulate inorganic phosphate use efficiency in a plant.
- This may also comprise IncRNA molecules which are polyadenylated such as IncRNA molecules comprising a nucleotide sequence having at least 80 % sequence identity to SEQ ID No. 216; SEQ ID No. 235; SEQ ID No. 240; SEQ ID No. 250; SEQ
- the IncRNA molecule may also comprise a nucleotide sequence of any one of Tables 5 to 10.
- the invention further provides a method for modulating the phosphate use efficiency in a plant comprising the step of increasing or decreasing the transcription and/or concentration of an IncRNA molecule as herein described in a plant cell.
- the method may comprise a step of decreasing transcription and/or concentration of an IncRNA molecule which may be achieved by mutating the DNA region from which the IncRNA molecule is transcribed, or by expression of an inhibitory RNA molecule specifically recognizing the IncRNA molecule.
- the method may comprise a step of increasing transcription and/or concentration of an IncRNA molecule which may be achieved by mutating the DNA region from which the IncRNA molecule is transcribed, or by introducing a recombinant gene comprising:
- RNA molecule comprising or consisting of said IncRNA molecule
- a DNA region which when transcribed encodes an RNA molecule comprising or consisting of said IncRNA molecule; and optionally c. an appropriate transcription termination region, and/or polyadenylation region.
- the invention also provides modified plant cells comprising a modulated concentration or transcription of an IncRNA molecule as herein described, when compared to an unmodified plant cell and modified plant, or plant part, tissue or organ comprising a multitude of or consisting essentially of modified plant cells, as well as seeds of such plants comprising the mutation, genetic alteration or recombinant leading to or resulting in the modulated concentration or transcription of an IncRNA molecule as herein described.
- the invention also provides a method for isolating further IncRNA molecules involved in phosphate use efficiency in a plant comprising the step of identifying a nucleotide sequence having a degree of homology to the IncRNA molecules as herein described and isolating or synthesizing such RNA molecule comprising or consisting of such nucleotide sequences.
- Figure 1 Flowchart of RNA sequencing and data processing suitable to identify non-polyadenylated long non-coding RNAs .
- Panel (A) Purification of non-polyA RNAs and total RNAs.
- Panel (B) Construction of strand-specific cDNA libraries and sequencing, sequencing at short and long read length are different at steps of size selection and Illumina sequencing.
- Figure 2 Comparison of total and non-polyA RNA-seq with short and long read length.
- Panel (A) Length distribution of clean-read assembled fragments from three sequencing data.
- Panel (B) Length distribution of novel IncRNAs from three sequencing data.
- Panel (C) Numbers of novel IncRNAs at different expressed values (RPKM) from non-polyA RNA-seq (short reads), gray bars. Overlap ratio with novel IncRNAs from total RNA-seq, black line.
- Panel (E) Validation of novel IncRNA candidates from non-polyA RNA-seq by RT-PCR.
- Non-polyA and PolyA candidates were amplified from two cDNA libraries: non-polyA RNAs and polyA RNAs; RT(-), negative control without reverse transcriptase; Left box, polyA RNA (control); Right box, non-polyA RNA.
- FIG. 3 Identification of polyA and non-polyA IncRNAs associated with inorganic phosphate (Pi) starvation.
- Panel (A) Flowchart of polyA and non-polyA IncRNA prediction and characterization.
- Panel (B) Differential expression of polyA and non-polyA IncRNAs between plants grown under low Pi conditions and the control plants.
- Panel (C) Comparison of the proportion of differentially expressed polyA and non- polyA IncRNAs. !, !2 test p-value ⁇ 0.01 ; Gray box, differentially expressed polyA IncRNAs; Dotted box, differentially expressed polyA IncRNAs; gray plus dotted boxes, all identified polyA IncRNAs.
- FIG. 4 Characterization of identified polyA and associated with Pi starvation.
- Panel (A) DNA conservation of novel IncRNAs in multiple plant species. Compared to coding genes and unexpressed intergenic control regions, IncRNAs show moderate DNA conservation.
- Panel (B) Comparison of DNA and protein conservation, RNA structure conservation and structure stability (free energy) of IncRNAs and coding genes. RNA structure conservation was normalized by their DNA conservation scores.
- Panel (C) GO enrichment of non-polyA IncRNA co-located (on cis-regulatory region) with coding genes. Pi-starvation associated GO terms are shown.
- FIG. 5 Validation of novel non-polyA IncRNAs in Pi starvation.
- Panel (C) An example of conserved secondary structure of the 5' end of a non-polyA IncRNA (lnc-149). Two conserved short hairpins are displayed in the top and left region, and one variable long hairpin onn the right. Covariance base- pairings are encircled with dark dots and conserved loops are indicted by darker boxes green (darkest means conserved in three genomes and lightest box in two genomes). The multiple alignments from lnc-146 as can be found in three plant genomes (A. thaliana SEQ ID No. 1328; A. lyrata SEQ ID No. 1329; and T. halophile SEQ ID No. 1330) are also provided to illustrate the conserved bases and loops in the primary sequences.
- Figure 6 Illustration of the Reads quality by FastQC. Raw reads quality is represented by uploading FastQ files of sequencing reads to the FastQC program (see Example 1/Methods).
- Figure 7 Novel non-polyA and polyA IncRNAs (both total and differentially expressed) were sub-typed as Transgenic elements, pseudogene, antisense, ambiguous and intergenic by aligning to TAIR10 genome (see Example 1/ Methods).
- Figure 8 GO enrichment of polyA IncRNAs co-located (on the cis-regulatory region) with coding genes.
- the current invention is based upon the identification of long non coding RNAs, including novel unpolyadenylated long non coding RNAs, which are differentially expressed upon growth of plants under conditions of inorganic phosphate starvation.
- the IncRNAs have been detected by processing raw sequencing data based upon an integrative computational model. Validation by RT-PCR indicates that the IncRNAs are bona fide transcripts rather than transcriptional noise.
- the invention provides long non-coding RNA( IncRNA) molecules which are differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply and which are unpolyadenylated.
- unpolyadenylated or “non-polyadenylated” or “non-polyA” IncRNA molecules refers to IncRNA molecules which usually lack a polyA-tail.
- a IncRNA molecule may be classified as unpolyadenylated when, upon RNA sequencing according to methods described herein, particularly in Example 1, the maximum expression value (Reads per Kilobase per Million mapped reads or RPKM value) of each IncRNA determined separately in the polyA reads data and non-polyA reads data is four time greater in non-poly A data than in the polyA data.
- a IncRNA molecule is considered as "differentially expressed" upon Pi starvation compared to normally grown plants when read counts assigned to IncRNAs, e.g. using DEGseq package [30], normalized against the total mapped reads and analyzed using e.g. the MA-plot based method with random sampling with p-value of 0.05 have a fold- change of 2 or more using RNA of the Pi-starved plants compared to normally grown plants.
- Table 4 lists long non-coding unpolyadenylated RNAs identified in Arabidopsis thaliana and indicates whether these molecules are differentially expressed under Pi starvation conditions.
- the invention provides long non-coding RNA molecules comprising a nucleotide sequence which has at least 80% percent identity with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No.
- sequence identity may be larger than at least 80%, such as at least 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or be identical with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No.
- the invention further provides DNA molecules which when transcribed yield an RNA molecule comprising said IncRNAs as herein described. Also provided by the invention are the complement of such DNA molecules. DNA molecules may be single- stranded or double stranded. Such DNA molecules may be referred herein as encoding the IncRNA molecules.
- RNA molecules or the encoding DNA molecules whose expression is upregulated during growth under inorganic phosphate starvation conditions such as lnc-34, lnc-107, lnc-139, lnc-194 or lnc-195.
- the invention provides IncRNA molecules comprising a nucleotide sequence which has at least 80% percent identity with any one of SEQ ID No. 34, SEQ ID No. 107, SEQ ID No. 139, SEQ ID No. 194 or SEQ ID No.
- RNA molecules of particular interest are those long- noncoding, unpolyadenylated RNA molecules, differentially expressed under Pi starvation conditions and which comprise a secondary structure, particularly which comprise at least two conserved hairpin loops, preferably further comprising two co- variance base pairings, such as lnc-149( SEQ ID No. 149).
- These non-polyA IncR A molecules, or DNA molecules encoding such non- polyA IncRNA molecules may be used to modulate inorganic phosphate use efficiency in a plant.
- a IncRNA molecule may be classified as polyadenylated (poly A) when, upon RNA sequencing according to methods described herein, particularly in Example 1 , the maximum expression value (Reads per Kilobase per Million mapped reads or RPKM value) of each IncRNA determined separately in the polyA reads data and non-polyA reads data is four time greater in poly A data than in the non-polyA data, or is only identified in the polyA data.
- poly A polyadenylated
- Table 4 also lists long non-coding polyadenylated RNAs identified in Arabidopsis thaliana and indicates whether these molecules are differentially expressed under Pi starvation conditions.
- the invention also relates to the use of a IncRNA molecule, whether polyadenylated or unpolyadenylated or "bimorphic", which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply, or a DNA molecule which when transcribed yields such IncRNA molecule, or the complement of said DNA molecule, to modulate inorganic phosphate use efficiency in a plant.
- the invention relates to the use of IncRNA molecule (or encoding DNA molecules) wherein a polyadenylated IncRNA molecule is used comprising a nucleotide sequence having at least 80% sequence identity, or at least 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity or being identical with any one of SEQ ID No. 216; SEQ ID No. SEQ ID No. 240; SEQ ID No. 250; SEQ ID No. 266; SEQ ID No. 267; SEQ ID No.
- SEQ ID No. . 571 ; SEQ ID No. 572; SEQ ID No. 600; SEQ ID No. 601 ; SEQ ID No.
- SEQ ID No. 998 SEQ ID No. 1020; SEQ ID No. 1025; SEQ ID No. 1031; SEQ ID 1033; SEQ ID No. 1047; SEQ ID No. 1048; SEQ ID No. 1068 or SEQ ID No. 1080.
- IncRNA molecules for phosphate use modulation in a plant comprising a nucleotide sequence of any one of Tables 5 to 10, listing similar sequences found in other plant species, including rice, soybean, wheat, millet, sorghum or corn/maize.
- the invention thus provides a method of modulating the phosphate use efficiency in a plant comprising the step of increasing or decreasing the transcription and/or concentration of an IncRNA molecule as herein described, in a plant cell.
- mutagenesis refers to the process in which plant cells (e.g., seed or tissues, such as pollen, etc.) are contacted one or more times to a mutagenic agent, such as a chemical substance (such as ethylmethylsulfonate (EMS), ethylnitrosourea (ENU), etc.) or ionizing radiation (neutrons (such as in fast neutron mutagenesis, etc.), gamma rays (such as that supplied by a Cobalt 60 source), X-rays, etc.), or a combination of the foregoing.
- a mutagenic agent such as a chemical substance (such as ethylmethylsulfonate (EMS), ethylnitrosourea (ENU), etc.) or ionizing radiation (neutrons (such as in fast neutron mutagenesis, etc.), gamma rays (such as that supplied by a Cobalt 60 source), X-rays, etc
- mutations created by irradiation are often large deletions or other gross lesions such as translocations or complex rearrangements
- mutations created by chemical mutagens are often more discrete lesions such as point mutations.
- EMS alkylates guanine bases, which results in base mispairing: an alkylated guanine will pair with a thymine base, resulting primarily in G/C to A/T transitions.
- plants can be regenerated from the treated cells using known techniques. For instance, the resulting seeds may be planted in accordance with conventional growing procedures and following self-pollination seed is formed on the plants. Alternatively, doubled haploid plantlets may be extracted to immediately form homozygous plants.
- DeleteageneTM Delete- a- gene; Li et al., 2001, Plant J 27: 235-242
- PCR polymerase chain reaction
- the IncRNA expression may be downregulated by introducing a chimeric DNA construct which yields a sense RNA molecule capable of down-regulating IncRNA expression by co-suppression.
- the transcribed DNA region will yield upon transcription a so-called sense RNA molecule capable of reducing the concentration of the IncRNA molecule in the target plant or plant cell in a transcriptional or post-transcriptional manner.
- the transcribed DNA region (and resulting RNA molecule) comprises at least 20 consecutive nucleotides having at least 95% sequence identity to the nucleotide sequence of the IncR A-encoding DNA region present in the plant cell or plant.
- the IncRNA expression may be downregulated by introducing a chimeric DNA construct which yields an anti-sense RNA molecule capable of down-regulating IncRNA expression by co-suppression.
- the transcribed DNA region will yield upon transcription a so-called antisense RNA molecule capable of reducing the concentration of the IncRNA molecule in the target plant or plant cell in a transcriptional or post-transcriptional manner.
- the transcribed DNA region (and resulting RNA molecule) comprises at least 20 consecutive nucleotides having at least 95% sequence identity to the complement of the nucleotide sequence of the IncRNA molecule in the plant cell or plant.
- the minimum nucleotide sequence of the antisense or sense RNA region of about 20 nt of the IncRNA molecule may be comprised within a larger RNA molecule, varying in size from 20 nt to a length equal to the size of the target IncRNA molecule.
- the mentioned antisense or sense nucleotide regions may thus be about from about 21 nt to about 5000 nt long, such as 21 nt, 40 nt, 50 nt, 100 nt, 200 nt, 300 nt, 500 nt or 1000 nt in length.
- the nucleotide sequence of the used inhibitory RNA molecule or the encoding region of the transgene is completely identical or complementary to the IncRNA molecule, the expression of which is targeted to be reduced in the plant cell.
- the sense or antisense regions may have an overall sequence identity of about 40 % or 50 % or 60 % or 70 % or 80 % or 90 % or 100 % to the nucleotide sequence of the endogenous parpl gene or the complement thereof.
- antisense or sense regions should comprise a nucleotide sequence of 20 consecutive nucleotides having about 95 to about 100 % sequence identity to the nucleotide sequence of the IncRNA molecule.
- the stretch of about 95 to about 100% sequence identity may be about 50, 75 or 100 nt.
- IncR A molecule transcription or concentration may be down-regulated by introducing a chimeric DNA construct which yields a double- stranded RNA molecule capable of down-regulating IncRNA expression. Upon transcription of the DNA region the RNA is able to form dsRNA molecule through conventional base paring between a sense and antisense region, whereby the sense and antisense region are nucleotide sequences as hereinbefore described.
- dsRNA-encoding parpl expression-reducing chimeric genes according to the invention may further comprise an intron, such as a heterologous intron, located e.g. in the spacer sequence between the sense and antisense RNA regions in accordance with the disclosure of WO 99/53050 (incorporated herein by reference).
- an intron such as a heterologous intron, located e.g. in the spacer sequence between the sense and antisense RNA regions in accordance with the disclosure of WO 99/53050 (incorporated herein by reference).
- IncRNA molecule transcription or concentration can be down-regulated by introducing a chimeric DNA construct which yields a pre-miRNA RNA molecule which is processed into a miRNA capable of guiding the cleavage of the IncRNA molecule.
- miRNAs are small endogenous RNAs that regulate gene expression in plants, but also in other eukaryotes. In plants, these about 21 nucleotide long RNAs are processed from the stem-loop regions of long endogenous pre-miRNAs by the cleavage activity of DICERLIKE 1 (DCL1). Plant miRNAs are highly complementary to conserved target mRNAs, and guide the cleavage of their targets. miRNAs appear to be key components in regulating the gene expression of complex networks of pathways involved inter alia in development.
- a "miRNA” is an RNA molecule of about 20 to 22 nucleotides in length which can be loaded into a RISC complex and direct the cleavage of a target RNA molecule, wherein the target RNA molecule comprises a nucleotide sequence essentially complementary to the nucleotide sequence of the miRNA molecule whereby one or more of the following mismatches may occur:
- a "pre-miRNA” molecule is an RNA molecule of about 100 to about 200 nucleotides, preferably about 100 to about 130 nucleotides which can adopt a secondary structure comprising a dsRNA stem and a single stranded RNA loop and further comprising the nucleotide sequence of the miRNA and its complement sequence of the miRNA* in the double-stranded RNA stem.
- the miRNA and its complement are located about 10 to about 20 nucleotides from the free ends of the miRNA dsRNA stem.
- the length and sequence of the single stranded loop region are not critical and may vary considerably, e.g. between 30 and 50 nt in length.
- the difference in free energy between unpaired and paired RNA structure is between -20 and -60 kcal/mole, particularly around -40 kcal/mole.
- the complementarity between the miRNA and the miRNA* do not need to be perfect and about 1 to 3 bulges of unpaired nucleotides can be tolerated.
- the secondary structure adopted by an RNA molecule can be predicted by computer algorithms conventional in the art such as mFold, UNAFold and RNAFold.
- the particular strand of the dsRNA stem from the pre-miRNA which is released by DCL activity and loaded onto the RISC complex is determined by the degree of complementarity at the 5' end, whereby the strand which at its 5' end is the least involved in hydrogen bounding between the nucleotides of the different strands of the cleaved dsRNA stem is loaded onto the RISC complex and will determine the sequence specificity of the target RNA molecule degradation.
- miRNA molecules may be comprised within their naturally occurring pre-miRNA molecules but they can also be introduced into existing pre-miRNA molecule scaffolds by exchanging the nucleotide sequence of the miRNA molecule normally processed from such existing pre-miRNA molecule for the nucleotide sequence of another miRNA of interest.
- the scaffold of the pre-miRNA can also be completely synthetic.
- synthetic miRNA molecules may be comprised within, and processed from, existing pre- miRNA molecule scaffolds or synthetic pre-miRNA scaffolds.
- Increase of transcription and/or concentration of a lncRNA molecule as herein described can be achieved by providing the plant cells with a recombinant gene, wherein the recombinant gene comprises a plant expressible promoter operably linked to a DNA region which when transcribed yields an RNA molecule comprising or consisting of that lncRNA molecule and optionally, an appropriate transcription termination region and/or polyadenylation region.
- plant- operative promoter or "plant- expressible promoter” means a promoter which is capable of driving transcription in a plant, plant tissue, plant organ, plant part, or plant cell. This includes any promoter of plant origin, but also any promoter of non-plant origin which is capable of directing transcription in a plant cell.
- Promoters that may be used in this respect are constitutive promoters, such as the promoter of the cauliflower mosaic virus (CaMV) 35S transcript (Hapster et al.,1988, Mol. Gen. Genet. 212: 182-190), the CaMV 19S promoter (U.S. Pat. No. 5,352,605; WO 84/02913; Benfey et al, 1989, EMBO J. 8:2195-2202), the subterranean clover virus promoter No 4 or No 7 (WO 96/06932), the Rubisco small subunit promoter (U.S. Pat. No. 4,962,028), the ubiquitin promoter (Holtorf et al, 1995, Plant Mol.
- CaMV cauliflower mosaic virus
- CaMV 19S promoter U.S. Pat. No. 5,352,605; WO 84/02913; Benfey et al, 1989, EMBO J. 8:2195-2202
- T-DNA gene promoters such as the octopine synthase (OCS) and nopaline synthase (NOS) promoters from Agrobacterium, and further promoters of genes whose constitutive expression in plants is known to the person skilled in the art.
- OCS octopine synthase
- NOS nopaline synthase
- tissue-specific or organ-specific promoters preferably seed-specific promoters, such as the 2S albumin promoter (Joseffson et al, 1987, J. Biol. Chem. 262: 12196-12201), the phaseolin promoter (U.S. Pat. No. 5,504,200; Bustos et al, 1989, Plant Cell l .(9):839-53), the legumine promoter (Shirsat et al, 1989, Mol. Gen. Genet. 215(2):326-331), the "unknown seed protein” (USP) promoter (Baumlein et al, 1991, Mol. Gen. Genet.
- tissue-specific or organ-specific promoters like organ primordia-specific promoters (An et al., 1996, Plant Cell 8: 15-30), stem- specific promoters (Keller et al, 1988, EMBO J. 7(12): 3625-3633), leaf-specific promoters (Hudspeth et al., 1989, Plant Mol. Biol. 12: 579-589), mesophyl-specific promoters (such as the light-inducible Rubisco promoters), root-specific promoters (Keller et al., 1989, Genes Dev.
- tuber-specific promoters Keil et al., 1989, EMBO J. 8(5): 1323-1330
- vascular tissue-specific promoters Pieris et al., 1989, Gene 84: 359-369
- stamen-selective promoters WO 89/10396, WO 92/13956
- dehiscence zone-specific promoters WO 97/13865
- RNA polymerase I In addition to promoters recognized by RNA polymerase I, also promoter recognized by RNA Polymerase I or RNA polymerase III promoters may be used including Type 3 Pol III promoters which can be found e.g. associated with the genes encoding 7SL RNA, U3 snRNA and U6 snRNA. Other nucleotide sequences for type 3 Pol III promoters can be found in nucleotide sequence databases under the entries for the A. thaliana gene AT7SL-1 for 7SL RNA (X72228), A. thaliana gene AT7SL-2 for 7SL RNA (X72229), A.
- thaliana gene AT7SL-3 for 7SL RNA (AJ290403), Humulus lupulus H17SL-1 gene (AJ236706), Humulus lupulus H17SL-2 gene (AJ236704), Humulus lupulus H17SL-3 gene (AJ236705), Humulus lupulus H17SL-4 gene (AJ236703), A. thaliana U6-1 snRNA gene (X52527), A. thaliana U6-26 snRNA gene (X52528), A. thaliana U6-29 snRNA gene (X52529), A.
- thaliana U6-1 snRNA gene (X52527), Zea mays U3 snRNA gene (Z29641), Solanum tuberosum U6 snRNA gene (Z17301; X 60506; S83742), Tomato U6 smal nuclear RNA gene (X51447), A. thaliana U3C snRNA gene (X52630), A.
- thaliana U3B snRNA gene (X52629), Oryza sativa U3 snRNA promoter (X79685), Tomato U3 smal nuclear RNA gene (X14411), Triticum aestivum U3 snRNA gene (X63065), Triticum aestivum U6 snRNA gene (X63066).
- the recombinant DNA molecules as herein described optionally comprise a DNA region involved in transcription termination and/or polyadenylation.
- a variety of DNA region involved in transcription termination and/or polyadenylation functional in plants are known in the art and those skilled in the art will be aware of terminator and polyadenylation sequences that may be suitable in performing the methods herein described.
- the polyadenylation region may be derived from a natural gene, from a variety of other plant genes, from T-DNA genes or even from plant viral genomes.
- the 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or from any other eukaryotic gene.
- Terminator regions for Pol III promoters include a so-called « oligo dT stretch » which is a stretch of consecutive T-residues that serve as a terminator for the RNA polymerase III activity. It should comprise at least 4 T-residues, but obviously may contain more T-residues.
- providing a recombinant DNA molecule may refer to introduction of an exogenous DNA molecule to a plant cell by transformation, optionally followed by regeneration of a plant from the transformed plant cell.
- the term may also refer to introduction of the recombinant DNA molecule by crossing of a transgenic plant comprising the recombinant DNA molecule with another plant and selecting progeny plants which have inherited the recombinant DNA molecule or transgene.
- Yet another alternative meaning of providing refers to introduction of the recombinant DNA molecule by techniques such as protoplast fusion, optionally followed by regeneration of a plant from the fused protoplasts.
- Transformation of plants is now a routine technique.
- any of several transformation methods may be used to introduce the nucleic acid/gene of interest into a suitable ancestor cell. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens et al. (1982) Nature 296: 72- 74 ; Negrutiu et al. (1987) Plant. Mol. Biol.
- Transgenic rice plants can be produced via Agrobacterium-mQdiatQd transformation using any of the well-known methods for rice transformation, such as described in any of the following: European patent application EP 1 198985 Al ; Aldemita and Hodges (1996) Planta 199: 612-617 ; Chan et al. (1993) Plant. Mol. Biol. 22 (3): 491-506 ; Hiei et al. (1994) Plant J. 6 (2): 271 -282), which disclosures are incorporated by reference herein as if fully set forth.
- a suitable method is as described in either Ishida et al. (1996) Nat. Biotechnol. 14(6): 745- 50) or Frame et al.
- the recombinant DNA molecules according to the invention may be introduced into plants in a stable manner or in a transient manner using methods well known in the art.
- the chimeric genes may be introduced into plants, or may be generated inside the plant cell as described e.g. in EP 1339859.
- Gametes, seeds, embryos, either zygotic or somatic, progeny or hybrids of plants which comprising such modulated IncRNA molecule concentration, which are produced by traditional breeding methods are also included within the scope of the present invention.
- the methods and means described herein are believed to be suitable for all plant cells and plants, both dicotyledonous and monocotyledonous plant cells and plants including but not limited to cotton, Brassica vegetables, oilseed rape, wheat, corn or maize, barley, sunflowers, sorghum, rice, oats, sugarcane, soybean, vegetables (including chicory, lettuce, tomato), tobacco, potato, sugarbeet, papaya, pineapple, mango, Arabidopsis thaliana, but also plants used in horticulture, floriculture or forestry.
- the invention also provides a method for isolating further IncRNA molecules involved in phosphate use efficiency, in a plant or plant cell, comprising the step of identifying a nucleotide sequence having a degree of homology to the IncRNA molecules herein described and isolating or synthesizing such RNA molecule comprising or consisting of such nucleotide sequence.
- the identification can occur via hybridization under stringent conditions in plants using probes having the nucleotide sequence of the IncRNA molecules.
- sequence databases (of genomes or transcriptomes) can be searched using software such as BLASTN for sequences that share a defined degree of sequence identity with the sequences of the IncRNA molecules herein described.
- sequence identity of two related nucleotide or amino acid sequences, expressed as a percentage, refers to the number of positions in the two optimally aligned sequences which have identical residues (xlOO) divided by the number of positions compared.
- a gap i.e., a position in an alignment where a residue is present in one sequence but not in the other is regarded as a position with non-identical residues.
- the alignment of the two sequences is performed by the Needleman and Wunsch algorithm (Needleman and Wunsch (1970) J. Mol Biol. 48: 443-453).
- RNA sequences are to be essentially similar or have a certain degree of sequence identity with DNA sequences, thymine (T) in the DNA sequence is considered equal to uracil (U) in the RNA sequence.
- nucleic acid comprising a sequence of nucleotides
- a chimeric gene as will be described further below which comprises a nucleic acid which is functionally or structurally defined may comprise additional nucleic acids etc.
- the term “comprising” also includes “consisting of.
- nucleic acid comprising a certain nucleotide sequence
- terminology relating to a nucleic acid “comprising" a certain nucleotide sequence refers to a nucleic acid or protein including or containing at least the described sequence, so that other nucleotide or amino acid sequences can be included at the 5' (or N-terminal) and/or 3' (or C-terminal) end, e.g. (the nucleotide sequence of) a selectable marker protein, (the nucleotide sequence of) a transit peptide, and/or a 5' leader sequence or a 3' trailer sequence.
- SEQ ID Nos. 1-211 nucleotide sequence of respectively NonpolyA_lnc 1 to
- NonpolyA_lnc 211 from Arabidopsis thaliana.
- SEQ ID Nos. 212-1081 nucleotide sequence of respectively PolyA lnc 1 to
- PolyA_lnc 870 from Arabidopsis thaliana From Arabidopsis thaliana.
- SEQ ID Nos. 1082-1096 nucleotide sequence of orhologous IncRNA molecules from
- SEQ ID Nos. 1097-1 110 nucleotide sequence of orhologous IncRNA molecules from
- SEQ ID Nos. 111 1-1 121 nucleotide sequence of orhologous IncRNA molecules from
- SEQ ID Nos. 1122-1 130 nucleotide sequence of orhologous IncRNA molecules from
- SEQ ID Nos. 1131-1239 nucleotide sequence of orhologous IncRNA molecules from
- Triticiim aestivum.
- SEQ ID Nos. 1240-1297 nucleotide sequence of orhologous IncRNA molecules from
- SEQ ID Nos. 1298-1327 nucleotide sequence of RT-PCR primers listed in Table 3.
- SEQ ID No. 1328 nucleotide sequence of lnc-149 from Arabidopsis thaliana.
- SEQ ID No. 1329 nucleotide sequence of lnc-149 like from Arabidopsis lyrata.
- SEQ ID No. 1330 nucleotide sequence of lnc-149 from Thellungiella halophile.
- Non-polyA RNA and total RNA (rRNA removed) purification were extracted from thirteen-day-old seedlings using QIAGEN RNeasy Plant Mini Kit and then quantified by NanoDrop 1000 and 1% agarose gel electrophoresis.
- To enrich non- polyA RNA we chose 10 ⁇ g RNA to capture polyA RNA using Oligo dT 3 o-probe (Oligotex mRNA Mini Kit, QIAGEN) for four times. 5 ⁇ 1 probes and 80 ⁇ 1 binding buffer were used each time. After centrifuging, probes binding polyA RNAs were precipitated and the supernatant mainly consisted of non-polyA RNA and short-polyA RNA.
- Non-polyA RNAs and total RNAs were quantified using Agilent 2100 Bioanalyzer (allowing to see zonal distribution for total RNAs after rRNA removal) and then stored -80 °C.
- RNA-seq longest reads
- 320-620 bp fragments containing 200-500 bp cDNA fragments and 120bp barcodes were purified,amplified and then sequenced by 100 nt single-end using Illumina HiSeq 2000. For each sample, ⁇ 20M raw reads were collected.
- Mapped reads were assembled by Cufflinks (v2.0.1) and re-assembled by Cuffcompare (v2.0.1) following the protocol from [28].
- Transcripts labeled CUFF are considered novel transcripts (without annotation). These transcripts were collected and then re-filtered out to remove those overlapping with coding genes and known ncRNAs.
- CPC tools [29] were used to calculate coding potential and those reads with CPC ⁇ 0 (low coding potential) were retained. After length selection of transcripts longer than 200 nt transcripts, these were designated as novel IncRNA.
- Novel IncRNAs were re-located to transposable elements (TE) (>1 nt overlap with TE, no strand-specificity), pseudogene (>1 nt overlap with pseudogene, no strand-specificity), antisense (>50% overlap with mRNA, on opposite strand), intronic (100% overlap with intron, on the same strand), ambiguous ( >1 nt overlap with known ncRNAs or coding genes) and intergenic region (the remainder).
- TE transposable elements
- pseudogene >1 nt overlap with pseudogene, no strand-specificity
- antisense >50% overlap with mRNA, on opposite strand
- intronic 100% overlap with intron, on the same strand
- ambiguous >1 nt overlap with known ncRNAs or coding genes
- intergenic region the remainder.
- DNA conservation analysis A phylogenetic tree based on the plant species' divergent time was constructed (www.timetree.org) using MEGA5.0. DNA conservation scores between Arabidopsis thaliana and 16 other organisms were calculated using BLASTn followed by the calculation of the average DNA conservation score.
- DNA conservation scores were calculated across 31 plant species (genomes downloaded from PlantGDB) using BLASTn with default parameters. The maximum Bitscore was used as the feature score. Protein conservation scores were calculated using BLASTx in a similar manner. The coding potential of each bin was calculated using R Acode with default parameters [32]. RNA secondary structure stability of each bin was calculated using RandFold [33], with 1000 times of dinucleotide randomshuffling, and the p-value was used as the feature score.
- RNA structure conservation scores were denoted by SCI (structure conservation index) scores, which were calculated using RNAz based on multiple alignments between Arabidopsis thaliana, Arabidopsis lyrata, Carica papaya, Thellungiella halophile and Citrus Clementina (downloaded from VISTA database).
- SCI structure conservation index
- RNA-seq and tiling array data [34-40] Dozens of RNA-seq and tiling array data [34-40], and a set of unexpressed intergenic regions was defined as negative control.
- the genomic regions with expression level lower than the mean expression level of all genomic element across all RNA-seq and array samples [31] were defined as unexpressed intergenic regions.
- Example 2 Development of a non-polyA RNA sequencing method to identify IncRNAs in the Arabidopsis thaliana genome
- RNA-sequencing methods to identify novel IncRNAs in Arabidopsis: (i) Sequencing of total RNAs (rRNA depleted) for short reads with 36nt in length; (ii) Sequencing of non-polyA RNAs separated from total RNAs by a general RNA-seq protocol for short reads with 36 nt in length; (iii) Sequencing of non- polyA RNAs for long reads with 100 nt in length (see Example 1). These three methods were compared by the length of assembled transcripts and the best one was selected for genome -wide IncRNAs identification.
- the identification of novel IncRNAs comprised three steps: (i) purifying specific RNA components; (ii) constructing strand-specific cDNA library and sequencing; (iii) high- throughput sequencing data processing to identify novel IncRNAs ( Figure 1).
- LncRNAs IPS1 and At4 act as target mimics to inhibit the activity of miRNA-399, and are involved in response to Pi starvation in Arabidopsis [6].
- a systematic understanding of the roles of IncRNAs in Pi starvation response is still lacking.
- We utilized non-polyA RNA-seq (long read) method (described above, see Examples 1 and 2) to identify IncRNAs which are differentially expressed in Arabidopsis seedling under normal and Pi-deficient condition.
- Table 4 The chromosomal locations of differentially expressed IncRNA candidates are listed in Table 4. Table 4 also includes a cross reference to the corresponding SEQ ID No. entry in the sequence listing. Also included are GO terms associated coding regions co- localized with the IncRNA as described below.
- Non-polyA IncRNAs are mainly located in antisense and intergenic regions ( Figure ID), with limited clues to infer their potential cellular functions.
- Previous studies suggested that IncRNAs located around protein coding genes (no matter upstream, internal or downstream of the sense or antisense strand) can regulate the expression of their 'host' coding genes in Arabidopsis [17, 47]. Therefore we analyzed potential IncRNA functions based on genomic co-location with protein coding genes. 1 Kb regions were identified downstream and upstream the protein coding genes as the potential cis- regulatory regions. IncRNAs, differentially expressed under Pi starvation conditions, which are located in the cis-regulatory regions could potentially regulate their 'host' protein coding genes.
- the lnc-34 is transcribed from a region located upstream of a gene AT1G74670, which encodes a protein whose expression is responsive to gibberellins and sugar, two signals which function in Pi starvation responses [48, 49].
- AT1G74670 encodes a protein whose expression is responsive to gibberellins and sugar, two signals which function in Pi starvation responses [48, 49].
- differential expression for lnc-34 can be found between control and low Pi samples, while no similar phenomenon was found in polyA sequencing.
- the conserved structure is composed of two conserved short hairpins and one variable long hairpin located at the 5' end (3-70 nt) of lnc-149. Moreover, the three parts of the local structure are connected together by a multi-branch loop and an additional stem. The key points of the connection, one in the start region of the first conserved hairpin and the other in the start region of the variable hairpin, are constrained by base pairs with covariance (the bases change and the structures retain), implying the conserved structure is favored during evolution and may have a function in the stress response by low Pi conditions. Furthermore, the loop regions in the two conserved hairpins are also maintained in three or two genomes, demonstrating that the loops may have a function, for instance, to bind R A-binding proteins.
- Example 5 Identification of orthologues of IncRNA in other crop species.
- a plant expressible promoter such as a CaMV35S promoter b) a DNA region which when transcribed yields an R A comprising a nucleotide sequence of a IncR A, preferably a non-polyA IncR A upregulated under Pi starvation conditions, including a non-poly IncRNA selected from lnc-24, lnc-107, lnc-139, lnc-194, lnc-195 or lnc-149
- the recombinant genes are introduced into plants, particularly Arabidopsis plants through transformation methods known in the art and transgenic plants are identified.
- RNA-seq an assessment of technical reproducibility and comparison with gene expression arrays. Genome research, 2008. 18(9): p. 1509- 17.
- RNAcode robust discrimination of coding and noncoding regions in comparative sequence data.
- At-TAX a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana. Genome biology, 2008. 9(7): p. Rl 12.
- AtmtPNPase is required for multiple aspects of the 18S rRNA metabolism in Arabidopsis thaliana mitochondria. Nucleic acids research, 2004. 32(17): p. 5174-82.
- RNA-seq short reads
- non-polyA RNA-seq long reads
- low Pi samples were sequenced at lOOnt single-end. Reads were aligned to TAIRIO rRNA, and remaining were mapped to TAIRIO genome using Tophat with two mismatches.
- abscisic acid mediated signaling pathway cell communication,cell death,cellular membr fusion,cytoplasm, defense response to bacterium, incompatible interaction, defense res to fungus,embryo development,endoplasmic reticulum unfolded protein response,ethyl mediated signaling pathway,ethylene mediated signaling pathway,glycolysis,Golgi organization,Golgi vesicle transport,hyperosmotic responsejasmonic acid mediated sig pathwayjasmonic acid mediated signaling pathway,lateral root morphogenesis,MAPK cascade,NAD+ ADP-ribosyltransferase activity,NAD+ ADP-ribosyltransferase activity,NA ADP-ribosyltransferase activity, negative regulation of defense response,negative regula of programmed cell death,nitric oxide biosynthetic
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Plant Pathology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Cell Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Virology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Long non-coding RNAs in plants are provided, particularly non-polyadenylated long non-coding RNAs, which are differentially expressed under phosphate starvation conditions, such as long non-coding RNAs whose transcription is upregulated in plants during phosphate starvation. Modulation of the expression of such long non-coding RNAs can be used to alter phosphate use efficiency in plants.
Description
Long non-coding RNAs for modulating phosphate use
efficiency in plants
Field of the invention
[001 ] The present invention relates to the fields of the agriculture and molecular biology. More particularly, the invention is directed at methods for identification of long non- coding RNAs in plants, particularly non-polyadenylated long non-coding RNAs which are differentially expressed under phosphate starvation conditions, such as long non- coding RNAs whose transcription is upregulated in plants during phosphate starvation. Modulation of the expression of such long non-coding RNAs can be used to alter phosphate use efficiency in plants.
Background of the invention
[002] Recently, many novel IncRNAs have been found to fulfill essential roles in different species [8]. Because of their much lower expression level and evolutionary conservation than protein-coding mRNAs [9, 10], as well as uncertain functions for these transcripts, they were initially considered as transcriptional noise [11, 12]. But this hypothesis was soon challenged by the discovery that many IncRNAs were detected during specific developmental stages [13, 14]. Although most IncRNAs have not been well characterized due to difficulties of cloning, some have been extensively studied and are well known for us, such as Xist, an inactive X-specific IncRNA [15, 16], and COLD AIR, an intronic IncRNA [17].
[003] Plants possess an elaborate physiological system to respond to external stimuli and stress conditions [1]. Non-coding RNAs are the indispensable part of signal transduction networks during stress responses [2]. Sequencing of small RNA libraries from stressed A. thaliana seedlings has discovered 26 new miRNAs and 102 novel endogenous small RNAs, suggesting that they have roles in stress responses [3].
[004] In addition, nutrient homeostasis (e.g. phosphorus balance) is tightly regulated by many complex pathways, in which miRNAs are important members [4]. For example, Pi-
dependent miR-399 targets and cleaves PH02 mRNA, which encodes the E2 conjugase that decreases the Pi content and weakens Pi remobilization [5]. While searching the Arabidopsis genome complementary to miR-399, biologists founding a new class of Pi starvation-associated long non-coding R As such as IPS 1 (INDUCED BY PHOSPHATE STARVAYIO 1 ) and AT4, which act as target mimics to inhibit miRNA- 399 activity [6, 7].
[005] Recent advances in next-generation sequencing technology known as cDNA sequencing or RNA-sequencing (RNA-seq) has been successfully applied in whole- transcriptome research [18]. Several well developed experimental protocols for strand- specific RNA-seq (such as SMART [19] and Bisulfite [20]) are powerful in identifying lots of nature antisense transcripts (NATs). Nearly 10,000 lncRNAs have been identified in human [21 , 22]. Although most lncRNAs were suggested to be synthesized by RNA polymerase II and processed with polyadenylation [23], an increasing number of non- polyadenylated lncRNAs as well as those with short polyA tails have been discovered recently [24].
[006] Although some lncRNAs have been discovered to function in fundamental biological processes during Pi starvation, the role of others remains to be elucidated. Systematic identification of lncRNAs involved in Pi starvition in plants is warranted, especially identification of the less well known, non-polyadenylated ones.
[007] The prior art thus remains deficient in providing lncRNAs which can be used to modulate phosphate use efficiency in plants.
[008] These and other problems are solved as described hereinafter in the different embodiments, examples and claims.
Summary of the invention
[009] In one embodiment, the invention provides long non-coding RNA( IncRNA) molecules which are differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply and which are unpolyadenylated such as IncRNA molecules comprising a nucleotide sequence having at least 80% percent identity with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No. 70; SEQ ID No. 72; SEQ ID No. 77; SEQ ID No. 79; SEQ ID No. 82; SEQ ID No. 83; SEQ ID No. 84; SEQ ID No. 86; SEQ ID No. 90; SEQ ID No. 91; SEQ ID No. 98; SEQ ID No. 99; SEQ ID No. 102; SEQ ID No. 107; SEQ ID No. 114; SEQ ID No. 115; SEQ ID No. 1 18; SEQ ID No. 1 19; SEQ ID No. 120; SEQ ID No. 121; SEQ ID No. 128; SEQ ID No. 132; SEQ ID No. 137; SEQ ID No. 139; SEQ ID No. 140; SEQ ID No. 145; SEQ ID No. 149; SEQ ID No. 151; SEQ ID No. 152; SEQ ID No. 158; SEQ ID No. 160; SEQ ID No. 162; SEQ ID No. 172; SEQ ID No. 174; SEQ ID No. 176; SEQ ID No. 178; SEQ ID No. 179; SEQ ID No. 180; SEQ ID No. 181; SEQ ID No. 183; SEQ ID No. 187; SEQ ID No. 194; SEQ ID No. 195; SEQ ID No. 200; SEQ ID No. 201 ; SEQ ID No. 204; SEQ ID No. 209 or SEQ ID No. 210 or a nucleotide sequence being identical with the mentioned sequences.
[010] In another embodiment, the invention provides single stranded DNA molecules which when transcribed yield RNA molecules comprising such IncRNA or the complement of such single stranded DNA molecules or double stranded DNA molecules.
[01 1] The expression of the IncRNA molecule may be upregulated as is the case for IncRNA molecules comprises a nucleotide sequence having at least 80% percent identity with any one of SEQ ID Nos. 34, 107, 139, 194 or 195.
[012] The IncRNA molecule may comprise at least two conserved hairpin loops, preferably further comprising two co-variance base pairings as is the case for IncRNA
molecules comprises a nucleotide sequence having at least 80% percent identity with SEQ ID No. 149.
[013] In yet another embodiment, the invention provides use of such IncRNA molecules or DNA molecules which when transcribed yield such IncRNA molecules, to modulate inorganic phosphate use efficiency in a plant.
[014] In still another embodiment, the invention provides the use of a IncRNA molecule which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply, or a DNA molecule which when transcribed yields such IncRNA molecule, or the complement of the DNA molecule, to modulate inorganic phosphate use efficiency in a plant. This may also comprise IncRNA molecules which are polyadenylated such as IncRNA molecules comprising a nucleotide sequence having at least 80 % sequence identity to SEQ ID No. 216; SEQ ID No. 235; SEQ ID No. 240; SEQ ID No. 250; SEQ
ID No. 266; SEQ ID No. 267; SEQ ID No. 278; SEQ ID No. 281; SEQ ID No. 283; SEQ
ID No. 315; SEQ ID No. 320; SEQ ID No. 321; SEQ ID No. 325; SEQ ID No. 330; SEQ
ID No. 361 ; SEQ ID No. 364; SEQ ID No. 368; SEQ ID No. 370; SEQ ID No. 378; SEQ
ID No. 384; SEQ ID No. 388; SEQ ID No. 393 SEQ ID No. 396; SEQ ID No. 414; SEQ
ID No. 420; SEQ ID No. 424; SEQ ID No. 429; SEQ ID No. 432; SEQ ID No. 433; SEQ
ID No. 436; SEQ ID No. 437; SEQ ID No. 448; SEQ ID No. 459; SEQ ID No. 460; SEQ
ID No. 461 ; SEQ ID No. 462; SEQ ID No. 463; SEQ ID No. 486; SEQ ID No. 504; SEQ
ID No. 512; SEQ ID No. 533; SEQ ID No. 570; SEQ ID No. 571 ; SEQ ID No. 572; SEQ
ID No. 600; SEQ ID No. 601; SEQ ID No. 617; SEQ ID No. 618; SEQ ID No. 649; SEQ
ID No. 651 ; SEQ ID No. 652; SEQ ID No. 656; SEQ ID No. 660; SEQ ID No. 662; SEQ
ID No. 707; SEQ ID No. 709; SEQ ID No. 712; SEQ ID No. 713; SEQ ID No. 715; SEQ
ID No. 716; SEQ ID No. 725; SEQ ID No. 734; SEQ ID No. 783; SEQ ID No. 785; SEQ
ID No. 786; SEQ ID No. 787; SEQ ID No. 788; SEQ ID No. 798; SEQ ID No. 817; SEQ
ID No. 850; SEQ ID No. 857; SEQ ID No. 918; SEQ ID No. 921; SEQ ID No. 945; SEQ
ID No. 946; SEQ ID No. 955; SEQ ID No. 957; SEQ ID No. 972; SEQ ID No. 985; SEQ
ID No. 986; SEQ ID No. 990; SEQ ID No. 997; SEQ ID No. 998; SEQ ID No. 1020;
SEQ ID No. 1025; SEQ ID No. 1031; SEQ ID No. 1033; SEQ ID No. 1047; SEQ ID No. 1048; SEQ ID No. 1068 or SEQ ID No. 1080. The IncRNA molecule may also comprise a nucleotide sequence of any one of Tables 5 to 10.
[015] The invention further provides a method for modulating the phosphate use efficiency in a plant comprising the step of increasing or decreasing the transcription and/or concentration of an IncRNA molecule as herein described in a plant cell. The method may comprise a step of decreasing transcription and/or concentration of an IncRNA molecule which may be achieved by mutating the DNA region from which the IncRNA molecule is transcribed, or by expression of an inhibitory RNA molecule specifically recognizing the IncRNA molecule. The method may comprise a step of increasing transcription and/or concentration of an IncRNA molecule which may be achieved by mutating the DNA region from which the IncRNA molecule is transcribed, or by introducing a recombinant gene comprising:
a. a plant-expressible promoter
b. a DNA region which when transcribed encodes an RNA molecule comprising or consisting of said IncRNA molecule; and optionally c. an appropriate transcription termination region, and/or polyadenylation region.
[016] The invention also provides modified plant cells comprising a modulated concentration or transcription of an IncRNA molecule as herein described, when compared to an unmodified plant cell and modified plant, or plant part, tissue or organ comprising a multitude of or consisting essentially of modified plant cells, as well as seeds of such plants comprising the mutation, genetic alteration or recombinant leading to or resulting in the modulated concentration or transcription of an IncRNA molecule as herein described.
[017] The invention also provides a method for isolating further IncRNA molecules involved in phosphate use efficiency in a plant comprising the step of identifying a nucleotide sequence having a degree of homology to the IncRNA molecules as herein
described and isolating or synthesizing such RNA molecule comprising or consisting of such nucleotide sequences.
Brief description of the drawings
[018] Figure 1: Flowchart of RNA sequencing and data processing suitable to identify non-polyadenylated long non-coding RNAs . Panel (A): Purification of non-polyA RNAs and total RNAs. Panel (B): Construction of strand-specific cDNA libraries and sequencing, sequencing at short and long read length are different at steps of size selection and Illumina sequencing. Panel (C): Sequencing data processing: trimming and assembling reads to transcripts.
[019] Figure 2: Comparison of total and non-polyA RNA-seq with short and long read length. Panel (A): Length distribution of clean-read assembled fragments from three sequencing data. Panel (B): Length distribution of novel IncRNAs from three sequencing data. Panel (C): Numbers of novel IncRNAs at different expressed values (RPKM) from non-polyA RNA-seq (short reads), gray bars. Overlap ratio with novel IncRNAs from total RNA-seq, black line. Panel (D): Genome location for identified novel IncRNAs from three methods. Panel (E) Validation of novel IncRNA candidates from non-polyA RNA-seq by RT-PCR. M: DNA ladder; ACT2: AT3G18780; Non-polyA and PolyA, candidates were amplified from two cDNA libraries: non-polyA RNAs and polyA RNAs; RT(-), negative control without reverse transcriptase; Left box, polyA RNA (control); Right box, non-polyA RNA.
[020] Figure 3: Identification of polyA and non-polyA IncRNAs associated with inorganic phosphate (Pi) starvation. Panel (A): Flowchart of polyA and non-polyA IncRNA prediction and characterization. Panel (B): Differential expression of polyA and non-polyA IncRNAs between plants grown under low Pi conditions and the control plants. Panel (C): Comparison of the proportion of differentially expressed polyA and non- polyA IncRNAs. !, !2 test p-value < 0.01 ; Gray box, differentially expressed polyA IncRNAs; Dotted box, differentially expressed polyA IncRNAs; gray plus dotted boxes, all identified polyA IncRNAs.
[021] Figure 4: Characterization of identified polyA and associated with Pi starvation. Panel (A): DNA conservation of novel IncRNAs in multiple plant species. Compared to coding genes and unexpressed intergenic control regions, IncRNAs show moderate DNA conservation. Panel (B) Comparison of DNA and protein conservation, RNA structure conservation and structure stability (free energy) of IncRNAs and coding genes. RNA structure conservation was normalized by their DNA conservation scores.Panel (C) GO enrichment of non-polyA IncRNA co-located (on cis-regulatory region) with coding genes. Pi-starvation associated GO terms are shown.
[022] Figure 5: Validation of novel non-polyA IncRNAs in Pi starvation. Panel (A) Up- regulated expression of novel non-polyA IncRNAs in low Pi samples. Relative expression of selected non-polyA IncRNAs from Table 4 were measured by real-time PCR. Panel (B) A non-polyA IncRNAs (lnc-34) supported by expression and epigenetics. H3K9ac and H2A.Z are raw reads from ChlP-seq data (GSE28398) and chAP-seq data (GSM954614), and inputs are background of them. Signals in non-polyA and polyA RNA-seq tracks are raw reads from four RNA-seq data. The signal is normalized by total mapped reads of each sample. Panel (C) An example of conserved secondary structure of the 5' end of a non-polyA IncRNA (lnc-149). Two conserved short hairpins are displayed in the top and left region, and one variable long hairpin onn the right. Covariance base- pairings are encircled with dark dots and conserved loops are indicted by darker boxes green (darkest means conserved in three genomes and lightest box in two genomes). The multiple alignments from lnc-146 as can be found in three plant genomes (A. thaliana SEQ ID No. 1328; A. lyrata SEQ ID No. 1329; and T. halophile SEQ ID No. 1330) are also provided to illustrate the conserved bases and loops in the primary sequences.
[023] Figure 6: Illustration of the Reads quality by FastQC. Raw reads quality is represented by uploading FastQ files of sequencing reads to the FastQC program (see Example 1/Methods).
[024] Figure 7: Novel non-polyA and polyA IncRNAs (both total and differentially expressed) were sub-typed as Transgenic elements, pseudogene, antisense, ambiguous and intergenic by aligning to TAIR10 genome (see Example 1/ Methods).
[025] Figure 8: GO enrichment of polyA IncRNAs co-located (on the cis-regulatory region) with coding genes.
Detailed description of different embodiments
[026] The current invention is based upon the identification of long non coding RNAs, including novel unpolyadenylated long non coding RNAs, which are differentially expressed upon growth of plants under conditions of inorganic phosphate starvation. The IncRNAs have been detected by processing raw sequencing data based upon an integrative computational model. Validation by RT-PCR indicates that the IncRNAs are bona fide transcripts rather than transcriptional noise.
[027] Accordingly, in a first embodiment the invention provides long non-coding RNA( IncRNA) molecules which are differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply and which are unpolyadenylated.
[028] As used herein, "unpolyadenylated" or "non-polyadenylated" or "non-polyA" IncRNA molecules refers to IncRNA molecules which usually lack a polyA-tail. A IncRNA molecule may be classified as unpolyadenylated when, upon RNA sequencing according to methods described herein, particularly in Example 1, the maximum expression value (Reads per Kilobase per Million mapped reads or RPKM value) of each IncRNA determined separately in the polyA reads data and non-polyA reads data is four time greater in non-poly A data than in the polyA data.
[029] A IncRNA molecule is considered as "differentially expressed" upon Pi starvation compared to normally grown plants when read counts assigned to IncRNAs, e.g. using DEGseq package [30], normalized against the total mapped reads and analyzed using e.g. the MA-plot based method with random sampling with p-value of 0.05 have a fold-
change of 2 or more using RNA of the Pi-starved plants compared to normally grown plants.
[030] Table 4 lists long non-coding unpolyadenylated RNAs identified in Arabidopsis thaliana and indicates whether these molecules are differentially expressed under Pi starvation conditions.
[031] Thus, in another embodiment, the invention provides long non-coding RNA molecules comprising a nucleotide sequence which has at least 80% percent identity with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No. 70; SEQ ID No. 72; SEQ ID No. 77; SEQ ID No. 79; SEQ ID No. 82; SEQ ID No. 83; SEQ ID No. 84; SEQ ID No. 86; SEQ ID No. 90; SEQ ID No. 91 ; SEQ ID No. 98; SEQ ID No. 99; SEQ ID No. 102; SEQ ID No. 107; SEQ ID No. 1 14; SEQ ID No. 1 15; SEQ ID No. 118; SEQ ID No. 119; SEQ ID No. 120; SEQ ID No. 121 ; SEQ ID No. 128; SEQ ID No. 132; SEQ ID No. 137; SEQ ID No. 139; SEQ ID No. 140; SEQ ID No. 145; SEQ ID No. 149; SEQ ID No. 151; SEQ ID No. 152; SEQ ID No. 158; SEQ ID No. 160; SEQ ID No. 162; SEQ ID No. 172; SEQ ID No. 174; SEQ ID No. 176; SEQ ID No. 178; SEQ ID No. 179; SEQ ID No. 180; SEQ ID No. 181; SEQ ID No. 183; SEQ ID No. 187; SEQ ID No. 194; SEQ ID No. 195; SEQ ID No. 200; SEQ ID No. 201; SEQ ID No. 204; SEQ ID No. 209 or SEQ ID No. 210.
[032] The sequence identity may be larger than at least 80%, such as at least 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or be identical with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No. 70; SEQ ID No. 72; SEQ ID No. 77; SEQ ID No. 79; SEQ ID No. 82; SEQ ID No. 83; SEQ ID No. 84; SEQ ID No. 86; SEQ ID No. 90; SEQ ID No.
91 ; SEQ ID No. 98; SEQ ID No. 99; SEQ ID No. 102; SEQ ID No. 107; SEQ ID No. 114; SEQ ID No. 115; SEQ ID No. 1 18; SEQ ID No. 1 19; SEQ ID No. 120; SEQ ID No. 121 ; SEQ ID No. 128; SEQ ID No. 132; SEQ ID No. 137; SEQ ID No. 139; SEQ ID No. 140; SEQ ID No. 145; SEQ ID No. 149; SEQ ID No. 151; SEQ ID No. 152; SEQ ID No. 158; SEQ ID No. 160; SEQ ID No. 162; SEQ ID No. 172; SEQ ID No. 174; SEQ ID No. 176; SEQ ID No. 178; SEQ ID No. 179; SEQ ID No. 180; SEQ ID No. 181 ; SEQ ID No. 183; SEQ ID No. 187; SEQ ID No. 194; SEQ ID No. 195; SEQ ID No. 200; SEQ ID No. 201 ; SEQ ID No. 204; SEQ ID No. 209 or SEQ ID No. 210.
[033] The invention further provides DNA molecules which when transcribed yield an RNA molecule comprising said IncRNAs as herein described. Also provided by the invention are the complement of such DNA molecules. DNA molecules may be single- stranded or double stranded. Such DNA molecules may be referred herein as encoding the IncRNA molecules.
[034] Of particular interest are long non-coding RNA molecules (or the encoding DNA molecules) whose expression is upregulated during growth under inorganic phosphate starvation conditions such as lnc-34, lnc-107, lnc-139, lnc-194 or lnc-195. Thus in another embodiment, the invention provides IncRNA molecules comprising a nucleotide sequence which has at least 80% percent identity with any one of SEQ ID No. 34, SEQ ID No. 107, SEQ ID No. 139, SEQ ID No. 194 or SEQ ID No. 195, or having at least 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity or be identical with any one of SEQ ID No. 34, SEQ ID No. 107, SEQ ID No. 139, SEQ ID No. 194 or SEQ ID No. 195.
[035] Other long non-coding RNA molecules of particular interest are those long- noncoding, unpolyadenylated RNA molecules, differentially expressed under Pi starvation conditions and which comprise a secondary structure, particularly which comprise at least two conserved hairpin loops, preferably further comprising two co- variance base pairings, such as lnc-149( SEQ ID No. 149).
[036] These non-polyA IncR A molecules, or DNA molecules encoding such non- polyA IncRNA molecules, may be used to modulate inorganic phosphate use efficiency in a plant.
[037] It has also been determined that in addition to the unpolyadenylated IncRNA molecules, there are also polyadenylated IncRNA molecules which are differentially expressed under phosphate starvation conditions and these polyadenylated IncRNA may be used to similar effect to modulate inorganic phosphate use efficiency in a plant as the unpolyadenylated ones.
[038] A IncRNA molecule may be classified as polyadenylated (poly A) when, upon RNA sequencing according to methods described herein, particularly in Example 1 , the maximum expression value (Reads per Kilobase per Million mapped reads or RPKM value) of each IncRNA determined separately in the polyA reads data and non-polyA reads data is four time greater in poly A data than in the non-polyA data, or is only identified in the polyA data.
[039] Table 4 also lists long non-coding polyadenylated RNAs identified in Arabidopsis thaliana and indicates whether these molecules are differentially expressed under Pi starvation conditions.
[040] Thus, the invention also relates to the use of a IncRNA molecule, whether polyadenylated or unpolyadenylated or "bimorphic", which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply, or a DNA molecule which when transcribed yields such IncRNA molecule, or the complement of said DNA molecule, to modulate inorganic phosphate use efficiency in a plant.
[041] In particular the invention relates to the use of IncRNA molecule (or encoding DNA molecules) wherein a polyadenylated IncRNA molecule is used comprising a nucleotide sequence having at least 80% sequence identity, or at least 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity or being identical with any one of SEQ ID No. 216; SEQ ID No.
SEQ ID No. 240; SEQ ID No. 250; SEQ ID No. 266; SEQ ID No. 267; SEQ ID No.
SEQ ID No. 281; SEQ ID No. 283; SEQ ID No. 315; SEQ ID No. 320; SEQ ID No.
SEQ ID No. 325; SEQ ID No. 330; SEQ ID No. 361; SEQ ID No. 364; SEQ ID No.
SEQ ID No. 370; SEQ ID No. 378; SEQ ID No. 384; SEQ ID No. 388; SEQ ID No.
SEQ ID No. 396; SEQ ID No. 414; SEQ ID No. 420; SEQ ID No. 424; SEQ ID No.
SEQ ID No. 432; SEQ ID No. 433; SEQ ID No. 436; SEQ ID No. 437; SEQ ID No.
SEQ ID No. 459; SEQ ID No. 460; SEQ ID No. 461; SEQ ID No. 462; SEQ ID No.
SEQ ID No. 486; SEQ ID No. 504; SEQ ID No. 512; SEQ ID No. 533; SEQ ID No.
SEQ ID No. . 571: ; SEQ ID No. 572; SEQ ID No. 600; SEQ ID No. 601 ; SEQ ID No.
SEQ ID No. 618; SEQ ID No. 649; SEQ ID No. 651; SEQ ID No. 652; SEQ ID No.
SEQ ID No. 660; SEQ ID No. 662; SEQ ID No. 707; SEQ ID No. 709; SEQ ID No.
SEQ ID No. 713; SEQ ID No. 715; SEQ ID No. 716; SEQ ID No. 725; SEQ ID No.
SEQ ID No. 783; SEQ ID No. 785; SEQ ID No. 786; SEQ ID No. 787; SEQ ID No.
SEQ ID No. 798; SEQ ID No. 817; SEQ ID No. 850; SEQ ID No. 857; SEQ ID No.
SEQ ID No. 921; SEQ ID No. 945; SEQ ID No. 946; SEQ ID No. 955; SEQ ID No.
SEQ ID No. 972; SEQ ID No. 985; SEQ ID No. 986; SEQ ID No. 990; SEQ ID No.
SEQ ID No. 998; SEQ ID No. 1020; SEQ ID No. 1025; SEQ ID No. 1031; SEQ ID 1033; SEQ ID No. 1047; SEQ ID No. 1048; SEQ ID No. 1068 or SEQ ID No. 1080.
[042] Also of particular interest is the use of IncRNA molecules for phosphate use modulation in a plant, comprising a nucleotide sequence of any one of Tables 5 to 10, listing similar sequences found in other plant species, including rice, soybean, wheat, millet, sorghum or corn/maize.
[043] The invention thus provides a method of modulating the phosphate use efficiency in a plant comprising the step of increasing or decreasing the transcription and/or concentration of an IncRNA molecule as herein described, in a plant cell.
[044] Decrease of transcription and/or concentration of a IncRNA molecule as herein described can be achieved by mutation of the DNA region from which the IncRNA molecule is transcribed. Mutagenesis methods to achieve such mutations, including substitutions, small or large insertions or deletions, in plants are well known in the art.
[045] "Mutagenesis", as used herein, refers to the process in which plant cells (e.g., seed or tissues, such as pollen, etc.) are contacted one or more times to a mutagenic agent, such as a chemical substance (such as ethylmethylsulfonate (EMS), ethylnitrosourea (ENU), etc.) or ionizing radiation (neutrons (such as in fast neutron mutagenesis, etc.), gamma rays (such as that supplied by a Cobalt 60 source), X-rays, etc.), or a combination of the foregoing. While mutations created by irradiation are often large deletions or other gross lesions such as translocations or complex rearrangements, mutations created by chemical mutagens are often more discrete lesions such as point mutations. For example, EMS alkylates guanine bases, which results in base mispairing: an alkylated guanine will pair with a thymine base, resulting primarily in G/C to A/T transitions. Following mutagenesis, plants can be regenerated from the treated cells using known techniques. For instance, the resulting seeds may be planted in accordance with conventional growing procedures and following self-pollination seed is formed on the plants. Alternatively, doubled haploid plantlets may be extracted to immediately form homozygous plants. Additional seed which is formed as a result of such self-pollination in the present or a subsequent generation may be harvested and screened for the presence of mutant IncRNA encoding regions. Several techniques are known to screen for specific mutant sequences, e.g., DeleteageneTM (Delete- a- gene; Li et al., 2001, Plant J 27: 235-242) uses polymerase chain reaction (PCR) assays to screen for deletion mutants generated by fast neutron mutagenesis, TILLING (targeted induced local lesions in genomes; McCallum et al., 2000, Nat Biotechnol 18:455-457) identifies EMS-induced point mutations, etc.
[046] Decrease of transcription and/or concentration of IncRNA molecules as herein described can also be achieved using inhibitory RNA molecules specifically recognizing IncRNA molecules, introduced or expressed in the plant cells.
[047] In one embodiment, the IncRNA expression may be downregulated by introducing a chimeric DNA construct which yields a sense RNA molecule capable of down-regulating IncRNA expression by co-suppression. The transcribed DNA region will yield upon transcription a so-called sense RNA molecule capable of reducing the concentration of the IncRNA molecule in the target plant or plant cell in a transcriptional
or post-transcriptional manner. The transcribed DNA region (and resulting RNA molecule) comprises at least 20 consecutive nucleotides having at least 95% sequence identity to the nucleotide sequence of the IncR A-encoding DNA region present in the plant cell or plant.
[048] In another embodiment, the IncRNA expression may be downregulated by introducing a chimeric DNA construct which yields an anti-sense RNA molecule capable of down-regulating IncRNA expression by co-suppression. The transcribed DNA region will yield upon transcription a so-called antisense RNA molecule capable of reducing the concentration of the IncRNA molecule in the target plant or plant cell in a transcriptional or post-transcriptional manner. The transcribed DNA region (and resulting RNA molecule) comprises at least 20 consecutive nucleotides having at least 95% sequence identity to the complement of the nucleotide sequence of the IncRNA molecule in the plant cell or plant.
[049] However, the minimum nucleotide sequence of the antisense or sense RNA region of about 20 nt of the IncRNA molecule may be comprised within a larger RNA molecule, varying in size from 20 nt to a length equal to the size of the target IncRNA molecule. The mentioned antisense or sense nucleotide regions may thus be about from about 21 nt to about 5000 nt long, such as 21 nt, 40 nt, 50 nt, 100 nt, 200 nt, 300 nt, 500 nt or 1000 nt in length. Moreover, it is not required for the purpose of the invention that the nucleotide sequence of the used inhibitory RNA molecule or the encoding region of the transgene, is completely identical or complementary to the IncRNA molecule, the expression of which is targeted to be reduced in the plant cell. The longer the sequence, the less stringent the requirement for the overall sequence identity is. Thus, the sense or antisense regions may have an overall sequence identity of about 40 % or 50 % or 60 % or 70 % or 80 % or 90 % or 100 % to the nucleotide sequence of the endogenous parpl gene or the complement thereof. However, as mentioned, antisense or sense regions should comprise a nucleotide sequence of 20 consecutive nucleotides having about 95 to about 100 % sequence identity to the nucleotide sequence of the IncRNA molecule. The stretch of about 95 to about 100% sequence identity may be about 50, 75 or 100 nt.
[050] In yet another embodiment, IncR A molecule transcription or concentration may be down-regulated by introducing a chimeric DNA construct which yields a double- stranded RNA molecule capable of down-regulating IncRNA expression. Upon transcription of the DNA region the RNA is able to form dsRNA molecule through conventional base paring between a sense and antisense region, whereby the sense and antisense region are nucleotide sequences as hereinbefore described. dsRNA-encoding parpl expression-reducing chimeric genes according to the invention may further comprise an intron, such as a heterologous intron, located e.g. in the spacer sequence between the sense and antisense RNA regions in accordance with the disclosure of WO 99/53050 (incorporated herein by reference). To achieve the construction of such a transgene, use can be made of the vectors described in WO 02/059294 Al.
[051 ] In still another embodiment, IncRNA molecule transcription or concentration can be down-regulated by introducing a chimeric DNA construct which yields a pre-miRNA RNA molecule which is processed into a miRNA capable of guiding the cleavage of the IncRNA molecule. miRNAs are small endogenous RNAs that regulate gene expression in plants, but also in other eukaryotes. In plants, these about 21 nucleotide long RNAs are processed from the stem-loop regions of long endogenous pre-miRNAs by the cleavage activity of DICERLIKE 1 (DCL1). Plant miRNAs are highly complementary to conserved target mRNAs, and guide the cleavage of their targets. miRNAs appear to be key components in regulating the gene expression of complex networks of pathways involved inter alia in development.
[052] As used herein, a "miRNA" is an RNA molecule of about 20 to 22 nucleotides in length which can be loaded into a RISC complex and direct the cleavage of a target RNA molecule, wherein the target RNA molecule comprises a nucleotide sequence essentially complementary to the nucleotide sequence of the miRNA molecule whereby one or more of the following mismatches may occur:
- A mismatch between the nucleotide at the 5' end of said miRNA and the corresponding nucleotide sequence in the target RNA molecule;
- A mismatch between any one of the nucleotides in position 1 to position 9 of said miRNA and the corresponding nucleotide sequence in the target RNA molecule;
- Three mismatches between any one of the nucleotides in position 12 to position 21 of said miRNA and the corresponding nucleotide sequence in the target RNA molecule provided that there are no more than two consecutive mismatches.
No mismatch is allowed at positions 10 and 1 1 of the miRNA (all miRNA positions are indicated starting from the 5' end of the miRNA molecule).
[053] As used herein, a "pre-miRNA" molecule is an RNA molecule of about 100 to about 200 nucleotides, preferably about 100 to about 130 nucleotides which can adopt a secondary structure comprising a dsRNA stem and a single stranded RNA loop and further comprising the nucleotide sequence of the miRNA and its complement sequence of the miRNA* in the double-stranded RNA stem. Preferably, the miRNA and its complement are located about 10 to about 20 nucleotides from the free ends of the miRNA dsRNA stem. The length and sequence of the single stranded loop region are not critical and may vary considerably, e.g. between 30 and 50 nt in length. Preferably, the difference in free energy between unpaired and paired RNA structure is between -20 and -60 kcal/mole, particularly around -40 kcal/mole. The complementarity between the miRNA and the miRNA* do not need to be perfect and about 1 to 3 bulges of unpaired nucleotides can be tolerated. The secondary structure adopted by an RNA molecule can be predicted by computer algorithms conventional in the art such as mFold, UNAFold and RNAFold. The particular strand of the dsRNA stem from the pre-miRNA which is released by DCL activity and loaded onto the RISC complex is determined by the degree of complementarity at the 5' end, whereby the strand which at its 5' end is the least involved in hydrogen bounding between the nucleotides of the different strands of the cleaved dsRNA stem is loaded onto the RISC complex and will determine the sequence specificity of the target RNA molecule degradation. However, if empirically the miRNA molecule from a particular synthetic pre-miRNA molecule is not functional because the "wrong" strand is loaded on the RISC complex, it will be immediately evident that this problem can be solved by exchanging the position of the miRNA molecule and its complement on the respective strands of the dsRNA stem of the pre-miRNA molecule.
As is known in the art, binding between A and U involving two hydrogen bounds, or G and U involving two hydrogen bounds is less strong that between G and C involving three hydrogen bounds.
[054] miRNA molecules may be comprised within their naturally occurring pre-miRNA molecules but they can also be introduced into existing pre-miRNA molecule scaffolds by exchanging the nucleotide sequence of the miRNA molecule normally processed from such existing pre-miRNA molecule for the nucleotide sequence of another miRNA of interest. The scaffold of the pre-miRNA can also be completely synthetic. Likewise, synthetic miRNA molecules may be comprised within, and processed from, existing pre- miRNA molecule scaffolds or synthetic pre-miRNA scaffolds.
[055] Increase of transcription and/or concentration of a lncRNA molecule as herein described can be achieved by providing the plant cells with a recombinant gene, wherein the recombinant gene comprises a plant expressible promoter operably linked to a DNA region which when transcribed yields an RNA molecule comprising or consisting of that lncRNA molecule and optionally, an appropriate transcription termination region and/or polyadenylation region.
[056] For the purpose of the invention, the term "plant- operative promoter" or "plant- expressible promoter" means a promoter which is capable of driving transcription in a plant, plant tissue, plant organ, plant part, or plant cell. This includes any promoter of plant origin, but also any promoter of non-plant origin which is capable of directing transcription in a plant cell.
[057] Promoters that may be used in this respect are constitutive promoters, such as the promoter of the cauliflower mosaic virus (CaMV) 35S transcript (Hapster et al.,1988, Mol. Gen. Genet. 212: 182-190), the CaMV 19S promoter (U.S. Pat. No. 5,352,605; WO 84/02913; Benfey et al, 1989, EMBO J. 8:2195-2202), the subterranean clover virus promoter No 4 or No 7 (WO 96/06932), the Rubisco small subunit promoter (U.S. Pat. No. 4,962,028), the ubiquitin promoter (Holtorf et al, 1995, Plant Mol. Biol. 29:637-649), T-DNA gene promoters such as the octopine synthase (OCS) and nopaline synthase
(NOS) promoters from Agrobacterium, and further promoters of genes whose constitutive expression in plants is known to the person skilled in the art.
[058] Further promoters that may be used in this respect are tissue-specific or organ- specific promoters, preferably seed-specific promoters, such as the 2S albumin promoter (Joseffson et al, 1987, J. Biol. Chem. 262: 12196-12201), the phaseolin promoter (U.S. Pat. No. 5,504,200; Bustos et al, 1989, Plant Cell l .(9):839-53), the legumine promoter (Shirsat et al, 1989, Mol. Gen. Genet. 215(2):326-331), the "unknown seed protein" (USP) promoter (Baumlein et al, 1991, Mol. Gen. Genet. 225(3):459-67), the napin promoter (U.S. Pat. No. 5,608,152; Stalberg et al, 1996, Planta 199:515-519), the Arabidopsis oleosin promoter (WO 98/45461), the Brassica Bce4 promoter (WO 91/13980), and further promoters of genes whose seed-specific expression in plants is known to the person skilled in the art.
[059] Other promoters that can be used are tissue-specific or organ-specific promoters like organ primordia-specific promoters (An et al., 1996, Plant Cell 8: 15-30), stem- specific promoters (Keller et al, 1988, EMBO J. 7(12): 3625-3633), leaf-specific promoters (Hudspeth et al., 1989, Plant Mol. Biol. 12: 579-589), mesophyl-specific promoters (such as the light-inducible Rubisco promoters), root-specific promoters (Keller et al., 1989, Genes Dev. 3: 1639-1646), tuber-specific promoters (Keil et al., 1989, EMBO J. 8(5): 1323-1330), vascular tissue-specific promoters (Peleman et al., 1989, Gene 84: 359-369), stamen-selective promoters (WO 89/10396, WO 92/13956), dehiscence zone-specific promoters (WO 97/13865), and the like.
[060] In addition to promoters recognized by RNA polymerase I, also promoter recognized by RNA Polymerase I or RNA polymerase III promoters may be used including Type 3 Pol III promoters which can be found e.g. associated with the genes encoding 7SL RNA, U3 snRNA and U6 snRNA. Other nucleotide sequences for type 3 Pol III promoters can be found in nucleotide sequence databases under the entries for the A. thaliana gene AT7SL-1 for 7SL RNA (X72228), A. thaliana gene AT7SL-2 for 7SL RNA (X72229), A. thaliana gene AT7SL-3 for 7SL RNA (AJ290403), Humulus lupulus H17SL-1 gene (AJ236706), Humulus lupulus H17SL-2 gene (AJ236704), Humulus
lupulus H17SL-3 gene (AJ236705), Humulus lupulus H17SL-4 gene (AJ236703), A. thaliana U6-1 snRNA gene (X52527), A. thaliana U6-26 snRNA gene (X52528), A. thaliana U6-29 snRNA gene (X52529), A. thaliana U6-1 snRNA gene (X52527), Zea mays U3 snRNA gene (Z29641), Solanum tuberosum U6 snRNA gene (Z17301; X 60506; S83742), Tomato U6 smal nuclear RNA gene (X51447), A. thaliana U3C snRNA gene (X52630), A. thaliana U3B snRNA gene (X52629), Oryza sativa U3 snRNA promoter (X79685), Tomato U3 smal nuclear RNA gene (X14411), Triticum aestivum U3 snRNA gene (X63065), Triticum aestivum U6 snRNA gene (X63066).
[061] The recombinant DNA molecules as herein described optionally comprise a DNA region involved in transcription termination and/or polyadenylation. A variety of DNA region involved in transcription termination and/or polyadenylation functional in plants are known in the art and those skilled in the art will be aware of terminator and polyadenylation sequences that may be suitable in performing the methods herein described. The polyadenylation region may be derived from a natural gene, from a variety of other plant genes, from T-DNA genes or even from plant viral genomes. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or from any other eukaryotic gene. Terminator regions for Pol III promoters include a so-called « oligo dT stretch » which is a stretch of consecutive T-residues that serve as a terminator for the RNA polymerase III activity. It should comprise at least 4 T-residues, but obviously may contain more T-residues.
[062] As used herein the term "providing a recombinant DNA molecule" may refer to introduction of an exogenous DNA molecule to a plant cell by transformation, optionally followed by regeneration of a plant from the transformed plant cell. The term may also refer to introduction of the recombinant DNA molecule by crossing of a transgenic plant comprising the recombinant DNA molecule with another plant and selecting progeny plants which have inherited the recombinant DNA molecule or transgene. Yet another alternative meaning of providing refers to introduction of the recombinant DNA molecule
by techniques such as protoplast fusion, optionally followed by regeneration of a plant from the fused protoplasts.
[063] It will be clear that the methods of transformation used are of minor relevance to the current invention. Transformation of plants is now a routine technique. Advantageously, any of several transformation methods may be used to introduce the nucleic acid/gene of interest into a suitable ancestor cell. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens et al. (1982) Nature 296: 72- 74 ; Negrutiu et al. (1987) Plant. Mol. Biol. 8: 363-373); electroporation of protoplasts (Shillito et al. (1985) Bio/Technol. 3 : 1099-1 102); microinjection into plant material (Crossway et al. (1986) Mol. Gen. Genet. 202: 179-185); DNA or RNA-coated particle bombardment (Klein et al. (1987) Nature 327: 70) infection with (non- integrative) viruses and the like. Transgenic rice plants can be produced via Agrobacterium-mQdiatQd transformation using any of the well-known methods for rice transformation, such as described in any of the following: European patent application EP 1 198985 Al ; Aldemita and Hodges (1996) Planta 199: 612-617 ; Chan et al. (1993) Plant. Mol. Biol. 22 (3): 491-506 ; Hiei et al. (1994) Plant J. 6 (2): 271 -282), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, a suitable method is as described in either Ishida et al. (1996) Nat. Biotechnol. 14(6): 745- 50) or Frame et al. (2002) Plant Physiol. 129(1): 13-22), which disclosures are incorporated by reference herein as if fully set forth. In the case of canola, a suitable transformation method is that disclosed in De Block et al. (Plant Physiol. (1989) 91 : 694- 701), which disclosure is incorporated by reference herein as if fully set forth. Methods to transform cotton plants are also well known in the art. Agrobacterium-mQdiatQd transformation of cotton has been described e.g. in US patent 5,004,863 or in US patent 6,483,013 and cotton transformation by particle bombardment is reported e.g. in WO 92/15675. Other suitable cotton transformation methods are disclosed e.g. in
WO00071733 and US 5,159,135, which disclosures are incorporated by reference herein as if fully set forth.
[064] The recombinant DNA molecules according to the invention may be introduced into plants in a stable manner or in a transient manner using methods well known in the art. The chimeric genes may be introduced into plants, or may be generated inside the plant cell as described e.g. in EP 1339859.
[065] Other embodiments of the invention relate to the recombinant DNA molecules as herein described, as well as to plants, plant cells, plant tissues or seeds comprising the recombinant DNA molecules as herein described.
[066] It is also an object of the invention to provide plant cells and plants comprising a modulated, increased or decreased, concentration of an IncRNA molecule as herein described, when compared to an unmodified plant cell. Gametes, seeds, embryos, either zygotic or somatic, progeny or hybrids of plants which comprising such modulated IncRNA molecule concentration, which are produced by traditional breeding methods are also included within the scope of the present invention.
[067] The methods and means described herein are believed to be suitable for all plant cells and plants, both dicotyledonous and monocotyledonous plant cells and plants including but not limited to cotton, Brassica vegetables, oilseed rape, wheat, corn or maize, barley, sunflowers, sorghum, rice, oats, sugarcane, soybean, vegetables (including chicory, lettuce, tomato), tobacco, potato, sugarbeet, papaya, pineapple, mango, Arabidopsis thaliana, but also plants used in horticulture, floriculture or forestry.
[068] The invention also provides a method for isolating further IncRNA molecules involved in phosphate use efficiency, in a plant or plant cell, comprising the step of identifying a nucleotide sequence having a degree of homology to the IncRNA molecules herein described and isolating or synthesizing such RNA molecule comprising or consisting of such nucleotide sequence. The identification can occur via hybridization
under stringent conditions in plants using probes having the nucleotide sequence of the IncRNA molecules. Alternatively, sequence databases (of genomes or transcriptomes) can be searched using software such as BLASTN for sequences that share a defined degree of sequence identity with the sequences of the IncRNA molecules herein described.
[069] As used herein "sequence identity" of two related nucleotide or amino acid sequences, expressed as a percentage, refers to the number of positions in the two optimally aligned sequences which have identical residues (xlOO) divided by the number of positions compared. A gap, i.e., a position in an alignment where a residue is present in one sequence but not in the other is regarded as a position with non-identical residues. The alignment of the two sequences is performed by the Needleman and Wunsch algorithm (Needleman and Wunsch (1970) J. Mol Biol. 48: 443-453). The computer- assisted sequence alignment above, can be conveniently performed using standard software program such as GAP which is part of the Wisconsin Package Version 10.1 (Genetics Computer Group, Madision, Wisconsin, USA) using the default scoring matrix with a gap creation penalty of 50 and a gap extension penalty of 3. It is clear that when RNA sequences are to be essentially similar or have a certain degree of sequence identity with DNA sequences, thymine (T) in the DNA sequence is considered equal to uracil (U) in the RNA sequence.
[070] As used herein, the term "comprising" is to be interpreted as specifying the presence of the stated features, integers, steps or components as referred to, but does not preclude the presence or addition of one or more features, integers, steps or components, or groups thereof. Thus, e.g., a nucleic acid comprising a sequence of nucleotides, may comprise more nucleotides than the actually cited ones, i.e., be embedded in a larger nucleic acid. A chimeric gene as will be described further below which comprises a nucleic acid which is functionally or structurally defined may comprise additional nucleic acids etc. However, in context with the present disclosure, the term "comprising" also includes "consisting of. In other words, the terminology relating to a nucleic acid "comprising" a certain nucleotide sequence, as used throughout the text, refers to a
nucleic acid or protein including or containing at least the described sequence, so that other nucleotide or amino acid sequences can be included at the 5' (or N-terminal) and/or 3' (or C-terminal) end, e.g. (the nucleotide sequence of) a selectable marker protein, (the nucleotide sequence of) a transit peptide, and/or a 5' leader sequence or a 3' trailer sequence.
[071] The following non-limiting Examples describe the identification of IncR A molecules differentially expressed under inorganic phosphate starvation conditions as well as the use thereof for modulating phosphate use efficiency in plants.
[072] Unless stated otherwise in the Examples, all recombinant DNA techniques can be carried out according to standard protocols as described in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, NY and in Volumes 1 and 2 of Ausubel et al. (1994) Current Protocols in Molecular Biology, Current Protocols, USA. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R.D.D. Croy, jointly published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications, UK. Other references for standard molecular biology techniques include Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, NY, Volumes I and II of Brown (1998) Molecular Biology LabFax, Second Edition, Academic Press (UK). Standard materials and methods for polymerase chain reactions can be found in Dieffenbach and Dveksler (1995) PCR Primer: A Laboratory Manual, Cold Spring Harbor Laboratory Press, and in McPherson at al. (2000) PCR - Basics: From Background to Bench, First Edition, Springer Verlag, Germany.
[073] Throughout the description and Examples, reference is made to the following sequences:
SEQ ID Nos. 1-211 : nucleotide sequence of respectively NonpolyA_lnc 1 to
NonpolyA_lnc 211 from Arabidopsis thaliana.
SEQ ID Nos. 212-1081 : nucleotide sequence of respectively PolyA lnc 1 to
PolyA_lnc 870 from Arabidopsis thaliana.
SEQ ID Nos. 1082-1096 : nucleotide sequence of orhologous IncRNA molecules from
Glycine max.
SEQ ID Nos. 1097-1 110 : nucleotide sequence of orhologous IncRNA molecules from
Oryza sativa.
SEQ ID Nos. 111 1-1 121 : nucleotide sequence of orhologous IncRNA molecules from
Sorghum bicolor.
SEQ ID Nos. 1122-1 130 : nucleotide sequence of orhologous IncRNA molecules from
Setaria italica.
SEQ ID Nos. 1131-1239 : nucleotide sequence of orhologous IncRNA molecules from
Triticiim aestivum.
SEQ ID Nos. 1240-1297 : nucleotide sequence of orhologous IncRNA molecules from
Zea mays.
SEQ ID Nos. 1298-1327 : nucleotide sequence of RT-PCR primers listed in Table 3.
SEQ ID No. 1328 nucleotide sequence of lnc-149 from Arabidopsis thaliana. SEQ ID No. 1329 nucleotide sequence of lnc-149 like from Arabidopsis lyrata. SEQ ID No. 1330 nucleotide sequence of lnc-149 from Thellungiella halophile.
[074] The sequence listing contained in the file named„BCS14-2001_ST25.txt", which is 740 kilobytes (size as measured in Microsoft Windows®), contains 1330 sequences SEQ ID NO: 1 through SEQ ID NO: 1330, is filed herewith by electronic submission and is incorporated by reference herein.
Example 1; Materials and methods
[075] Plant Materials and phosphate-starvation treatment. Wild-type Arabidopsis thaliana seedlings were used in this study. Col-0 seeds were sterilized in household disinfectant and then washed in sterilized water five times. Then they were grown on full- strength MS medium supplemented with 1% sucrose and 1.2% agar under long day condition (24 °C, 16:8h light/dark, at light intensity of 100 μπιοΐ m-2 s-1). For Pi deficiency, 0.6 mM K2S04 was used to substitute 1.2 mM KH2P04 in P+ MS medium. Thirteen-day-old seedlings were collected and frozen in liquid nitrogen immediately and then store in -80 °C for following treatment.
[076] Non-polyA RNA and total RNA (rRNA removed) purification. Total RNA was extracted from thirteen-day-old seedlings using QIAGEN RNeasy Plant Mini Kit and then quantified by NanoDrop 1000 and 1% agarose gel electrophoresis. To enrich non- polyA RNA, we chose 10μg RNA to capture polyA RNA using Oligo dT3o-probe (Oligotex mRNA Mini Kit, QIAGEN) for four times. 5μ1 probes and 80μ1 binding buffer were used each time. After centrifuging, probes binding polyA RNAs were precipitated and the supernatant mainly consisted of non-polyA RNA and short-polyA RNA. Then ~8μg non-polyA RNA was collected and all used to remove rRNA (RiboMinus™ Plant Kit, Invitrogen) for two times. For total RNA collection, only rRNA-removing protocol was performed two times for lOug RNA. Non-polyA RNAs and total RNAs were quantified using Agilent 2100 Bioanalyzer (allowing to see zonal distribution for total RNAs after rRNA removal) and then stored -80 °C.
[077] Strand-specific cDNA library construction and high-throughput sequencing.
~^g RNA was used to construct a cDNA library. Sheared RNA was linked with different barcodes at its 3 ' and 5' end respectively and reverse transcript to cDNA to construct strand-specific cDNA library according to the SMART method [26]). For total RNA-seq and non-polyA RNA-seq (short reads), 320~420bp fragments were collected (containing 200-3 OObp cDNA fragments and 120bp barcodes) by gel recovery. After amplification, the cDNA library was sequenced by 36nt paired-end using Illumina HiSeq
2000. For non-polyA RNA-seq (long reads) in WT and Pi-starvation samples, 320-620 bp fragments containing 200-500 bp cDNA fragments and 120bp barcodes were purified,amplified and then sequenced by 100 nt single-end using Illumina HiSeq 2000. For each sample, ~20M raw reads were collected.
[078] Reads trimming, assembling and filtering. The reads' sequencing quality was tested using FastQC vO.10.1 (www.bioinformatics.babraham.ac.uk/projects/fastqc/) (Figure 6). 5' terminal 2 and 8 nucleotides with bad quality were trimmed using FASTX- toolkit for short reads and long reads, respectively. rRNA was filtered out from these trimmed reads to get clean reads using Bowtie (vO.12.7) with one mismatch [27. The reads were mapped to the Arabidopsis thaliana reference genome (TAIR10) using TopHat (v2.0.5) with two mismatches. Mapped reads were assembled by Cufflinks (v2.0.1) and re-assembled by Cuffcompare (v2.0.1) following the protocol from [28]. Transcripts labeled CUFF are considered novel transcripts (without annotation). These transcripts were collected and then re-filtered out to remove those overlapping with coding genes and known ncRNAs. CPC tools [29] were used to calculate coding potential and those reads with CPC<0 (low coding potential) were retained. After length selection of transcripts longer than 200 nt transcripts, these were designated as novel IncRNA. Novel IncRNAs were re-located to transposable elements (TE) (>1 nt overlap with TE, no strand-specificity), pseudogene (>1 nt overlap with pseudogene, no strand-specificity), antisense (>50% overlap with mRNA, on opposite strand), intronic (100% overlap with intron, on the same strand), ambiguous ( >1 nt overlap with known ncRNAs or coding genes) and intergenic region (the remainder).
[079] RT-PCR validation of novel IncRNAs. 9 novel IncRNAs candidates from non- polyA RNA-seq (long reads) were selected. PolyA and non-polyA RNAs were prepared as described above, and cDNAs were synthesized using the Superscript™ III First-Strand Synthesis System (Invitrogen) using Random hexamers as primers. qRT-PCR was performed with SYBR Premix Ex Taq (ABI StepOnePlus™system, TaKaRa), PCR reaction was performed for cDNAs and negative control using specific primers (Table 3).
[080] Classification into polyA and non-polyA IncRNAs. A published polyA RNA- seq database [25] and the 100 nt reads non-polyA RNA-seq generated in this study were used to classify polyA and non-polyA IncRNAs. The maximum expression value (Reads Per Kilobase per Million mapped reads, RPKM) of each IncRNA in polyA data and non- polyA data was determined separately. If a transcript's polyA RPKM was four times greater than its maximum non-polyA RPKM, or if the transcript was only identified in the polyA data, we defined the transcript as a polyA transcript, and vice versa. If a transcript could not be assigned to either the polyA class or or non-polyA class, the transcript was defined as a "biomorphic" transcript. For IncRNAs assembled from polyA RNA-seq, only the ones defined as polyA IncRNAs were retained. Using the same strategy a set of non-polyA IncRNAs was also defined.
[081] Differential expression analysis To find the differentially expressed IncRNAs responsive to Pi starvation, we first assigned reads count to IncRNAs with RNA-seq data using the DEGseq package [30]. The raw read counts were normalized against the total mapped reads in each sample, and then used the MA-plot based method with random sampling model.was used The p-value of 0.05 and fold change of 2 were set as the cutoff. When comparing the ratios of differential expressed IncRNAs, we performed χ2 test with p-value < 0.01.
[082] DNA conservation analysis. A phylogenetic tree based on the plant species' divergent time was constructed (www.timetree.org) using MEGA5.0. DNA conservation scores between Arabidopsis thaliana and 16 other organisms were calculated using BLASTn followed by the calculation of the average DNA conservation score.
[083] Calculation of sequence and structure features. For convenient computational analysis, we split the entire Arabidopsis thaliana genome into 4,765,850 small bins (each with lOObp), two neighboring bins have 50 nt overlap [31], and calculated sequence and structure features (DNA sequence conservation, protein sequence conservation, RNA secondary structure stability and RNA structure conservation) for each genomic bin (every lOObp of genome). When measuring features for genomic elements (e.g. IncRNA),
we overlapped the genomic elements with bins and use the maximum or minimum (only R A structure stability) bin scores for the comparison.
[084] DNA conservation scores were calculated across 31 plant species (genomes downloaded from PlantGDB) using BLASTn with default parameters. The maximum Bitscore was used as the feature score. Protein conservation scores were calculated using BLASTx in a similar manner. The coding potential of each bin was calculated using R Acode with default parameters [32]. RNA secondary structure stability of each bin was calculated using RandFold [33], with 1000 times of dinucleotide randomshuffling, and the p-value was used as the feature score. The RNA structure conservation scores were denoted by SCI (structure conservation index) scores, which were calculated using RNAz based on multiple alignments between Arabidopsis thaliana, Arabidopsis lyrata, Carica papaya, Thellungiella halophile and Citrus Clementina (downloaded from VISTA database).
[085] Dozens of RNA-seq and tiling array data [34-40], and a set of unexpressed intergenic regions was defined as negative control. The genomic regions with expression level lower than the mean expression level of all genomic element across all RNA-seq and array samples [31] were defined as unexpressed intergenic regions.
[086] Gene Ontology Enrichment Analysis . lkb regions downstream or upstream the coding genes were defined as the potential cis-regulatory regions. Four gene sets were defined: two sets with polyA or non-polyA IncRNAs located on the cis-regulatory region; two sets with polyA or non-polyA IncRNAs located on the opposite strand (the antisense IncRNAs). The GOtoolbox (genome.crg.es/GOToolBox/) was used to find enriched Gene Ontology terms in these four gene sets using 0.01 as the p-value cutoff.
[087] Identification of conserved local structures. After calculating SCI (structure conservation index) scores of each bin (as described above), the maximum SCI score of the overlapping bins was assigned to all non-polyA IncRNA candidates. Those with high SCI value (>= 0.7) and multiple alignments across five plants (including Arabidopsis thaliana, Arabidopsis lyrata, Thellungiella halophila, Carica papaya and Citrus
Clementina) were considered as highly conserved in both DNA sequence and RNA secondary structure. Based on the multiple genome alignments, the conserved RNA secondary structures were calculated using RNAalifold [41] and then visualized using VARNA [42].
Example 2: Development of a non-polyA RNA sequencing method to identify IncRNAs in the Arabidopsis thaliana genome
[088] High-throughput RNA sequencing technology could lead deeper insights into the transcriptome. Here we applied three kinds of RNA-sequencing methods to identify novel IncRNAs in Arabidopsis: (i) Sequencing of total RNAs (rRNA depleted) for short reads with 36nt in length; (ii) Sequencing of non-polyA RNAs separated from total RNAs by a general RNA-seq protocol for short reads with 36 nt in length; (iii) Sequencing of non- polyA RNAs for long reads with 100 nt in length (see Example 1). These three methods were compared by the length of assembled transcripts and the best one was selected for genome -wide IncRNAs identification.
[089] The identification of novel IncRNAs comprised three steps: (i) purifying specific RNA components; (ii) constructing strand-specific cDNA library and sequencing; (iii) high- throughput sequencing data processing to identify novel IncRNAs (Figure 1).
[090] As a the first step, we improved the polyA RNA and rRNA removing protocol by utilizing fewer oligo dT30-probes and rRNA-probes with more replicates. As a result, we found that only 13.04% and 18.72% of the non-polyA and total RNA-seq reads were mapped to T AIR 10 rDNA region, respectively (Table 1). Previous studies revealed that polyadenylation regulates RNA stability and degradation. Polyadenylation of 18S rRNA also regulates its degradation in mitochondria [43]. So we assumed that the polyA RNA selection could also remove some rRNAs.
[091 ] High-throughput sequencing generated -20 million reads for these three methods (Table 1). First, we trimmed nucleotides with low quality scores in raw sequencing reads
(Figure 6, see Example 1). After removing rR A-matched reads, assembling transcripts, filtering annotated transcripts, selecting non-coding candidates and discarding short ones (see Example 1), we finally identified 97 novel IncRNAs in total RNA-seq with short readslength, 559 in non-polyA RNA-seq with short read length and 177 in non-polyA RNA-seq with long read length.
[092] Comparing the length distribution of the assembled transcript fragments, we found that 62.27% of the fragments are longer than 200 nt for long read RNA-seq library, while the percentages are only 43.14% and 16.33% for two short read RNA-seq libraries, respectively (Figure 2A). After filtering out transcript fragments with known genomic annotations and high coding potential (CPC>0) (see Example 1), the long read RNA-seq library and the short read total RNA-seq library still had the longest and shortest average read length, respectively (Figure 2B). Although the long read RNA-seq method results in a lower number of transcripts before and after IncRNA screening, it yields longer IncRNA candidates.
[093] Using the same RNA-seq protocol but with different RNAs , we identified 97 and 559 novel IncRNA candidates from total RNA-seq library and non-polyA RNA-seq library, respectively. For these, 64 out of 97 and 35 out of 559 are shared (with >50% nucleotides overlapped in at least one dataset). We examined these candidates using IGB [44] and found that several candidates from total RNA-seq library were pieces of longer transcripts from non-polyA RNA-seq library (data not shown). Non-polyA RNA-seq assay could enrich more reads derived from intergenic regions and with higher expression level. This has been proved by the percentage of overlapping candidates in two short-read RNA-seq. 95.5% of IncRNAs were expressed at low levels (RPKM 0-5) in non-polyA RNA-seq library, while 3.4% of them were rediscovered in total RNA-seq library. 1.25% of IncRNAs had the highest expression levels (RPKM 103-105) in non-polyA RNA-seq library, and 60% of them could also be detected in total RNA-seq library (Figure 2C). Because most IncRNAs are characterized by their extreme low expression level [9, 21], more robust experimental methods were needed to detect these low-expressed IncRNAs and non-polyA RNA-seq assay could be an ideal choice.
[094] Furthermore, we investigated the sources of these newly identified IncRNAs in three RNA-seq libraries. Constructing strand-specific cDNA library makes identification of novel IncRNA more sensitive, especially for those derived from antisense region. Locating these IncRNA candidates to TAIR10 genome, we found that the majority are derived from the antisense strand of coding genes and the second largest set are transcribed from intergenic regions (Figure 2D). Finally, 9 random-selected novel non- polyA IncRNA candidates were validated by RT-PCR. Because of non-polyA RNA remaining during polyA RNA selection, we also detected these IncRNAs in polyA samples, but at much lower levels (Figure 2E).
Example 3: Identification of non-polyA and polyA IncRNAs associated with Pi starvation
[095] LncRNAs IPS1 and At4 act as target mimics to inhibit the activity of miRNA-399, and are involved in response to Pi starvation in Arabidopsis [6]. However, a systematic understanding of the roles of IncRNAs in Pi starvation response is still lacking. We utilized non-polyA RNA-seq (long read) method (described above, see Examples 1 and 2) to identify IncRNAs which are differentially expressed in Arabidopsis seedling under normal and Pi-deficient condition. In addition, we also used published polyA RNA-seq data generated from Pi-stressed Arabidopsis for comparison (Table 2) [25].
[096] A computational framework was developed to detect and characterize IncRNAs specifically associated with Pi starvation in both polyA and non-polyA RNA-seq libraries (Figure 3A). We re-assembled the transcriptome, selected transcript fragments with low coding potential (CPC<0) and detected 870 and 210 IncRNAs after cross filtering in polyA and non-polyA RNA-seq libraries, respectively (see Example 1). 91 out of 870 (10.5%) and 68 out of 210 (32.2%) IncRNAs were differentially expressed under Pi- deficient and normal conditions (Figure 3B and 3C- See also Table 4). Most of the IncRNAs detected in polyA and non-polyA RNA-seq libraries, whether differentially expressed or not, were derived from intergenic regions (Figure 3C). Antisense IncRNAs, were more enriched in non-polyA RNA-seq library (Figure 7).
[097] Next, novel IncRNAs (both in polyA and non-polyA RNA-seq libraries) detected by their sequence and structural features were characterized including by Gene Ontology (GO) enrichment to identify IncRNA subgroups that are associated with Pi-starvation (Figure 3 A). We found that novel IncRNAs in both polyA and non-polyA RNA-seq libraries are more conserved than unexpressed intergenic regions, but less conserved than coding genes (Figure 4A and 4B). Additionally, we compared RNA structural stability, RNA structure conservation and protein conservation between novel IncRNAs and coding genes. We observed that most novel IncRNAs have extreme low protein conservation, which supports that the detected novel IncRNAs are potentially non-protein coding. Overall, novel IncRNAs have an slightly lower RNA structure free energy than coding genes, which reflects higher RNA structural stability.
[098] We also found that only some novel IncRNAs had higher RNA structural conservation than coding genes, which might due to extreme high DNA sequence conservation of coding genes, even though we had normalized the structure conservation to DNA conservation scores (Figure 4B). These three features of IncRNAs described above are well consistent with previous reports [45, 46].
[099] The chromosomal locations of differentially expressed IncRNA candidates are listed in Table 4. Table 4 also includes a cross reference to the corresponding SEQ ID No. entry in the sequence listing. Also included are GO terms associated coding regions co- localized with the IncRNA as described below.
[0100] Non-polyA IncRNAs are mainly located in antisense and intergenic regions (Figure ID), with limited clues to infer their potential cellular functions. Previous studies suggested that IncRNAs located around protein coding genes (no matter upstream, internal or downstream of the sense or antisense strand) can regulate the expression of their 'host' coding genes in Arabidopsis [17, 47]. Therefore we analyzed potential IncRNA functions based on genomic co-location with protein coding genes. 1 Kb regions were identified downstream and upstream the protein coding genes as the potential cis- regulatory regions. IncRNAs, differentially expressed under Pi starvation conditions,
which are located in the cis-regulatory regions could potentially regulate their 'host' protein coding genes.
[0101] Gene Ontology Enrichment Analysis was performed on non-polyA and polyA IncRNAs which were differentially expressed in control and low Pi samples to determine their putative cellular functions. Several GO biological processes (BP) terms are more enriched for non-polyA IncRNAs, such as gibberellin signaling transduction and root epidermal morphology (Figure 4C). Notably, GO BP terms for polyA IncRNAs exhibit strong preference for plant hormone signaling transduction and responses (Figure 8). Our findings are supported by previous studies that many plant hormones (e.g. gibberellins) are deeply involved in Pi starvation responses [48]. Therefore, part of the non-polyA IncRNAs in cis-regulatory regions might be involved in hormone signaling transduction during Pi starvation.
Example 4: Validation of novel non-poly IncRNAs responding to Pi starvation
[0102] Five intergenic non-polyA IncRNAs which were up-regulated under low Pi condition by the sequencing data were selected. The upregulated expression pattern of these intergenic non-polyA IncRNAs were validated by real-time PCR (Figure 5A).
[0103] The lnc-34 is transcribed from a region located upstream of a gene AT1G74670, which encodes a protein whose expression is responsive to gibberellins and sugar, two signals which function in Pi starvation responses [48, 49]. In non-polyA RNA sequencing data, differential expression for lnc-34 can be found between control and low Pi samples, while no similar phenomenon was found in polyA sequencing.
[0104] Apart from expression, we downloaded ChlP-seq data for H3K9ac (GSE28398) and chAP-seq data (GSM954614) for H2A.Z. The signals of H3K9ac were observed near lnc-34 and AT1G74670 and H2A.Z was showed of signal peaks at the 5' and 3' end of them (Figure 5B). Since H3K9ac is a general activator [50], while H2A.Z was reported to regulate the expression of several classes of Pi starvation responding genes (e.g. PHOl)
[51 ] and always shows a strong signal peak at 5 ' end and a smaller peak at 3 ' end of most genes [52], indicating IncRNA may function as a regulator in Pi starvation.
[0105] The conserved secondary structures of all novel non-polyA IncRNAs was also determined. Interestingly, a Pi starvation related IncRNA, lnc-149, was identified with a common local structure conserved in three plant genomes (Figure 5C, see Example 1).
[0106] The conserved structure is composed of two conserved short hairpins and one variable long hairpin located at the 5' end (3-70 nt) of lnc-149. Moreover, the three parts of the local structure are connected together by a multi-branch loop and an additional stem. The key points of the connection, one in the start region of the first conserved hairpin and the other in the start region of the variable hairpin, are constrained by base pairs with covariance (the bases change and the structures retain), implying the conserved structure is favored during evolution and may have a function in the stress response by low Pi conditions. Furthermore, the loop regions in the two conserved hairpins are also maintained in three or two genomes, demonstrating that the loops may have a function, for instance, to bind R A-binding proteins.
Example 5: Identification of orthologues of IncRNA in other crop species.
[0107] Sequences with a high degree of similarity to several of the IncRNAs from Arabidopsis thaliana of Table 4 were identified in soybean (Table 5), rice (Table 6), sorghum (Table 7) Foxtail millet (Table 8) wheat (Table 9) and mais/corn (Table 10). The Tables indicate the identity of the IncRNA in Arabidopsis and the reference to the appropriate sequence entry for the sequence in Arabidopsis, and the reference to the sequence entry for the crop species as well as chromosome location and % sequence identity.
Example 6: Modulation of Pi-starvation related IncRNAs in plants
[0108] Recombinant genes are made comprising the following DNA regions
a) A plant expressible promoter such as a CaMV35S promoter
b) a DNA region which when transcribed yields an R A comprising a nucleotide sequence of a IncR A, preferably a non-polyA IncR A upregulated under Pi starvation conditions, including a non-poly IncRNA selected from lnc-24, lnc-107, lnc-139, lnc-194, lnc-195 or lnc-149
[0109] The recombinant genes are introduced into plants, particularly Arabidopsis plants through transformation methods known in the art and transgenic plants are identified.
[01 10] The behavior of the transgenic plants under Pi-starvation conditions is analyzed in comparison to transgenic plants under normal Pi conditions, as well as in comparison to wild-type non-transgenic plants under normal or Pi-starvation conditions.
[01 11]
References
1. Hirayama, T. and K. Shinozaki, Research on plant abiotic stress responses in the post-genome era: past, present and future. The Plant journal : for cell and molecular biology, 2010. 61(6): p. 1041-52.
2. Chen, W., et al., Expression profile matrix of Arabidopsis transcription factor genes suggests their putative functions in response to environmental stresses. The Plant cell, 2002. 14(3): p. 559-74.
3. Sunkar, R. and J.K. Zhu, Novel and stress-regulated microRNAs and other small RNAs from Arabidopsis. The Plant cell, 2004. 16(8): p. 2001-19.
4. Chiou, T.J., et al., Regulation of phosphate homeostasis by MicroRNA in
Arabidopsis. The Plant cell, 2006. 18(2): p. 412-21.
5. Bari, R., et al., PH02, microRNA399, and PHR1 define a phosphate-signaling pathway in plants. Plant physiology, 2006. 141(3): p. 988-99.
6. Franco-Zorrilla, J.M., et al., Target mimicry provides a new mechanism for
regulation of microRNA activity. Nature genetics, 2007. 39(8): p. 1033-7.
7. Shin, H., et al., Loss of At4 function impacts phosphate distribution between the roots and the shoots during phosphate starvation. The Plant journal : for cell and molecular biology, 2006. 45(5): p. 712-26.
8. Rinn, J.L. and H.Y. Chang, Genome regulation by long noncoding RNAs. Annual review of biochemistry, 2012. 81: p. 145-66.
9. Ravasi, T., et al., Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome research, 2006. 16(1): p. 11-9.
10. Ponjavic, J., CP. Ponting, and G. Lunter, Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome research,
2007.17(5): p. 556-65.
11. Struhl, K., Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nature structural & molecular biology, 2007. 14(2): p. 103-5.
12. Ebisuya, M., et al., Ripples from neighbouring transcription. Nature cell biology, 2008. 10(9): p. 1106-13.
Dinger, M.E., et al., Long noncoding RNAs in mouse embryonic stem eel pluripotency and differentiation. Genome research, 2008. 18(9): p. 1433-45.
Guttman, M., et al., Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nature biotechnology, 2010. 28(5): p. 503-10.
Borsani, G., et al., Characterization of a murine gene expressed from the inactive X chromosome. Nature, 1991. 351(6324): p. 325-9.
Brown, C.J., et al., The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell, 1992. 71(3): p. 527-42.
Heo, J.B. and S. Sung, Vernalization-mediated epigenetic silencing by a long intronic noncoding RNA. Science, 201 1. 331(6013): p. 76-9.
Marioni, J.C., et al., RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome research, 2008. 18(9): p. 1509- 17.
Zhu, Y.Y., et al., Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. BioTechniques, 2001. 30(4): p. 892-7. Schaefer, M., et al., RNA cytosine methylation analysis by bisulfite sequencing. Nucleic acids research, 2009. 37(2): p. el2.
Cabili, M.N., et al., Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes & development, 201 1. 25(18): p. 1915-27.
Khalil, A.M., et al., Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proceedings of the National Academy of Sciences of the United States of America, 2009. 106(28): p. 1 1667-72.
Guttman, M., et al., Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature, 2009. 458(7235): p. 223- 7.
Yang, L., et al., Genome wide characterization of non-polyadenylated RNAs Genome biology, 201 1. 12(2): p. R16.
Lan, P., W. Li, and W. Schmidt, Complementary proteome and transcriptome profiling in phosphate-deficient Arabidopsis roots reveals multiple levels of gene regulation. Molecular & cellular proteomics : MCP, 2012. 11(1 1): p. 1 156-66. Levin, J.Z., et al., Comprehensive comparative analysis of strand- specific RNA sequencing methods. Nature methods, 2010. 7(9): p. 709-15.
Langmead, B., et al., Ultrafast and memory- efficient alignment of short DNA sequences to the human genome. Genome biology, 2009. 10(3): p. R25.
Trapnell, C, et al., Differential gene and transcript expression analysis of RNA- seq experiments with TopHat and Cufflinks. Nature protocols, 2012. 7(3): p. 562- 78.
Kong, L., et al., CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic acids research, 2007. 35(Web Server issue): p. W345-9.
Wang, L., et al., DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics, 2010. 26(1): p. 136-8.
Lu, Z.J., et al., Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data. Genome research, 201 1. 21(2): p. 276-85.
Washietl, S., et al., RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data. RNA, 201 1. 17(4): p. 578-94.
Bonnet, E., et al., Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics, 2004. 20(17): p. 291 1 -7.
Zeller, G., et al., Stress-induced changes in the Arabidopsis thaliana transcriptome analyzed using whole-genome tiling arrays. The Plant journal for cell and molecular biology, 2009. 58(6): p. 1068-82.
Becker, C, et al., Spontaneous epigenetic variation in the Arabidopsis thaliana methylome. Nature, 201 1. 480(7376): p. 245-9.
Filichkin, S.A., et al., Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome research, 2010. 20(1): p. 45-58.
German, M.A., et al., Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends. Nature biotechnology, 2008. 26(8): p. 941 -6.
Gregory, B.D., et al., A link between RNA metabolism and silencing affecting Arabidopsis development. Developmental cell, 2008. 14(6): p. 854-66.
Laubinger, S., et al., At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana. Genome biology, 2008. 9(7): p. Rl 12.
Lister, R., et al., Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell, 2008. 133(3): p. 523-36.
Bernhart, S.H., et al., RNAalifold: improved consensus structure prediction for RNA alignments. BMC bioinformatics, 2008. 9: p. 474.
Darty, K., A. Denise, and Y. Ponty, VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics, 2009. 25(15): p. 1974-5.
Perrin, R., et al., AtmtPNPase is required for multiple aspects of the 18S rRNA metabolism in Arabidopsis thaliana mitochondria. Nucleic acids research, 2004. 32(17): p. 5174-82.
Nicol, J.W., et al., The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics, 2009. 25(20): p. 2730- 1.
Guttman, M. and J.L. Rinn, Modular regulatory principles of large non-coding RNAs. Nature, 2012. 482(7385): p. 339-46.
Marques, A.C. and CP. Ponting, Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness. Genome biology, 2009. 10(1 1): p. R124. Swiezewski, S., et al., Cold-induced silencing by long antisense transcripts of an Arabidopsis Poly comb target. Nature, 2009. 462(7274): p. 799-802.
Devaiah, B.N., et al., Phosphate starvation responses and gibber ellic acid biosynthesis are regulated by the MYB62 transcription factor in Arabidopsis. Molecular plant, 2009. 2(1): p. 43-58.
Karthikeyan, A.S., et al., Phosphate starvation responses are mediated by sugar signaling in Arabidopsis. Planta, 2007. 225(4): p. 907-18.
He, G., A.A. Elling, and X.W. Deng, The epigenome and plant development. Annual review of plant biology, 201 1. 62: p. 41 1-35.
Smith, A.P., et al., Histone H2A.Z regulates the expression of several classes of phosphate starvation response genes but not as a transcriptional activator. Plant physiology, 2010. 152(1): p. 217-25.
Coleman-Derr, D. and D. Zilberman, Deposition of histone variant H2A.Z within gene bodies regulates responsive genes. PLoS genetics, 2012. 8(10): p. el 002988. Chiou, T.J. and S.I. Lin, Signaling network in sensing phosphate availability in plants. Annual review of plant biology, 201 1. 62: p. 185-206.
Jain, A., V.K. Nagarajan, and K.G. Raghothama, Transcriptional regulation of phosphate acquisition by higher plants. Cell Mol Life Sci, 2012. 69(19): p.3207- 24.
Wu, P., et al., Phosphate starvation triggers distinct alterations of genome expression in Arabidopsis roots and leaves. Plant physiology, 2003. 132(3): p. 1260-71.
Jiang, C, et al., Phosphate starvation root architecture and anthocyanin accumulation responses are modulated by the gibberellin-DELLA signaling pathway in Arabidopsis. Plant physiology, 2007. 145(4): p. 1460-70.
Chodroff, R.A., et al., Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome biology, 2010. 11(7): p. R72.
Table 1. Total and non-polyA RiboMinus RNA-seq data
Total RNA-seq (short reads) and non-polyA RNA-seq (short reads) were sequenced at 36nt single-end, non-polyA RNA-seq (long reads) for control and low Pi samples were sequenced at lOOnt single-end. Reads were aligned to TAIRIO rRNA, and remaining were mapped to TAIRIO genome using Tophat with two mismatches.
Table 2. Downloaded Poly A RNA-seq data
The data are downloaded from Ref [45]. Raw reads from polyA RNA-seq were mapped to TAIRIO genome using Tophat with two mismatches
Table 3. RT-PCR primers
Table 4: Long non-coding RNA from Arabidopsis thaliana
Exon Differential
IncRNA ID Chr strand Tr start Tr end num Exon start Exon end expression SEQ ID NO Potential functions
abscisic acid mediated signaling pathway,cell communication,cell death,cellular membr fusion,cytoplasm, defense response to bacterium, incompatible interaction, defense res to fungus,embryo development,endoplasmic reticulum unfolded protein response,ethyl mediated signaling pathway,ethylene mediated signaling pathway,glycolysis,Golgi organization,Golgi vesicle transport,hyperosmotic responsejasmonic acid mediated sig pathwayjasmonic acid mediated signaling pathway,lateral root morphogenesis,MAPK cascade,NAD+ ADP-ribosyltransferase activity,NAD+ ADP-ribosyltransferase activity,NA ADP-ribosyltransferase activity, negative regulation of defense response,negative regula of programmed cell death,nitric oxide biosynthetic
process,nucleus,nucleus,nucleus,nucleus,programmed cell death,protein binding,protei binding,protein binding,protein binding,protein glycosylation,protein targeting to membrane,protein targeting to vacuole,regulation of plant-type hypersensitive response,regulation of reactive oxygen species metabolic process,response to abscisic a stimulus,response to auxin stimulus,response to cadmium ion, response to carbohydrat stimulus,response to cold, response to ethylene stimulus,response to external stimulus,response to osmotic stress,response to oxidative stress,response to ozone,res to salicylic acid stimulus,response to salt stress, response to salt stress, response to superoxide,response to superoxide,response to temperature stimulus,response to wate deprivation,salicylic acid mediated signaling pathway,systemic acquired resistance, salic
PolyA_lnc-l Chrl 11616594 11617128 11616594, 11617128, 212 acid mediated signaling pathway,vesicle-mediated transport,water transport
Table 5. Orthologs in Glycine max
Table 6. Orthologs in Oryza sativa
Table 7. Orthologs in Sorghum bicolor
Table 8. Orthologs in Setaria italica
Table 9. Orthologs in Triticum aestivum
Table 10. Orthologs inL " ea mays
Claims
1. A long non-coding RNA( IncRNA) molecule which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply and which is unpolyadenylated or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA or the complement of said DNA molecule.
2. The IncRNA molecule according to claim 1 which comprises a nucleotide sequence having at least 80% identity with any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No. 15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No. 70; SEQ ID No. 72; SEQ ID No. 77; SEQ ID No. 79; SEQ ID No. 82; SEQ ID No. 83; SEQ ID No. 84; SEQ ID No. 86; SEQ ID No. 90; SEQ ID No. 91 ; SEQ ID No. 98; SEQ ID No. 99; SEQ ID No. 102; SEQ ID No. 107; SEQ ID No. 114; SEQ ID No. 115; SEQ ID No. 118; SEQ ID No. 119; SEQ ID No. 120; SEQ ID No. 121; SEQ ID No. 128; SEQ ID No. 132; SEQ ID No. 137; SEQ ID No. 139; SEQ ID No. 140; SEQ ID No. 145; SEQ ID No. 149; SEQ ID No. 151; SEQ ID No. 152; SEQ ID No. 158; SEQ ID No. 160; SEQ ID No. 162; SEQ ID No. 172; SEQ ID No. 174; SEQ ID No. 176; SEQ ID No. 178; SEQ ID No. 179; SEQ ID No. 180; SEQ ID No. 181 ; SEQ ID No. 183; SEQ ID No. 187; SEQ ID No. 194; SEQ ID No. 195; SEQ ID No. 200; SEQ ID No. 201 ; SEQ ID No. 204; SEQ ID No. 209 or SEQ ID No. 210 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
3. The IncRNA molecule according to claim 1 having a nucleotide sequence comprising any one of SEQ ID No. 4; SEQ ID No. 5; SEQ ID No. 6; SEQ ID No.
15; SEQ ID No. 18; SEQ ID No. 26; SEQ ID No. 28; SEQ ID No. 32; SEQ ID No. 34; SEQ ID No. 36; SEQ ID No. 40; SEQ ID No. 42; SEQ ID No. 47; SEQ ID No. 52; SEQ ID No. 55; SEQ ID No. 57; SEQ ID No. 59; SEQ ID No. 60; SEQ ID No. 62; SEQ ID No. 64; SEQ ID No. 70; SEQ ID No. 72; SEQ ID No. 77; SEQ ID No. 79; SEQ ID No. 82; SEQ ID No. 83; SEQ ID No. 84; SEQ ID No. 86; SEQ ID No. 90; SEQ ID No. 91; SEQ ID No. 98; SEQ ID No. 99; SEQ ID No. 102; SEQ ID No. 107; SEQ ID No. 1 14; SEQ ID No. 115; SEQ ID No. 1 18; SEQ ID No. 119; SEQ ID No. 120; SEQ ID No. 121; SEQ ID No. 128; SEQ ID No. 132; SEQ ID No. 137; SEQ ID No. 139; SEQ ID No. 140; SEQ ID No. 145; SEQ ID No. 149; SEQ ID No. 151; SEQ ID No. 152; SEQ ID No. 158; SEQ ID No. 160; SEQ ID No. 162; SEQ ID No. 172; SEQ ID No. 174; SEQ ID No. 176; SEQ ID No. 178; SEQ ID No. 179; SEQ ID No. 180; SEQ ID No. 181; SEQ ID No. 183; SEQ ID No. 187; SEQ ID No. 194; SEQ ID No. 195; SEQ ID No. 200; SEQ ID No. 201; SEQ ID No. 204; SEQ ID No. 209 or SEQ ID No. 210 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
4. The IncRNA molecule according to claim 1 , wherein said differential expression is upregulation, or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA or comprises the complement of said nucleotide sequence.
5. The IncRNA molecule according to claim 4, wherein said IncRNA molecule comprises a nucleotide sequence having at least 80% identity with any one of SEQ ID Nos. 34, 107, 139, 194 or 195 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
6. The IncRNA molecule according to claim 4, having a nucleotide sequence comprising any one of SEQ ID Nos. 34, 107, 139, 194 or 195 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
7. The IncRNA molecule according to claim 1, wherein said IncRNA molecule comprises at least two conserved hairpin loops, preferably further comprising two co-variance base pairings or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
8. The IncRNA molecule according to claim 7, wherein said IncRNA molecule comprises a nucleotide sequence having at least 80% sequence identity with SEQ ID No. 149 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
9. The IncRNA molecule according to claim 8, having a nucleotide sequence comprising SEQ ID No. 149 or a DNA molecule which when transcribed yields an RNA molecule comprising said IncRNA.
10. Use of a IncRNA molecule according to any one of claims 1 to 3, or a DNA molecule which when transcribed yields such IncRNA molecule, to modulate inorganic phosphate use efficiency in a plant.
11. Use of a IncRNA molecule which is differentially expressed in a plant grown under phosphate starvation conditions when compared to a plant grown under normal conditions with adequate phosphate supply, or a DNA molecule which when transcribed yields such IncRNA molecule, or the complement of said DNA molecule, to modulate inorganic phosphate use efficiency in a plant.
12. The use according to claim 11 , wherein said IncRNA molecule comprises a nucleotide sequence having at least 80 % sequence identity to SEQ ID No. 216; SEQ ID No. 235; SEQ ID No. 240; SEQ ID No. 250; SEQ ID No. 266; SEQ ID No. 267; SEQ ID No. 278; SEQ ID No. 281; SEQ ID No. 283; SEQ ID No. 315; SEQ ID No. 320; SEQ ID No. 321; SEQ ID No. 325; SEQ ID No. 330; SEQ ID No. 361; SEQ ID No. 364; SEQ ID No. 368; SEQ ID No. 370; SEQ ID No. 378; SEQ ID No. 384; SEQ ID No. 388; SEQ ID No. 393; SEQ ID No. 396; SEQ ID
No. 414; SEQ ID No. 420; SEQ ID No. 424; SEQ ID No. 429; SEQ ID No. 432; SEQ ID No. 433; SEQ ID No. 436; SEQ ID No. 437; SEQ ID No. 448; SEQ ID No. 459; SEQ ID No. 460; SEQ ID No. 461; SEQ ID No. 462; SEQ ID No. 463; SEQ ID No. 486; SEQ ID No. 504; SEQ ID No. 512; SEQ ID No. 533; SEQ ID No. 570; SEQ ID No. 571 ; SEQ ID No. 572; SEQ ID No. 600; SEQ ID No. 601; SEQ ID No. 617; SEQ ID No. 618; SEQ ID No. 649; SEQ ID No. 651; SEQ ID No. 652; SEQ ID No. 656; SEQ ID No. 660; SEQ ID No. 662; SEQ ID No. 707; SEQ ID No. 709; SEQ ID No. 712; SEQ ID No. 713; SEQ ID No. 715; SEQ ID No. 716; SEQ ID No. 725; SEQ ID No. 734; SEQ ID No. 783; SEQ ID No. 785; SEQ ID No. 786; SEQ ID No. 787; SEQ ID No. 788; SEQ ID No. 798; SEQ ID No. 817; SEQ ID No. 850; SEQ ID No. 857; SEQ ID No. 918; SEQ ID No. 921; SEQ ID No. 945; SEQ ID No. 946; SEQ ID No. 955; SEQ ID No. 957; SEQ ID No. 972; SEQ ID No. 985; SEQ ID No. 986; SEQ ID No. 990; SEQ ID No. 997; SEQ ID No. 998; SEQ ID No. 1020; SEQ ID No. 1025; SEQ ID No. 1031; SEQ ID No. 1033; SEQ ID No. 1047; SEQ ID No. 1048; SEQ ID No. 1068 or SEQ ID No. 1080.
13. The use according to claim 5, wherein said IncRNA molecule comprises a nucleotide sequence of SEQ ID No. 216; SEQ ID No. 235; SEQ ID No. 240; SEQ ID No. 250; SEQ ID No. 266; SEQ ID No. 267; SEQ ID No. 278; SEQ ID No. 281 ; SEQ ID No. 283; SEQ ID No. 315; SEQ ID No. 320; SEQ ID No. 321; SEQ ID No. 325; SEQ ID No. 330; SEQ ID No. 361; SEQ ID No. 364; SEQ ID No. 368; SEQ ID No. 370; SEQ ID No. 378; SEQ ID No. 384; SEQ ID No. 388; SEQ ID No. 393; SEQ ID No. 396; SEQ ID No. 414; SEQ ID No. 420; SEQ ID No. 424; SEQ ID No. 429; SEQ ID No. 432; SEQ ID No. 433; SEQ ID No. 436; SEQ ID No. 437; SEQ ID No. 448; SEQ ID No. 459; SEQ ID No. 460; SEQ ID No. 461 ; SEQ ID No. 462; SEQ ID No. 463; SEQ ID No. 486; SEQ ID No. 504; SEQ ID No. 512; SEQ ID No. 533; SEQ ID No. 570; SEQ ID No. 571; SEQ ID No. 572; SEQ ID No. 600; SEQ ID No. 601 ; SEQ ID No. 617; SEQ ID No. 618; SEQ ID No. 649; SEQ ID No. 651; SEQ ID No. 652; SEQ ID No. 656; SEQ ID No. 660; SEQ ID No. 662; SEQ ID No. 707; SEQ ID No. 709; SEQ ID No. 712; SEQ
ID No. 713; SEQ ID No. 715; SEQ ID No. 716; SEQ ID No. 725; SEQ ID No. 734; SEQ ID No. 783; SEQ ID No. 785; SEQ ID No. 786; SEQ ID No. 787; SEQ ID No. 788; SEQ ID No. 798; SEQ ID No. 817; SEQ ID No. 850; SEQ ID No. 857; SEQ ID No. 918; SEQ ID No. 921 ; SEQ ID No. 945; SEQ ID No. 946; SEQ ID No. 955; SEQ ID No. 957; SEQ ID No. 972; SEQ ID No. 985; SEQ ID No. 986; SEQ ID No. 990; SEQ ID No. 997; SEQ ID No. 998; SEQ ID No. 1020; SEQ ID No. 1025; SEQ ID No. 1031; SEQ ID No. 1033; SEQ ID No. 1047; SEQ ID No. 1048; SEQ ID No. 1068 or SEQ ID No. 1080.
14. The use according to claim 5, wherein said IncRNA molecule comprises a nucleotide sequence of any one of Tables 5 to 10.
15. A method for modulating the phosphate use efficiency in a plant comprising the step of increasing or decreasing the transcription and/or concentration of an IncRNA molecule according to any one of claims 1 to 14 in a plant cell.
16. The method according to claim 15, comprising the step of increasing the transcription and/or concentration of an IncRNA molecule according to any one of claims 1 to 14 in a plant cell.
17. The method according to claim 16, wherein said phosphate use efficiency is increased.
18. The method according to claim 15, comprising the step of decreasing the transcription and/or concentration of an IncRNA molecule according to any one of claims 1 to 14 in a plant cell.
19. The method according to claim 16, wherein said phosphate use efficiency is increased.
20. The method according to claim 18, wherein said decrease in transcription or concentration is achieved by mutating the DNA region from which said IncR A molecule is transcribed.
21. The method according to claim 18, wherein said decrease in transcription or concentration is achieved by expression of an inhibitory R A molecule specifically recognizing said IncRNA molecule.
22. The method according to claim 16, wherein said increase in transcription or concentration is achieved by mutating the DNA region from which said IncRNA molecule is transcribed.
23. The method according to claim 16, comprising introducing a recombinant gene comprising:
a. a plant-expressible promoter
b. a DNA region which when transcribed encodes an RNA molecule comprising or consisting of said IncRNA molecule; and optionally c. an appropriate transcription termination region, and/or polyadenylation region.
24. A modified plant cell comprising a modulated concentration or transcription of an IncRNA molecule according to any one of claims 1 to 14 when compared to an unmodified plant cell.
25. A modified plant, or plant part, tissue or organ comprising a multitude of or consisting essentially of modified plant cells according to claim 24.
26. A seed of the modified plant according to claim 25.
27. A method for isolating further IncRNA molecules involved in phosphate use efficiency in a plant comprising the step of identifying a nucleotide sequence
having a degree of homology to the IncR A molecules of any one of claims 1 to 14 and isolating or synthesizing such RNA molecule comprising or consisting of such nucleotide sequence.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2014/071873 WO2015117265A1 (en) | 2014-02-07 | 2014-02-07 | LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2014/071873 WO2015117265A1 (en) | 2014-02-07 | 2014-02-07 | LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2015117265A1 true WO2015117265A1 (en) | 2015-08-13 |
Family
ID=53777109
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2014/071873 Ceased WO2015117265A1 (en) | 2014-02-07 | 2014-02-07 | LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2015117265A1 (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019233863A1 (en) | 2018-06-04 | 2019-12-12 | Bayer Aktiengesellschaft | Herbicidally active bicyclic benzoylpyrazoles |
| CN110699355A (en) * | 2019-07-30 | 2020-01-17 | 中山大学 | Long non-coding RNA gene ROVULE and its application in regulating rice endosperm development |
| CN111304197A (en) * | 2018-12-11 | 2020-06-19 | 东北农业大学 | Long-chain non-coding RNA gene of beet under alkali stress resistance and preparation method and application thereof |
| CN114214334A (en) * | 2022-01-12 | 2022-03-22 | 山东农业大学 | Application of the gene EsH2A.3 from salt mustard in regulating plant salt tolerance |
| CN114807137A (en) * | 2021-07-07 | 2022-07-29 | 忻州师范学院 | Potato high temperature response lncRNA and its application |
| CN116024218A (en) * | 2023-02-06 | 2023-04-28 | 山东农业大学 | A kind of Arabidopsis LncRNA29 and its application |
| CN119955806A (en) * | 2025-02-05 | 2025-05-09 | 华中农业大学 | A citrus CsMYB62 gene and its application in softening citrus branch thorns |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2013090620A1 (en) * | 2011-12-13 | 2013-06-20 | Genomedx Biosciences, Inc. | Cancer diagnostics using non-coding transcripts |
-
2014
- 2014-02-07 WO PCT/CN2014/071873 patent/WO2015117265A1/en not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2013090620A1 (en) * | 2011-12-13 | 2013-06-20 | Genomedx Biosciences, Inc. | Cancer diagnostics using non-coding transcripts |
Non-Patent Citations (4)
| Title |
|---|
| ANDRZEJ T WIERZBICKI.: "The role of long non-coding RNA in transcriptional gene silencing", CURRENT OPINION IN PLANT BIOLOGY, vol. 15, no. 5, 6 September 2012 (2012-09-06), pages 517 - 522, XP028958937 * |
| BESMA BEN AMOR ET AL.: "Novel long non-protein coding RNAs involved in Arabidopsis differentiation and stress responses", GENOME RES., vol. 19, no. 1, 31 January 2009 (2009-01-31), pages 57 - 69, XP055216531 * |
| JEREMY E. WILUSZ ET AL.: "A triple helix stabilizes the 3' ends of long noncoding RNAs that lack poly(A) tails", GENES DEV., vol. 26, no. 21, 1 November 2012 (2012-11-01), pages 2392 - 2407, XP055194808 * |
| SHU YONG-JUN ET AL.: "Computational identification and functional analysis of long non- coding RNA in Triticum aestivum", CHINESE JOURNAL OF BIOINFORMATICS, vol. 11, no. 2, 30 June 2013 (2013-06-30), pages 153 - 157 * |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019233863A1 (en) | 2018-06-04 | 2019-12-12 | Bayer Aktiengesellschaft | Herbicidally active bicyclic benzoylpyrazoles |
| CN111304197A (en) * | 2018-12-11 | 2020-06-19 | 东北农业大学 | Long-chain non-coding RNA gene of beet under alkali stress resistance and preparation method and application thereof |
| CN110699355A (en) * | 2019-07-30 | 2020-01-17 | 中山大学 | Long non-coding RNA gene ROVULE and its application in regulating rice endosperm development |
| CN110699355B (en) * | 2019-07-30 | 2023-09-22 | 中山大学 | Long non-coding RNA gene ROVULE and its application in regulating rice endosperm development |
| CN114807137A (en) * | 2021-07-07 | 2022-07-29 | 忻州师范学院 | Potato high temperature response lncRNA and its application |
| CN114214334A (en) * | 2022-01-12 | 2022-03-22 | 山东农业大学 | Application of the gene EsH2A.3 from salt mustard in regulating plant salt tolerance |
| CN114214334B (en) * | 2022-01-12 | 2023-08-04 | 山东农业大学 | The application of the gene EsH2A.3 derived from the salt mustard in the regulation of plant salt tolerance |
| CN116024218A (en) * | 2023-02-06 | 2023-04-28 | 山东农业大学 | A kind of Arabidopsis LncRNA29 and its application |
| CN119955806A (en) * | 2025-02-05 | 2025-05-09 | 华中农业大学 | A citrus CsMYB62 gene and its application in softening citrus branch thorns |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kutter et al. | MicroRNA-mediated regulation of stomatal development in Arabidopsis | |
| Calixto et al. | Cold-dependent expression and alternative splicing of Arabidopsis long non-coding RNAs | |
| Pant et al. | Identification of nutrient-responsive Arabidopsis and rapeseed microRNAs by comprehensive real-time polymerase chain reaction profiling and small RNA sequencing | |
| Kinoshita et al. | IAA-Ala Resistant3, an evolutionarily conserved target of miR167, mediates Arabidopsis root architecture changes during high osmotic stress | |
| Zeng et al. | Conservation and divergence of microRNAs and their functions in Euphorbiaceous plants | |
| Tian et al. | De novo characterization of the Anthurium transcriptome and analysis of its digital gene expression under cold stress | |
| Wang et al. | Combined small RNA and degradome sequencing to identify miRNAs and their targets in response to drought in foxtail millet | |
| Jeong et al. | Parallel analysis of RNA ends enhances global investigation of microRNAs and target RNAs of Brachypodium distachyon | |
| Li et al. | miRNA164-directed cleavage of ZmNAC1 confers lateral root development in maize (Zea mays L.) | |
| Salinas et al. | Genomic organization, phylogenetic comparison and differential expression of the SBP-box family of transcription factors in tomato | |
| Khandal et al. | MicroRNA profiling provides insights into post-transcriptional regulation of gene expression in chickpea root apex under salinity and water deficiency | |
| Sanan-Mishra et al. | Cloning and validation of novel miRNA from basmati rice indicates cross talk between abiotic and biotic stresses | |
| Javelle et al. | Genome-wide characterization of the HD-ZIP IV transcription factor family in maize: preferential expression in the epidermis | |
| WO2015117265A1 (en) | LONG NON-CODING RNAs FOR MODULATING PHOSPHATE USE EFFICIENCY IN PLANTS | |
| Bai et al. | Uncovering male fertility transition responsive miRNA in a wheat photo-thermosensitive genic male sterile line by deep sequencing and degradome analysis | |
| Yan et al. | Comparative expression profiling of miRNAs between the cytoplasmic male sterile line MeixiangA and its maintainer line MeixiangB during rice anther development | |
| Wang et al. | Identification and profiling of novel and conserved microRNAs during the flower opening process in Prunus mume via deep sequencing | |
| Chen et al. | Integrated mRNA and microRNA analysis identifies genes and small miRNA molecules associated with transcriptional and post-transcriptional-level responses to both drought stress and re-watering treatment in tobacco | |
| Xu et al. | Differential expression networks and inheritance patterns of long non‐coding RNA s in castor bean seeds | |
| Wu et al. | microRNA-dependent gene regulatory networks in maize leaf senescence | |
| Neutelings et al. | Identification and characterization of miRNAs and their potential targets in flax | |
| Wei et al. | The miRNAs and their regulatory networks responsible for pollen abortion in Ogura-CMS Chinese cabbage revealed by high-throughput sequencing of miRNAs, degradomes, and transcriptomes | |
| CN106459982A (en) | Compositions and methods for increasing plant growth and yield | |
| Wang et al. | The pattern of alternative splicing and DNA methylation alteration and their interaction in linseed (Linum usitatissimum L.) response to repeated drought stresses | |
| Zhang et al. | Cloning and characterization of miRNAs from maize seedling roots under low phosphorus stress |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14882100 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 14882100 Country of ref document: EP Kind code of ref document: A1 |