HK1229378A1

HK1229378A1 - Gene gw7 for controlling grain shape, exterior quality and yield of rice and applications of gene gw7

Info

Publication number: HK1229378A1
Application number: HK17103131.5A
Authority: HK
Inventors: 傅向东; 王少奎; 吴昆�; 刘倩; 李姗
Original assignee: 中国科学院遗传与发育生物学研究所
Filing date: 2017-03-27
Publication date: 2017-11-17

Description

Gene GW7 for controlling rice grain type, appearance quality and yield and application thereof

Technical Field

The present invention belongs to the field of botany and biotechnology. Specifically, the invention relates to a gene GW7 for controlling rice grain type and appearance quality and application thereof. The invention also relates to the polypeptide coded by the gene and the promoter sequence of the gene.

Background

Rice (Oryza sativa L.) is the first major food crop in China and is one of the major food crops in the world, and rice is taken as staple food in nearly half of the global population. In recent years, the planting area of rice is continuously reduced, and the improvement of the yield per unit area of rice becomes the most important way for improving the total yield of grain to solve the problem of insufficient grain. The rice semi-dwarf breeding and the heterosis utilization greatly improve the yield of rice and make great contribution to guaranteeing the food safety of China. However, for a long time, in order to solve the problem of eating by a huge population, the rice breeding in China always takes high yield as a main target and neglects quality improvement, so that the current situation that the rice varieties in China generally have high yield but not high quality is caused. With the improvement of living standard of people and the opening of rice market, rice is traded as a commodity, the requirements of people on the appearance, cooking taste and other qualities of rice are increasingly improved, the task of improving the rice quality is more and more urgent, and the breeding of new rice varieties with high quality and high yield becomes an important direction of rice breeding in China.

The rice grain type (grain length, grain width and aspect ratio) is directly related to rice yield and quality. The research on the rice grain type character has very important practical significance and theoretical value for rice breeding for improving the rice yield and quality. The shape of rice grain is typical quantitative character, is regulated by multiple genes and has complex genetic basis. The method is an effective means for researching the grain shape gene by decomposing and positioning and cloning QTL controlling the quantitative character through genetic hybridization and molecular marking technology. In recent years, a number of genes controlling rice grain shape have been cloned by this technique, for example, rice grain width controlling genes GW2(Song et al, 2007), GW5(Shomura et al, 2008; Weng et al, 2008), GS5(Li et al, 2011) and GW8(Wang et al, 2012), and grain length controlling genes GS3(Fan et al, 2006; Mao et al, 2010). The cloning and functional analysis of the grain type genes lay a material foundation for molecular design breeding based on grain type improvement and rice yield and quality improvement.

Disclosure of Invention

The invention relates to a gene for controlling rice yield and quality and application thereof. The invention aims to provide an important functional gene for separating, cloning and controlling rice grain type (grain width and grain length) and quality from rice and application thereof. In the present invention, this gene controlling rice grain type and quality is referred to as GW 7.

In the first aspect of the invention, a segregation population is constructed by utilizing high-quality sterile line materials Taifeng A and Hua nong nonglutinous rice 74 in hybrid rice breeding, a GW7 gene is finely positioned in a physical range of 2.6Kb of a long arm end of a No. 7 chromosome of rice, and a near isogenic line material NIL-GW7 under the background of the Hua nong nonglutinous rice 74 is constructed.

The second aspect of the present invention relates to gene GW7 for controlling rice grain type and quality, which is an isolated polynucleotide comprising a nucleotide sequence selected from the group consisting of:

(1) SEQ ID NOs: 1-4;

(2) a nucleotide sequence that hybridizes under moderately stringent conditions, preferably high stringent hybridization conditions, to the complement of the nucleotide sequence of (1);

(3) a nucleotide sequence having at least 70%, preferably at least 80%, more preferably at least 90%, especially at least 95% or 98% or 99% identity to the nucleotide sequence of (1);

(4) a nucleotide sequence which encodes a protein of the same amino acid sequence as the nucleotide sequence of (1) but differs in sequence due to the degeneracy of the genetic code;

(5) a nucleotide sequence encoding one of the following amino acid sequences: SEQ ID NOs: 7. 8, or, alternatively, an amino acid sequence that differs from the amino acid sequence set forth in any one of SEQ ID NOs: 7. 8, or an amino acid sequence that differs from the amino acid sequence set forth in any one of SEQ ID NOs: 7. 8, preferably at least 80%, more preferably at least 90%, especially at least 95% or 98% identity;

(6) an active fragment of the nucleotide sequence of any one of (1) to (5); or

(7) A nucleotide sequence complementary to the nucleotide sequence of any one of (1) to (5).

Wherein the cDNA sequences of GW7 and GW7 are shown in SEQ ID NOs: 1-4, the nucleotide sequence of the related promoter is represented by SEQ ID NOs: 5-6, and the amino acid sequence of the encoded protein is shown as SEQ ID NOs: 7-8. See table 1 below for details.

Table 1.SEQ ID NOs: 1-8 and their sources

Preferably, the gene GW7 for controlling the width, length and appearance quality of rice is shown as SEQ ID NO: 2, respectively.

The invention also relates to a promoter sequence of GW7, comprising a nucleotide sequence selected from the group consisting of:

(1) SEQ ID NOs: 5 and 6;

(3) a nucleotide sequence having at least 70%, preferably at least 80%, more preferably at least 90%, especially at least 95% or 98% identity to the nucleotide sequence of (1);

(4) an active fragment of the nucleotide sequence of any one of (1) to (3); or

(5) A nucleotide sequence complementary to the nucleotide sequence of any one of (1) to (4).

The invention also relates to a construct comprising the promoter sequence, and a cell comprising the construct, wherein the cell is a plant cell, preferably a rice cell.

The third aspect of the present invention relates to an isolated polypeptide (also called protein) encoded by the GW7 gene according to the present invention, comprising an amino acid sequence selected from the group consisting of:

(1) SEQ ID NOs: 7. 8 in a sequence of any one of the amino acids set forth in SEQ ID NO,

(2) and (b) a sequence that differs from the sequence of SEQ ID NOs: 9. 10 or 11, or a pharmaceutically acceptable salt thereof,

(3) and SEQ ID NOs: 7. 8, preferably at least 80%, more preferably at least 90%, in particular at least 95% or 98% or 99% identity,

(4) an active fragment of the amino acid sequence of (1) or (2) or (3),

(5) an amino acid sequence encoded by a polynucleotide molecule of the invention.

Wherein, the GW7 protein sequence and the amino acid sequence of the variant protein thereof are represented by SEQ ID NOs: 7. 8, see table 1 for details.

The fourth aspect of the present invention relates to a recombinant construct containing the polynucleotide sequence of the gene GW7 for controlling rice grain type and quality. Wherein the vector used for the construct is a cloning vector or an expression vector for expressing the polynucleotide.

The fifth aspect of the invention relates to a recombinant host cell, which contains the recombinant construct or integrates the polynucleotide sequence of the gene GW7 for controlling rice grain type and quality in the genome. The host cell may be selected from plant cells or microbial cells, such as e.coli cells or agrobacterium cells, preferably plant cells, most preferably rice cells. The cell may be isolated, ex vivo, cultured or part of a plant.

A sixth aspect of the invention relates to the use of a polynucleotide (i.e., gene GW7 that controls rice grain type and quality) or a polypeptide of the invention or a recombinant construct of the invention or a recombinant host cell of the invention for improving crop plant traits (e.g., increased crop yield).

The present invention also relates to a method of improving traits in crop plants (e.g., increasing crop yield), which comprises preparing a crop plant comprising a polynucleotide sequence of the gene GW7 of the present invention for controlling rice grain type and quality or a construct of the present invention, for example, the method may comprise: a transgenic plant is regenerated from a recombinant plant cell containing the gene GW7 for controlling the rice grain type and quality and the allelic variation thereof, or a transgenic plant is obtained by crossing a plant containing the gene GW7 for controlling the rice grain type and quality and the allelic variation thereof with another plant, or transfecting a crop plant by using a recombinant agrobacterium cell containing GW 7. Such traits include, but are not limited to: grain width, grain length, aspect ratio, grouting rate, yield, quality, and the like. Wherein the plant is preferably a plant with altered grain type, increased aspect ratio, no significant change in thousand kernel weight, and improved grain quality, wherein the crop plant is preferably a crop plant, such as rice.

In a seventh aspect of the present invention, the present invention provides the use of the gene encoding GW7 for controlling grain type of crop grain (more preferably, improving the appearance quality of crop by controlling grain width, changing aspect ratio); regulating but not limited to cell division speed and direction; as a molecular marker for identifying long and thin grain varieties and high-quality varieties of crops.

An eighth aspect of the invention relates to a method of improving a crop. The method comprises the following steps: transfecting a crop plant by using a recombinant agrobacterium cell containing GW7 to obtain a transgenic crop plant, or properly adjusting the expression level of a GW7 gene in the crop plant, or properly changing the biological activity of a GW7 protein, or crossing a plant containing the grain type gene of the invention with another plant; wherein the plant is preferably a plant with altered grain aspect ratio, increased grain length, and improved quality, wherein the crop plant is preferably a crop plant, such as rice.

In a more preferred embodiment of the present invention, based on more detailed experimental verification, the present inventors found that breeding experiments for breeding rice varieties with improved rice quality and yield can be performed using the GW7 gene and allelic variation thereof in three ways:

(1) the shape and quality of the rice grains are improved by hybridizing with other grain-shaped genes;

(2) improving the expression level of GW7 gene in rice; or

(3) Increasing the content of the protein coded by the GW7 gene in the crops.

Wherein the other grain type genes mentioned in (1) include, but are not limited to, gs3 gene. Thus, the present invention provides a breeding method for breeding a rice variety with improved rice quality and yield, the method comprising: transgenic rice plants containing the GW7 gene are obtained by transfecting rice plants with recombinant host cells (e.g., agrobacterium cells) containing the GW7 gene.

The method of breeding rice varieties with improved rice quality and yield may also be performed by: a rice plant containing GW7 gene and allelic variation thereof is hybridized with another rice plant to obtain a hybrid rice plant containing GW7 gene. Wherein the other rice plant may comprise the gs3 gene.

In a preferred embodiment, the present invention provides a molecular marker assisted selective pyramiding method of breeding rice varieties with improved rice quality and yield, said method comprising: a rice parent containing GW7 gene and allelic variation thereof is hybridized with a rice parent containing gs3 gene, and a strain or variety of GW7-gs3 double-gene aggregate is bred in the offspring.

In summary, the present invention provides the following embodiments:

1. an isolated, cloned gene GW7 for controlling rice grain width, grain length and rice appearance quality, which is an isolated polynucleotide comprising a nucleotide sequence selected from the group consisting of:

(1) SEQ ID NOs: 1-4;

(5) a nucleotide sequence encoding one of the following amino acid sequences: SEQ ID NOs: 7 and 8, or a sequence identical to the amino acid sequence of any one of SEQ ID NOs: 7 and 8, or an amino acid sequence that differs from the amino acid sequence set forth in any one of SEQ ID NOs: 7 and 8, preferably at least 80%, more preferably at least 90%, especially at least 95% or 98% identity;

(6) an active fragment of the nucleotide sequence of any one of (1) to (5); or

2. An isolated protein encoded by the gene controlling rice grain width, grain length and rice appearance quality of item 1, GW7, comprising an amino acid sequence selected from the group consisting of:

(1) SEQ ID NOs: 7 and 8,

(2) and (b) a sequence that differs from the sequence of SEQ ID NOs: 9-11, or a pharmaceutically acceptable salt thereof,

(3) and SEQ ID NOs: 7 and 8, preferably at least 80%, more preferably at least 90%, especially at least 95% or 98% identity,

(4) an active fragment of the amino acid sequence of (1) or (2) or (3),

(5) an amino acid sequence encoded by the polynucleotide molecule of claim 1.

3. The protein of claim 2, wherein said protein has a TRM conserved domain.

4. A recombinant construct comprising the polynucleotide sequence of gene GW7 of item 1, wherein the vector used for said construct is a cloning vector or an expression vector for expressing said polynucleotide.

5. A recombinant host cell comprising the polynucleotide sequence of the gene GW7 of item 1 or the recombinant construct of item 4, or having integrated into its genome the polynucleotide sequence of the gene GW7 of item 1, wherein said cell is selected from the group consisting of a plant cell and a microbial cell, wherein said microbial cell is preferably an E.coli or Agrobacterium cell.

6. The recombinant host cell of item 5, wherein the cell is a rice cell.

7. A method of growing a crop with increased yield, the method comprising: a transgenic crop plant obtained by transfecting a crop plant with a recombinant Agrobacterium cell comprising the gene GW7 described in item 1, or a plant comprising the gene GW7 for controlling rice grain shape and rice quality described in item 1, preferably a plant with altered grain shape, increased aspect ratio and improved quality, with another crop plant, preferably a crop plant such as rice.

8. A promoter sequence of gene GW7 controlling rice grain length and grain width, comprising a nucleotide sequence selected from the group consisting of:

(1) SEQ ID NOs: 5 and 6;

(4) an active fragment of the nucleotide sequence of any one of (1) to (3); or

9. A construct comprising the promoter sequence of item 8.

10. A molecular marker assisted selective pyramiding method of breeding rice varieties with improved rice quality and yield, the method comprising: a rice parent comprising the gene GW7 and allelic variation thereof described in the item 1 is used for hybridization with a rice parent comprising a gs3 gene, and a strain or variety of GW7-gs3 double-gene aggregate is bred in later generations.

The following are definitions of some terms used in the present invention. Unless otherwise indicated, terms used herein have meanings known to those of ordinary skill in the art.

"associated"/"operably linked" refers to two nucleic acid sequences that are physically or functionally related. For example, a promoter or regulatory DNA sequence is said to be "associated with" a DNA sequence encoding an RNA or protein if the promoter or regulatory DNA sequence and the DNA sequence encoding the RNA or protein are operably linked or positioned such that the regulatory DNA sequence will affect the level of expression of the coding or structural DNA sequence.

A "chimeric gene" is a recombinant nucleic acid sequence in which a promoter or regulatory nucleic acid sequence is operably linked to, or associated with, a nucleic acid sequence that encodes mRNA or is expressed as a protein, such that the regulatory nucleic acid sequence is capable of regulating the transcription or expression of the associated nucleic acid sequence. The regulatory nucleic acid sequences of the chimeric gene are not normally operably linked to the relevant nucleic acid sequences as found in nature.

A "coding sequence" is a nucleic acid sequence that is transcribed into RNA, e.g., mRNA, rRNA, tRNA, snRNA, sense RNA or antisense RNA. Preferably, the RNA is subsequently translated in the organism to produce a protein.

In the context of the present invention, "corresponding to" means that when the nucleic acid coding sequences or amino acid sequences of different GW7 genes or proteins are compared to each other, the nucleic acids or amino acids "corresponding to" some of the enumerated positions are those aligned with these positions, but not necessarily those in these exact numerical positions relative to the respective nucleic acid coding sequence or amino acid sequence of the particular GW 7. Likewise, when a coding or amino acid sequence of a particular GW7 is aligned with a coding or amino acid sequence of a reference GW7, the nucleic acids or amino acids in that particular GW7 sequence that "correspond to" some numbered positions of the reference GW7 sequence are those aligned with those positions of the reference GW7 sequence, but not necessarily those in the exact numbered positions of the respective nucleic acid coding sequence or amino acid sequence of that particular GW7 protein.

As used herein, an "expression cassette" is intended to mean a nucleic acid sequence capable of directing the expression of a particular nucleotide sequence in a suitable host cell, comprising a promoter operably linked to a nucleotide sequence of interest operably linked to a termination signal. Typically, it also comprises sequences required for proper translation of the nucleotide sequence. An expression cassette comprising a nucleotide sequence of interest may be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components. The expression cassette may also be naturally occurring, but obtained in recombinant form for heterologous expression. However, in general, the expression cassette is heterologous with respect to the host, i.e., the particular nucleic acid sequence of the expression cassette does not occur naturally in the host cell and must be introduced into the host cell or a precursor of the host cell by a transformation event. Expression of the nucleotide sequence in the expression cassette may be controlled by a constitutive promoter or an inducible promoter, wherein transcription is initiated by the inducible promoter only when the host cell is exposed to some specific external stimulus. In the case of multicellular organisms, such as plants, the promoter may also be specific to a particular tissue, or organ or developmental stage.

A "gene" is a defined region within the genome which, in addition to the aforementioned coding nucleic acid sequences, comprises other, mainly regulatory nucleic acid sequences which are responsible for the expression of the coding part, i.e.transcriptional and translational control. The gene may also contain other 5 'and 3' untranslated sequences and termination sequences. Further elements that may be present are, for example, introns.

A "heterologous" nucleic acid sequence is a nucleic acid sequence that is not naturally associated with the host cell into which it is introduced, comprising multiple copies of a naturally occurring nucleic acid sequence that is not naturally occurring.

A "homologous" nucleic acid sequence is a nucleic acid sequence that is naturally associated with the host cell into which it is introduced.

"homologous recombination" is the interchange of nucleic acid fragments between homologous nucleic acid molecules.

A nucleic acid sequence is "cognate-encoding" with a reference nucleic acid sequence when the nucleic acid sequence encodes a polypeptide having the same amino acid sequence as the polypeptide encoded by the reference nucleic acid sequence.

An "isolated" nucleic acid molecule or isolated protein is one that exists artificially isolated from its natural environment and is therefore not a natural product. An isolated nucleic acid molecule or protein may exist in purified form, or may exist in a non-natural environment such as, for example, a recombinant host cell or a transgenic plant.

"native gene" refers to a gene that is present in the genome of an untransformed cell.

The term "naturally occurring" is used to describe a subject that can be found in nature, as opposed to an artificially produced subject. For example, a protein or nucleotide sequence present in an organism (including viruses) that has been isolated from a natural source and that has not been intentionally artificially modified in the laboratory is "naturally-occurring".

A "nucleic acid molecule" or "nucleic acid sequence" is a linear fragment of single or double stranded DNA or RNA that can be isolated from any source. In the context of the present invention, preferably, the nucleic acid molecule is a DNA fragment. A "nucleic acid molecule" is also referred to as a polynucleotide molecule.

A "plant" is any plant, particularly a seed plant, at any developmental stage.

A "plant cell" is the structural and physiological unit of a plant, comprising protoplasts and a cell wall. Plant cells may be in the form of isolated individual cells or cultured cells, or as a higher organized unit such as, for example, a plant tissue, a plant organ, or a portion of a whole plant.

By "plant cell culture" is meant a culture of plant units of various developmental stages such as, for example, protoplasts, cell culture cells, cells in plant tissue, pollen tubes, ovules, embryo sacs, zygotes and embryos.

"plant material" refers to leaves, stems, roots, flowers or parts of flowers, fruits, pollen, egg cells, zygotes, seeds, cuttings, cell or tissue cultures, or any other part or product of a plant.

A "plant organ" is a distinct and well-structured and differentiated part of a plant, such as a root, stem, leaf, flower bud or embryo.

As used herein, "plant tissue" means a group of plant cells organized into structural and functional units. Including any tissue of the plant in the plant or in culture. The term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue cultures, and any group of plant cells organized into structural and/or functional units. The use of this term in combination or alone with any particular type of plant tissue enumerated above or encompassed by this definition is not meant to exclude any other type of plant tissue.

A "promoter" is an untranslated DNA sequence upstream of a coding region that contains a binding site for RNA polymerase II and initiates transcription of the DNA. The promoter region may also contain other elements that act as regulators of gene expression.

A "protoplast" is an isolated plant cell that has no cell wall or only a partial cell wall.

"regulatory element" refers to a sequence involved in controlling the expression of a nucleotide sequence. The regulatory elements comprise a promoter operably linked to the nucleotide sequence of interest and a termination signal. Usually they also comprise sequences required for correct translation of the nucleotide sequence.

A "shuffled" nucleic acid is a nucleic acid produced by a shuffling process, such as any of the shuffling processes described herein. Shuffled nucleic acids are produced by recombining (physically or actually) two or more nucleic acids (or character strings) in an artificial and optionally cyclic manner. Typically, one or more screening steps are employed in a shuffling process to identify nucleic acids of interest; this screening step may be performed before or after any recombination step. In some (but not all) shuffling embodiments, it is desirable to perform multiple rounds of recombination prior to screening to increase the diversity of the libraries to be screened. Alternatively, the entire process of recombination and screening may be repeated cyclically. Depending on the context, shuffling may refer to the entire process of recombination and screening, or alternatively, may refer to only the recombined part of the entire process.

The phrase "substantially identical" in an alignment of two nucleic acid or protein sequences refers to two or more sequences or subsequences that have at least 60%, preferably 80%, more preferably 90%, even more preferably 95% and most preferably at least 99% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as determined using one of the following sequence comparison algorithms or visual inspection. Preferably, substantial identity exists over a region of the sequence that is at least about 50 residues in length, more preferably over a region of at least about 100 residues, and most preferably, the sequences are substantially identical over at least about 150 residues. In a particularly preferred embodiment, the sequence is substantially the same throughout the length of the coding region. Moreover, substantially identical nucleic acid or protein sequences have substantially identical functions.

For sequence comparison, typically, one sequence is compared to the test sequence as a reference sequence. When using a sequence comparison algorithm, the test and reference sequences are input into a computer, the coordinates of the subsequences are specified, if necessary, and the parameters of the sequence algorithm program are specified. The sequence comparison algorithm will then calculate the percent sequence identity of the test sequence relative to the reference sequence based on the selected program parameters.

For example, by Smith & Waterman, adv.appl.math.2: 482(1981) by Needleman & Wunsch, j.mol.biol.48: 443(1970) by Pearson & Lipman, proc.nat' l.acad.sci.usa85: 2444(1988), by computerized implementation of these algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics software package, Genetics Computer Group, 575Science Dr., Madison, Wis.) or by visual inspection (see generally Ausubel et al, infra) optimal alignment of sequences for comparison can be performed.

An example of an algorithm suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, described in Altschul et al, j.mol.biol.215: 403- & 410(1990) describes the algorithm. Software for BLAST analysis is publicly available through the national center for Biotechnology information (http:// www.Ncbi.nlm.nih.gov /). The algorithm comprises the following steps: high scoring sequence pairs (HSPs) are first identified by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighbor word score threshold (Altschul et al, 1990). These initial neighborhood word hits act as clues to the initial lookup to find longer HSPs containing them. These word hits will then extend as far as possible in both directions of each sequence until the cumulative alignment score no longer increases. For nucleotide sequences, cumulative scores were calculated using the parameters M (reward score for pairwise matching residues; always greater than zero) and N (penalty score for mismatching residues; always less than zero). For amino acid sequences, a scoring matrix was used to calculate the cumulative score. Word hit extension in each direction stops when the cumulative alignment score falls back by the number X of maximum achieved, the cumulative score reaches or falls below zero due to one or more negative scoring residue alignments being accumulated, or either of the two sequences reaches the endpoint. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses a word length value (W)11, an expectation value (E)10, a cutoff value of 100, M-5, N-4 and a comparison of the two strands as defaults. For amino acid sequences, the BLASTP program uses the word length value (W)3, expectation value (E)10 and BLOSUM62 scoring matrices as default values (see, Henikoff & Henikoff, proc. natl. acad. sci. usa 89: 10915 (1989)).

In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. nat' l. Acad. Sci. USA 90: 5873) 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P (N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability of comparing the test nucleic acid sequence to the reference nucleic acid sequence is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.

Another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions. The phrase "specifically hybridizes" refers to the binding of a molecule to only a specific nucleotide sequence, forming a duplex or hybridizing under stringent conditions when the sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. "substantial binding" refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and comprises fewer mismatches that can be tolerated by reducing the stringency of the hybridization medium to achieve the desired detection of the target nucleic acid sequence.

"stringent hybridization conditions" and "stringent hybridization rinse conditions" in the context of nucleic acid hybridization assays, such as Southern and Northern hybridizations, are sequence dependent and differ under different environmental parameters. Longer sequences hybridize specifically at higher temperatures. A number of guidelines for Nucleic acid Hybridization can be found in the Tijssen (1993) Laboratory Techniques in Biochemistry and molecular biology-Hybridization with Nucleic acid probes, part I, Chapter 2, "Overview of principles of Hybridization and the protocol of Nucleic acid probe assays" Elsevier, New York. Generally, for a particular sequence at a defined ionic strength and pH, high stringency hybridization and wash conditions are selected to be below the hot melting point (T)_m) About 5 ℃. Typically, under "stringent conditions" a probe will hybridize to its target subsequence, but not to other sequences.

T_mIs the temperature (under defined ionic strength and pH conditions) at which 50% of the target sequence hybridizes to a perfectly matched probe. For a particular probe, very stringent conditions are chosen to be equal to T_m. An example of a stringent hybridization condition for hybridization of complementary nucleic acids having more than 100 complementary residues on the filter in a Southern or Northern blot is to perform the hybridization overnight at 42 ℃ in 50% formamide with 1mg heparin. An example of high stringency rinsing conditions is 72 ℃, 0.15m nacl for about 15 minutes. An example of stringent rinse conditions is a 0.2x SSC rinse at 65 ℃ for 15 minutes (see, Sambrook, infra, description of SSC buffer). Typically, a low stringency rinse is performed before a high stringency rinse to remove background probe signal. For duplexes of, for example, more than 100 nucleotides, an example of a medium stringency rinse is a 45 ℃ 1x SSC rinse for 15 minutes. For duplexes of, for example, more than 100 nucleotides, an example of a low stringency rinse is a 40 ℃ 4-6 XSSC rinse for 15 minutes. For short probes (e.g., about 10 to 50 nucleotides), stringent conditions typically include a salt concentration of less than about 1.0M Na ion, typically about 0.01 to 1.0M Na at pH7.0 to 8.3Generally, in a particular hybridization assay, a signal to noise ratio of 2 × (or higher) above the value observed for an unrelated probe indicates detection of specific hybridization.

The following are examples of settings of hybridization/rinse conditions that may be used to clone a homologous nucleotide sequence that is substantially identical to a reference nucleotide sequence of the present invention: the reference nucleotide sequence and the reference nucleotide sequence are preferably at 50 ℃ with 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO₄Hybridization in 1mM EDTA, rinsing in 50 deg.C, 2 XSSC, 0.1% SDS, more desirably 50 deg.C, 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO₄Hybridization in 1mM EDTA, rinsing in 1 XSSC, 0.1% SDS at 50 ℃, more desirably 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO at 50 ℃₄Hybridization in 1mM EDTA, rinsing in 0.5 XSSC, 0.1% SDS at 50 ℃, preferably 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO at 50 ℃₄Hybridization in 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 50 ℃, more preferably 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO at 50 ℃₄1mM EDTA, and rinsed at 65 ℃ in 0.1 XSSC, 0.1% SDS.

Another indication that two nucleic acid sequences or proteins are substantially identical is that the protein encoded by the first nucleic acid immunologically cross reacts with or specifically binds to the protein encoded by the second nucleic acid. Thus, a protein is typically substantially identical to a second protein, e.g., where the two proteins differ only by conservative substitutions.

"synthetic" refers to a nucleotide sequence that contains structural features not found in the native sequence. For example, artificial sequences that are said to more closely resemble the G + C content and normal codon distribution of dicotyledonous and/or monocotyledonous plant genes are synthetic.

"transformation" is the process of introducing a heterologous nucleic acid into a host cell or organism, and in particular "transformation" means the stable integration of a DNA molecule into the genome of an organism of interest.

"transformed/transgenic/recombinant" refers to a host organism, such as a bacterium or plant, into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule may be stably integrated into the host genome or the nucleic acid molecule may also be present as an extrachromosomal molecule. Such extrachromosomal molecules may be autonomously replicating. Transformed cells, tissues, or plants are understood to encompass not only the end product of the transformation process, but also transgenic progeny thereof. A "non-transformed", "non-transgenic", or "non-recombinant" host refers to a wild-type organism, such as a bacterium or plant, that does not contain a heterologous nucleic acid molecule.

The terms "polynucleotide", "polynucleotide molecule", "polynucleotide sequence", "coding sequence", "Open Reading Frame (ORF)" and the like as used herein include single-or double-stranded DNA and RNA molecules, which may comprise one or more prokaryotic sequences, cDNA sequences, genomic DNA sequences comprising exons and introns, chemically synthesized DNA and RNA sequences, as well as sense and corresponding antisense strands. Methods for producing and manipulating the polynucleotide molecules and oligonucleotide molecules disclosed herein are known to those skilled in the art and can be performed according to the recombinant techniques already described (see Maniatis et al, 1989,is divided into Subcloning, laboratory ManualCold spring harbor laboratory press, cold spring harbor, new york; ausubel et al, 1989,current techniques of molecular biology，Greene Publishing Associates&WileyInterscience, NY; the contents of Sambrook et al, 1989,molecular cloning, a laboratory Manual2 nd edition, cold spring harbor laboratory Press, cold spring harbor, N.Y.; innis et al (eds.), 1995,PCR strategyAcademic Press, Inc., San Diego; and the list of compounds of Erlich (eds.), 1992,PCR techniqueOxford university press, New York).

Drawings

The above features and advantages of the present invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings, in which:

FIG. 1 shows F constructed using Taifeng A and Zhenshan 97B₂Population analysis particle shape QTL: (a) grain width QTL; (b) and grain length QTL.

FIG. 2 shows a phenotypic comparison of the near isogenic lines NIL-GW7 and NIL-GW 7. (a) Comparing the plant types of the near isogenic line materials; (b) comparison of plant height; (c) comparison of heading period; (d) comparison of tillering number; (e) comparison of grain number per ear; (f) comparison of grain width; (g) comparison of grain length; (h) comparison of grain aspect ratios; (i) comparison of grouting rates; (j) comparison of thousand kernel weights; (k) comparison of the yield of individual plants.

Fig. 3 shows the results of genetic complementation validation of GW7 function. (a) The GW7-RNAi transgenic positive plant obviously increases the grain width and reduces the grain length; (b) the GW7 overexpression transgenic positive plant obviously increases the grain length and reduces the grain width.

Fig. 4 shows the interaction of GW7 with OsTON1b and OsTON 2. (a) GW7 interacts with OsTON 2; (b) GW7 interacts with OsTON1 b; (c) OsTON2 interacts with OsTON1b as a positive control; (d) GW7 did not interact with osppl 16 as a negative control. The above shows that GW7, one of the TRM family members in rice, may be involved in biological events such as cell division and regulation of division direction.

Detailed Description

Through intensive research, the inventor determines a grain type gene which can change the grain type of rice and improve the quality of the rice, but does not reduce the grain number of rice ears and the thousand kernel weight, and the gene is positioned at the long arm end of a No. 7 chromosome of the rice. The promoter sequence of the gene can be combined with GW8/OsSPL16 protein, the increase of the gene expression level can promote the increase of grain length of rice grains and the reduction of grain width, and further the appearance quality of rice is improved. The present inventors named the gene controlling rice grain type without changing thousand kernel weight and ear number as GW 7.

Plant transformation

In a particularly preferred embodiment, at least one protein of the invention which controls grain width and grain weight is expressed in higher organisms, such as plants. The nucleotide sequence of the gene controlling grain width and grain weight of the present invention may be inserted into an expression cassette, which is then preferably stably integrated in the plant genome. In another preferred embodiment, the nucleotide sequence of the gene controlling grain width and grain weight is comprised in a non-pathogenic self-replicating virus. Plants transformed according to the invention may be monocotyledonous or dicotyledonous plants, including but not limited to maize, wheat, barley, rye, sweet potato, beans, peas, chicory, lettuce, cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, garlic, pepper, celery, squash, pumpkin, hemp, zucchini, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado, papaya, mango, banana, soybean, tomato, sorghum, sugarcane, sugar beet, sunflower, rapeseed, clover, tobacco, carrot, cotton, alfalfa, rice, potato, eggplant, cucumber, arabidopsis and woody plants such as conifers and deciduous trees. Particularly preferred is rice, wheat, barley, corn, oats, or rye.

Once the desired nucleotide sequence has been transformed into a particular plant species, it may be propagated in that species or transferred into other varieties of the same species, including particularly commercial varieties, using conventional breeding techniques.

Preferably, the nucleotide sequences according to the invention are expressed in transgenic plants, which result in the biosynthesis of the corresponding grain width protein in the transgenic plants. In this way, transgenic plants with improved traits can be produced. In order to express the nucleotide sequence of the present invention in transgenic plants, the nucleotide sequence of the present invention may need to be modified and optimized. All organisms have a particular codon usage preference, which is known in the art, and the codons can be changed to conform to plant preferences while maintaining the amino acids encoded by the nucleotide sequences of the present invention. Moreover, high levels of expression in plants can best be achieved from coding sequences having at least about 35%, preferably more than about 45%, more preferably more than 50%, and most preferably more than about 60% GC content. Although preferred gene sequences can be expressed adequately in monocot and dicot species, the sequences can be modified to accommodate the specific codon preferences and GC content preferences of monocots or dicots, as these preferences have been shown to be different (Murray et al, Nucl. acids Res.17: 477-498 (1989)). In addition, the nucleotide sequence can be screened for the presence of non-canonical splice sites that cause truncation of the message. All changes that need to be made in these nucleotide sequences, such as those described above, are carried out using the methods described in published patent applications EP 0385962 (Monsanto), EP 0359472 (Lubrizol) and WO 93/07278(Ciba-Geigy) using site-directed mutagenesis techniques, PCR and synthetic gene construction well known in the art.

The invention will be further illustrated with reference to the following specific examples. It is to be understood that the following examples are intended only to further illustrate the present invention and are not intended to limit the spirit and scope of the present invention.

It should be noted that, unless otherwise specified, the reagents, enzymes and the like used in the following examples are those commercially available from reagent companies as analytical grade reagents or enzymes.

Example 1: acquisition of major QTL-qGW7 for rice grain type

Minghui 63 and Zhenshan 97 are parent materials widely used in hybrid rice breeding in China, Minghui 63 is a recovery line material with the largest application area at present, and Zhenshan 97A is a sterile line with the widest application area. The Shanyou 63 bred by taking the two as parents is a hybrid rice variety with the largest popularization area in China, and the variety has excellent yield but poor rice quality. The Taifeng A is a sterile line material with excellent quality developed in south China in recent years, and the important appearance quality characters such as the rice length-width ratio, the chalkiness degree and the like of the Taifeng A reach the first grade of the national standard.

The inventor utilizes Zhenshan 97B (bred by Wenzhou city agricultural science research institute, presented by Liu Guang province of Life sciences college of Wushan south China agricultural university) as female parent to hybridize with Taifeng A (high-quality rice line rice 31 utilized by Guangdong province agricultural sciences rice and (Bob/Zhe 9248) F8 generation excellent line and then backcross and transfer to obtain high-quality indica rice sterile line through 8 generations, presented by Wangfeng teacher of Guangdong province farm institute of Rice 29 number of Tianhe Guangzhou Guang province Jing Lu) to construct F Guangxi₂Mapping the population by comparing F₂QTL analysis of the population, we detected 2 major QTLs controlling rice grain width (FIG. 1, a) and 3 QTLs controlling rice grain length (FIG. 1, b), wherein qGW7 at the long-arm end of chromosome 7 can control both grain length and grain width of rice (FIG. 1, a, b).

The molecular markers used in the study are PCR-based markers including SSR markers and self-designed InDel markers. SSR markers are all from the linkage map of microsatellite markers published by McCouch et al (2001, 2002); PSM marker is obtained by analyzing clone sequences with SSR analysis tool (http:// www.gramene.org/gramene/searchs/ssrtol) to screen SSR target sequences with good microsatellite repeatability, and then Primer design is carried out on the target sequences by using Primer5 analysis software. From these primers, 95 pairs of markers that are evenly distributed on the chromosome and that are well polymorphic between parents were selected for genetic background screening (see Table 2 for primers and their sequences for particle QTL analysis).

The PCR procedure was carried out according to the method of Panaud et al (1996) with minor modifications, specifically 20. mu.l per tube of amplification reaction system comprising 0.15. mu.M SSR primer, 200. mu.M dNTP, 1 × PCR reaction buffer (50mM KCl, 10mM Tris-HCl pH8.3, 1.5mM MgCl)₂0.01% gelatin), 50-100ng template DNA, 1U Taq enzyme; the reaction procedure is as follows: DNA was denatured at 94 ℃ for 5 minutes, cycled (94 ℃ for 1 minute, 55 ℃ for 1 minute, 72 ℃ for 1 minute) 35 times, and re-extended at 72 ℃ for 5 minutes. Amplified PCRThe product was electrophoresed on 6% polyacrylamide denaturing gel, the voltage was adjusted to 300V, and the electrophoresis was carried out at room temperature for about 3 hours. After electrophoresis, silver staining record band type or gel imaging.

Table 2: primer for particle type QTL analysis and sequence thereof

Example 2: acquisition of rice grain type gene GW7

The inventor utilizes Taifeng A (a high-quality indica rice sterile line obtained by hybridizing a high-quality rice line 31 utilized by Guangdong agricultural academy of sciences with (BoB/Zhe 9248) F8 generation excellent lines and then carrying out 8 generation backcross transformation) to hybridize with Huajing indica 74 (a high-yield rice seed bred by the agricultural department of southern China university, a variety examined in 2000 in Guangdong province and given by Zhangyi teacher of the agricultural college of southern China, Wushan mountain southern China agricultural university in Tianheu, Guangzhou), and constructs BC by combining continuous multi-generation backcross₃F₂And (4) a group. On the basis, GW7 gene was cloned by using map-based cloning technology, the sequence of the polymorphic marker primer used for map-based cloning is shown in Table 3, and the method for detecting the polymorphic marker is as described in example 1.

Table 3: primers and sequences for map-based cloning

Example 3: construction of near-isogenic materials NIL-GW7 and NIL-GW7

BC as set forth by the inventors in example 2₃F₂Selecting qGW 7-containing single plant from the population, and continuously backcrossing the single plant with long and thin grain type with Hua-nong indica 74 for multiple generations to BC₆F_2，Selecting individual plants according to the phenotype of the elongated and narrowed grains, carrying out tracking detection on target segments by using molecular markers, and carrying out background scanning on segments except the target traits to finally obtain a near isogenic line NIL-GW7 material which is very close to the genetic background of the Hua-nong indica 74(NIL-GW7) for subsequent research. The molecular marker detection method was as described in example 1.

Example 4: GW7 can improve rice quality while ensuring rice yield

The inventor seeds NIL-GW7 and NIL-GW7 (the construction of near isogenic lines NIL-GW7 and NIL-GW7 is described in example 3) in an experimental field of a test base in Ling-shui county, Hainan province), and counts heading date and grain filling rate in the rice growth process; after the plants are mature, counting the number of tillers per plant, the plant height, the number of grains per spike, the thousand seed weight (1,000-grain weight) and the like, and concretely referring to FIG. 2.

The specific statistical method comprises the following steps: starting from the heading of the first material population, the individual plants were observed daily and recorded as heading stage when 50% of the heading plants were observed. And (3) counting the filling rate, selecting the spikelets at the middle upper part of the main spike for marking when the rice blooms, then sampling and drying at certain time intervals until the spikelets of the rice are completely ripe, weighing the dry weight, and stripping 30 complete developing grains each time, wherein the number of the grains is 10, and the number of the grains is 3. Counting the grain number of ears: after the rice is mature, 30 ears on main tillers are respectively taken in the field, and the number of the ears on each ear is respectively counted and recorded.

Measurement of grain length, grain width and grain weight. After the grains are naturally dried, the blighted grains are washed by floating water, and after the grains are dried at the constant temperature of 37 ℃, the grains are stored for 3 months or more at the room temperature, so that the full drying of the grains and the relative consistency of the water content among all strains are ensured. Randomly selecting seeds with normal shapes from each strain, and measuring the grain width and the grain length by using a vernier caliper to serve as the examination indexes of grain shapes. The thousand grain weight of the grains is estimated according to 1000 full grains selected randomly, the total weight of the grains is weighed on an electronic balance, and the average value of 20 times is the thousand grain weight. The test results are allowed to be poor: the weight of each thousand grains is less than 20g and less than 0.4g, the weight of each thousand grains is 20.1-50 g and less than 0.7g, and the weight of each thousand grains is more than 50.1g and less than 1.0 g.

The rice quality was analyzed and evaluated according to "method for measuring Rice quality by Standard of agriculture department of the people's republic of China" (NY 147-88).

The method for detecting the difference significance comprises the following steps: 1. establishing a null hypothesis, namely considering that the two are not different; 2. the probability P of the hypothesis establishment is determined by statistical operations. 3. And judging whether the hypothesis is established or not according to the P value. The standard P in our experiments was < 0.05. Statistics show that the plant height, tillering number, heading stage, thousand-kernel weight and the like of the NIL-GW7 and the NIL-GW7 have no significant difference (figure 2, a, b, c, d, e, j and k); however, the grain types of the two materials are very different, and the grouting rate of the NIL-GW7 is slightly slower than that of the NIL-GW7 (figure 2, i). The inventors have also found that the thousand kernel weight of NIL-gw7 and the thousand kernel weight of NIL-gw8^Basmati385(Wang et al, 2012) has obvious difference, and the yield of the NIL-gw7 single strain is more than that of the NIL-gw8^Basmati385Around 14.9% higher (fig. 2, j and k). Rice quality analysis of NIL-GW7 and NIL-GW7 showed that the rice quality of NIL-GW7 was significantly better than that of NIL-GW7 (Table 5). The research results show that the NIL-gw7 improves the rice quality while ensuring the yield, and has wide application prospect in the high-yield and high-quality breeding of rice.

TABLE 4 Rice quality comparison of NIL-GW7 and NIL-GW7

Example 5: application of GW7 and gs3 polymerization breeding for improving rice quality and yield

The inventor hybridizes near isogenic line materials NIL-GW7 and NIL-gs3 (the construction method of the line NIL-gs3 refers to the introduction of example 3, and near isogenic lines are obtained by continuous backcross for 7 generations) under the background of Hua-nong indica 74, and selects NIL-GW7-gs3 double-polymer line materials in later generations. After plants are matured in test fields of experimental bases of Ling-shui county, Hainan province in 1 month in 2014, NIL-GW7, NIL-gs3 and NIL-GW7-gs3 are respectively sown in a single plant, the yield characters such as the number of grains per spike, the weight of thousand grains and the like are inspected, the characters such as the length of rice grains and the width of rice grains are analyzed, and the measuring and analyzing methods are the same as those in example 4. The results show that the NIL-GW7-gs3 polymer material is greatly improved in both the yield and the appearance quality of rice. This shows that in breeding practice, the quality and yield of rice can be synergistically improved by aggregating GW7 with gs3 (table 5).

TABLE 5 GW7 and gs3 Polymer series Material yields and Rice texture analysis

Example 6: molecular marker design and application for controlling excellent allelic variation of slender granular GW7 gene

According to the nucleotide sequence difference of GW7 gene between Taifeng A and ZS97B varieties, the inventor designs 2 pairs of InDel primers (SEQ ID NOs: 9, 10 and SEQ ID NOs: 11, 12) in a GW7 promoter region, and judges whether the detected rice variety (strain) carries excellent allelic variation of GW7 gene from Taifeng A by using the size of PCR amplification product of Taq enzyme, and the molecular marker can be used for molecular marker assisted selection breeding. For example, Taq enzyme PCR is used to amplify genomic DNA of NIL-GW7 material with large aspect ratio and Hua-nong-NUG 74 material with small aspect ratio of the grain, and the PCR product is subjected to 4% agarose electrophoresis to detect the difference caused by the base insertion deletion, thereby determining the allelic variation type carried by the rice variety. In this example, the amplification product of primer 1(SEQ ID NOs: 9, 10) of Hua-nonglutinous indica 74 is large, and the amplification product of primer 2(SEQ ID NOs: 11, 12) is small; the PCR product of NIL-GW7 was reversed. This can be used as a specific molecular marker to identify the excellent allelic variation type of GW7 controlling the elongated grain type gene, i.e., when the amplification product of primer 1(SEQ ID NOs: 9, 10) is small and the amplification product of primer 2(SEQ ID NOs: 11, 12) is large, the material has a narrow grain and a large grain aspect ratio. In the process of variety breeding, the mark can be used for quickly screening out progeny materials carrying narrow-grain gene GW7 from filial generation groups carrying GW7 broad-grain varieties and GW7 narrow-grain varieties.

Example 7: GW7 rice transgenic experiment and transgenic material grain type analysis

In this example, the full-length cDNA of GW7 gene was obtained by reverse transcription and amplification using young ear of near isogenic line NIL-GW7 (constructed in example 3) as a material.

(1) Construction and transformation of GW7 antisense overexpression transgenic vector

Designing a primer according to the cDNA sequence of the GW7 gene:

GW7pucc-BglIIF：5’-AgATCTAAACTgTTACCAAgAgCTCC-3’

GW7pucc-XhoIR：5’-CTCgAggTTCCACTgTCCACCTTgCATC-3’

GW7pucc-XbaIF：5’-TCTAgAAAACTgTTACCAAgAgCTCC-3’

GW7pucc-SalIR：5’-gTCgACgTTCCACTgTCCACCTTgCATC-3’

using GW7 gene full-length cDNA as a template, respectively amplifying GW7 gene 406bp sequences by using primers GW7pucc-BglIIF, GW7pucc-XhoIR, GW7pucc-XbaIF and GW7pucc-SalIR, respectively, double-digesting and recovering amplification products by using BglIIF, XhoIR, Xba IF and SalIR, firstly, connecting fragments double-digested by BglIIF and XhoIR to plasmid pUCCRNAi (intermediate vector for constructing RNAi vector, detailed construction method is referred to Huang et al, Nature Genetics, 2009, 41 (4): 494-497) double-digested by BglIIF and XhoIR; after the sequencing is correct, the connected plasmid is subjected to double digestion by Xba IF and Sal IR, a plasmid part is recovered, and then the plasmid part is connected with a fragment recovered by double digestion of Xba IF and Sal IR, so that two sections of the same cDNA fragments of GW7 genes with opposite insertion directions are connected to pUCCRNAi, and pUCCNAi is constructed: GW7-RNAi, the plasmid is cut by Pst I single enzyme and is connected with plasmid pCAMBIA-2300-Actin cut by Pst I to construct RNAi vector pAct: GW 7-RNAi. The vector was introduced into NIL-GW7 using Agrobacterium-mediated transformation. The grain type of the transgenic positive plants was significantly altered, being wider and shorter than NIL-GW7 (FIG. 3, a).

(2) Construction and transformation of GW7 overexpression transgenic vector

Designing a primer according to the cDNA sequence of the GW7 gene:

GW7-F-SalI：5’-gCgTCgACATgCCTCCggCgAgggTgCTC-3’

GW7-R-KpnI：’-cggggTACCTCAgCTTgTACTACTAAATgACAgC-3’

using GW7 gene full-length cDNA as a template, amplifying a GW7 full-length sequence, carrying out double enzyme digestion on a PCR product by utilizing SalI F and KpnI R to recover a target fragment, and connecting the target fragment to a p1301Ubinos vector which is recovered by using Sal I F and KpnI R double enzyme digestion to construct an over-expression vector pUbi: : GW 7. The vector was introduced into NIL-gw7 (i.e., Hua-nong-indica 74) using Agrobacterium-mediated transformation. In transgenic positive plants, the expression level of GW7 is greatly up-regulated, the grain type is significantly changed, and the transgenic positive plants are slimmer than Hua-nong rice 74 (figure 3, b).

The above examples show that the phenotypes such as grain width, grain length and aspect ratio can be changed by properly regulating the expression of GW7 gene, thereby affecting the quality of rice grains.

Example 8: interaction protein analysis of GW7

The full-length ORF of GW7 with the length of 2823Kb from Taofeng A obtained in example 7 was recombined onto pSY735 vector through Sal I and Spe I to construct nYFP-GW7 vector. The protein interaction between GW7 and OsTON1b and OsTON2 was analyzed by BiFC method. The BiFC method adopted in this example is briefly described as follows: using an enzymatic hydrolysate (ingredients: 0.6M Mannitol, 10mM MES (pH5.7), 1.5% Cellulase RS, 0.75% Macerozyme, 0.1% BSA, 1mM CaCl)₂50. mu.g/mL carbenicillin) were lysed in 28 ℃ dark-cultured 9-day rice seedlings to obtain rice protoplasts; the nYFP-GW7 vector to be detected is respectively mixed with cYFP-GW7 and cYFP-GW7 by PEG/Ca (component: 0.6M Mannitol, 100mM CaCl)₂PEG4000) mediated method to co-transform rice protoplast; the transformed protoplast was cultured overnight at 28 ℃ in the dark, and the protoplast was observed by scanning confocal electron microscopy. The experimental results show that GW7 has interaction with both OsTON1b and OsTON2 (FIG. 4), which indicates that GW7 is the function conservation of TRM protein in rice, and also suggests that GW7 is involved in plant cell division regulation, such as cell division and elongation direction.

All documents mentioned herein are incorporated by reference. After reading the above description of the present invention, various changes and modifications may be made to the invention, which is equivalent to the present invention, and which is within the scope of the appended claims.

Claims

(1) SEQ ID NOs: 1-4;

(6) an active fragment of the nucleotide sequence of any one of (1) to (5); or

2. An isolated protein encoded by the gene for controlling rice grain width, grain length and rice appearance quality of claim 1 GW7 comprising an amino acid sequence selected from the group consisting of amino acid sequences of seq id no:

(1) SEQ ID NOs: 7 and 8,

(4) an active fragment of the amino acid sequence of (1) or (2) or (3),

(5) an amino acid sequence encoded by the polynucleotide molecule of claim 1.

3. The protein of claim 2, wherein said protein has a TRM conserved domain.

4. A recombinant construct characterized by comprising the polynucleotide sequence of gene GW7 of claim 1, the vector used for said construct being a cloning vector or an expression vector for expressing said polynucleotide.

5. A recombinant host cell characterized in that it comprises the polynucleotide sequence of the gene GW7 of claim 1 or the recombinant construct of claim 4, or its genome has integrated therein the polynucleotide sequence of the gene GW7 of claim 1, wherein said cell is selected from a plant cell or a microbial cell, wherein said microbial cell is preferably an e.

6. The recombinant host cell of claim 5, wherein the cell is a rice cell.

7. A method of growing a crop with increased yield, the method comprising: a transgenic crop plant obtained by transfecting a crop plant with a recombinant Agrobacterium cell comprising the gene GW7 of claim 1, or a plant comprising the gene GW7 for controlling rice grain shape and rice quality of claim 1, preferably a plant with altered grain shape, increased aspect ratio and improved quality, with another crop plant, preferably a crop plant such as rice.

(1) SEQ ID NOs: 5 and 6;

(4) an active fragment of the nucleotide sequence of any one of (1) to (3); or

9. A construct comprising the promoter sequence of claim 8.

10. A molecular marker assisted selective pyramiding method of breeding rice varieties with improved rice quality and yield, the method comprising: a strain or variety of GW7-gs3 double-gene aggregate is bred in the offspring by crossing a rice parent containing the gene GW7 and allelic variation thereof according to claim 1 with a rice parent containing a gs3 gene.