WO2025089036A1 - Method for producing property determination model, property determination model, property determination method, and property determination device - Google Patents
Method for producing property determination model, property determination model, property determination method, and property determination device Download PDFInfo
- Publication number
- WO2025089036A1 WO2025089036A1 PCT/JP2024/035869 JP2024035869W WO2025089036A1 WO 2025089036 A1 WO2025089036 A1 WO 2025089036A1 JP 2024035869 W JP2024035869 W JP 2024035869W WO 2025089036 A1 WO2025089036 A1 WO 2025089036A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sample
- expression levels
- small rnas
- disease
- subject
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12M—APPARATUS FOR ENZYMOLOGY OR MICROBIOLOGY; APPARATUS FOR CULTURING MICROORGANISMS FOR PRODUCING BIOMASS, FOR GROWING CELLS OR FOR OBTAINING FERMENTATION OR METABOLIC PRODUCTS, i.e. BIOREACTORS OR FERMENTERS
- C12M1/00—Apparatus for enzymology or microbiology
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
Definitions
- This disclosure relates to a method for generating a property determination model, a property determination model, a property determination method, and a property determination device.
- sample data is obtained that includes the expression levels of multiple types of microRNA in blood samples measured by a next-generation sequencer (hereinafter referred to as NGS), and the presence or absence of a disease is determined using a trained model that has been machine-learned using the microRNA expression level data of disease-free samples and multiple disease samples as training data (see Non-Patent Document 1).
- NGS next-generation sequencer
- Non-patent document 1 Suzuki, K., Igata, H., Abe, M., Yamamoto, Y., small RNA based cancer classification project, Multiple cancer type classific ation by small RNA expression profiles with plasma samples from multiple facilities. Cancer Sci. 113, 2144-2166 (26 February 2022).
- small RNAs such as microRNAs are present in trace amounts in samples and are also very unstable. Therefore, if the conditions between when a sample is collected from a subject and when the expression levels of multiple small RNAs in the sample are measured are inappropriate, the measured value of the small RNA expression level may not reflect the expression level of the small RNAs contained in the sample at the time of collection.
- the values expressed as small RNA data indicating the measured values for each sample may differ between samples, even if the measured values reflect the expression levels of the small RNAs contained in the sample at the time of collection.
- various conditions include the measurement device itself that measures the expression levels of multiple small RNAs, the measurement conditions of the measurement device, and the processing conditions in the preprocessing performed before measuring the expression levels of multiple small RNAs (e.g., the reagents used in the preprocessing).
- a property determination model for determining a subject's property (e.g., disease) is generated using the expression levels of small RNAs in samples measured under inappropriate conditions, or the expression levels of small RNAs measured under different conditions between samples
- the property determination model will be generated based on information (i.e., the expression levels of small RNAs) that does not reflect the small RNAs in the sample at the time of collection, i.e., the subject, and the property determination model generated may have low determination accuracy.
- a method for generating a property determination model is a method for generating a property determination model for determining the property of a subject based on the expression levels of multiple small RNAs, in which a sample from which the multiple small RNAs are derived is selected, and a sample that satisfies predetermined conditions from the time the sample is collected from the subject to the time the expression levels of the multiple small RNAs in the sample are measured is selected as a standard sample, and the property determination model is generated using the expression levels of the multiple small RNAs in the standard sample.
- FIG. 1 is a flow chart showing each step of a method for generating a disease trained model according to this embodiment.
- FIG. 2 is a flow chart showing each step of the disease determination method according to the present embodiment.
- FIG. 2 is a schematic block diagram of an example of a computer that functions as the disease determination device of this embodiment.
- 1 is a block diagram showing a functional configuration of an illness determination device according to an embodiment of the present invention; 11 is a result table showing the determination results in this embodiment.
- Fig. 1 is a schematic diagram showing each step of the disease trained model generation method 10 according to the present embodiment.
- the disease trained model generation method 10 is an example of a property determination model generation method.
- the disease-trained model generation method 10 is a generation method for selecting a specimen from which multiple small RNAs are derived when generating a disease-trained model for determining a disease in a subject based on the expression levels of multiple small RNAs, and selects a specimen that satisfies predetermined conditions from the time the specimen is collected from the subject until the expression levels of multiple small RNAs in the specimen are measured as a standard specimen, and generates a disease-trained model using the expression levels of multiple small RNAs in the standard specimen.
- selection is not limited to selecting some samples from multiple samples, but also includes selecting all samples from multiple samples.
- a standard sample is a sample that serves as a reference for generating a disease-trained model, and is a sample in which the conditions from when the sample is collected from a subject to when the expression levels of multiple small RNAs in the sample are measured satisfy predetermined conditions.
- a standard sample is a sample in which at least one of the collection conditions for collecting a sample from a subject, the storage conditions for storing the collected sample, and the measurement conditions for measuring the expression levels of multiple small RNAs in the subject satisfy predetermined conditions.
- Specific examples of the collection conditions include various conditions in the various processes in the collection step 11 described below.
- Specific examples of the storage conditions include various conditions in the various processes in the storage step 13 described below.
- Specific examples of the measurement conditions include various conditions in the various processes in the measurement step 12 described below.
- specimens collected from subjects include body fluids, cells, extracellular vesicles, and tissue fragments.
- body fluids include blood, serum, urine, tears, saliva, sweat, semen, lymph, tissue fluid, body cavity fluid (e.g., pleural fluid, ascites, etc.), cerebrospinal fluid, amniotic fluid, vaginal fluid, and nasal mucus.
- cells include red blood cells, white blood cells, and platelets.
- extracellular vesicles include exosomes and liposomes.
- tissue fragments include FFPE (Formalin Fixed Paraffin Embedded) specimens, biopsy specimens, and frozen specimens.
- specimens may be specimens from which multiple small RNAs are derived, in other words, specimens from which the expression levels of multiple small RNAs can be measured.
- subject refers to a concept that indicates the subject from whom a specimen is collected.
- the subject from whom a specimen is collected may be a human or a non-human animal.
- Non-human animals include non-human mammals (monkeys, dogs, cats, mice, rats, rabbits, cows, horses, pigs, sheep, etc.), birds (chickens, quails, etc.), etc.
- microRNA An example of a small RNA is microRNA.
- the small RNA may also be a small RNA other than microRNA (e.g., piRNA and tsRNA).
- the disease trained model generation method 10 is a method for generating a disease trained model used in the disease assessment method 100 described below, which determines the presence or absence of a disease, by inputting small RNA data indicating the results of measuring the expression levels of multiple small RNAs in a sample from a subject into the disease trained model.
- the generation method 10 is a generation method for generating a disease trained model as an example of a property assessment model for determining the property of a subject, but the property of the subject to be assessed is not limited to the subject's disease. In other words, the property may be any property that can be assessed based on the expression levels of multiple small RNAs. Examples of property assessment include confirming the efficacy of the subject's drug, determining the possibility of disease recurrence, confirming/assessing lifestyle habits such as checking the presence or absence of a smoking history or drinking history, and predicting physical age.
- the collecting step 11 is a step of collecting a body fluid as a specimen from a subject.
- the collecting step 11 is performed before the execution of the storage step 13 and the measurement step 12.
- the collection process is performed as follows.
- blood is collected using a vacuum blood collection tube containing a serum separating agent (blood collection process).
- blood is mixed by inversion.
- the blood is then allowed to coagulate at room temperature for at least 30 minutes.
- the mixture is centrifuged and the serum is separated (centrifugation treatment).
- the serum is separated and stored at ⁇ 80° C. (storage treatment).
- the process from collecting bodily fluids from the subject to completing the centrifugation process (specifically, the process from the blood collection process to the preservation process described above) is performed within a predetermined time (e.g., 2 hours).
- a predetermined time e.g. 2 hours.
- the longer the processing time from collection of bodily fluids from the subject to completion of the centrifugation operation (hereinafter, sometimes referred to as collection processing time), the more the bodily fluids deteriorate, and the more the measurement results in the measurement step 12 change.
- collection processing time becomes longer, small RNAs such as microRNAs contained in the bodily fluids are degraded, and the expression level of small RNAs measured in the measurement step 12 may decrease.
- small RNAs leak from formed components (e.g., blood cells and platelets) that settle by centrifugation in the bodily fluids, and the amount of small RNAs present in the bodily fluids increases.
- the bodily fluid is serum
- the time to coagulate blood or the processing time from centrifugation to separation of serum becomes longer, small RNAs leak from formed components, and the expression level of small RNAs measured in the measurement step 12 may increase.
- the collection processing time becomes longer, small RNAs are degraded, and the length of the base sequence to be measured falls below the threshold and becomes undetectable. In this way, the degradation of small RNA may change the amount of small RNA present measured in the measurement process 12. This phenomenon becomes noticeable when the collection process time exceeds a predetermined time, and affects the measurement results of the small RNA measured in the measurement process 12. For this reason, in the collection process 11, it is required that the collection process be performed within the predetermined time (e.g., 2 hours), for example.
- the predetermined time e.g. 2 hours
- the collection process 11 may include at least a process of collecting bodily fluid from the subject and a process of centrifuging the bodily fluid.
- the preservation step 13 is a step of preserving the collected body fluid at a predetermined preservation temperature.
- the collected body fluid e.g., serum
- a predetermined preservation temperature e.g., ⁇ 80° C.
- the preservation temperature is not limited to ⁇ 80° C., and can be set as appropriate, for example, at a temperature lower than 4° C.
- the body fluid collected from the subject may be transported, for example, to a measurement location where the measurement step 12 is performed, while being preserved at a predetermined preservation temperature.
- the measurement step 12 may include a pretreatment step performed before measuring the expression levels of the multiple small RNAs in the subject's body fluid.
- the pretreatment step may include various pretreatment steps performed on the sample (e.g., centrifugation, storage, library preparation, etc.).
- the various pretreatment steps may be performed using various reagents.
- each step from collecting a sample from a subject to measuring the expression levels of multiple small RNAs in the sample is performed under the same conditions as those in each step in the production method 10 (specifically, collection step 11, storage step 13, and measurement step 12) (in other words, the predetermined conditions in the selection step 14).
- the small RNA data is small RNA data obtained from a sample in which each step from collection of the sample from the subject to measurement of the expression levels of multiple small RNAs in the sample satisfies predetermined conditions (i.e., the same conditions as the predetermined conditions in selection step 14 of production method 10).
- the disease trained model is a disease trained model generated by the above-mentioned generation method 10. That is, the disease trained model is generated based on the expression levels of multiple small RNAs, and a sample from which the multiple small RNAs are derived is selected during the generation, and the sample that satisfies predetermined conditions from the time the sample is collected from the subject to the time the expression levels of the multiple small RNAs in the sample are measured is selected as a standard sample, and the disease trained model is generated using the expression levels of the multiple small RNAs in the standard sample.
- the disease trained model is an example of a property determination model that determines the property of a subject.
- the disease determination process 116 the presence or absence of a disease is determined for the specimen selected in the selection process 114.
- the disease determination process 116 is not performed for specimens that were not selected in the selection process 114, i.e., specimens that do not satisfy the selection conditions.
- the fact that the selection conditions were not satisfied may be presented to the user (i.e., the person executing the disease determination method 100).
- the disease determined by the disease trained model is the same as the disease used for training in the processing time trained model, storage condition trained model, and degree of hemolysis trained model.
- the disease used for training in the processing time trained model, storage condition trained model, and degree of hemolysis trained model is cancer
- the disease determined by the disease discrimination trained model is also cancer.
- the measurement data used in the disease discrimination trained model is the same as the measurement data used in the processing time trained model, storage condition trained model, degree of hemolysis trained model, and preservative addition time trained model.
- the disease determination step 116 was not performed for samples that were not selected in the selection step 114, but this is not limited to the above.
- the disease determination step 116 may be performed for samples that were not selected in the selection step 114, for example, if the determination result is to be obtained as reference data. In this case, for example, the user may be informed that a determination was made even though the selection conditions were not satisfied.
- the disease determination step 116 was performed after the selection step 114, but it may be performed before the selection step 114. In this case, if the selection conditions are not satisfied in the selection step 114, the determination result of the disease determination step 116 is treated as, for example, reference data.
- the property determination system 20 includes a measurement device 21 and a disease determination device 30, as shown in FIG.
- the measurement device 21 is a device that executes the above-mentioned measurement step 112. That is, the measurement device 21 measures the expression levels of a plurality of small RNAs contained in each of a plurality of samples.
- the measurement device 21 for example, an NGS is used.
- the disease determination device 30 executes the above-mentioned selection step 114. That is, the disease determination device 30 selects, from among a plurality of samples in which the expression levels of a plurality of small RNAs have been measured by the measurement device 21, a sample in which the conditions from collection of the sample from the subject to measurement of the expression levels of a plurality of small RNAs in the sample satisfy a predetermined condition as a sample to be determined.
- the disease determination device 30 is an example of a property determination device.
- the disease determination device 30 is an example of a property determination device, but the property determination system 20 including the measurement device 21 may be understood as an example of a property determination device.
- the disease assessment device 30 further executes the disease assessment step 116 described above. That is, the presence or absence of a disease is assessed by inputting small RNA data indicating the expression levels of multiple small RNAs in the selected specimen into the disease-trained model generated by the generation method 10 described above.
- the disease assessment device 30 functions as a computer, and as shown in FIG. 3, has a CPU (Central Processing Unit) 31, a ROM (Read Only Memory) 32, a RAM (Random Access Memory) 33, storage 34, an input unit 35, a display unit 36, and a communication interface (I/F) 37. Each component is connected to each other via a bus 39 so that they can communicate with each other.
- CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- storage 34 storage 34
- I/F communication interface
- CPU 31 (an example of a processor) is a central processing unit that executes various programs and controls each part. That is, CPU 31 reads a program from ROM 32 or storage 34, and executes the program using RAM 33 as a working area. CPU 31 controls each of the above components and performs various calculation processes according to the program stored in ROM 32 or storage 34.
- CPU 31 is an example of a processor.
- ROM 32 records various programs and various data.
- RAM 33 temporarily stores programs or data as a working area.
- Storage 34 is composed of a HDD (Hard Disk Drive) or SSD (Solid State Drive), and records various programs including the operating system, and various data.
- a disease assessment program for executing a disease assessment process that performs the disease assessment method described above is recorded in storage 34.
- the disease assessment program may be a single program, or a group of programs consisting of multiple programs or modules.
- the disease assessment program may be recorded in ROM 32. ROM 32 and storage 34 function as an example of a non-transitory recording medium.
- processors are not limited to the aforementioned CPU, which is a general-purpose processor, but may be, for example, a dedicated processor made up of a circuit designed specifically to execute a specific process. Also, an example of a processor is not limited to a single processor, but may be a processor made up of multiple processors working together at physically separate locations.
- the input unit 35 includes a pointing device such as a mouse and a keyboard, and is used to perform various inputs.
- the input unit 35 also receives as input information on the expression levels of multiple small RNAs measured by the measurement device 21.
- the display unit 36 is, for example, a liquid crystal display, and displays various information.
- the disease assessment device 30 for example, the selection results and the results of the assessment of the presence or absence of disease can be presented to the user through the display unit 36.
- the display unit 36 may also function as the input unit 35 by employing a touch panel system.
- the communication interface 37 is an interface for communicating with other devices, and uses standards such as Ethernet (registered trademark), FDDI (Fiber Distributed Data Interface), and Wi-Fi (registered trademark).
- the CPU 31 executes the disease assessment program to function as a selection unit 160 and a disease assessment unit 170.
- the selection unit 160 executes the selection step 114 described above. That is, the selection unit 160 selects, from among a plurality of samples in which the expression levels of a plurality of small RNAs have been measured by the measurement device 21, a sample for which the conditions from when the sample is collected from the subject to when the expression levels of a plurality of small RNAs in the sample are measured satisfy predetermined conditions, as a sample to be evaluated.
- the disease determination unit 170 executes the disease determination step 116 described above. That is, the disease determination unit 170 determines the presence or absence of a disease by inputting small RNA data indicating the expression levels of multiple small RNAs in the selected specimen into the disease-trained model generated by the generation method 10 described above.
- the property determination system 20 includes a measurement device 21 and a disease determination device 30, but the property determination system 20 may be configured with a single device.
- the disease assessment device 30 may also be composed of multiple devices.
- the disease assessment device 30 may be composed of multiple (e.g., two) devices that share the tasks of performing the above-mentioned selection process 114 and the disease assessment process 116.
- the method for generating a disease-trained model according to this embodiment is, as described above, a method for selecting a sample from which multiple small RNAs are derived when generating a disease-trained model for determining a disease in a subject based on the expression levels of multiple small RNAs, in which a sample that satisfies predetermined conditions (hereinafter referred to as selection conditions) from the time the sample is collected from the subject to the time the expression levels of multiple small RNAs in the sample are measured is selected as a standard sample, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample. Therefore, it is possible to generate a disease-trained model with higher accuracy of judgment compared to a case where a disease-trained model is generated using the expression levels of multiple small RNAs in a sample without selecting the standard sample.
- the selection condition is the collection processing time from when a bodily fluid as a sample is collected from a subject to when centrifugation is completed, and in this generation method, a sample with an appropriate collection processing time is selected as a standard sample, and a disease-learned model is generated using the expression levels of multiple small RNAs in the standard sample. Therefore, regardless of the appropriateness of the collection processing time, a disease-trained model with higher determination accuracy can be generated compared to a case in which a sample is selected as a standard sample and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
- the small RNA data is input to a collection processing time trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and collection processing time data showing the collection processing time of the body fluids collected from the multiple subjects, thereby testing the appropriateness of the collection processing time and selecting samples with appropriate processing times as standard samples.
- a collection processing time trained model is used that uses not only measurement data from healthy subjects but also measurement data from subjects suffering from a disease as training data, so that the appropriateness of the collection processing time can be tested with high accuracy and samples with inappropriate collection processing times can be prevented from being selected as standard samples.
- the selection conditions are the storage temperature of the bodily fluid as a sample collected from the subject and the storage time at that storage temperature, and in this generation method, a sample with an appropriate storage temperature and storage time is selected as a standard sample, and a disease-learned model is generated using the expression levels of multiple small RNAs in the standard sample. Therefore, regardless of the appropriateness of the storage temperature and storage time, it is possible to generate a disease-trained model with higher accuracy than when a sample is selected as a standard sample and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
- the small RNA data is input into a storage condition learned model that has been machine-learned to learn the correlation between measurement data indicating the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and storage condition data indicating the storage temperature of the body fluids collected from the multiple subjects and the storage time at that storage temperature, thereby inspecting the appropriateness of the storage temperature and storage time at that storage temperature, and selecting samples with appropriate storage temperatures and storage times as standard samples.
- a storage condition trained model is used that uses not only measurement data from samples from healthy subjects but also measurement data from samples from diseased subjects as training data, so that the appropriateness of the storage temperature and storage time can be inspected with high accuracy, and samples with inappropriate storage temperatures and times can be prevented from being selected as standard samples.
- the selection condition is the degree of hemolysis of the blood sample collected from the subject
- a sample with an appropriate degree of hemolysis is selected as a standard sample, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample. Therefore, a disease-trained model with higher determination accuracy can be generated compared to a case in which a sample is selected as a standard sample regardless of the appropriateness of the degree of hemolysis, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
- the degree of hemolysis is tested by inputting small RNA data into a hemolysis degree trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in blood collected from multiple subjects and hemolysis degree data showing the degree of hemolysis of the blood collected from the multiple subjects, and a sample with an appropriate degree of hemolysis is selected as a standard sample.
- Example 1 In Example 1, the collection step 11 and the storage step 13 were appropriately performed for 64 lung cancer patients to obtain serum free from hemolysis. That is, for the 64 lung cancer patients, serum free from hemolysis was obtained as a standard specimen, which was collected and processed within 2 hours and stored at -80°C under appropriate storage conditions (storage time and temperature). The expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
- the collection step 11 and the storage step 13 were appropriately performed for 133 healthy subjects to obtain serum free of hemolysis. That is, serum free of hemolysis was obtained as a standard specimen from 133 healthy subjects, which was collected and processed within 2 hours and stored at -80°C under appropriate storage conditions (storage time and storage temperature). The expression levels of multiple small RNAs in the serum were measured using NGS, and small RNA data showing the measurement results was obtained. Then, these small RNA data were used as training data to generate a disease-trained model 1 that determines the disease of a subject.
- Comparative Example 1 serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. The serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained.
- serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
- Comparative Example 1 serum free from hemolysis was obtained from 24 healthy subjects, which had been collected and processed for 24 hours (more than 2 hours).
- the serum was stored at -80°C, which is an appropriate storage condition (storage time and storage temperature).
- the serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained. Then, these small RNA data were used as training data to generate a disease-trained model 2 that determines the disease of the subject.
- Comparative Example 2 serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection and processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. Expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
- serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
- Comparative Example 2 serum free from hemolysis was obtained from 24 healthy subjects that had been stored under inappropriate storage conditions (storage temperature of 4° C. for 24 hours). The serum was collected and processed within 2 hours. The serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained. Then, these small RNA data were used as training data to generate a disease-trained model 3 that determines the disease of the subject.
- Comparative Example 3 serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. Expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
- serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
- ⁇ Example of learning model> In generating a disease trained model, various linear and nonlinear algorithms known as machine learning algorithms, or a combination of multiple algorithms, can be used. For example, the following algorithms can be used:
- a method for generating a property determination model for determining the property of a subject based on expression levels of a plurality of small RNAs comprising the steps of: selecting a specimen from which the plurality of small RNAs are derived; A sample that satisfies a predetermined condition from the time of collection of the sample from the subject to the time of measuring the expression levels of a plurality of small RNAs in the sample is selected as a standard sample; generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen; A method for generating a property judgment model.
- the condition is a storage temperature of the body fluid as the sample collected from the subject and a storage time at the storage temperature;
- a sample having the appropriate storage temperature and storage time is selected as the standard sample;
- (Aspect 6) a storage condition trained model that has been machine-learned to learn a correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of multiple subjects, including subjects suffering from a disease, and storage condition data showing the storage temperature of the body fluids collected from the multiple subjects and the storage time at that storage temperature, and inputs small RNA data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of the subjects into the model, thereby examining the suitability of the storage temperature and the storage time at that storage temperature;
- a sample having the appropriate storage temperature and storage time is selected as the standard sample; generating a trait determination model for determining a disease as the trait of a subject using the expression levels of the plurality of small RNAs in the standard specimen;
- the condition is a degree of hemolysis of the blood sample collected from the subject, A sample having an appropriate degree of hemolysis is selected as the standard sample; generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen; A method for generating a property determination model according to any one of aspects 2 to 6.
- a hemolysis degree trained model is configured to machine-learn a correlation between measurement data showing the results of measuring the expression levels of a plurality of small RNAs in blood collected from a plurality of subjects and hemolysis degree data showing the hemolysis degree of the blood collected from the plurality of subjects, by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject, and examining the degree of hemolysis;
- a sample having an appropriate degree of hemolysis is selected as the standard sample; generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
- Aspect 9 Generate a property determination model for determining a disease as the property of the subject; A method for generating a property determination model according to any one of aspects 1 to 8.
- a property determination model for determining a property of a subject the property determination model being generated based on expression levels of a plurality of small RNAs, and a specimen from which the plurality of small RNAs are derived being selected during the generation, A sample that satisfies predetermined conditions from collection of the sample from the subject to measurement of the expression levels of multiple small RNAs in the sample is selected as a standard sample; A property determination model generated using the expression levels of the plurality of small RNAs in the standard specimen.
- a method for determining a property comprising inputting small RNA data showing the results of measuring expression levels of a plurality of small RNAs in a specimen of a subject into the property determination model according to aspect 10, thereby determining the property.
- the small RNA data is A property determination method according to aspect 11, wherein the data obtained from the sample satisfies the predetermined conditions from when the sample is collected from the subject until when the expression levels of the plurality of small RNAs in the sample are measured.
- a property determination model according to aspect 10 The property determining device determines the property by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject into the property determining model.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Immunology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Medical Informatics (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Urology & Nephrology (AREA)
- Hematology (AREA)
- Medicinal Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Food Science & Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Evolutionary Computation (AREA)
- Bioethics (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Pathology (AREA)
Abstract
Description
本開示は、性質判定モデルの生成方法、性質判定モデル、性質判定方法、及び性質判定装置に関する。 This disclosure relates to a method for generating a property determination model, a property determination model, a property determination method, and a property determination device.
次世代シーケンサー(以下、NGSという)により測定された血液試料中の複数種類のマイクロRNAの発現量を含むサンプルデータを取得し、疾患のない試料と複数の疾患試料の当該マイクロRNA発現量データを、訓練データとして機械学習させた学習済モデルを用いて、疾患の有無を判定する判定方法が知られている(非特許文献1参照)。 A method is known in which sample data is obtained that includes the expression levels of multiple types of microRNA in blood samples measured by a next-generation sequencer (hereinafter referred to as NGS), and the presence or absence of a disease is determined using a trained model that has been machine-learned using the microRNA expression level data of disease-free samples and multiple disease samples as training data (see Non-Patent Document 1).
非特許文献1:Suzuki, K., Igata, H., Abe, M., Yamamoto, Y., small RNA based cancer classification project, Multiple cancer type classification by small RNA expression profiles with plasma samples from multiple facilities. Cancer Sci. 113, 2144-2166 (26 February 2022). Non-patent document 1: Suzuki, K., Igata, H., Abe, M., Yamamoto, Y., small RNA based cancer classification project, Multiple cancer type classific ation by small RNA expression profiles with plasma samples from multiple facilities. Cancer Sci. 113, 2144-2166 (26 February 2022).
ここで、マイクロRNA等のスモールRNAは、検体中で微量であり、また、非常に不安定であるため、例えば、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が不適切であると、スモールRNA発現量を測定した測定値が、採取した時点の検体に含まれるスモールRNAの発現量を反映していない値になる場合がある。 Here, small RNAs such as microRNAs are present in trace amounts in samples and are also very unstable. Therefore, if the conditions between when a sample is collected from a subject and when the expression levels of multiple small RNAs in the sample are measured are inappropriate, the measured value of the small RNA expression level may not reflect the expression level of the small RNAs contained in the sample at the time of collection.
また、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が適切であったとしても、複数の検体間における各種条件が異なっていると、各検体における測定値が、採取した時点の検体に含まれるスモールRNAの発現量を反映した値であったとしても、その値を示すスモールRNAデータとして表現される値が、検体間で異なる場合がある。当該各種条件としては、例えば、複数のスモールRNAの発現量を測定する測定装置自体、測定装置における測定条件、及び複数のスモールRNAの発現量の測定前に行われる前処理における処理条件(例えば、前処理に用いられる試薬)などが挙げられる。 In addition, even if the conditions from collection of a sample from a subject to measurement of the expression levels of multiple small RNAs in the sample are appropriate, if various conditions differ between the multiple samples, the values expressed as small RNA data indicating the measured values for each sample may differ between samples, even if the measured values reflect the expression levels of the small RNAs contained in the sample at the time of collection. Examples of such various conditions include the measurement device itself that measures the expression levels of multiple small RNAs, the measurement conditions of the measurement device, and the processing conditions in the preprocessing performed before measuring the expression levels of multiple small RNAs (e.g., the reagents used in the preprocessing).
したがって、当該条件が不適切である検体のスモールRNAの発現量、又は、検体間において異なる条件で測定されたスモールRNAの発現量を用いて、被検者の性質(例えば疾患)を判定する性質判定モデルを生成した場合では、採取した時点の検体、すなわち、被検者におけるスモールRNAを反映していない情報(すなわちスモールRNAの発現量)に基づき性質判定モデルを生成することとなり、判定精度が低い性質判定モデルが生成される場合がある。 Therefore, if a property determination model for determining a subject's property (e.g., disease) is generated using the expression levels of small RNAs in samples measured under inappropriate conditions, or the expression levels of small RNAs measured under different conditions between samples, the property determination model will be generated based on information (i.e., the expression levels of small RNAs) that does not reflect the small RNAs in the sample at the time of collection, i.e., the subject, and the property determination model generated may have low determination accuracy.
本開示は、判定精度が高い性質判定モデルを生成できるようにすることを課題とする。 The objective of this disclosure is to make it possible to generate a property determination model with high determination accuracy.
本開示の一態様に係る性質判定モデルの生成方法は、複数のスモールRNAの発現量に基づき、被検者の性質を判定する性質判定モデルを生成する際に、当該複数のスモールRNAの由来となる検体を選別する、性質判定モデルの生成方法であって、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別し、前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する。 A method for generating a property determination model according to one embodiment of the present disclosure is a method for generating a property determination model for determining the property of a subject based on the expression levels of multiple small RNAs, in which a sample from which the multiple small RNAs are derived is selected, and a sample that satisfies predetermined conditions from the time the sample is collected from the subject to the time the expression levels of the multiple small RNAs in the sample are measured is selected as a standard sample, and the property determination model is generated using the expression levels of the multiple small RNAs in the standard sample.
本開示によれば、判定精度が高い性質判定モデルを生成することができる。 According to this disclosure, it is possible to generate a property determination model with high determination accuracy.
以下に、本開示の技術に係る実施形態の一例を図面に基づき説明する。なお、動作、作用、機能が同じ働きを担う構成要素及び処理には、全図面を通して同じ符号を付与し、重複する説明を適宜省略する場合がある。各図面は、本開示の技術を十分に理解できる程度に、概略的に示してあるに過ぎない。よって、本開示の技術は、図示例のみに限定されるものではない。また、本実施形態では、本開示と直接的に関連しない構成や周知な構成については、説明を省略する場合がある。 Below, an example of an embodiment of the technology of the present disclosure will be described with reference to the drawings. Note that components and processes that perform the same operation, action, and function will be given the same reference numerals throughout the drawings, and duplicated explanations may be omitted as appropriate. Each drawing is merely a schematic illustration to allow a sufficient understanding of the technology of the present disclosure. Therefore, the technology of the present disclosure is not limited to the illustrated examples. Furthermore, in this embodiment, explanations of configurations that are not directly related to the present disclosure or well-known configurations may be omitted.
<疾患学習済モデルの生成方法10>
まず、本実施形態に係る疾患学習済モデルの生成方法10を説明する。図1は、本実施形態に係る疾患学習済モデルの生成方法10の各工程を示す概略図である。なお、疾患学習済モデルの生成方法10は、性質判定モデルの生成方法の一例である。
<Disease trained model generation method 10>
First, a disease trained model generation method 10 according to the present embodiment will be described. Fig. 1 is a schematic diagram showing each step of the disease trained model generation method 10 according to the present embodiment. The disease trained model generation method 10 is an example of a property determination model generation method.
疾患学習済モデルの生成方法10は、複数のスモールRNAの発現量に基づき、被検者の疾患を判定する疾患学習済モデルを生成する際に、複数のスモールRNAの由来となる検体を選別する生成方法であって、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を標準検体として選別し、標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する生成方法である。 The disease-trained model generation method 10 is a generation method for selecting a specimen from which multiple small RNAs are derived when generating a disease-trained model for determining a disease in a subject based on the expression levels of multiple small RNAs, and selects a specimen that satisfies predetermined conditions from the time the specimen is collected from the subject until the expression levels of multiple small RNAs in the specimen are measured as a standard specimen, and generates a disease-trained model using the expression levels of multiple small RNAs in the standard specimen.
なお、「選別」には、複数の検体から一部を選ぶ場合に限られず、複数の検体から全部を選ぶ場合も含まれる。 Note that "selection" is not limited to selecting some samples from multiple samples, but also includes selecting all samples from multiple samples.
標準検体は、疾患学習済モデルの生成の基準となる検体であり、被検者から当該検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす検体である。具体的には、標準検体は、被検者から検体を採取する採取条件、採取された検体を保存する保存条件、及び、被検者の複数のスモールRNAの発現量を測定する測定条件の少なくとも一つの条件が、予め定められた条件を満たす検体である。当該採取条件としては、具体的には、例えば、後述の採取工程11における各種処理における各種の条件が該当する。当該保存条件としては、具体的には、例えば、後述の保存工程13における各種処理における各種の条件が該当する。当該測定条件としては、具体的には、例えば、後述の測定工程12における各種処理における各種の条件が該当する。 A standard sample is a sample that serves as a reference for generating a disease-trained model, and is a sample in which the conditions from when the sample is collected from a subject to when the expression levels of multiple small RNAs in the sample are measured satisfy predetermined conditions. Specifically, a standard sample is a sample in which at least one of the collection conditions for collecting a sample from a subject, the storage conditions for storing the collected sample, and the measurement conditions for measuring the expression levels of multiple small RNAs in the subject satisfy predetermined conditions. Specific examples of the collection conditions include various conditions in the various processes in the collection step 11 described below. Specific examples of the storage conditions include various conditions in the various processes in the storage step 13 described below. Specific examples of the measurement conditions include various conditions in the various processes in the measurement step 12 described below.
被検者から採取される検体としては、例えば、体液、細胞、細胞外小胞、及び組織片等が挙げられる。体液としては、例えば、血液、血清、尿、涙、唾液、汗、精液、リンパ液、組織液、体腔液(例えば、胸水、腹水など)、脳髄液、羊水、膣液、及び鼻水などが挙げられる。細胞としては、例えば、赤血球、白血球、及び血小板などが挙げられる。細胞外小胞としては、例えば、エクソソーム、及びリポソームなどが挙げられる。組織片としては、例えば、FFPE(Formalin Fixed Paraffin Embedded:ホルマリン固定パラフィン包埋)標本、生検標本、及び凍結標本などが挙げられる。なお、検体としては、複数のスモールRNAの由来となる検体、換言すれば、複数のスモールRNAの発現量が測定可能である検体であればよい。被検者とは、検体を採取する対象を示す概念である。検体を採取する対象である被検者としては、ヒトでも非ヒト動物でもよい。非ヒト動物としては、非ヒト哺乳動物(サル、イヌ、ネコ、マウス、ラット、ウサギ、ウシ、ウマ、ブタ、及びヒツジ等)、鳥類(ニワトリ、ウズラ等)等が挙げられる。 Examples of specimens collected from subjects include body fluids, cells, extracellular vesicles, and tissue fragments. Examples of body fluids include blood, serum, urine, tears, saliva, sweat, semen, lymph, tissue fluid, body cavity fluid (e.g., pleural fluid, ascites, etc.), cerebrospinal fluid, amniotic fluid, vaginal fluid, and nasal mucus. Examples of cells include red blood cells, white blood cells, and platelets. Examples of extracellular vesicles include exosomes and liposomes. Examples of tissue fragments include FFPE (Formalin Fixed Paraffin Embedded) specimens, biopsy specimens, and frozen specimens. Note that specimens may be specimens from which multiple small RNAs are derived, in other words, specimens from which the expression levels of multiple small RNAs can be measured. The term "subject" refers to a concept that indicates the subject from whom a specimen is collected. The subject from whom a specimen is collected may be a human or a non-human animal. Non-human animals include non-human mammals (monkeys, dogs, cats, mice, rats, rabbits, cows, horses, pigs, sheep, etc.), birds (chickens, quails, etc.), etc.
スモールRNAとしては、例えば、マイクロRNAが挙げられる。なお、スモールRNAとしては、マイクロRNA以外のスモールRNA(例えば、piRNA、及びtsRNA)などであってもよい。 An example of a small RNA is microRNA. Note that the small RNA may also be a small RNA other than microRNA (e.g., piRNA and tsRNA).
本実施形態では、疾患学習済モデルの生成方法10は、被検者の検体中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを疾患学習済モデルに入力することによって、疾患の罹患の有無を判定する後述の罹患判定方法100に用いられる当該疾患学習済モデルを生成する方法である。なお、生成方法10は、被検者の性質を判定する性質判定モデルの一例としての疾患学習済モデルを生成する生成方法であるが、判定対象となる被検者の性質としては、被検者の疾患に限られない。すなわち、当該性質としては、複数のスモールRNAの発現量に基づいて判定可能な性質であればよい。性質の判定としては、例えば、被検者の薬剤の効能確認、疾患の再発可能性判定、喫煙歴・飲酒歴の有無の確認などの生活習慣の確認/判定、及び体年齢の予測などが挙げられる。 In this embodiment, the disease trained model generation method 10 is a method for generating a disease trained model used in the disease assessment method 100 described below, which determines the presence or absence of a disease, by inputting small RNA data indicating the results of measuring the expression levels of multiple small RNAs in a sample from a subject into the disease trained model. Note that the generation method 10 is a generation method for generating a disease trained model as an example of a property assessment model for determining the property of a subject, but the property of the subject to be assessed is not limited to the subject's disease. In other words, the property may be any property that can be assessed based on the expression levels of multiple small RNAs. Examples of property assessment include confirming the efficacy of the subject's drug, determining the possibility of disease recurrence, confirming/assessing lifestyle habits such as checking the presence or absence of a smoking history or drinking history, and predicting physical age.
疾患学習済モデルの生成方法10は、具体的には、図1に示されるように、採取工程11と、保存工程13と、測定工程12と、選別工程14と、モデル生成工程17と、を有している。本実施形態では、採取工程11、保存工程13、測定工程12、選別工程14、及びモデル生成工程17は、一例として、この順で実行される。なお、本実施形態では、疾患学習済モデルの生成方法10が、性質判定モデルの生成方法の一例であるが、疾患学習済モデルの生成方法10の一部である選別工程14と、モデル生成工程17とを、性質判定モデルの生成方法の一例と把握してもよい。 Specifically, as shown in FIG. 1, the disease trained model generation method 10 includes a collection step 11, a storage step 13, a measurement step 12, a selection step 14, and a model generation step 17. In this embodiment, the collection step 11, the storage step 13, the measurement step 12, the selection step 14, and the model generation step 17 are performed in this order, for example. Note that in this embodiment, the disease trained model generation method 10 is an example of a property determination model generation method, but the selection step 14 and the model generation step 17, which are part of the disease trained model generation method 10, may be understood as an example of a property determination model generation method.
以下、疾患学習済モデルの生成方法10の各工程、疾患学習済モデルを用いた罹患判定方法100、及び、罹患判定方法100を実行する性質判定システム20について説明する。 Below, we will explain each step of the disease trained model generation method 10, the disease prevalence determination method 100 using the disease trained model, and the property determination system 20 that executes the disease prevalence determination method 100.
<採取工程11>
採取工程11は、被検者から検体としての体液を採取する工程である。採取工程11は、保存工程13及び測定工程12の実行前に実行される。採取工程11において、体液としての血清を採取する場合では、例えば、以下のように、採取処理が行われる。
<Collecting step 11>
The collecting step 11 is a step of collecting a body fluid as a specimen from a subject. The collecting step 11 is performed before the execution of the storage step 13 and the measurement step 12. In the case where serum is collected as the body fluid in the collecting step 11, for example, the collection process is performed as follows.
まず、血清分離剤入りの真空採血管を用いて採血する(採血処理)。
次に、採血処理の後、直ちに、血液を転倒混和する。
次に、30分以上、室温に置いて、血液を凝固させる。
次に、遠心分離し、血清を分取する(遠心分離処理)。
そして、血清を分取し、-80℃にて保存する(保存処理)。
First, blood is collected using a vacuum blood collection tube containing a serum separating agent (blood collection process).
Next, immediately after the blood collection process, the blood is mixed by inversion.
The blood is then allowed to coagulate at room temperature for at least 30 minutes.
Next, the mixture is centrifuged and the serum is separated (centrifugation treatment).
Then, the serum is separated and stored at −80° C. (storage treatment).
採取工程11では、被検者から体液を採取してから遠心分離操作を完了するまでの工程(具体的には、前述の採血処理から保存処理までの工程)を、予め定められた時間(例えば2時間)以内に実行する。なお、遠心分離操作は、例えば、前述の遠心分離処理から保存処理までの操作である。 In the collection process 11, the process from collecting bodily fluids from the subject to completing the centrifugation process (specifically, the process from the blood collection process to the preservation process described above) is performed within a predetermined time (e.g., 2 hours). Note that the centrifugation process is, for example, the process from the centrifugation process to the preservation process described above.
ここで、被検者から体液を採取してから遠心分離操作を完了するまでの処理時間(以下、採取処理時間という場合がある)が長くなればなるほど、体液が劣化し、測定工程12における測定結果が変化することが考えられる。具体的には、採取処理時間が長くなると、体液に含まれるマイクロRNAなどのスモールRNAが分解されるため、測定工程12で測定されるスモールRNAの発現量が低減する場合がある。また、採取処理時間が長くなればなるほど、体液中の遠心分離操作によって沈降する有形成分(例えば、血球や血小板など)からスモールRNAが漏出し、体液中へのスモールRNAの存在量が増える。具体的には、体液が血清である場合において、血液を凝固する時間や、遠心分離をして血清を分取するまでの間の処理時間が長くなると、有形成分からスモールRNAが漏出するため、測定工程12で測定されるスモールRNAの発現量が増加する場合がある。さらに、採取処理時間が長くなればなるほど、スモールRNAが分解され、測定される塩基配列の長さが閾値を下回ることで検出されなくなる。このようにスモールRNAの分解によって、測定工程12で測定されるスモールRNAの存在量が変化する場合がある。このような現象は、採取処理時間が、予め定められた時間を超えた場合に顕著に現れ、測定工程12で測定するスモールRNAの測定結果に影響を与える。このため、採取工程11では、例えば、当該予め定められた時間(例えば2時間)以内に、採取処理が実行されることが求められる。 Here, the longer the processing time from collection of bodily fluids from the subject to completion of the centrifugation operation (hereinafter, sometimes referred to as collection processing time), the more the bodily fluids deteriorate, and the more the measurement results in the measurement step 12 change. Specifically, as the collection processing time becomes longer, small RNAs such as microRNAs contained in the bodily fluids are degraded, and the expression level of small RNAs measured in the measurement step 12 may decrease. In addition, as the collection processing time becomes longer, small RNAs leak from formed components (e.g., blood cells and platelets) that settle by centrifugation in the bodily fluids, and the amount of small RNAs present in the bodily fluids increases. Specifically, when the bodily fluid is serum, if the time to coagulate blood or the processing time from centrifugation to separation of serum becomes longer, small RNAs leak from formed components, and the expression level of small RNAs measured in the measurement step 12 may increase. Furthermore, as the collection processing time becomes longer, small RNAs are degraded, and the length of the base sequence to be measured falls below the threshold and becomes undetectable. In this way, the degradation of small RNA may change the amount of small RNA present measured in the measurement process 12. This phenomenon becomes noticeable when the collection process time exceeds a predetermined time, and affects the measurement results of the small RNA measured in the measurement process 12. For this reason, in the collection process 11, it is required that the collection process be performed within the predetermined time (e.g., 2 hours), for example.
なお、採取工程11としては、少なくとも、被検者から体液を採取する工程と、遠心分離を行う工程と、を有していればよい。 The collection process 11 may include at least a process of collecting bodily fluid from the subject and a process of centrifuging the bodily fluid.
<保存工程13>
保存工程13は、採取された体液を予め定められた保存温度にて保存する工程である。本実施形態では、採取工程11において、採取された体液(例えば血清)を予め定められた保存温度(例えば、-80℃)にて保存し、測定工程12が実行されるまでに体液が劣化することを抑制する。なお、保存温度としては、-80℃に限られず、例えば、4℃よりも低い温度にて、適宜、設定することが可能である。また、保存工程13では、被検者から採取された体液を予め定められた保存温度にて保存しながら、例えば、測定工程12が実行される測定場所へ輸送してもよい。
<Preservation step 13>
The preservation step 13 is a step of preserving the collected body fluid at a predetermined preservation temperature. In this embodiment, the collected body fluid (e.g., serum) in the collection step 11 is preserved at a predetermined preservation temperature (e.g., −80° C.) to prevent the body fluid from deteriorating before the measurement step 12 is performed. Note that the preservation temperature is not limited to −80° C., and can be set as appropriate, for example, at a temperature lower than 4° C. In addition, in the preservation step 13, the body fluid collected from the subject may be transported, for example, to a measurement location where the measurement step 12 is performed, while being preserved at a predetermined preservation temperature.
ここで、保存温度が高く且つ保存時間が長くなると、体液(例えば、血清)が劣化し、測定工程12における測定結果が変化することが考えられる。具体的には、保存温度が高く且つ保存時間が長くなると、体液に含まれるマイクロRNAなどのスモールRNAが分解されやすく、測定工程12で測定されるスモールRNAの発現量が低減する場合がある。また、保存温度が高く且つ保存時間が長くなると、スモールRNAが分解され、測定される塩基配列の長さが閾値を下回ることで検出されなくなる。このようにスモールRNAの分解によって、測定工程12で測定されるスモールRNAの存在量が変化する場合がある。このような現象は、保存温度及び保存時間が予め定められた値を超えた場合に顕著に現れ、測定工程12で測定するスモールRNAの測定結果に影響を与える。このため、保存工程13では、保存温度及び保存時間が予め定められた値以内において、体液の保存が実行されることが求められる。 Here, if the storage temperature is high and the storage time is long, the body fluid (e.g., serum) may deteriorate, causing a change in the measurement result in the measurement step 12. Specifically, if the storage temperature is high and the storage time is long, small RNAs such as microRNAs contained in the body fluids are easily decomposed, and the expression level of the small RNA measured in the measurement step 12 may decrease. Also, if the storage temperature is high and the storage time is long, the small RNA is decomposed and the length of the measured base sequence falls below the threshold and becomes undetectable. In this way, the amount of small RNA present measured in the measurement step 12 may change due to the decomposition of small RNA. This phenomenon is noticeable when the storage temperature and storage time exceed predetermined values, and affects the measurement result of the small RNA measured in the measurement step 12. For this reason, in the storage step 13, it is required that the storage temperature and storage time are within predetermined values to store the body fluid.
なお、保存温度としては、保存時間に関わらずスモールRNAの変化がほとんど生じない第一温度領域(例えば、4℃より低い温度領域)と、保存時間との関係によってスモールRNAの変化が生じる第二温度領域(例えば、4℃以上25℃以下の温度領域)と、保存時間に関わらずスモールRNAの変化が生じる第三温度領域(例えば、25℃より高い温度領域)と、が考えられる。 The storage temperatures can be classified into a first temperature range (e.g., a temperature range below 4°C) in which small RNAs undergo little change regardless of storage time, a second temperature range (e.g., a temperature range between 4°C and 25°C) in which small RNAs undergo change depending on storage time, and a third temperature range (e.g., a temperature range above 25°C) in which small RNAs undergo change regardless of storage time.
<測定工程12>
測定工程12は、被検者の体液中の複数のスモールRNAの発現量を測定する工程である。具体的には、測定工程12では、測定装置である次世代シーケンサー(NGS)を用いて、被検者の体液中に含まれる複数のスモールRNAを測定し、それぞれのスモールRNAの塩基配列を特定する。次に、特定したそれぞれのスモールRNAの数を、塩基配列ごとに数えることで、NGSにおけるスモールRNAのリード数を求める。このスモールRNAのリード数が、スモールRNAの発現量(具体的には絶対発現量)に対応する。すなわち、このスモールRNAのリード数が、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを意味する。ただし、体液(例えば血液)中の複数のスモールRNA(例えばマイクロRNA)の発現量は生体に由来する絶対的な値であるが、測定装置や試薬処理を経る過程により発現量を数値化する必要があるため、体液中のスモールRNAの発現量を絶対的な値で数値化することが難しい。そこで、NGSでの測定結果をデータ処理することにより、スモールRNAの発現量を相対的な値として示してもよく、このような相対的な値であってもスモールRNAデータを意味する。
<Measurement step 12>
The measurement step 12 is a step of measuring the expression levels of multiple small RNAs in the subject's body fluid. Specifically, in the measurement step 12, a next-generation sequencer (NGS) is used as a measurement device to measure multiple small RNAs contained in the subject's body fluid and identify the base sequence of each small RNA. Next, the number of identified small RNAs is counted for each base sequence to obtain the number of reads of the small RNA in the NGS. This number of reads of the small RNA corresponds to the expression level (specifically, the absolute expression level) of the small RNA. In other words, this number of reads of the small RNA means small RNA data showing the result of measuring the expression levels of multiple small RNAs in the subject's body fluid. However, although the expression levels of multiple small RNAs (e.g., microRNAs) in body fluids (e.g., blood) are absolute values derived from a living organism, it is difficult to quantify the expression levels of small RNAs in body fluids as absolute values because it is necessary to quantify the expression levels through a process of a measurement device or reagent treatment. Therefore, the expression levels of small RNAs may be expressed as relative values by processing the measurement results from NGS, and such relative values also refer to small RNA data.
また、絶対発現量を測定するために、NGSで出力したスモールRNAのリード数を正規化し、相対的な発現量(つまり、相対発現量)としてもよい。正規化の手段としては、例えば、RPM(Read Per Million)正規化や、内部標準のスモールRNAを用いた正規化などの手段を利用できる。このように、スモールRNAデータは、正規化した相対発現量を示すものであってもよい。なお、スモールRNAデータとしては、絶対的な値として数値化された絶対定量値であってもよい。 Furthermore, in order to measure the absolute expression level, the number of reads of small RNA output by NGS may be normalized to obtain a relative expression level (i.e., relative expression level). As a means of normalization, for example, RPM (Read Per Million) normalization or normalization using a small RNA as an internal standard may be used. In this way, the small RNA data may indicate a normalized relative expression level. Note that the small RNA data may be an absolute quantitative value quantified as an absolute value.
また、NGSでは、複数の体液(例えば、複数の被検者から採取された血液)を一括して、複数のスモールRNAの発現量を測定することが可能である。すなわち、測定工程12においては、複数の体液について、同じプロセスにて、複数のスモールRNAの発現量を測定する。 In addition, NGS makes it possible to simultaneously measure the expression levels of multiple small RNAs in multiple body fluids (e.g., blood samples taken from multiple subjects). That is, in measurement step 12, the expression levels of multiple small RNAs are measured for multiple body fluids in the same process.
なお、測定装置として次世代シーケンサー以外にもDNAチップや定量PCR、フローサイトメータなども利用でき、複数のスモールRNAの発現量を測定できればよい。以上のように、複数のスモールRNAの発現量を測定し、その測定した結果を示すスモールRNAを得ることができれば、複数のスモールRNAの発現量の測定手法として、公知の手法を含む種々の手法を採用することが可能である。 In addition to next-generation sequencers, other measuring devices such as DNA chips, quantitative PCR, and flow cytometers can also be used as long as the expression levels of multiple small RNAs can be measured. As described above, if the expression levels of multiple small RNAs can be measured and a small RNA showing the measurement results can be obtained, various methods, including publicly known methods, can be used to measure the expression levels of multiple small RNAs.
なお、測定工程12としては、被検者の体液中の複数のスモールRNAの発現量を測定する前に行われる前処理を含んでいてもよい。当該前処理としては、検体に対して実行される各種の前処理(例えば、遠心分離、保存、及びライブラリ調製など)が挙げられる。当該各種の前処理では、各種の試薬を用いて実行することができる。 The measurement step 12 may include a pretreatment step performed before measuring the expression levels of the multiple small RNAs in the subject's body fluid. The pretreatment step may include various pretreatment steps performed on the sample (e.g., centrifugation, storage, library preparation, etc.). The various pretreatment steps may be performed using various reagents.
<選別工程14>
選別工程14は、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件(以下、選別条件という)が、予め定められた条件を満たす当該検体を、標準検体として選別する工程である。
<Sorting step 14>
The selection process 14 is a process of selecting a sample as a standard sample, the conditions from when the sample is collected from the subject to when the expression levels of multiple small RNAs in the sample are measured (hereinafter referred to as selection conditions) satisfying predetermined conditions.
<第一の選別条件>
選別条件は、被検者から検体としての体液を採取してから遠心分離操作を完了するまでの処理時間(以下、採取処理時間という場合がある)である。すなわち、選別工程14では、測定工程12において複数のスモールRNAの発現量が測定された複数の検体のうち、当該採取処理時間が適切である検体を、標準検体として選別する。
<First selection condition>
The selection condition is the processing time from collection of a body fluid as a sample from a subject to completion of centrifugation (hereinafter, sometimes referred to as collection processing time). That is, in the selection step 14, from among the multiple samples in which the expression levels of multiple small RNAs were measured in the measurement step 12, a sample with an appropriate collection processing time is selected as a standard sample.
ここで、スモールRNAは検体中で微量であり、また、非常に不安定であるため、当該採取処理時間が不適切であると、スモールRNA発現量を測定した測定結果が、検体に含まれるスモールRNAの発現量を正確に反映していない値になる場合がある。 Since small RNAs are present in trace amounts in samples and are very unstable, if the collection and processing time is inappropriate, the measurement results for the small RNA expression level may not accurately reflect the expression level of the small RNA contained in the sample.
具体的には、前述のように、採取処理時間が予め定められた時間(例えば2時間)を超えた場合に、検体の劣化等により、測定されるスモールRNAの発現量が変化する現象が、顕著に現れる。そこで、選別工程14では、一例として、当該採取処理時間が、当該予め定められた時間(例えば2時間)以内である検体を、標準検体として選別する。 Specifically, as mentioned above, when the collection and processing time exceeds a predetermined time (e.g., 2 hours), the phenomenon that the expression level of the measured small RNA changes due to deterioration of the sample, etc., becomes evident. Therefore, in the selection process 14, as an example, samples whose collection and processing time is within the predetermined time (e.g., 2 hours) are selected as standard samples.
標準検体の選別は、例えば、疾患学習済モデルを作成する作成者(すなわち、本作成方法10を実行する実行者)が、採取処理時間を管理し、採取処理時間が、当該予め定められた時間(例えば2時間)以内である検体を選別することで行われる。このように、標準検体の選別は、一例として、作成者が採取処理時間を管理するという、作成者に依存した方法によって、行うことが可能であるが、当該方法に限られない。例えば、標準検体の選別は、採取処理時間学習済モデルを用いて行ってもよい。 The selection of standard samples is performed, for example, by the creator who creates the disease-learned model (i.e., the executor who executes this creation method 10) managing the collection processing time and selecting samples whose collection processing time is within the predetermined time (e.g., 2 hours). In this way, the selection of standard samples can be performed by a method that depends on the creator, in which the creator manages the collection processing time, as one example, but is not limited to this method. For example, the selection of standard samples may be performed using a collection processing time-learned model.
具体的には、疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の採取処理時間を示す採取処理時間データとの相関関係を機械学習させた採取処理時間学習済モデルに、スモールRNAデータを入力することによって、当該採取処理時間の適否を検査し、当該採取処理時間が適切である体液を、標準検体として選別する。当該スモールRNAデータは、測定工程12で測定されたスモールRNAの発現量の測定結果を示すデータである。 Specifically, the small RNA data is input into a collection processing time trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and collection processing time data showing the collection processing time of the body fluids collected from the multiple subjects, thereby examining the appropriateness of the collection processing time, and selecting body fluids for which the collection processing time is appropriate as standard samples. The small RNA data is data showing the measurement results of the expression levels of small RNAs measured in the measurement process 12.
当該測定データは、具体的には、疾患を罹患した被検者と、疾患を罹患していない被検者(すなわち、健常者)と、を含む複数の被検者から採取した体液中の複数のスモールRNAを測定した結果である。なお、測定データは、前述の測定工程12と同様の測定方法により測定して得ることが可能である。当該採取処理時間データは、具体的には、測定データに係る体液を被検者から採取してから遠心分離操作を完了するまでの採取処理時間の適否を示すデータである。 The measurement data is specifically the result of measuring multiple small RNAs in bodily fluids collected from multiple subjects, including subjects with a disease and subjects without a disease (i.e., healthy subjects). The measurement data can be obtained by measuring using a measurement method similar to that of the measurement step 12 described above. The collection processing time data is specifically data indicating the suitability of the collection processing time from when the bodily fluid related to the measurement data is collected from the subject to when the centrifugation operation is completed.
具体的には、機械学習させる測定データ(以下、訓練データという場合がある)として、例えば、健常者から採取した体液に対する採取処理を予め定められた時間(例えば2時間)以内に行って得られた第一測定データと、健常者から採取した体液に対する採取処理を予め定められた時間(例えば2時間)を超えて行って得られた第二測定データと、疾患を罹患した被検者から採取した体液に対する採取処理を予め定められた時間(例えば2時間)以内に行って得られた第三測定データと、を用いて、処理時間学習済モデルを生成する。なお、機械学習させる訓練データとしては、上記の第一測定データ、第二測定データ及び第三測定データに、疾患を罹患した被検者から採取した体液に対する採取処理を予め定められた時間(例えば2時間)を超えて行って得られた測定データを加えた4つの測定データを用いて、採取処理時間学習済モデルを生成してもよい。このように、本実施形態では、健常者の体液の測定データだけでなく、疾患を罹患した被検者の体液の測定データを、訓練データとして用いる。 Specifically, as the measurement data to be machine-learned (hereinafter, sometimes referred to as training data), for example, the first measurement data obtained by performing a collection process on a body fluid collected from a healthy subject within a predetermined time (e.g., 2 hours), the second measurement data obtained by performing a collection process on a body fluid collected from a healthy subject for more than a predetermined time (e.g., 2 hours), and the third measurement data obtained by performing a collection process on a body fluid collected from a subject suffering from a disease within a predetermined time (e.g., 2 hours) are used to generate a processing time trained model. Note that as the training data to be machine-learned, the collection processing time trained model may be generated using four measurement data, which are the above-mentioned first measurement data, second measurement data, and third measurement data, plus the measurement data obtained by performing a collection process on a body fluid collected from a subject suffering from a disease for more than a predetermined time (e.g., 2 hours). In this way, in this embodiment, not only the measurement data of the body fluid of a healthy subject but also the measurement data of the body fluid of a subject suffering from a disease are used as training data.
<第二の選別条件>
さらに、選別条件は、被検者から採取された検体としての体液の保存温度と当該保存温度での保存時間である。すなわち、選別工程14では、測定工程12において複数のスモールRNAの発現量が測定された複数の検体のうち、保存温度及び保存時間が適切である検体を、標準検体として選別する。このように、選別条件として、複数の条件が設定されていてもよい。
<Second selection condition>
Further, the selection conditions are the storage temperature and storage time at the storage temperature of the body fluid as a sample collected from the subject. That is, in the selection step 14, a sample having an appropriate storage temperature and storage time is selected as a standard sample from among the multiple samples in which the expression levels of multiple small RNAs were measured in the measurement step 12. In this manner, multiple conditions may be set as the selection conditions.
ここで、前述の採取処理時間の場合と同様に、被検者から採取された検体の保存温度と当該保存温度での保存時間が不適切であると、スモールRNA発現量を測定した測定結果が、検体に含まれるスモールRNAの発現量を正確に反映していない値になる場合がある。 As with the collection and processing time described above, if the storage temperature of the sample collected from the subject and the storage time at that storage temperature are inappropriate, the measurement results of the small RNA expression level may not accurately reflect the expression level of the small RNA contained in the sample.
具体的には、前述のように、保存温度及び保存時間が予め定められた値を超えた場合に、検体の劣化等により、測定されるスモールRNAの発現量が変化する現象が、顕著に現れる。そこで、選別工程14では、保存温度及び保存時間が予め定められた値以下である検体を、標準検体として選別する。 Specifically, as mentioned above, when the storage temperature and storage time exceed predetermined values, the phenomenon in which the expression level of the measured small RNA changes due to deterioration of the sample, etc., becomes evident. Therefore, in the selection process 14, samples whose storage temperature and storage time are below the predetermined values are selected as standard samples.
保存温度としては、保存時間に関わらずスモールRNAの変化がほとんど生じない第一温度領域(例えば、4℃より低い温度領域)と、保存時間との関係によってスモールRNAの変化が生じる第二温度領域(例えば、4℃以上25℃以下の温度領域)と、保存時間に関わらずスモールRNAの変化が生じる第三温度領域(例えば、25℃より高い温度領域)と、が考えられる。 Considering the storage temperature, there are three possible temperature ranges: a first temperature range (e.g., a temperature range below 4°C) in which small RNAs undergo little change regardless of storage time; a second temperature range (e.g., a temperature range between 4°C and 25°C) in which small RNAs undergo change depending on storage time; and a third temperature range (e.g., a temperature range above 25°C) in which small RNAs undergo change regardless of storage time.
選別工程14では、具体的には、例えば、第一温度領域に属する温度(例えば、-80℃)で保存された検体を、標準検体として選別する。この場合では、保存時間は不問となる。 Specifically, in the selection process 14, a sample stored at a temperature in the first temperature range (e.g., -80°C) is selected as the standard sample. In this case, the storage time is not an issue.
標準検体の選別は、例えば、疾患学習済モデルを作成する作成者が、保存温度及び保存時間を管理し、保存温度及び保存時間が、当該予め定められた値以下である検体を選別することで行われる。このように、標準検体の選別は、一例として、作成者が保存温度及び保存時間を管理するという、作成者に依存した方法により、行うことが可能であるが、当該方法に限られない。例えば、標準検体の選別は、保存状況学習済モデルを用いて行ってもよい。 The selection of standard specimens is performed, for example, by the creator of the disease-trained model managing the storage temperature and storage time, and selecting specimens whose storage temperature and storage time are equal to or less than the predetermined values. In this way, the selection of standard specimens can be performed by a method that depends on the creator, in which the creator manages the storage temperature and storage time, as one example, but is not limited to this method. For example, the selection of standard specimens may be performed using a storage condition-trained model.
具体的には、疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の保存温度と当該保存温度での保存時間とを示す保存状況データとの相関関係を機械学習させた保存状況学習済モデルに、スモールRNAデータを入力することによって、保存温度と当該保存温度での保存時間の適否を検査し、保存温度及び保存時間が適切である体液を、標準検体として選別する。当該スモールRNAデータは、測定工程12で測定されたスモールRNAの発現量の測定結果を示すデータである。 Specifically, the small RNA data is input into a storage condition trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and storage condition data showing the storage temperature and storage time at that storage temperature of the body fluids collected from the multiple subjects, and the small RNA data is input into the storage condition trained model to check the suitability of the storage temperature and storage time at that storage temperature, and body fluids with appropriate storage temperature and storage time are selected as standard samples. The small RNA data is data showing the measurement results of the expression levels of small RNAs measured in the measurement process 12.
当該測定データは、具体的には、疾患を罹患した被検者と、疾患を罹患していない被検者(すなわち、健常者)と、を含む複数の被検者から採取した検体中の複数のスモールRNAを測定した結果である。なお、測定データは、前述の測定工程12と同様の測定方法により測定して得ることが可能である。当該保存状況データは、具体的には、測定データに係る検体の保存温度と当該保存温度での保存時間の両方を考慮した保存状況の適否を示すデータである。 The measurement data is specifically the result of measuring multiple small RNAs in samples collected from multiple subjects, including subjects with a disease and subjects without a disease (i.e., healthy subjects). The measurement data can be obtained by a measurement method similar to that of the measurement step 12 described above. The storage condition data is specifically data indicating the suitability of the storage conditions, taking into account both the storage temperature of the sample related to the measurement data and the storage time at that storage temperature.
具体的には、機械学習させる測定データ(以下、訓練データという場合がある)として、健常者から採取した体液を適切な保存状況にて保存して得られた第一測定データと、健常者から採取した体液を不適切な保存状況にて保存して得られた第二測定データと、疾患を罹患した被検者から採取した体液を適切な保存状況にて保存して得られた第三測定データと、を用いて、保存状況学習済モデルを生成する。このように、本実施形態では、健常者の検体の測定データだけでなく、疾患を罹患した被検者の検体の測定データを、訓練データとして用いる。なお、機械学習させる訓練データとしては、上記の第一測定データ、第二測定データ及び第三測定データに、疾患を罹患した被検者から採取した体液を不適切な保存状況にて保存して得られた測定データを加えた4つの測定データを用いて、保存状況学習済モデルを生成してもよい。なお、前述の適切な保存状況とは、一例として、第一温度領域に属する温度(例えば、-80℃)で保存された場合であり、前述の不適切な保存状況とは、一例として、第一温度領域に属する温度より高い温度で保存された場合である。 Specifically, the storage condition learned model is generated using the measurement data (hereinafter sometimes referred to as training data) to be machine-learned, which are the first measurement data obtained by storing the body fluid collected from a healthy subject under appropriate storage conditions, the second measurement data obtained by storing the body fluid collected from a healthy subject under inappropriate storage conditions, and the third measurement data obtained by storing the body fluid collected from a subject suffering from a disease under appropriate storage conditions. In this manner, in this embodiment, not only the measurement data of the healthy subject's sample but also the measurement data of the subject suffering from a disease is used as the training data. Note that the storage condition learned model may be generated using four measurement data as the training data to be machine-learned, which are the above-mentioned first measurement data, second measurement data, and third measurement data, plus the measurement data obtained by storing the body fluid collected from a subject suffering from a disease under inappropriate storage conditions. Note that the above-mentioned appropriate storage conditions are, for example, when stored at a temperature belonging to the first temperature range (for example, -80°C), and the above-mentioned inappropriate storage conditions are, for example, when stored at a temperature higher than the temperature belonging to the first temperature range.
<第三の選別条件>
さらに、検体として血液を被検者から採取する場合では、選別条件は、被検者から採取された検体としての血液の溶血程度である。すなわち、選別工程14では、測定工程12において複数のスモールRNAの発現量が測定された複数の検体のうち、溶血程度が適切である検体を、標準検体として選別する。
<Third Selection Condition>
Furthermore, in the case where blood is collected from a subject as a sample, the selection condition is the degree of hemolysis of the blood sample collected from the subject. That is, in the selection step 14, a sample having an appropriate degree of hemolysis is selected as a standard sample from among the multiple samples in which the expression levels of multiple small RNAs are measured in the measurement step 12.
ここで、前述の採取処理時間、保存温度及び保存時間の場合と同様に、被検者から採取された検体としての血液の溶血程度が不適切であると、スモールRNA発現量を測定した測定結果が、検体に含まれるスモールRNAの発現量を正確に反映していない値になる場合がある。 As with the collection processing time, storage temperature, and storage time described above, if the degree of hemolysis of the blood sample collected from the subject is inappropriate, the measurement results of the small RNA expression level may not accurately reflect the expression level of the small RNA contained in the sample.
具体的には、溶血程度が予め定められた値を超えた場合に、測定されるスモールRNAの発現量が変化する現象が、顕著に現れる。そこで、本生成方法では、溶血程度が予め定められた値以下である検体を、標準検体として選別する。 Specifically, when the degree of hemolysis exceeds a predetermined value, the phenomenon in which the expression level of the measured small RNA changes becomes evident. Therefore, in this production method, samples with a degree of hemolysis below a predetermined value are selected as standard samples.
標準検体の選別は、例えば、疾患学習済モデルを作成する作成者が、溶血程度を管理し、溶血程度が、当該予め定められた値以下である検体を選別することで行われる。このように、標準検体の選別は、一例として、作成者が溶血程度を管理するという、作成者に依存した方法により、行うことが可能であるが、当該方法に限られない。例えば、標準検体の選別は、溶血程度学習済モデルを用いて行ってもよい。 The selection of standard specimens is performed, for example, by the creator of the disease-trained model managing the degree of hemolysis and selecting specimens whose degree of hemolysis is equal to or lower than the predetermined value. In this way, the selection of standard specimens can be performed, for example, by a method that depends on the creator, in which the creator manages the degree of hemolysis, but is not limited to this method. For example, the selection of standard specimens may be performed using a model trained on the degree of hemolysis.
具体的には、複数の被検者から採取した血液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した血液の溶血程度を示す溶血程度データとの相関関係を機械学習させた溶血程度学習済モデルに、スモールRNAデータを入力することによって、溶血程度を検査し、溶血程度が適切である検体を、標準検体として選別する。当該スモールRNAデータは、測定工程12で測定されたスモールRNAの発現量の測定結果を示すデータである。 Specifically, the small RNA data is input into a hemolysis level trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in blood collected from multiple subjects and hemolysis level data showing the degree of hemolysis in the blood collected from the multiple subjects, and the small RNA data is input to inspect the degree of hemolysis and select samples with an appropriate degree of hemolysis as standard samples. The small RNA data is data showing the measurement results of the expression levels of small RNAs measured in the measurement process 12.
当該測定データは、具体的には、複数の被検者から採取した血液中の複数のスモールRNAを測定した結果である。当該複数の被検者は、疾患を罹患した被検者と、疾患を罹患していない被検者(すなわち、健常者)と、を含んでいてもよいし、疾患を罹患した被検者及び、疾患を罹患していない被検者(すなわち、健常者)の一方のみを含む被検者であってもよい。なお、測定データは、前述の測定工程12と同様の測定方法により測定して得ることが可能である。 The measurement data is specifically the result of measuring multiple small RNAs in blood collected from multiple subjects. The multiple subjects may include subjects with a disease and subjects without a disease (i.e., healthy subjects), or may include only subjects with a disease and subjects without a disease (i.e., healthy subjects). The measurement data can be obtained by measurement using a measurement method similar to that of the measurement step 12 described above.
当該溶血程度データは、具体的には、測定データに係る血液の溶血程度の適否を示すデータである。溶血程度データにおける溶血程度は、罹患判定方法100にて用いられる疾患学習済モデルにおいて判定するための溶血程度が適切か否かに基づくものである。すなわち、溶血程度データは、罹患判定方法100において、正しい判定が実行可能である溶血程度を適切(良)とし、罹患判定方法100において、正しい判定が実行できない溶血程度を不適切(不良)とするデータである。 The hemolysis degree data is specifically data indicating whether the degree of hemolysis of the blood related to the measurement data is appropriate. The degree of hemolysis in the hemolysis degree data is based on whether the degree of hemolysis is appropriate for judgment in the disease-learned model used in the disease judgment method 100. In other words, the hemolysis degree data is data that defines a degree of hemolysis at which a correct judgment can be performed in the disease judgment method 100 as appropriate (good), and defines a degree of hemolysis at which a correct judgment cannot be performed in the disease judgment method 100 as inappropriate (bad).
具体的には、機械学習させる測定データ(以下、訓練データという場合がある)として、複数の被検者から採取した血液において溶血程度が適切な状況下で得られた測定データと、複数の被検者から採取した血液において溶血程度が不適切な状況下で得られた測定データと、を用いて、溶血程度学習済モデルを生成する。 Specifically, as the measurement data to be subjected to machine learning (hereinafter sometimes referred to as training data), a trained model for the degree of hemolysis is generated using measurement data obtained from blood samples taken from multiple subjects under conditions where the degree of hemolysis is appropriate, and measurement data obtained from blood samples taken from multiple subjects under conditions where the degree of hemolysis is inappropriate.
<他の選別条件>
さらに、選別条件として、採血管の種類およびロット、遠心分離回転数等の採血から保存までの条件、保存に関する条件、スモールRNA測定試薬の種類およびロット、反応温度等の測定の前処理条件、測定装置の種類、測定時の検体濃度等の測定条件が設定されていてもよい。
<Other selection criteria>
Furthermore, as selection conditions, the type and lot of blood collection tube, conditions from blood collection to storage such as centrifugation rotation speed, storage conditions, type and lot of small RNA measurement reagent, measurement pretreatment conditions such as reaction temperature, type of measurement device, and measurement conditions such as sample concentration at the time of measurement may be set.
選定条件は、前述のように、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量の測定までの条件である。ここで、当該測定には、当該発現量の測定結果を示し、疾患学習済モデルの生成に用いられる測定データの取得が含まれる。したがって、当該測定データが、例えば、疾患学習済モデルの生成前に加工される場合には、当該加工も選定条件として設定可能である。 As described above, the selection conditions are the conditions from when a sample is collected from a subject to when the expression levels of multiple small RNAs in the sample are measured. Here, the measurement includes obtaining measurement data that indicates the measurement results of the expression levels and is used to generate a disease-learned model. Therefore, if the measurement data is processed, for example, before the generation of a disease-learned model, the processing can also be set as a selection condition.
また、選別工程14では、前述のように、選別条件が、予め定められた条件を満たす当該検体を、標準検体として選別するが、当該予め定められた条件としては、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、複数の検体間において一定であることを条件としてもよい。すなわち、選別工程14では、選定条件が、複数の検体間において一定である当該複数の検体を、標準検体として選別してもよい。具体的には、例えば、同一の測定装置において、同一の測定条件にて複数のスモールRNAの発現量が測定された複数の検体を、標準検体として選別することが可能である。 In addition, in the selection step 14, as described above, the sample whose selection conditions satisfy the predetermined conditions is selected as the standard sample, and the predetermined conditions may be that the conditions from when the sample is collected from the subject to when the expression levels of the multiple small RNAs in the sample are measured are constant among the multiple samples. In other words, in the selection step 14, the multiple samples whose selection conditions are constant among the multiple samples may be selected as the standard samples. Specifically, for example, multiple samples in which the expression levels of multiple small RNAs have been measured using the same measurement device under the same measurement conditions can be selected as the standard samples.
疾患学習済モデルの生成に用いられる測定データとして、採取した時点の検体に含まれるスモールRNAの発現量を反映したデータであって、当該発現量を示すデータとして表現される値が、複数の検体間において一定であるデータを取得可能とするために、選定条件が設定される。すなわち、選定条件は、当該測定データとして前述のデータを取得可能とするための条件であるといえる。 The selection conditions are set so that the measurement data used to generate the disease-trained model can be acquired as data reflecting the expression levels of small RNA contained in the sample at the time of collection, and the values expressed as data indicating the expression levels are constant across multiple samples. In other words, the selection conditions can be said to be conditions that make it possible to acquire the aforementioned data as the measurement data.
<モデル生成工程17>
モデル生成工程17は、選別工程14で選別された標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する工程である。モデル生成工程17では、一例として、標準検体の複数のスモールRNAの発現量を測定した結果を示す測定データと、該標準検体が採取された被検者における疾患の罹患の有無を示す罹患データとの相関関係を機械学習させることで、疾患学習済モデルを生成する。
<Model generation step 17>
The model generation step 17 is a step of generating a disease-trained model using the expression levels of multiple small RNAs in the standard specimen selected in the selection step 14. In one example, the model generation step 17 generates a disease-trained model by machine learning the correlation between measurement data indicating the results of measuring the expression levels of multiple small RNAs in a standard specimen and morbidity data indicating the presence or absence of a disease in the subject from whom the standard specimen was collected.
具体的には、機械学習させる測定データ(以下、訓練データという場合がある)として、疾患を罹患した被検者の標準検体から得られた測定データと、疾患を罹患していない被検者(すなわち、健常者)の標準検体から得られた測定データと、を用いて、疾患学習済モデルを生成する。 Specifically, a disease-learned model is generated using measurement data obtained from standard specimens of subjects with a disease and measurement data obtained from standard specimens of subjects without a disease (i.e., healthy individuals) as the measurement data to be subjected to machine learning (hereinafter sometimes referred to as training data).
また、モデル生成工程17では、訓練データとして、例えば、複数の被検者の標準検体において発現量が上位100個のスモールRNAによる測定データを用いる。 In addition, in the model generation step 17, measurement data for the top 100 small RNAs in expression levels in standard samples from multiple subjects is used as training data.
このように、本実施形態では、発現量が相対的に少ないスモールRNAを除外した測定データを用いる。発現量が相対的に少ないスモールRNAは、小さな発現量変動であってもその倍率変化(fold-change)が大きくなるため説明変数として機能しやすい一面がある。しかしながら、注目している要素(具体的には、採取処理時間)以外の要因による変動や測定誤差による変動の影響を受けやすく、いわゆるロバスト性が弱いという欠点を併せ持つ。したがって、相対的に発現量の多いスモールRNAのみを訓練データとすることでロバスト性を強化する効果が得られる。 In this way, in this embodiment, measurement data is used that excludes small RNAs with relatively low expression levels. Small RNAs with relatively low expression levels have the advantage that they can easily function as explanatory variables, since even small fluctuations in expression level result in large fold changes. However, they also have the disadvantage of being easily affected by fluctuations due to factors other than the element of interest (specifically, the collection and processing time) and fluctuations due to measurement errors, making them less robust. Therefore, by using only small RNAs with relatively high expression levels as training data, the effect of strengthening robustness can be achieved.
また、モデル生成工程17では、訓練データとして、例えば、いずれかの被検者の標準検体において発現量がゼロであったスモールRNAを除いた測定データを用いる。ゼロからの変動は、倍率変化が無限大となり、影響が大きいため、訓練データとして用いるいずれかの検体で、発現量がゼロであるスモールRNAは訓練データから除外する。 In addition, in the model generation step 17, for example, measurement data excluding small RNAs whose expression level was zero in a standard specimen from any of the subjects is used as training data. Since the variation from zero has a large impact as the fold change is infinite, small RNAs whose expression level is zero in any of the specimens used as training data are excluded from the training data.
以上のように、本実施形態では、疾患の罹患の有無を定性的に学習モデルに学習させることで、疾患学習済モデルを生成する。そして、疾患学習済モデルによって、疾患の罹患の有無が判定され、当該有無が出力されることで、疾患の罹患の有無を判定する。したがって、後述の罹患判定方法100では、疾患の罹患が有るとの検査結果、又は疾患の罹患が無いとの検査結果が示される。 As described above, in this embodiment, a disease trained model is generated by having a learning model learn the presence or absence of a disease qualitatively. The disease trained model then determines the presence or absence of a disease, and outputs the presence or absence, thereby judging the presence or absence of a disease. Therefore, in the disease judgment method 100 described below, the test result indicating the presence or absence of a disease is displayed.
なお、本実施形態では、疾患の罹患の有無を定性的に学習モデルに学習させることで、疾患学習済モデルを生成していたが、これに限られない。例えば、判定対象を定量的に学習モデルに学習させることで、疾患学習済モデルを生成してもよい。この場合では、疾患学習済モデルによって、例えば、疾患の罹患の確率が判定され、その判定結果が、判定値として出力される。そして、出力された判定値と閾値との比較に基づいて、疾患の罹患の有無が判定する判定部の判定結果に基づき、疾患の罹患の有無を判定する。 In this embodiment, the disease-trained model is generated by having the learning model learn the presence or absence of a disease qualitatively, but this is not limited to the above. For example, the disease-trained model may be generated by having the learning model learn the subject of judgment quantitatively. In this case, the disease-trained model judges, for example, the probability of having a disease, and the judgment result is output as a judgment value. Then, the presence or absence of a disease is judged based on the judgment result of the judgment unit that judges the presence or absence of a disease based on a comparison between the output judgment value and a threshold value.
<罹患判定方法100>
本実施形態に係る罹患判定方法100を説明する。図2は、本実施形態に係る罹患判定方法100の各工程を示す概略図である。なお、罹患判定方法100は、性質判定方法の一例である。
<Infection determination method 100>
The disease determination method 100 according to this embodiment will be described. Fig. 2 is a schematic diagram showing each step of the disease determination method 100 according to this embodiment. Note that the disease determination method 100 is an example of a property determination method.
罹患判定方法100は、被検者の検体中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを、疾患学習済モデルに入力することによって、疾患の罹患の有無を判定する判定方法である。 The disease assessment method 100 is a method for assessing the presence or absence of a disease by inputting small RNA data, which indicates the results of measuring the expression levels of multiple small RNAs in a subject's sample, into a disease-trained model.
罹患判定方法100は、具体的には、図2に示されるように、採取工程111と、保存工程113と、測定工程112と、選別工程114と、罹患判定工程116と、を有している。本実施形態では、採取工程111、保存工程113、測定工程112、選別工程114、及び罹患判定工程116は、一例として、この順で実行される。なお、本実施形態では、罹患判定方法100が、性質判定方法の一例であるが、罹患判定方法100の一部である罹患判定工程116を性質判定方法の一例と把握してもよい。 Specifically, as shown in FIG. 2, the disease determination method 100 has a collection step 111, a storage step 113, a measurement step 112, a selection step 114, and a disease determination step 116. In this embodiment, the collection step 111, the storage step 113, the measurement step 112, the selection step 114, and the disease determination step 116 are performed in this order, for example. Note that in this embodiment, the disease determination method 100 is an example of a property determination method, but the disease determination step 116, which is part of the disease determination method 100, may also be understood as an example of a property determination method.
罹患判定方法100では、被検者の性質の一例としての疾患の罹患の有無を判定するが、判定対象となる被検者の性質としては、被検者の疾患に限られない。すなわち、当該性質としては、複数のスモールRNAの発現量に基づいて判定可能な性質であればよい。性質の判定としては、前述のように、例えば、被検者の薬剤の効能確認、疾患の再発可能性判定、喫煙歴・飲酒歴の有無の確認などの生活習慣の確認/判定、及び体年齢の予測などが挙げられる。 In the disease assessment method 100, the presence or absence of a disease is assessed as an example of a subject's property, but the property of the subject to be assessed is not limited to the subject's disease. In other words, the property may be any property that can be assessed based on the expression levels of multiple small RNAs. As described above, examples of property assessment include confirming the efficacy of the subject's medication, determining the possibility of disease recurrence, confirming/assessing lifestyle habits such as checking the presence or absence of a smoking history or drinking history, and predicting physical age.
<採取工程111、保存工程113、測定工程112、選別工程114>
採取工程111、保存工程113、測定工程112、及び選別工程114は、一例として、生成方法10における採取工程11、保存工程13、測定工程12、及び選別工程14と同様に実行される。
<Collecting step 111, storing step 113, measuring step 112, sorting step 114>
As an example, the collection step 111, the storage step 113, the measurement step 112, and the selection step 114 are performed in the same manner as the collection step 11, the storage step 13, the measurement step 12, and the selection step 14 in the production method 10.
すなわち、罹患判定方法100では、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの各工程(具体的には、採取工程111、保存工程113、測定工程112)が、生成方法10における当該各工程(具体的には、採取工程11、保存工程13、測定工程12)における条件(換言すれば、選別工程14における予め定められた条件)と同じ条件において、実行される。 In other words, in the disease assessment method 100, each step from collecting a sample from a subject to measuring the expression levels of multiple small RNAs in the sample (specifically, collection step 111, storage step 113, and measurement step 112) is performed under the same conditions as those in each step in the production method 10 (specifically, collection step 11, storage step 13, and measurement step 12) (in other words, the predetermined conditions in the selection step 14).
そして、選別工程114では、測定工程112において複数のスモールRNAの発現量が測定された複数の検体のうち、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件(すなわち、生成方法10の選別工程14における予め定められた条件と同じ条件)を満たす当該検体を、罹患判定工程116における判定対象となる検体として選別する。 Then, in the selection step 114, from among the multiple samples in which the expression levels of multiple small RNAs were measured in the measurement step 112, those samples whose conditions from the collection of the sample from the subject to the measurement of the expression levels of multiple small RNAs in the sample satisfy predetermined conditions (i.e., the same conditions as the predetermined conditions in the selection step 14 of the production method 10) are selected as samples to be assessed in the disease assessment step 116.
<罹患判定工程116>
罹患判定工程116は、選別工程114にて選別された検体における複数のスモールRNAデータを、疾患学習済モデルに入力することによって、疾患の罹患の有無を判定する。
<Medition determination step 116>
The disease determination step 116 determines the presence or absence of a disease by inputting multiple small RNA data in the specimen selected in the selection step 114 into a disease trained model.
したがって、当該スモールRNAデータは、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの各工程が、予め定められた条件(すなわち、生成方法10の選別工程14における予め定められた条件と同じ条件)を満たす検体から得られたスモールRNAデータである。 Therefore, the small RNA data is small RNA data obtained from a sample in which each step from collection of the sample from the subject to measurement of the expression levels of multiple small RNAs in the sample satisfies predetermined conditions (i.e., the same conditions as the predetermined conditions in selection step 14 of production method 10).
当該疾患学習済モデルは、前述の生成方法10により生成された疾患学習済モデルである。すなわち、当該疾患学習済モデルは、複数のスモールRNAの発現量に基づいて生成され、当該生成の際に当該複数のスモールRNAの由来となる検体が選別される疾患学習済モデルであって、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別され、当該標準検体における当該複数のスモールRNAの発現量を用いて生成された疾患学習済モデルである。なお、疾患学習済モデルは、被検者の性質を判定する性質判定モデルの一例である。 The disease trained model is a disease trained model generated by the above-mentioned generation method 10. That is, the disease trained model is generated based on the expression levels of multiple small RNAs, and a sample from which the multiple small RNAs are derived is selected during the generation, and the sample that satisfies predetermined conditions from the time the sample is collected from the subject to the time the expression levels of the multiple small RNAs in the sample are measured is selected as a standard sample, and the disease trained model is generated using the expression levels of the multiple small RNAs in the standard sample. The disease trained model is an example of a property determination model that determines the property of a subject.
罹患判定工程116では、選別工程114において選別された検体に対して、疾患の有無を判定する。換言すれば、選別工程114において選別されなかった検体、すなわち、選別条件を満たさなかった検体に対しては、罹患判定工程116は実行されない。なお、この場合では、例えば、選別条件を満たさなかったことをユーザ(すなわち罹患判定方法100を実行する実行者)に提示してもよい。 In the disease determination process 116, the presence or absence of a disease is determined for the specimen selected in the selection process 114. In other words, the disease determination process 116 is not performed for specimens that were not selected in the selection process 114, i.e., specimens that do not satisfy the selection conditions. In this case, for example, the fact that the selection conditions were not satisfied may be presented to the user (i.e., the person executing the disease determination method 100).
疾患学習済モデルで判定する疾患は、処理時間学習済モデル、保存状況学習済モデル、及び溶血程度学習済モデルで学習に使用した疾患と同じ疾患であることが望ましい。例えば、処理時間学習済モデル、保存状況学習済モデル、及び溶血程度学習済モデルで学習に使用した疾患が、癌疾患であれば、疾患判別学習済モデルで判定する疾患も、癌疾患である。具体的には、疾患判別学習済モデルで使用する測定データは、処理時間学習済モデル、保存状況学習済モデル、溶血程度学習済モデル、及び保存剤添加時間学習済モデルで使用した測定データと同じデータであることが望ましい。 It is desirable that the disease determined by the disease trained model is the same as the disease used for training in the processing time trained model, storage condition trained model, and degree of hemolysis trained model. For example, if the disease used for training in the processing time trained model, storage condition trained model, and degree of hemolysis trained model is cancer, the disease determined by the disease discrimination trained model is also cancer. Specifically, it is desirable that the measurement data used in the disease discrimination trained model is the same as the measurement data used in the processing time trained model, storage condition trained model, degree of hemolysis trained model, and preservative addition time trained model.
前述のように、選別工程114において選別されなかった検体に対しては、罹患判定工程116は実行されなかったが、これに限らない。例えば、選別工程114において選別されなかった検体に対しても、例えば、参考データとして判定結果を得る場合には、罹患判定工程116を実行してもよい。この場合では、例えば、選別条件を満たさなかった上で、判定を行ったことをユーザに提示してもよい。また、本実施形態では、罹患判定工程116は、選別工程114の後に実行されていたが、選別工程114の前に実行されてもよい。この場合では、選別工程114において選別条件を満たさなかった場合に、罹患判定工程116の判定結果は、例えば、参考データとして取り扱われる。 As described above, the disease determination step 116 was not performed for samples that were not selected in the selection step 114, but this is not limited to the above. For example, the disease determination step 116 may be performed for samples that were not selected in the selection step 114, for example, if the determination result is to be obtained as reference data. In this case, for example, the user may be informed that a determination was made even though the selection conditions were not satisfied. Also, in this embodiment, the disease determination step 116 was performed after the selection step 114, but it may be performed before the selection step 114. In this case, if the selection conditions are not satisfied in the selection step 114, the determination result of the disease determination step 116 is treated as, for example, reference data.
<性質判定システム20>
次に、性質判定方法の一例である前述の罹患判定方法100を実行するシステムとしての性質判定システム20について説明する。性質判定システム20は、図3に示されるように、測定装置21と、罹患判定装置30と、を有している。
<Property Determination System 20>
Next, a property determination system 20 will be described as a system for executing the above-mentioned disease determination method 100, which is an example of a property determination method. The property determination system 20 includes a measurement device 21 and a disease determination device 30, as shown in FIG.
<測定装置21>
測定装置21は、前述の測定工程112を実行する装置である。すなわち、測定装置21は、複数の検体の各々に含まれる複数のスモールRNAの発現量を測定する。測定装置21としては、例えば、NGSが用いられる。
<Measuring device 21>
The measurement device 21 is a device that executes the above-mentioned measurement step 112. That is, the measurement device 21 measures the expression levels of a plurality of small RNAs contained in each of a plurality of samples. As the measurement device 21, for example, an NGS is used.
<罹患判定装置30>
罹患判定装置30は、前述の選別工程114を実行する。すなわち、罹患判定装置30は、測定装置21によって複数のスモールRNAの発現量が測定された複数の検体のうち、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、判定対象となる検体として選別する。なお、罹患判定装置30は、性質判定装置の一例である。このように、本実施形態では、罹患判定装置30が、性質判定装置の一例であるが、測定装置21を含む性質判定システム20を、性質判定装置の一例と把握してもよい。
<Medition determination device 30>
The disease determination device 30 executes the above-mentioned selection step 114. That is, the disease determination device 30 selects, from among a plurality of samples in which the expression levels of a plurality of small RNAs have been measured by the measurement device 21, a sample in which the conditions from collection of the sample from the subject to measurement of the expression levels of a plurality of small RNAs in the sample satisfy a predetermined condition as a sample to be determined. The disease determination device 30 is an example of a property determination device. Thus, in this embodiment, the disease determination device 30 is an example of a property determination device, but the property determination system 20 including the measurement device 21 may be understood as an example of a property determination device.
罹患判定装置30は、さらに、前述の罹患判定工程116を実行する。すなわち、前述の生成方法10により生成された疾患学習済モデルに、選別された検体における複数のスモールRNAの発現量を示すスモールRNAデータを入力することによって、疾患の罹患の有無を判定する。 The disease assessment device 30 further executes the disease assessment step 116 described above. That is, the presence or absence of a disease is assessed by inputting small RNA data indicating the expression levels of multiple small RNAs in the selected specimen into the disease-trained model generated by the generation method 10 described above.
罹患判定装置30は、コンピュータとしての機能を有し、図3に示されるように、CPU(Central Processing Unit)31、ROM(Read Only Memory)32、RAM(Random Access Memory)33、ストレージ34、入力部35、表示部36及び通信インタフェース(I/F)37を有している。各構成部は、バス39を介して相互に通信可能に接続されている。 The disease assessment device 30 functions as a computer, and as shown in FIG. 3, has a CPU (Central Processing Unit) 31, a ROM (Read Only Memory) 32, a RAM (Random Access Memory) 33, storage 34, an input unit 35, a display unit 36, and a communication interface (I/F) 37. Each component is connected to each other via a bus 39 so that they can communicate with each other.
CPU31(プロセッサの一例)は、中央演算処理ユニットであり、各種プログラムを実行したり、各部を制御したりする。すなわち、CPU31は、ROM32又はストレージ34からプログラムを読み出し、RAM33を作業領域としてプログラムを実行する。CPU31は、ROM32又はストレージ34に記憶されているプログラムに従って、上記各構成の制御及び各種の演算処理を行う。なお、CPU31は、プロセッサの一例である。 CPU 31 (an example of a processor) is a central processing unit that executes various programs and controls each part. That is, CPU 31 reads a program from ROM 32 or storage 34, and executes the program using RAM 33 as a working area. CPU 31 controls each of the above components and performs various calculation processes according to the program stored in ROM 32 or storage 34. CPU 31 is an example of a processor.
ROM32は、各種プログラム及び各種データを記録する。RAM33は、作業領域として一時的にプログラム又はデータを記憶する。ストレージ34は、HDD(Hard Disk Drive)又はSSD(Solid State Drive)により構成され、オペレーティングシステムを含む各種プログラム、及び各種データを記録する。 ROM 32 records various programs and various data. RAM 33 temporarily stores programs or data as a working area. Storage 34 is composed of a HDD (Hard Disk Drive) or SSD (Solid State Drive), and records various programs including the operating system, and various data.
本実施形態では、例えば、前述の罹患判定方法を行う罹患判定処理を実行させるための罹患判定プログラムがストレージ34に記録されている。罹患判定プログラムは、1つのプログラムであってもよいし、複数のプログラム又はモジュールで構成されるプログラム群であってもよい。なお、罹患判定プログラムは、ROM32に記録されていてもよい。ROM32及びストレージ34は、非一時的な記録媒体の一例として機能する。 In this embodiment, for example, a disease assessment program for executing a disease assessment process that performs the disease assessment method described above is recorded in storage 34. The disease assessment program may be a single program, or a group of programs consisting of multiple programs or modules. The disease assessment program may be recorded in ROM 32. ROM 32 and storage 34 function as an example of a non-transitory recording medium.
プロセッサの一例としては、例えば、汎用的なプロセッサである前述のCPUに限られず、例えば、特定の処理を実行させるために専用に設計された回路で構成された専用のプロセッサであってもよい。また、プロセッサの一例としては、1つで構成される場合に限られず、物理的に離れた位置に設けられた複数が協働して成すものであってもよい。 An example of a processor is not limited to the aforementioned CPU, which is a general-purpose processor, but may be, for example, a dedicated processor made up of a circuit designed specifically to execute a specific process. Also, an example of a processor is not limited to a single processor, but may be a processor made up of multiple processors working together at physically separate locations.
入力部35は、マウス等のポインティングデバイス、及びキーボードを含み、各種の入力を行うために使用される。また、入力部35は、測定装置21により測定された複数のスモールRNAの発現量の情報を、入力として受け付ける。 The input unit 35 includes a pointing device such as a mouse and a keyboard, and is used to perform various inputs. The input unit 35 also receives as input information on the expression levels of multiple small RNAs measured by the measurement device 21.
表示部36は、例えば、液晶ディスプレイであり、各種の情報を表示する。罹患判定装置30では、例えば、選別結果、及び疾患の罹患の有無を判定した判定結果を、表示部36を通じてユーザに提示することができる。なお、表示部36は、タッチパネル方式を採用して、入力部35として機能してもよい。 The display unit 36 is, for example, a liquid crystal display, and displays various information. In the disease assessment device 30, for example, the selection results and the results of the assessment of the presence or absence of disease can be presented to the user through the display unit 36. The display unit 36 may also function as the input unit 35 by employing a touch panel system.
通信インタフェース37は、他の機器と通信するためのインタフェースであり、例えば、イーサネット(登録商標)、FDDI(Fiber Distributed Data Interface)、Wi-Fi(登録商標)等の規格が用いられる。 The communication interface 37 is an interface for communicating with other devices, and uses standards such as Ethernet (registered trademark), FDDI (Fiber Distributed Data Interface), and Wi-Fi (registered trademark).
図4に示されるように、罹患判定装置30では、CPU31が、罹患判定プログラムを実行することで、選別部160、及び罹患判定部170として機能する。 As shown in FIG. 4, in the disease assessment device 30, the CPU 31 executes the disease assessment program to function as a selection unit 160 and a disease assessment unit 170.
選別部160は、前述の選別工程114を実行する。すなわち、選別部160は、測定装置21によって複数のスモールRNAの発現量が測定された複数の検体のうち、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、判定対象となる検体として選別する。 The selection unit 160 executes the selection step 114 described above. That is, the selection unit 160 selects, from among a plurality of samples in which the expression levels of a plurality of small RNAs have been measured by the measurement device 21, a sample for which the conditions from when the sample is collected from the subject to when the expression levels of a plurality of small RNAs in the sample are measured satisfy predetermined conditions, as a sample to be evaluated.
罹患判定部170は、前述の罹患判定工程116を実行する。すなわち、罹患判定部170は、前述の生成方法10により生成された疾患学習済モデルに、選別された検体における複数のスモールRNAの発現量を示すスモールRNAデータを入力することによって、疾患の罹患の有無を判定する。 The disease determination unit 170 executes the disease determination step 116 described above. That is, the disease determination unit 170 determines the presence or absence of a disease by inputting small RNA data indicating the expression levels of multiple small RNAs in the selected specimen into the disease-trained model generated by the generation method 10 described above.
なお、本実施形態では、性質判定システム20は、測定装置21と、罹患判定装置30と、を有していたが、性質判定システム20としては、1つの装置で構成されていてもよい。 In this embodiment, the property determination system 20 includes a measurement device 21 and a disease determination device 30, but the property determination system 20 may be configured with a single device.
また、罹患判定装置30は、複数の装置で構成されていてもよい。例えば、罹患判定装置30は、前述の選別工程114、及び罹患判定工程116を分担して実行する複数(例えば2つ)の装置で構成されていてもよい。 The disease assessment device 30 may also be composed of multiple devices. For example, the disease assessment device 30 may be composed of multiple (e.g., two) devices that share the tasks of performing the above-mentioned selection process 114 and the disease assessment process 116.
<本実施形態の作用効果>
本実施形態に係る疾患学習済モデルの生成方法は、前述のように、複数のスモールRNAの発現量に基づき、被検者の疾患を判定する疾患学習済モデルを生成する際に、複数のスモールRNAの由来となる検体を選別する生成方法であって、被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件(以下、選別条件という)が、予め定められた条件を満たす当該検体を標準検体として選別し、標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する。
このため、当該標準検体を選別せずに、検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する場合に比べ、判定精度が高い疾患学習済モデルを生成することができる。
<Effects of this embodiment>
The method for generating a disease-trained model according to this embodiment is, as described above, a method for selecting a sample from which multiple small RNAs are derived when generating a disease-trained model for determining a disease in a subject based on the expression levels of multiple small RNAs, in which a sample that satisfies predetermined conditions (hereinafter referred to as selection conditions) from the time the sample is collected from the subject to the time the expression levels of multiple small RNAs in the sample are measured is selected as a standard sample, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
Therefore, it is possible to generate a disease-trained model with higher accuracy of judgment compared to a case where a disease-trained model is generated using the expression levels of multiple small RNAs in a sample without selecting the standard sample.
また、本実施形態では、当該選別条件は、被検者から検体としての体液を採取してから遠心分離操作を完了するまでの採取処理時間であり、本生成方法では、当該採取処理時間が適切である検体を、標準検体として選別し、当該標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する。
このため、採取処理時間の適否に関わらず、検体を標準検体として選別し、当該標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する場合に比べ、判定精度が高い疾患学習済モデルを生成することができる。
In addition, in this embodiment, the selection condition is the collection processing time from when a bodily fluid as a sample is collected from a subject to when centrifugation is completed, and in this generation method, a sample with an appropriate collection processing time is selected as a standard sample, and a disease-learned model is generated using the expression levels of multiple small RNAs in the standard sample.
Therefore, regardless of the appropriateness of the collection processing time, a disease-trained model with higher determination accuracy can be generated compared to a case in which a sample is selected as a standard sample and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
さらに、本実施形態では、疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の採取処理時間を示す採取処理時間データとの相関関係を機械学習させた採取処理時間学習済モデルに、スモールRNAデータを入力することによって、当該採取処理時間の適否を検査し、処理時間が適切である検体を、標準検体として選別する。このように、健常者の検体の測定データだけでなく、疾患を罹患した被検者の検体の測定データを、訓練データとした採取処理時間学習済モデルを用いるため、採取処理時間の適否を高精度に検査でき、採取処理時間が不適切である検体を、標準検体として選別することを抑制できる。 Furthermore, in this embodiment, the small RNA data is input to a collection processing time trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and collection processing time data showing the collection processing time of the body fluids collected from the multiple subjects, thereby testing the appropriateness of the collection processing time and selecting samples with appropriate processing times as standard samples. In this way, a collection processing time trained model is used that uses not only measurement data from healthy subjects but also measurement data from subjects suffering from a disease as training data, so that the appropriateness of the collection processing time can be tested with high accuracy and samples with inappropriate collection processing times can be prevented from being selected as standard samples.
また、本実施形態では、当該選別条件は、被検者から採取された検体としての体液の保存温度と当該保存温度での保存時間であり、本生成方法では、保存温度及び保存時間が適切である検体を、標準検体として選別し、標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する。
このため、保存温度及び保存時間の適否に関わらず、検体を標準検体として選別し、当該標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する場合に比べ、判定精度が高い疾患学習済モデルを生成することができる。
In addition, in this embodiment, the selection conditions are the storage temperature of the bodily fluid as a sample collected from the subject and the storage time at that storage temperature, and in this generation method, a sample with an appropriate storage temperature and storage time is selected as a standard sample, and a disease-learned model is generated using the expression levels of multiple small RNAs in the standard sample.
Therefore, regardless of the appropriateness of the storage temperature and storage time, it is possible to generate a disease-trained model with higher accuracy than when a sample is selected as a standard sample and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
さらに、本実施形態では、疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の保存温度と当該保存温度での保存時間とを示す保存状況データとの相関関係を機械学習させた保存状況学習済モデルに、スモールRNAデータを入力することによって、保存温度と当該保存温度での保存時間の適否を検査し、保存温度及び保存時間が適切である検体を、標準検体として選別する。
このように、健常者の検体の測定データだけでなく、疾患を罹患した被検者の検体の測定データを、訓練データとした保存状況学習済モデルを用いるため、保存温度及び保存時間の適否を高精度に検査でき、保存温度及び保存時間が不適切である検体を、標準検体として選別することを抑制できる。
Furthermore, in this embodiment, the small RNA data is input into a storage condition learned model that has been machine-learned to learn the correlation between measurement data indicating the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and storage condition data indicating the storage temperature of the body fluids collected from the multiple subjects and the storage time at that storage temperature, thereby inspecting the appropriateness of the storage temperature and storage time at that storage temperature, and selecting samples with appropriate storage temperatures and storage times as standard samples.
In this way, a storage condition trained model is used that uses not only measurement data from samples from healthy subjects but also measurement data from samples from diseased subjects as training data, so that the appropriateness of the storage temperature and storage time can be inspected with high accuracy, and samples with inappropriate storage temperatures and times can be prevented from being selected as standard samples.
また、本実施形態では、当該選別条件は、被検者から採取された検体としての血液の溶血程度であり、本生成方法では、溶血程度が適切である検体を、標準検体として選別し、標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する。このため、溶血程度の適否に関わらず、検体を標準検体として選別し、当該標準検体における複数のスモールRNAの発現量を用いて疾患学習済モデルを生成する場合に比べ、判定精度が高い疾患学習済モデルを生成することができる。 In addition, in this embodiment, the selection condition is the degree of hemolysis of the blood sample collected from the subject, and in this generation method, a sample with an appropriate degree of hemolysis is selected as a standard sample, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample. Therefore, a disease-trained model with higher determination accuracy can be generated compared to a case in which a sample is selected as a standard sample regardless of the appropriateness of the degree of hemolysis, and a disease-trained model is generated using the expression levels of multiple small RNAs in the standard sample.
さらに、本実施形態では、複数の被検者から採取した血液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した血液の溶血程度を示す溶血程度データとの相関関係を機械学習させた溶血程度学習済モデルに、スモールRNAデータを入力することによって、溶血程度を検査し、溶血程度が適切である検体を、標準検体として選別する。このように、測定データと溶血程度データとの相関関係を機械学習させた溶血程度学習済モデルに、スモールRNAデータを入力することによって、血液の溶血程度の検査を行うので、例えば、目視や、測定キット等によるHb濃度の測定にて検査を行う場合に比べ、血液の溶血程度を高精度に検査でき、溶血程度が不適切である検体を、標準検体として選別することを抑制できる。 Furthermore, in this embodiment, the degree of hemolysis is tested by inputting small RNA data into a hemolysis degree trained model that has been machine-learned to learn the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in blood collected from multiple subjects and hemolysis degree data showing the degree of hemolysis of the blood collected from the multiple subjects, and a sample with an appropriate degree of hemolysis is selected as a standard sample. In this way, the degree of hemolysis of blood is tested by inputting small RNA data into a hemolysis degree trained model that has been machine-learned to learn the correlation between measurement data and hemolysis degree data, so that the degree of hemolysis of blood can be tested with high accuracy compared to, for example, testing by visual inspection or measuring Hb concentration using a measurement kit, and samples with an inappropriate degree of hemolysis can be prevented from being selected as standard samples.
<実施例>
次に、実施例について説明する。なお、実施例は、本開示の技術の一例を示すものであり、本開示の技術は、実施例の内容に限定されるものではない。
<Example>
Next, examples will be described. Note that the examples are merely examples of the technology of the present disclosure, and the technology of the present disclosure is not limited to the contents of the examples.
〔実施例1〕
実施例1では、肺癌患者64人について、採取工程11、及び保存工程13を適切に実行し、溶血が生じていない血清を得た。すなわち、肺癌患者64人について、当該血清は、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を、標準検体として取得した。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
また、実施例1では、健常者133人について、採取工程11、及び保存工程13を適切に実行し、溶血が生じていない血清を得た。すなわち、健常者133人について、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を、標準検体として取得した。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
そして、これらのスモールRNAデータを訓練データとして用いて、被検者の疾患を判定する疾患学習済モデル1を生成した。
Example 1
In Example 1, the collection step 11 and the storage step 13 were appropriately performed for 64 lung cancer patients to obtain serum free from hemolysis. That is, for the 64 lung cancer patients, serum free from hemolysis was obtained as a standard specimen, which was collected and processed within 2 hours and stored at -80°C under appropriate storage conditions (storage time and temperature). The expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
In Example 1, the collection step 11 and the storage step 13 were appropriately performed for 133 healthy subjects to obtain serum free of hemolysis. That is, serum free of hemolysis was obtained as a standard specimen from 133 healthy subjects, which was collected and processed within 2 hours and stored at -80°C under appropriate storage conditions (storage time and storage temperature). The expression levels of multiple small RNAs in the serum were measured using NGS, and small RNA data showing the measurement results was obtained.
Then, these small RNA data were used as training data to generate a disease-trained model 1 that determines the disease of a subject.
〔比較例1〕
比較例1では、肺癌患者64人について、実施例1と同様に、血清を得た。すなわち、肺癌患者64人について、当該血清は、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
また、比較例1では、健常者109人について、実施例1と同様に、血清を得た。すなわち、健常者109人について、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。
さらに、比較例1では、健常者24人について、採取処理時間が24時間(2時間を超える時間)とされた、溶血が生じていない血清を得た。なお、当該血清は、適切な保存状況(保存時間及び保存温度)である-80℃で保存された血清である。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
そして、これらのスモールRNAデータを訓練データとして用いて、被検者の疾患を判定する疾患学習済モデル2を生成した。
Comparative Example 1
In Comparative Example 1, serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. The serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained.
In Comparative Example 1, serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
Furthermore, in Comparative Example 1, serum free from hemolysis was obtained from 24 healthy subjects, which had been collected and processed for 24 hours (more than 2 hours). The serum was stored at -80°C, which is an appropriate storage condition (storage time and storage temperature). The serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained.
Then, these small RNA data were used as training data to generate a disease-trained model 2 that determines the disease of the subject.
〔比較例2〕
比較例2では、肺癌患者64人について、実施例1と同様に、血清を得た。すなわち、肺癌患者64人について、当該血清は、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
また、比較例2では、健常者109人について、実施例1と同様に、血清を得た。すなわち、健常者109人について、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。
さらに、比較例2では、健常者24人について、不適切な保存状況(保存温度4℃にて24時間)で保存された、溶血が生じていない血清を得た。なお、当該血清は、採取処理時間が2時間以内とされた血清である。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
そして、これらのスモールRNAデータを訓練データとして用いて、被検者の疾患を判定する疾患学習済モデル3を生成した。
Comparative Example 2
In Comparative Example 2, serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection and processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. Expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
In Comparative Example 2, serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
Furthermore, in Comparative Example 2, serum free from hemolysis was obtained from 24 healthy subjects that had been stored under inappropriate storage conditions (storage temperature of 4° C. for 24 hours). The serum was collected and processed within 2 hours. The serum was measured for the expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained.
Then, these small RNA data were used as training data to generate a disease-trained model 3 that determines the disease of the subject.
〔比較例3〕
比較例3では、肺癌患者64人について、実施例1と同様に、血清を得た。すなわち、肺癌患者64人について、当該血清は、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
また、比較例3では、健常者109人について、実施例1と同様に、血清を得た。すなわち、健常者109人について、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された、溶血が生じていない血清を取得した。
さらに、比較例3では、健常者24人について、溶血が生じさせた血清を得た。なお、当該血清は、採取処理時間が2時間以内とされ、適切な保存状況(保存時間及び保存温度)である-80℃で保存された血清である。当該血清について、NGSを用いて複数のスモールRNAの発現量を測定し、当該測定結果を示すスモールRNAデータを取得した。
そして、これらのスモールRNAデータを訓練データとして用いて、被検者の疾患を判定する疾患学習済モデル4を生成した。
Comparative Example 3
In Comparative Example 3, serum was obtained from 64 lung cancer patients in the same manner as in Example 1. That is, serum was obtained from 64 lung cancer patients, with the collection processing time being within 2 hours, and the serum was stored under appropriate storage conditions (storage time and storage temperature) at -80°C, and free of hemolysis. Expression levels of multiple small RNAs were measured for the serum using NGS, and small RNA data showing the measurement results were obtained.
In Comparative Example 3, serum was obtained from 109 healthy subjects in the same manner as in Example 1. That is, serum was obtained from 109 healthy subjects, with the collection and processing time being within 2 hours, and the serum was stored at -80°C under appropriate storage conditions (storage time and storage temperature), and free of hemolysis.
Furthermore, in Comparative Example 3, hemolyzed serum was obtained from 24 healthy subjects. The serum was collected and processed within 2 hours and stored at -80°C under appropriate storage conditions (storage time and temperature). The serum was measured for expression levels of multiple small RNAs using NGS, and small RNA data showing the measurement results was obtained.
Then, these small RNA data were used as training data to generate a disease-trained model 4 that determines the disease of the subject.
<学習モデルの例>
疾患学習済モデルの生成において、機械学習アルゴリズムとして知られている線形、非線形の各種アルゴリズム、又は、複数のアルゴリズムを組合せて使用できる。例えば、以下のアルゴリズムを利用できる。
<Example of learning model>
In generating a disease trained model, various linear and nonlinear algorithms known as machine learning algorithms, or a combination of multiple algorithms, can be used. For example, the following algorithms can be used:
Random forest
Dropout Additive Regression Trees
Gradient boosted trees
Extreme gradient boosted trees
Light-gradient boosted machine
Neural networks
Regularized regression
Elastic-net
K-Nearest neighbors
Support vector machine
Generalized additive model
Random forest
Dropout Additive Regression Trees
Gradient boosted trees
Extreme gradient boosted trees
Light-gradient boosted machine
Neural networks
Regularized regression
Elastic-net
K-Nearest neighbors
Support vector machine
Generalized additive model
<疾患学習済モデル1~4における判定結果>
実施例1の疾患学習済モデル1において、例えば、学習モデル「Generalized additive
model」を用いた場合では、AUC0.957となり、比較例1の疾患学習済モデル2(0.819)、比較例2の疾患学習済モデル3(0.896)、及び比較例3の疾患学習済モデル4(0.900)に比べ、良好な判定が可能であった(図5参照)。また、図5に示されるように、他の種類の学習モデルを用いた場合でも、同様に、実施例1の疾患学習済モデル1において、比較例1の疾患学習済モデル2、比較例2の疾患学習済モデル3、及び比較例3の疾患学習済モデル4に比べ、良好な判定が可能であった。
<Judgment results for disease-trained models 1 to 4>
In the disease trained model 1 of Example 1, for example, the trained model “Generalized additive
When the "disease trained model" was used, the AUC was 0.957, which was better than the disease trained model 2 of Comparative Example 1 (0.819), the disease trained model 3 of Comparative Example 2 (0.896), and the disease trained model 4 of Comparative Example 3 (0.900) (see FIG. 5). Similarly, as shown in FIG. 5, even when other types of learning models were used, the disease trained model 1 of Example 1 was better than the disease trained model 2 of Comparative Example 1, the disease trained model 3 of Comparative Example 2, and the disease trained model 4 of Comparative Example 3.
本発明は、上記の実施形態に限るものではなく、その主旨を逸脱しない範囲内において種々の変形、変更、改良が可能である。前述の変形例は、適宜、複数組み合わせて構成ししてもよい。 The present invention is not limited to the above-described embodiment, and various modifications, changes, and improvements are possible without departing from the spirit of the invention. The above-described modifications may be combined as appropriate.
<付記>
(態様1)
複数のスモールRNAの発現量に基づき、被検者の性質を判定する性質判定モデルを生成する際に、当該複数のスモールRNAの由来となる検体を選別する、性質判定モデルの生成方法であって、
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
性質判定モデルの生成方法。
(態様2)
前記標準検体は、被検者から検体を採取する採取条件、採取された検体を保存する保存条件、及び、被検者の複数のスモールRNAの発現量を測定する測定条件の少なくとも一つの条件が、前記予め定められた条件を満たす検体である、
態様1に記載の性質判定モデルの生成方法。
(態様3)
前記条件は、被検者から前記検体としての体液を採取してから遠心分離操作を完了するまでの処理時間であり、
当該処理時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
態様2に記載の性質判定モデルの生成方法。
(態様4)
疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の前記処理時間を示す処理時間データとの相関関係を機械学習させた処理時間学習済モデルに、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記処理時間の適否を検査し、
前記処理時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて、被検者の前記性質としての疾患を判定する前記性質判定モデルを生成する、
態様3に記載の性質判定モデルの生成方法。
(態様5)
前記条件は、被検者から採取された前記検体としての体液の保存温度と当該保存温度での保存時間であり、
前記保存温度及び前記保存時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
態様2~4のいずれか1つに記載の性質判定モデルの生成方法。
(態様6)
疾患を罹患した被検者を含む複数の被検者から採取した体液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した体液の保存温度と当該保存温度での保存時間とを示す保存状況データとの相関関係を機械学習させた保存状況学習済モデルに、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記保存温度と当該保存温度での保存時間の適否を検査し、
前記保存温度及び前記保存時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて、被検者の前記性質としての疾患を判定する前記性質判定モデルを生成する、
態様5に記載の性質判定モデルの生成方法。
(態様7)
前記条件は、被検者から採取された前記検体としての血液の溶血程度であり、
前記溶血程度が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
態様2~6のいずれか1つに記載の性質判定モデルの生成方法。
(態様8)
複数の被検者から採取した血液中の複数のスモールRNAの発現量を測定した結果を示す測定データと、当該複数の被検者から採取した血液の溶血程度を示す溶血程度データとの相関関係を機械学習させた溶血程度学習済モデルに、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記溶血程度を検査し、
前記溶血程度が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
態様7に記載の性質判定モデルの生成方法。
(態様9)
被検者の前記性質としての疾患を判定する性質判定モデルを生成する、
態様1~8のいずれか1つに記載の性質判定モデルの生成方法。
(態様10)
複数のスモールRNAの発現量に基づいて生成され、当該生成の際に当該複数のスモールRNAの由来となる検体が選別される、被検者の性質を判定する性質判定モデルであって、
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別され、
前記標準検体における前記複数のスモールRNAの発現量を用いて生成された
性質判定モデル。
(態様11)
態様10に記載の性質判定モデルに、被検者の検体中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記性質を判定する
性質判定方法。
(態様12)
前記スモールRNAデータは、
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、前記予め定められた条件を満たす当該検体から得られたデータである
態様11に記載の性質判定方法。
(態様13)
態様10に記載の性質判定モデルを有し、
前記性質判定モデルに、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記性質を判定する
性質判定装置。
<Additional Notes>
(Aspect 1)
A method for generating a property determination model for determining the property of a subject based on expression levels of a plurality of small RNAs, the method comprising the steps of: selecting a specimen from which the plurality of small RNAs are derived;
A sample that satisfies a predetermined condition from the time of collection of the sample from the subject to the time of measuring the expression levels of a plurality of small RNAs in the sample is selected as a standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property judgment model.
(Aspect 2)
The standard sample is a sample in which at least one of the following conditions is satisfied in advance: collection conditions for collecting a sample from a subject, storage conditions for storing the collected sample, and measurement conditions for measuring the expression levels of a plurality of small RNAs in the subject.
A method for generating a property determination model according to aspect 1.
(Aspect 3)
the condition is a processing time from collection of the body fluid as the sample from the subject to completion of a centrifugation operation,
A sample having an appropriate treatment time is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to aspect 2.
(Aspect 4)
The suitability of the processing time is examined by inputting small RNA data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of the subjects into a processing time trained model that has been machine-learned to determine the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and processing time data showing the processing time of the body fluids collected from the multiple subjects;
A sample having an appropriate treatment time is selected as the standard sample;
generating a trait determination model for determining a disease as the trait of a subject using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to aspect 3.
(Aspect 5)
the condition is a storage temperature of the body fluid as the sample collected from the subject and a storage time at the storage temperature;
A sample having the appropriate storage temperature and storage time is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to any one of aspects 2 to 4.
(Aspect 6)
a storage condition trained model that has been machine-learned to learn a correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of multiple subjects, including subjects suffering from a disease, and storage condition data showing the storage temperature of the body fluids collected from the multiple subjects and the storage time at that storage temperature, and inputs small RNA data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of the subjects into the model, thereby examining the suitability of the storage temperature and the storage time at that storage temperature;
A sample having the appropriate storage temperature and storage time is selected as the standard sample;
generating a trait determination model for determining a disease as the trait of a subject using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to aspect 5.
(Aspect 7)
the condition is a degree of hemolysis of the blood sample collected from the subject,
A sample having an appropriate degree of hemolysis is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to any one of aspects 2 to 6.
(Aspect 8)
A hemolysis degree trained model is configured to machine-learn a correlation between measurement data showing the results of measuring the expression levels of a plurality of small RNAs in blood collected from a plurality of subjects and hemolysis degree data showing the hemolysis degree of the blood collected from the plurality of subjects, by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject, and examining the degree of hemolysis;
A sample having an appropriate degree of hemolysis is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property determination model according to aspect 7.
(Aspect 9)
Generate a property determination model for determining a disease as the property of the subject;
A method for generating a property determination model according to any one of aspects 1 to 8.
(Aspect 10)
A property determination model for determining a property of a subject, the property determination model being generated based on expression levels of a plurality of small RNAs, and a specimen from which the plurality of small RNAs are derived being selected during the generation,
A sample that satisfies predetermined conditions from collection of the sample from the subject to measurement of the expression levels of multiple small RNAs in the sample is selected as a standard sample;
A property determination model generated using the expression levels of the plurality of small RNAs in the standard specimen.
(Aspect 11)
A method for determining a property, comprising inputting small RNA data showing the results of measuring expression levels of a plurality of small RNAs in a specimen of a subject into the property determination model according to aspect 10, thereby determining the property.
(Aspect 12)
The small RNA data is
A property determination method according to aspect 11, wherein the data obtained from the sample satisfies the predetermined conditions from when the sample is collected from the subject until when the expression levels of the plurality of small RNAs in the sample are measured.
(Aspect 13)
A property determination model according to aspect 10,
The property determining device determines the property by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject into the property determining model.
2023年10月24日に出願された日本国特許出願2023-182776号の開示は、その全体が参照により本明細書に取り込まれる。本明細書に記載された全ての文献、特許出願、および技術規格は、個々の文献、特許出願、および技術規格が参照により取り込まれることが具体的かつ個々に記された場合と同程度に、本明細書中に参照により取り込まれる。 The disclosure of Japanese Patent Application No. 2023-182776, filed on October 24, 2023, is incorporated herein by reference in its entirety. All documents, patent applications, and technical standards described herein are incorporated herein by reference to the same extent as if each individual document, patent application, and technical standard was specifically and individually indicated to be incorporated by reference.
Claims (13)
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
性質判定モデルの生成方法。 A method for generating a property determination model for determining the property of a subject based on expression levels of a plurality of small RNAs, the method comprising the steps of: selecting a specimen from which the plurality of small RNAs are derived;
A sample that satisfies a predetermined condition from the time of collection of the sample from the subject to the time of measuring the expression levels of a plurality of small RNAs in the sample is selected as a standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
A method for generating a property judgment model.
請求項1に記載の性質判定モデルの生成方法。 The standard sample is a sample in which at least one of the following conditions is satisfied in advance: collection conditions for collecting a sample from a subject, storage conditions for storing the collected sample, and measurement conditions for measuring the expression levels of a plurality of small RNAs in the subject.
A method for generating a property determination model according to claim 1 .
当該処理時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
請求項2に記載の性質判定モデルの生成方法。 the condition is a processing time from collection of the body fluid as the sample from the subject to completion of a centrifugation operation,
A sample having an appropriate treatment time is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 2 .
前記処理時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて、被検者の前記性質としての疾患を判定する前記性質判定モデルを生成する、
請求項3に記載の性質判定モデルの生成方法。 The suitability of the processing time is examined by inputting small RNA data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of the subjects into a processing time trained model that has been machine-learned to determine the correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in body fluids collected from multiple subjects, including subjects suffering from a disease, and processing time data showing the processing time of the body fluids collected from the multiple subjects;
A sample having an appropriate treatment time is selected as the standard sample;
generating a trait determination model for determining a disease as the trait of a subject using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 3 .
前記保存温度及び前記保存時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
請求項2に記載の性質判定モデルの生成方法。 the condition is a storage temperature of the body fluid as the sample collected from the subject and a storage time at the storage temperature;
A sample having the appropriate storage temperature and storage time is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 2 .
前記保存温度及び前記保存時間が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて、被検者の前記性質としての疾患を判定する前記性質判定モデルを生成する、
請求項5に記載の性質判定モデルの生成方法。 a storage condition trained model that has been machine-learned to learn a correlation between measurement data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of multiple subjects, including subjects suffering from a disease, and storage condition data showing the storage temperature of the body fluids collected from the multiple subjects and the storage time at that storage temperature, and inputs small RNA data showing the results of measuring the expression levels of multiple small RNAs in the body fluids of the subjects into the model, thereby examining the suitability of the storage temperature and the storage time at that storage temperature;
A sample having the appropriate storage temperature and storage time is selected as the standard sample;
generating a trait determination model for determining a disease as the trait of a subject using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 5 .
前記溶血程度が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
請求項2に記載の性質判定モデルの生成方法。 the condition is a degree of hemolysis of the blood sample collected from the subject,
A sample having an appropriate degree of hemolysis is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 2 .
前記溶血程度が適切である検体を、前記標準検体として選別し、
前記標準検体における前記複数のスモールRNAの発現量を用いて前記性質判定モデルを生成する、
請求項7に記載の性質判定モデルの生成方法。 A hemolysis degree trained model is configured to machine-learn a correlation between measurement data showing the results of measuring the expression levels of a plurality of small RNAs in blood collected from a plurality of subjects and hemolysis degree data showing the hemolysis degree of the blood collected from the plurality of subjects, by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject, and examining the degree of hemolysis;
A sample having an appropriate degree of hemolysis is selected as the standard sample;
generating the property determination model using the expression levels of the plurality of small RNAs in the standard specimen;
The method for generating a property determination model according to claim 7.
請求項1に記載の性質判定モデルの生成方法。 Generate a property determination model for determining a disease as the property of the subject;
A method for generating a property determination model according to claim 1 .
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、予め定められた条件を満たす当該検体を、標準検体として選別され、
前記標準検体における前記複数のスモールRNAの発現量を用いて生成された
性質判定モデル。 A property determination model for determining a property of a subject, the property determination model being generated based on expression levels of a plurality of small RNAs, and a specimen from which the plurality of small RNAs are derived being selected during the generation,
A sample that satisfies predetermined conditions from collection of the sample from the subject to measurement of the expression levels of a plurality of small RNAs in the sample is selected as a standard sample;
A property determination model generated using the expression levels of the plurality of small RNAs in the standard specimen.
性質判定方法。 A property determination method comprising: inputting small RNA data showing the results of measuring expression levels of a plurality of small RNAs in a specimen of a subject into the property determination model according to claim 10, thereby determining the property.
被検者から検体を採取してから当該検体中の複数のスモールRNAの発現量を測定するまでの条件が、前記予め定められた条件を満たす当該検体から得られたデータである
請求項11に記載の性質判定方法。 The small RNA data is
The method for determining a property according to claim 11 , wherein the data obtained from the sample satisfies the predetermined conditions from when the sample is collected from the subject to when the expression levels of multiple small RNAs in the sample are measured.
前記性質判定モデルに、被検者の体液中の複数のスモールRNAの発現量を測定した結果を示すスモールRNAデータを入力することによって、前記性質を判定する
性質判定装置。 The property determination model according to claim 10,
The property determining device determines the property by inputting small RNA data showing the results of measuring the expression levels of a plurality of small RNAs in the body fluid of the subject into the property determining model.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023182776 | 2023-10-24 | ||
| JP2023-182776 | 2023-10-24 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025089036A1 true WO2025089036A1 (en) | 2025-05-01 |
Family
ID=95515192
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2024/035869 Pending WO2025089036A1 (en) | 2023-10-24 | 2024-10-07 | Method for producing property determination model, property determination model, property determination method, and property determination device |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025089036A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017146033A1 (en) * | 2016-02-22 | 2017-08-31 | 東レ株式会社 | METHOD FOR EVALUATING QUALITY OF miRNA DERIVED FROM BODY FLUID |
| WO2021132547A1 (en) * | 2019-12-25 | 2021-07-01 | 東レ株式会社 | Test method, test device, learning method, learning device, test program and learning program |
| JP2022512890A (en) * | 2018-10-30 | 2022-02-07 | ソマロジック オペレーティング カンパニー インコーポレイテッド | Sample quality evaluation method |
-
2024
- 2024-10-07 WO PCT/JP2024/035869 patent/WO2025089036A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017146033A1 (en) * | 2016-02-22 | 2017-08-31 | 東レ株式会社 | METHOD FOR EVALUATING QUALITY OF miRNA DERIVED FROM BODY FLUID |
| JP2022512890A (en) * | 2018-10-30 | 2022-02-07 | ソマロジック オペレーティング カンパニー インコーポレイテッド | Sample quality evaluation method |
| WO2021132547A1 (en) * | 2019-12-25 | 2021-07-01 | 東レ株式会社 | Test method, test device, learning method, learning device, test program and learning program |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP2016405B1 (en) | Methods and apparatus for identifying disease status using biomarkers | |
| Boguski et al. | Biomedical informatics for proteomics | |
| JP6681337B2 (en) | Device, kit and method for predicting the onset of sepsis | |
| CN110890137A (en) | Modeling method, device and application of compound toxicity prediction model | |
| JP7361187B2 (en) | Automated validation of medical data | |
| CN113053535B (en) | Medical information prediction system and medical information prediction method | |
| JP7467447B2 (en) | Sample quality assessment method | |
| CN115144599A (en) | Application of protein combination in preparation of kit for carrying out prognosis stratification on thyroid cancer of children, and kit and system thereof | |
| CN119804871A (en) | A lupus anticoagulant detection system and method based on conventional coagulation reagents | |
| WO2025089036A1 (en) | Method for producing property determination model, property determination model, property determination method, and property determination device | |
| CN119626506A (en) | A method for constructing a systemic lupus erythematosus activity assessment model and its application | |
| EP2701579A2 (en) | Stratifying patient populations through characterization of disease-driving signaling | |
| WO2025089033A1 (en) | Testing method, testing device, testing system, testing program, recording medium, and method for generating processing time-trained model | |
| WO2025089035A1 (en) | Testing method, testing device, testing system, testing program, recording medium, and method for generating storage condition-trained model | |
| WO2025089028A1 (en) | Assessment method, assessment device, assessment system, assessment program, recording medium, and method for generating trained model for degree of hemolysis | |
| WO2025089029A1 (en) | Determination method, determination device, determination system, determination program, and recording medium | |
| CN119920311B (en) | Single-cell data quality control processing method and system | |
| CN118538425B (en) | Diagnosis model of VHL syndrome kidney cancer and application thereof | |
| WO2025089057A1 (en) | Method of preparing and method of storing blood sample for small rna measurement, method of measuring small rna expression level, disease assessment method, evaluation method, model generation method, evaluation device, evaluation system, evaluation program, and recording medium | |
| CN116287175B (en) | Application of marker in preparation of related products for predicting intrahepatic cholestasis in gestation period | |
| WO2025094604A1 (en) | Detection method, detection device, detection system, detection program, and recording medium | |
| WO2025027987A1 (en) | Detection method, presentation device, detection device, detection system, detection program, and recording medium | |
| RU2449281C1 (en) | Method of diagnosing pathologies of hemostasis system by means of neural network | |
| WO2025253811A1 (en) | Property determination model generation method, property determination model, property determination method, property determination device, program, and recording medium | |
| CN120164621A (en) | Coagulation disease auxiliary analysis equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24882153 Country of ref document: EP Kind code of ref document: A1 |