[go: up one dir, main page]

CN114566285A - Early screening model for bladder cancer, construction method thereof, kit and use method thereof - Google Patents

Early screening model for bladder cancer, construction method thereof, kit and use method thereof Download PDF

Info

Publication number
CN114566285A
CN114566285A CN202210447648.8A CN202210447648A CN114566285A CN 114566285 A CN114566285 A CN 114566285A CN 202210447648 A CN202210447648 A CN 202210447648A CN 114566285 A CN114566285 A CN 114566285A
Authority
CN
China
Prior art keywords
chr
gene
bladder cancer
sample
chrx
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210447648.8A
Other languages
Chinese (zh)
Other versions
CN114566285B (en
Inventor
楼峰
王云凯
周涛
刘磊
朱帅鹏
孙宏
曹善柏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiangxin Biotechnology Co ltd
Tianjin Xiangxin Medical Instrument Co ltd
Tianjin Xiangxin Medical Laboratory Co ltd
Original Assignee
Tianjin Xiangxin Biotechnology Co ltd
Tianjin Xiangxin Medical Instrument Co ltd
Beijing Xiangxin Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Xiangxin Biotechnology Co ltd, Tianjin Xiangxin Medical Instrument Co ltd, Beijing Xiangxin Biotechnology Co ltd filed Critical Tianjin Xiangxin Biotechnology Co ltd
Priority to CN202210447648.8A priority Critical patent/CN114566285B/en
Publication of CN114566285A publication Critical patent/CN114566285A/en
Application granted granted Critical
Publication of CN114566285B publication Critical patent/CN114566285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Pathology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Public Health (AREA)
  • Immunology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Evolutionary Computation (AREA)
  • Microbiology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Biochemistry (AREA)
  • Epidemiology (AREA)

Abstract

The application relates to the technical field of bladder cancer screening, and particularly discloses a bladder cancer early screening model, a construction method thereof, a kit and a use method thereof. The construction method comprises the following steps: s1, respectively obtaining a gDNA sequencing result and a cfDNA sequencing result of the healthy sample and the cancer sample; s2, processing according to the cfDNA sequencing result to obtain SNP/INDEL characteristics; processing with gDNA sequencing result to obtain CNV characteristic; classifying according to the SNP/INDEL characteristics and the CNV characteristics, and drawing an ROC curve; s3, establishing a bladder cancer early screening model; the kit screens bladder cancer based on the model. The kit has the advantage of accurately screening bladder cancer.

Description

Early screening model for bladder cancer, construction method thereof, kit and use method thereof
Technical Field
The application relates to the technical field of bladder cancer screening, in particular to a bladder cancer early-stage screening model, a construction method and a kit thereof.
Background
Bladder cancer is one of the most common malignant tumors of the urinary system. Current bladder cancer monitoring methods rely on repeated cystoscopy, needle biopsy, and imaging examinations. Cystoscopy is considered the current gold standard for bladder cancer diagnosis, but these procedures are time consuming, costly, less sensitive to carcinoma in situ, and may lead to complications such as urinary tract infection, urinary tract injury, bladder injury, and the like. Needle biopsy methods can be traumatic to tissue due to their high invasiveness. The imaging examination is accompanied by radiation injury, and the above conventional examination methods all bring pain to patients.
Bladder cancer is a common malignant tumor of the urinary system, and the onset of bladder cancer has the biological characteristics of multicentric, easy recurrence, easy invasion, easy drug resistance and the like. The recurrence rate of non-muscle invasive bladder cancer is high, and bladder perfusion and cystoscope reexamination are required at regular intervals; the metastasis rate of the muscle layer invasive bladder cancer is high, and the five-year survival rate is only 50% -60% after radical excision is completed; based on such poor post-cure performance, early screening of bladder cancer is urgently needed to be realized and popularized.
Bladder cancer often has a large amount of tumor cells shed from bladder cancer tissues and small pieces of free DNA released by apoptosis and rupture of cancer cells in urine due to the specificity of the bladder cancer focus; therefore, urine is an ideal sample for diagnosing bladder cancer, and noninvasive diagnosis of urine will be a mainstream trend for research and development. Based on the urine noninvasive diagnosis technology such as exfoliative cytology, Fluorescence In Situ Hybridization (FISH), Bladder Tumor Antigen (BTA) and the like, the problems of low sensitivity and/or specificity and easy occurrence of missed diagnosis and misdiagnosis are solved; therefore, the development of a highly sensitive noninvasive diagnosis technique for bladder cancer can reduce the pain of patients and improve the current medical conditions.
The current clinical "gold standard" for screening bladder cancer is cystoscope combined pathological biopsy, which is painful to sample and causes trauma to the urinary system. There are also a series of non-invasive screening methods for bladder cancer, which are mainly implemented by detecting urine sediment cells and markers in urine supernatant. Wherein the urine sediment cell-based screening technology comprises the following steps: detecting gene copy number variation by low-depth whole gene sequencing, detecting gene mutation by targeted capture sequencing, detecting chromosome copy number variation by Fluorescence In Situ Hybridization (FISH), detecting DNA methylation level and detecting the mRNA expression amount of an overexpression marker gene. The screening technology based on the urine supernatant is mainly cfDNA point mutation detection.
However, the screening accuracy of the existing models for screening bladder cancer is not high; the detection and analysis method corresponding to the screening model also adopts a single-dimension linear classification method, and the method still has the problems of low sensitivity and specificity.
Disclosure of Invention
In order to improve the accuracy of early screening of bladder cancer and improve the sensitivity and specificity of screening, the application provides an early screening model of bladder cancer, a construction method thereof, a kit and a use method thereof.
In a first aspect, the present application provides a method for constructing a bladder cancer early screening model, which adopts the following technical scheme:
a construction method of a bladder cancer early screening model comprises the following steps:
s1, obtaining gDNA sequencing results of the healthy sample and the cancer sample from the urine sediment samples of the healthy sample and the cancer sample respectively; obtaining cfDNA sequencing results of the healthy sample and the cancer sample from the urine supernatant samples of the healthy sample and the cancer sample respectively;
s2, obtaining coverage conditions of bladder cancer related mutation markers of all healthy samples and cancer samples according to cfDNA sequencing results, and finally obtaining SNP/INDEL characteristics; the gDNA sequencing result of part of the healthy samples is used for constructing baseline of the CVN event, and the CNV characteristics are obtained by processing the information in the baseline; classifying according to the SNP/INDEL characteristics and the CNV characteristics, and drawing an ROC curve;
s3, integrating the SNP/INDEL characteristics and the CNV characteristics, and establishing the early screening model of the bladder cancer through an algorithm supporting the machine learning of a vector machine.
By adopting the technical scheme, more and more new biomarkers are discovered due to the wide application of the NGS in the field of early cancer screening. Current research in the field has focused mainly on DNA methylation, mutations and copy number variation, and there are also a few studies of protein and RNA markers. The model building method is to build a training set and a testing set by combining multiple genome characteristics of polymorphic site mutation and copy number variation, and build a cancer screening model by a machine learning method so as to improve the accuracy of cancer (early stage) screening.
Optionally, in step S2, the step of constructing baseline of the CVN event includes: randomly selecting urine sediment samples of part of healthy samples, and processing areas without signals and with a large amount of noise, wherein the areas without signals and with a large amount of noise comprise centromere, telomere and repeat areas; the expected coverage and median segment variance for each bin is then calculated to use this information for normalization of subsequent test samples.
Optionally, in step S2, the step of processing the information in baseline to obtain the CNV feature includes:
normalizing the information in baseline and calculating the log of each bin interval2ratio and Z-Score, and the calculation formulas are respectively:
ratio = RCgc-bin/mean(RCgc-all-bin),
RCgc-binrepresenting the number of reads after GC correction per bin interval,
mean(RCgc-all-bin) Counting the average number of read numbers after GC correction of all bin intervals,
log2ratio, taking log logarithm taking 2 as base log;
Z-Score = (ratio–E(ref-ratio))/std(ref-ratio)
wherein E(ref-ratio) For the desired coverage of each bin of cnv baseline,
std(ref-ratio)standard deviation of coverage for each bin of cnv baseline;
log of each bin to be obtained2ratio and Z-Score use the Circular Binding Segmentation (CBS) method to perform the call segment, resulting in the final log of segment2ratio and Z-Score results; wherein the cutoff _1 is 0.08 for segment filtering.
Optionally, in step S2, the cutoff _2 value is 8700000 for classification of CNV features.
Optionally, the distribution gene of the bladder cancer-related mutant marker comprises any one or more of gene TERT, gene TP53, gene ERBB2, gene ERCC2, gene FGFR3, gene KDM6A, gene PIK3CA, gene ARID1A, gene ERBB3, gene GATA3, gene BRCA2, gene CREBBP, gene CTNNB1, gene ELF3 and gene FH.
Alternatively, the bladder cancer-associated mutant markers distributed on gene TP53 include chr 17: 7577127 and 7577127, chr 17: 7577505 and 7577505, chr 17: 7577527 and 7577527, chr 17: 7577535 and 7577535, chr 17: 7577538 and 7577538, chr 17: 7577539 and 7577539, chr 17: 7577545 and 7577545, chr 17: 7577548 and 7577548, chr 17: 7577559 and 7577559, chr 17: 7577568 and 7577568, chr 17: 7579313 and 7579313, chr 17: 7579328 and 7579328, chr 17: 7579365 and 7579365, chr 17: 7579391-: 7579406 and 7579406, chr 17: 7579415 and 7579415, chr 17: 7579431 and 7579431, chr 17: 7573982 and 7573982, chr 17: 7573983 and 7573983chr 17: 7577574 and 7577574, chr 17: 7577596 and 7577596, chr 17: 7577599 and 7577599, chr 17: 7578382 and 7578382, chr 17: 7578419 and 7578419, chr 17: 7578437 and 7578437, chr 17: 7578442 and 7578442, chr 17: 7578513 and 7578513, chr 17: 7578524-7578524, chr 17: 7579340 and chr 17: 7577571 and 7577584;
bladder cancer-associated mutant markers distributed on the gene ERBB2 include chr 17: 37873691-37873691, chr 17: 37879658 and 37879658, chr 17: 37880220 and 37880220, chr 17: 37880257 and 37880257, chr 17: 37880261 and 37880261, chr 17: 37880265, chr 17: 37880981, chr 17: 37881329, chr 17: 37883131-37883131, chr 17: 37883158 and 37883158, chr 17: 37883660-37883660, chr 17: 37884073 and 37884073, chr 17: 37863323 and 37863323, chr 17: 37864656-chr 17: 37864665 and 37864665;
bladder cancer-associated mutant markers distributed on gene ERCC2 include chr 19: 45855817-45855817, chr 19: 45855824-45855824, chr 19: 45855835-45855835, chr 19: 45858086-: 45860556-45860556, chr 19: 45860733-45860733, chr 19: 45864859-45864881, chr 19: 45867571-45867571, chr 19: 45867584 + 45867584, chr 19: 45867687-: 45872189-45872189, chr 19: 45872213-45872213, chr 19: 45872219-45872219, chr 19: 45872362-45872362, chr 19: 45872380, chr 19: 45873425-45873425, chr 19: 45873455-45873455 and chr 19: 45873456 and 45873456;
bladder cancer-associated mutant markers distributed on the gene FGFR3 include chr 4: 1803564, chr 4: 1803568, chr 4: 1806089, chr 4: 1806092, chr 4: 1806099, chr 4: 1806119-1806119, chr 4: 1806153-1806153, chr 4: 1807859-1807859, chr 4: 1807889, chr 4: 1807890, chr 4: 1808916-1808916 and chr 4: 1808937-1808937;
bladder cancer-associated mutant markers distributed on gene KDM6A include chrX: 44732952 and 44732952, chrX: 44732955 and 44732955, chrX: 44733198 and 44733198, chrX: 44733200, chrX: 44833924 and 44833924, chrX: 44870233-44870240, chrX: 44913136 and 44913136, chrX: 44918316-44918316, chrX: 44918532-: 44918582-: 44918668-44918669, chrX: 44920630-44920633, chrX: 44949991-: 44950033-44950034, chrX: 44950066-44950066, chrX: 44969323-44969323 and chrX: 44969369 and 44969369;
bladder cancer-associated mutant markers distributed on the gene PIK3CA include chr 3: 178916992 9-: 1789169936-1789169936, chr 3: 178936091-178936091, chr 3: 178936094-178936094, chr 3: 178936095-178936095, chr 3: 178936103-178936103, chr 3: 178938886-178938886, chr 3: 178951994-: 178952039-178952039, chr 3: 178952085-178952085, chr 3: 178952090-178952090, chr 3: 178919256 and chr 3: 178921553 and 178921553;
bladder cancer-associated mutant markers distributed over gene ARID1A include chr 1: 27099915, chr 1: 27101380-27101398, chr 1: 27101427-27101449, chr 1: 27101551-27101551, chr 1: 27101586-: 27101645 and 27101645, chr 1: 27102132-: 27102137-27102137, chr 1: 27105648 and 27105649, chr 1: 27106081-27106087, chr 1: 27106240-27106240, chr 1: 27106279-27106279, chr 1: 27106354-27106354, chr 1: 27106539-: 27107081-: 27107134-27107135;
bladder cancer-associated mutant markers distributed on the gene ERBB3 include chr 12: 56478817-: 56478851-: 56478854-: 56481649-: 56481660-: 56482341-: 56482537-: 56489571-: 56489582-: 56490408-: 56492623-: 56495696-: 56495711-;
bladder cancer-associated mutant markers distributed over the gene GATA3 include chr 10: 8100619-8100619;
bladder cancer-associated mutant markers distributed on gene BRCA2 include chr 13: 32900694, chr 13: 32910686-32910686, chr 13: 32910940-32910940, chr 13: 32912514-32912514 and chr 13: 32972638 and 32972638;
bladder cancer-associated mutant markers distributed on the gene CREBBP include chr 16: 3786710-3786710, chr 16: 3786726, 3786730, chr 16: 3786740-: 3786775 and 3786775;
bladder cancer-associated mutant markers distributed over the gene CTNNB1 included chr 3: 41266113-41266113;
bladder cancer-associated mutant markers distributed on gene ELF3 include chr 1: 201981484. 201981494, chr 1: 201981500-201981500, chr 1: 201981537-201981537, chr 1: 201983034-201983034, chr 1: 201983035 and chr 1: 201984418 and 201984418;
bladder cancer-associated mutant markers distributed on gene FH include chr 1: 241663871-241663871.
By adopting the technical scheme, a proper marker is selected for model construction, wherein the selected marker in the application comprises part of marker data from TCGA bladder cancer SNP/INDEL public data, and related mutations comprise polymorphic site mutation and chromosome mutation; therefore, the data source of the marker is more comprehensive, and more accurate, representative and comprehensive marker groups can be obtained. The model construction method is based on a plurality of genome characteristics of polymorphism site mutation combined copy number variation, a training set and a testing set are established, and a cancer screening model is established by a machine learning method, so that the accuracy of cancer (early stage) screening can be obviously improved.
In a second aspect, the present application provides an early screening model for bladder cancer, which adopts the following technical scheme:
a bladder cancer early screening model is obtained by the construction method.
Optionally, 70% of the obtained data is taken as a training set, and 30% is taken as a test set; obtaining a radial basis kernel function rbf of the model; the parameter gamma of the kernel function is 0.001, and the penalty coefficient C of the objective function is 1000.
By adopting the technical scheme, the model is combined with a certain kit to screen the bladder cancer, particularly the early screening of the bladder cancer, the screening result is better in accuracy, and the specificity and the sensitivity are excellent.
In a third aspect, the present application provides a kit for early screening of bladder cancer based on the above model, which adopts the following technical scheme:
a kit for early screening of bladder cancer based on the above model, the kit comprising a reagent composition for extracting cfDNA, a reagent composition for gene library construction, a reagent composition for extracting gDNA;
extracting the cfDNA of the sample to be screened by adopting the reagent composition for extracting the cfDNA, constructing by adopting the reagent composition constructed by the gene library to obtain a cfDNA library of the sample to be screened, and obtaining a sequencing result of the cfDNA of the sample to be screened after the machine detection;
extracting gDNA of a sample to be screened by adopting a reagent composition for extracting the gDNA, constructing by adopting a reagent composition constructed by a gene library to obtain a gDNA library of the sample to be screened, and obtaining a gDNA sequencing result of the sample to be screened after the gDNA sequencing result is obtained by machine detection;
and inputting the obtained cfDNA sequencing result and the gDNA sequencing result into the model to obtain an early bladder cancer screening result.
In a fourth aspect, the application provides a use method of the kit, which adopts the following technical scheme:
the use method of the kit comprises the following steps:
extracting the cfDNA of the sample to be screened by using the reagent composition for extracting the cfDNA, constructing the cfDNA library of the sample to be screened by using the reagent composition constructed by the gene library, and obtaining a sequencing result of the cfDNA of the sample to be screened after the machine detection;
extracting gDNA of a sample to be screened by adopting a reagent composition for extracting the gDNA, constructing by adopting a reagent composition constructed by a gene library to obtain a gDNA library of the sample to be screened, and obtaining a gDNA sequencing result of the sample to be screened after the gDNA sequencing result is obtained by machine detection;
and inputting the obtained cfDNA sequencing result and the gDNA sequencing result into the model to obtain the early bladder cancer screening result.
In summary, the present application has the following beneficial effects:
1. according to the method, the CNV characteristics are obtained according to the gDNA sequencing result in the urine sediment sample, the SNP/INDEL characteristics are obtained according to the cfDNA sequencing result in the urine supernatant sample, and then the SNP/INDEL characteristics and the CNV characteristics are integrated and then a bladder cancer early screening model is established through an algorithm of support vector machine learning; the SNP/INDEL characteristics and the CNV characteristics are integrated in the model, and when the finally obtained screening model is used for early screening of bladder cancer, the sensitivity and specificity of the screening method are excellent, and the accuracy of the final screening result is better.
2. In the application, a specific marker is preferably adopted to obtain SNP/INDEL characteristics, so that the accuracy of an evaluation model is further improved.
Drawings
FIG. 1 is a ROC curve for SNP/INDEL characteristics;
FIG. 2 is a ROC curve for CNV characteristics;
FIG. 3 is a training set ROC curve for a support vector machine model;
FIG. 4 is a test set ROC curve for a support vector machine model.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples.
Examples
A bladder cancer early screening model based on urine cfDNA and a construction method thereof are disclosed, the construction method comprises the following steps:
s1, obtaining gDNA sequencing results of the healthy sample and the cancer sample from the urine sediment samples of the healthy sample and the cancer sample respectively; cfDNA sequencing results were obtained from the urine supernatant samples of the healthy and cancer samples, respectively.
According to the experimental process, 83 bladder cancer samples and 67 healthy samples are respectively subjected to cfDNA collection, extraction, library building and sequencing; and respectively carrying out gDNA collection, extraction, library building and sequencing.
The method comprises the following specific steps:
gDNA collection, extraction, library construction and sequencing in urine sediment cell
1 collection and transportation of urine
1.1 Using a 50mL centrifuge tube pre-filled with Urine Condition Buffer (Zymo Research), 50mL fasting mid-morning Urine 1 tube was collected for gDNA extraction of Urine sediment cells.
1.2 use Kangji century Urine DNA Storage Tube (cat # CW 2657) to collect 20mL of daily fasting mid-morning Urine (10 mL. times.2 tubes) for Urine supernatant cfDNA extraction.
2. Extraction of gDNA from urine sediment cell
2.1 pretreatment of urine
2.1.1 placing the urine collected from the laboratory in the step 1.1 into a normal temperature centrifuge, centrifuging at room temperature 1600rpm for 10min, and keeping the centrifugal precipitate;
2.1.2 adding 1000 microliter PBS into the sediment in the tube, blowing and suspending by a pipette, transferring the sediment into a prepared 1.5mL centrifuge tube, and centrifuging for 5min at room temperature and 1600 rpm;
2.1.3 end of centrifugation, pipette off the supernatant. Adding 1000 mu L PBS into the cell sediment, blowing and beating the suspension cells, and immediately using the suspension cells for subsequent experiments or storing the suspension cells at 4 ℃ for later use;
2.2 urine sediment cell gDNA extraction
Urine sediment cell gDNA extraction was performed using a blood/cell/tissue genomic DNA extraction kit (cat # DP 304) from which the following reagents were derived. The specific operation flow is as follows:
2.2.1 centrifuging the 2.1.3 cell suspension at room temperature 1600rpm for 5min, and removing the supernatant by aspiration;
2.2.2 Add 200. mu.L Buffer GA and 20. mu.L proteinase K to the cell pellet, mix well with shaking and centrifuge instantaneously. Then placing the sample in a constant-temperature metal bath which is pre-balanced to 56 ℃, incubating for at least 30min at 56 ℃ until the sample is fully digested, and vortexing and shaking for 2-3s every 15 min;
2.2.3 to the end of incubation 2.2.2 system 200. mu.L Buffer GB (lysis to separate nucleic acid and protein) was added, mixed by inversion and centrifuged instantaneously. Then placing the mixture into a constant-temperature metal bath which is pre-balanced to 70 ℃, carrying out warm bath for at least 10min, and carrying out vortex oscillation for 10s every 3min until the solution is clear and transparent;
2.2.4 adding 4 μ L RNase A into the 2.2.3 system, vortex and shake for 15s, standing for 2min at room temperature;
2.2.5 adding 200 μ L of anhydrous ethanol into the 2.2.4 system, vortex and oscillating for 15s, standing for 2-3min at room temperature;
2.2.6 transfer all samples to CB3/CR2 column, 12000rpm centrifugation for 1min, discard waste liquid;
2.2.7 adding 500 μ L Buffer GD to CB3/CR2 column, centrifuging at 12000rpm for 1min, and discarding the waste liquid;
2.2.8 adding 600 μ L of rinsing solution PW to CB3/CR2 column, centrifuging at 12000rpm for 1min, and discarding the waste liquid;
2.2.9 repeat step 2.2.8 once;
2.2.10 placing the CB3/CR2 column into a centrifuge, centrifuging at 12000rpm for 1min, throwing off, discarding the waste liquid, transferring CB3/CR2 into a 1.5mL tube, and airing at room temperature for 2 min;
2.2.11 mu.L of elution buffer preheated to 56 ℃ was added to the center of the air-dried CB3/CR2 column and left to stand at room temperature for 2 min. And then centrifuging at 12000rpm for 1min at room temperature, and collecting the extracted DNA for DNA quality inspection and library preparation.
3. WGS library construction
The following WGS library construction was carried out using a KAPA DNA HyperPrep library construction kit (Illumina, PCR-free, 96 reaction; cat # KK 8505) from which the following reagents were derived. The specific operation flow is as follows:
3.1 gDNA fragmentation
3.1.1 according to the quantitative result of the Qubit fluorometer, taking 200ng of DNA extracted from 2.2.11 to a pre-prepared 1.5mL centrifuge tube, supplementing the final volume to 50 μ L by using IDTE buffer (IDT), shaking, mixing uniformly and centrifuging briefly for later use;
3.1.2 transfer the 3.1.1 prepared DNA to a 50. mu.L Covaris cleavage TUBE (M220 Holder XTU micro TUBE 50. mu.l, cat # 500488) and fragment to the main fragment 300-350bp in a Covaris M220 disruptor;
3.1.3 transfer fragmented DNA into pre-prepared 200 μ L PCR tubes for end repair and A-tailed experiments;
3.2 end repair and "A" tailed experiments
The End-Repair and "A" Tailing reagents were added in amounts such that "End-Repair & A-Tailing Buffer addition (End Repair & A-Tailing Buffer) was 7. mu.L/reaction, and End-Repair Enzyme and A-Tailing Enzyme mixture addition (End Repair & A-Tailing Enzyme Mix) was 3. mu.L/reaction". After shaking, mixing and instantaneous centrifugation, the PCR reaction program was run in a PCR instrument: storing at 20 deg.C for 30min, 65 deg.C for 30min, 4 deg.C, and hot covering with 70 deg.C; finally obtaining the final modified A product.
3.3 Joint connection
After the reaction is finished, adding the joint connection reaction components into the reaction system according to the amount of '60 muL of the unmodified A product per reaction, 2.5 muL of the DNA connection joint per reaction, 7.5 muL of PCR-grade water (PCR-grade water), 30 muL of the connection Buffer solution (Ligation Buffer) per reaction, and 10 muL of DNA Ligase Enzyme per reaction', shaking, uniformly mixing and instantly centrifuging, and then, operating a PCR reaction program in a PCR instrument: 20 ℃ for 15min, hot cover OFF; to obtain a ligation product.
3.4 purification after linker ligation
3.4.1 taking out the Beckmann AmpureXP magnetic beads from a refrigerator at the temperature of 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30 min;
3.4.2 transferring the 3.3 reaction system to a new 1.5mL centrifuge tube, adding 88 μ L AmpureXP magnetic beads, shaking, mixing uniformly, centrifuging instantaneously, and incubating at room temperature for 10 min;
3.4.3 after the incubation is finished, placing the centrifugal tube on a magnetic frame to adsorb magnetic beads; after the solution in the tube is clear and transparent, absorbing and discarding the supernatant;
3.4.4 adding 400 μ L of 80% ethanol to the magnetic bead tube, rotating the tube for one week, and then removing the supernatant by aspiration;
3.4.5 repeating the step 3.4.4 once, then performing instantaneous centrifugation and swinging the centrifugal tube on a magnetic frame to adsorb magnetic beads; the remaining ethanol was removed with a 20. mu.L pipette tip;
3.4.6 keeping the magnetic beads on a magnetic frame, and airing for 5min at room temperature until no mirror surface is reflected on the surfaces of the magnetic beads;
3.4.7 adding 21 μ L of nuclease-free water (Nucleasefree water) into the magnetic bead tube, shaking and mixing uniformly, and incubating for 5min at room temperature;
3.4.8, instantly centrifuging the magnetic bead tube, placing the tube on a magnetic rack to adsorb magnetic beads, and transferring the supernatant to a pre-prepared 0.2mL PCR tube after the solution in the tube is clear and transparent;
3.5 library amplification
PCR tubes containing purified adaptor ligation products from 3.4.8 were loaded with PCR components in amounts "20. mu.L/reaction for purified adaptor ligation products, 25. mu.L/reaction for KAPA HiFi ReadyMix, 5. mu.L/reaction for P5+ P7 Primer Mix ((P5 + P7) Primer Mix"). Vortex, shake, mix and centrifuge instantaneously, then put into a PCR instrument to run a PCR program: 45s at 98 ℃; (98 ℃ 15S; 60 ℃ 30S; 72 ℃ 30S) for 6 cycles; 1min at 72 ℃; keeping at 4 ℃. Finally obtaining the amplified library.
3.6 purification of libraries after amplification
3.6.1 taking out the Beckmann AmpureXP magnetic beads from a refrigerator at the temperature of 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30 min;
3.6.2 transferring the 3.5 amplified system to a new 1.5mL centrifuge tube, adding 80. mu.L AmpureXP magnetic beads, shaking, mixing uniformly, centrifuging instantaneously, and incubating at room temperature for 10 min;
3.6.3 after incubation, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution in the tube is clear and transparent, absorbing and discarding the supernatant;
3.6.4 adding 400 μ L80% ethanol into the magnetic bead tube, rotating the tube for one week, and removing the supernatant;
3.6.5 repeating the step 3.6.4 once, then instantly centrifuging and placing the centrifugal tube on a magnetic rack to adsorb magnetic beads; then, the remaining ethanol was removed with a 20. mu.L pipette tip;
3.6.6 keeping the magnetic beads on a magnetic frame, and air-drying at room temperature for 5min until no mirror reflection exists on the surfaces of the magnetic beads;
3.6.7 adding 21 μ L Nuclease Free water (Nuclease Free water) into the magnetic bead tube, shaking, mixing, and incubating at room temperature for 5 min;
3.6.8 the magnetic bead tube is centrifuged instantly and placed on a magnetic frame to adsorb the magnetic beads, after the solution in the tube is clear and transparent, the supernatant is transferred to a new pre-prepared 1.5mL centrifuge tube for quality control and sWGS machine sequencing.
3.6.9 library quality control and sequencing: the purified library was quantified using a Qubit 4 fluorimeter in combination with a Qubit dsDNA HS Assay Kit (Thermofeisher). Library fragment quality control was performed using the Agilent 2100 Bioanalyzer in combination with the Agilent 2100 DNA 1000 Kit. Before the sequencing on the computer, ABI StepOne Plus is used for carrying out the quantitative determination of the mole number of the library, and the quantitative sequencing of 21Gb data is prearranged according to the quantitative result.
Collecting, extracting, establishing a library and sequencing cfDNA in urine supernatant
4.1 urine pretreatment
4.1.1 placing the urine collected from the laboratory in the step 1.2 into a normal temperature centrifuge, centrifuging at 1600rpm for 10min, and transferring the supernatant into a 5mL centrifuge tube by using a pipettor;
4.1.2 put 5mL centrifuge tube containing urine supernatant into centrifuge, centrifuge at 16000rpm for 10min, transfer supernatant to new 5mL centrifuge tube using pipettor for subsequent extraction experiment.
4.2 urine supernatant cfDNA extraction
South syndrome Apostle MiniMax after optimization of urine supernatant cfDNA usageTMHigh Efficiency cfDNA Isolation Kit was completed, and the following reagents used in the urine supernatant cfDNA extraction process were derived from this Kit. The specific operation flow is as follows:
4.2.1 adding 320. mu.L proteinase K (20mg/mL) into the supernatant of the 4.1.2 urine, reversing and mixing, adding 400. mu.L Sample cell Lysis Buffer (Sample Lysis Buffer), reversing and mixing, and incubating for 30min at 60 ℃ in a metal bath;
4.2.2 incubation, 1 tube of 15mL centrifuge tube was added with 5mL cfDNA Lysis/Binding Solution (cfDNA Lysis/Binding Solution) and 60 μ L Magnetic Nanoparticles (Magnetic Nanoparticles) in sequence;
4.2.3 transferring the incubated 4.2.1 system to a 4.2.2 centrifuge tube of 15mL, placing the system in a four-dimensional mixing instrument, and performing reverse incubation for 10min at room temperature;
4.2.4 after the incubation is finished by inverting, putting a 15mL centrifuge tube into a centrifuge, centrifuging for 2min at room temperature and 1600rpm, and then putting the centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution is clear and transparent, absorbing and discarding the supernatant;
4.2.5 Add 1mL Apostle MiniMax cfDNA cleaning Solution (Apostle MiniMax cfDNA Wash Solution) to 4.2.4 tubes, shake and mix, transfer to 1.5mL centrifuge tubes placed in a magnetic rack in advance. After the solution in the 1.5mL centrifuge tube is colorless and clear, transferring the supernatant to a 15mL centrifuge tube to rinse the primary magnetic beads again, and transferring the magnetic bead suspension subjected to secondary rinsing to the 1.5mL centrifuge tube to adsorb the magnetic beads;
4.2.6 when the solution in the 1.5mL centrifuge tube was clear and transparent, the supernatant was discarded by aspiration. Then, 1mL of Apostle MiniMax cfDNA cleaning solution is added into a 1.5mL centrifuge tube, the mixture is rinsed for 30s by shaking and is subjected to instantaneous centrifugation, and then the 1.5mL centrifuge tube is placed on a magnetic frame to adsorb magnetic beads. After the liquid in the tube is clear and transparent, absorbing and discarding the supernatant;
4.2.7 Apostle MiniMax cfDNA second washing Solution (Apostle MiniMax cfDNA 2nd Wash Solution) working Solution preparation: taking 1 brand-new 5mL centrifuge tube, adding 1600 mu L of Apostle MiniMax cfDNA second washing solution and 400 mu L of absolute ethyl alcohol, and then reversing, uniformly mixing and then centrifuging for standby;
4.2.8 adding 1mL of Apostle MiniMax cfDNA second washing solution working solution into an 4.2.6 centrifuge tube, shaking and rinsing for 30s, centrifuging for a short time, and placing on a magnetic rack for magnetic bead adsorption; after the liquid in the tube is clear and transparent, absorbing and discarding the supernatant;
4.2.9 repeat the 4.2.8 operation once. Then, centrifuging a 1.5mL centrifuge tube for a short time, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads, and absorbing and discarding the residual Apostle MiniMax cfDNA second washing liquid working solution by using a 20-mu L suction head;
4.2.10 uncovering a 1.5mL centrifuge tube, placing the centrifuge tube on a magnetic frame, and airing the centrifuge tube for 5min at room temperature until no mirror surface is reflected on the surface of the magnetic bead; adding 50 μ L of Apostle MiniMax cfDNA dilution Solution (Apostle MiniMax cfDNA Solution) to the magnetic beads, and shaking to dissolve back for 5min at room temperature;
4.2.11 centrifuging for a short time to collect the magnetic beads to the bottom of the tube, and placing a 1.5mL centrifuge tube on a magnetic frame to adsorb the magnetic beads; after the liquid in the tube is clear and transparent, transferring the supernatant into a new pre-prepared 1.5mL centrifuge tube; used for subsequent warehouse building experiments.
5 SNV panel capture sequencing library construction
The library building kit for SNV panel capture sequencing library building is as follows: KAPA DNA HyperPrep library construction kit (Illumina, PCR-free, 96 reaction); the kit was purchased from Roche, inc, under the designation KK 8505.
5.1 end repair and addition of the "A" tail
According to the DNA quantification result, more than or equal to 10ng of cfDNA extracted in step 4.2 was transferred to a PCR tube and the volume was supplemented to 50. mu.L with Nuclease-Free Water (Nuclear Free Water). The End-Repair and "A" tail reagents were added in an amount of "End-Repair & A tail Buffer (End Repair & A-Tailing Buffer) 7. mu.L/reaction, End-Repair & A tail Enzyme Mix (End Repair & A-Tailing Enzyme Mix) 3. mu.L/reaction". After shaking, mixing and instantaneous centrifugation, the PCR reaction program was run in a PCR instrument: storing at 20 deg.C for 30min, 65 deg.C for 30min, 4 deg.C, and hot covering with 70 deg.C; to obtain the final modified A product.
5.2 Joint connection
5.1 after the reaction is finished, adding the joint connection reaction components into the reaction system according to the amount of '60 mu L of the non-modified A product per reaction, 2.5 mu L of the DNA connection joint per reaction, 7.5 mu L of PCR-grade water per reaction, 30 mu L of the connection buffer solution per reaction and 10 mu L of DNA ligase per reaction', shaking, uniformly mixing and carrying out instant centrifugation, and then, operating a PCR reaction program in a PCR instrument: 20 ℃ for 15min, hot lid OFF.
5.3 purification after linker attachment
5.3.1 taking out the Beckmann AmpureXP magnetic beads from a refrigerator at the temperature of 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30 min;
5.3.2 transferring the 5.2 reaction system to a new 1.5mL centrifuge tube, adding 88 μ L Beckmann AmpureXP magnetic beads, shaking, mixing uniformly, centrifuging instantaneously, and incubating at room temperature for 10 min;
5.3.3 after the incubation is finished, placing the centrifugal tube on a magnetic frame to adsorb magnetic beads, and after the solution in the centrifugal tube is clear and transparent, absorbing and discarding the supernatant;
5.3.4 adding 400 μ L80% ethanol to the magnetic bead tube, rotating the tube for one week, and then removing the supernatant by aspiration;
5.3.5 repeating the step 5.3.4 once, then centrifuging instantly and placing the centrifuge tube on a magnetic rack to adsorb magnetic beads, and absorbing the residual ethanol with a 20 μ L suction head;
5.3.6 keeping the magnetic beads on a magnetic frame, and air-drying at room temperature for 5min until no mirror reflection exists on the surfaces of the magnetic beads;
5.3.7 adding 21. mu.L of nuclease-free water (Nuclear free water) into the magnetic bead tube, shaking and mixing uniformly, and incubating for 5min at room temperature;
5.3.8, instantly centrifuging the magnetic bead tube, placing the tube on a magnetic rack to adsorb magnetic beads, and transferring the supernatant to a pre-prepared 0.2mL PCR tube after the solution in the tube is clear and transparent;
5.4 library amplification
According to "20. mu.L/reaction of purified adaptor ligation product, 20. mu.L/reaction of KAPA HiFi ReadyMix, (P5+ P7) Primer Mix ((P5 + P7) Primer Mix) 5. mu.L/reaction", each PCR component was added to the PCR tube containing purified adaptor ligation product at step 5.3.8; vortex, shake, mix and centrifuge instantaneously, then put into a PCR instrument to run a PCR program: 45s at 98 ℃; (98 ℃ 15S; 60 ℃ 30S; 72 ℃ 30S) for 6 cycles; 1min at 72 ℃; keeping at 4 ℃.
5.5 purification of libraries after amplification
5.5.1 taking out the Beckmann AmpureXP magnetic beads from a refrigerator at the temperature of 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30 min;
5.5.2 transferring the 5.4 amplified system to a new 1.5mL centrifuge tube, adding 80 μ L of Beckmann AmpureXP magnetic beads, shaking, mixing uniformly, centrifuging instantaneously, and incubating at room temperature for 10 min;
5.5.3 after the incubation is finished, placing the centrifugal tube on a magnetic frame to adsorb magnetic beads; after the solution in the tube is clear and transparent, absorbing and discarding the supernatant;
5.5.4 adding 400 μ L of 80% ethanol into the magnetic bead tube, rotating the tube for one week, and then absorbing and removing the supernatant;
5.5.5 repeating the step 5.5.4 once, then instantly centrifuging and putting the centrifugal tube on a magnetic frame to adsorb magnetic beads; the remaining ethanol was removed with a 20. mu.L pipette tip;
5.5.6 keeping the magnetic beads on a magnetic frame, and airing for 5min at room temperature until no mirror surface is reflected on the surfaces of the magnetic beads;
5.5.7 adding 21 μ L nuclease free water (nuclease free water) into the magnetic bead tube, shaking, mixing, and incubating at room temperature for 5 min;
5.5.8 the magnetic bead tube is instantaneously centrifuged and placed on a magnetic frame to adsorb magnetic beads, and after the solution in the tube is clear and transparent, the supernatant is transferred to a new pre-prepared 1.5mL centrifuge tube for quality control and sequencing on a computer.
5.5.9 after quality control and purification of the library, the library is quantified by using a Qubit 4 fluorimeter and a Qubit dsDNA HS Assay Kit (Thermo fisher); library fragment quality control was performed using the Agilent 2100 bioanalyzer in combination with the Agilent 2100 DNA 1000 Kit.
6 SNV panel capture sequencing hybridization elution
The kit selected when SNV panel capture sequencing hybridization elution is carried out is as follows: SNV Panel (Twist Custom Panel); twist Fast hybridization and wash kit, 96 reactions (cat # 101175); twist Binding and purification targets, 96 reaction (cat # 100984); twist Universal bottles, TruSeq Compatible, High Concentration, 4X 96R reaction (cat # 101786). The above kit was purchased from Twist Bioscience.
6.1.1 turning on 2 metal bath heaters in advance, and respectively balancing the heating blocks to 60 ℃ and 65 ℃ for later use;
6.1.2 respectively taking a proper amount of intermediate library to be hybridized, mixing the intermediate library into a low-adsorption 0.2mL PCR tube, wherein the total amount of the hybridization input is 500ng-3 mu g;
6.1.3 Add 2. mu.L of Universal Blockers (Universal Blockers) to the mixed library mix;
6.1.4 Add 5. mu.L of blocker solution (Blockers solution) to the mixed library mix;
6.1.5 adding 4. mu.L of SNV panel probe working solution to the mixed library mix;
6.1.6 covering the tube cover tightly, shaking, mixing, centrifuging instantly, placing in a concentration drier, centrifuging at 60 deg.C, and draining;
6.1.7 in the process of draining, the Fast Hybridization Mix was placed in a metal bath at 65 ℃ for preheating;
6.1.8 adding 20 μ L of preheated quick hybridization composition into the dried library, blowing 10-15 times, dissolving completely, and instantly separating to 0.2mL centrifuge tube with low adsorption;
6.1.9 Add 30. mu.l Hybridization Enhancer (Hybridization Enhancer) to 6.1.8, centrifuge instantaneously, place the mixture on a PCR instrument prepared in advance, denature at 95 ℃ for 5min, and hybridize at 60 ℃ overnight.
6.1.10 the next day, two metal bath heaters were prepared in advance, set at 68 ℃ and 48 ℃ respectively; the method comprises the following steps of (1) balancing streptavidin magnetic beads, a Fast washing Buffer solution (Fast Wash Buffer 1), a washing Buffer solution (Wash Buffer 2) and a Fast Binding Buffer solution (Fast Binding Buffer) at room temperature for 30 min;
6.1.11 streptavidin magnetic beads are vortexed, shaken and mixed uniformly for 15s, and are distributed into low-adsorption 1.5mL centrifuge tubes according to the specification of 50 mu L magnetic beads of a single capture reaction;
6.1.12 adding 200 μ L of Fast Binding Buffer solution (Fast Binding Buffer), vortexing for 10s, placing the centrifuge tube back to the magnetic frame, adsorbing to clarify the solution, and discarding the supernatant;
6.1.13 repeat step 6.1.12 twice, finally resuspending the magnetic beads using 200. mu.L of Fast Binding Buffer;
6.1.14 the hybrid library is immediately transferred into the washed magnetic beads and is blown for 3-5 times to ensure that the gun head has no residue. Fixing a 1.5mL low-adsorption centrifuge tube filled with the hybrid library-magnetic beads on a vertical mixing instrument, and reversing and mixing the mixture at room temperature for 30 min;
6.1.15 during incubation, according to the number of hybridization reactions, preparing a proper amount of elution reagent according to 'Fast washing Buffer 1 (Fast Wash Buffer 1) 420 muL/reaction, washing Buffer 2 (Wash Buffer 2) 650 muL/reaction, and Fast Binding Buffer (Fast Binding Buffer) 850 muL/reaction'. After the split charging is finished, Fast Wash Buffer 1 is placed in a metal bath at 68 ℃ for preheating, and Wash Buffer 2 and a low-adsorption centrifuge tube are placed in a metal bath at 48 ℃ for preheating.
6.1.16, after the incubation is finished, placing a 1.5mL centrifuge tube on a magnetic frame after instantaneous centrifugation, adsorbing until the solution is clear and transparent, and then absorbing and removing the supernatant;
6.1.17 adding 200 μ L of preheated Fast Wash Buffer 1 (Fast Wash Buffer 1) into magnetic beads in a 1.5mL centrifuge tube, shaking rapidly, mixing, centrifuging instantly, and incubating in a metal bath at 68 deg.C for 5 min;
6.1.18, after the incubation is finished, placing a 1.5mL centrifuge tube on a magnetic rack, adsorbing magnetic beads, and after the solution is colorless and clear, absorbing and removing the supernatant;
6.1.19 adding 200 μ L of preheated Fast Wash Buffer 1 (Fast Wash Buffer 1) into the magnetic beads in a 1.5mL centrifuge tube, shaking rapidly, mixing, centrifuging instantly, and incubating at 68 deg.C in a metal bath for 5 min;
6.1.20, after the incubation is finished, transferring the incubation system into a pre-prepared 1.5mL low adsorption centrifuge tube preheated at 48 ℃, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads, and after the solution is colorless and clear, absorbing and discarding the supernatant;
6.1.21 adding 200 μ L of preheated washing Buffer 2 (Wash Buffer 2) into the magnetic beads in a 1.5mL centrifuge tube, shaking rapidly, mixing uniformly, centrifuging instantly, and incubating in a metal bath at a constant temperature of 48 deg.C for 5 min;
6.1.22 repeat 6.1.21 steps twice for a total of three rinses. After the last aspiration of the supernatant, 20. mu.L of Nuclease Free Water (Nuclear Free Water) was added to the beads, and the beads were resuspended for PostPCR or storage at-20 ℃;
6.1.23PostPCR
a PostPCR reaction system is configured according to the amount that a mixture of magnetic beads and a library is 20 muL/reaction, 2X KAPA HiFi hotspot ReadyMix (high fidelity hot start polymerase premix) is 25 muL/reaction, amplification Primers (10 muM) (amplification Primers) are 2.5 muL/reaction, and nucleic Free Water is 2.5 muL/reaction.
The prepared reaction system is placed in a PCR instrument, and the program is operated: 45s at 98 ℃; (98 ℃ 15S; 60 ℃ 30S; 72 ℃ 30S) for 8-10 cycles; 1min at 72 ℃; keeping at 4 deg.C for 1 min.
6.1.24 PostPCR purification
(1) Taking out the Beckmann AmpureXP magnetic beads from a refrigerator at 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30 min;
(2) transferring the amplified system in the step (1) to a new 1.5mL centrifuge tube, adding 90 μ L of AmpureXP magnetic beads, shaking, uniformly mixing, performing instantaneous centrifugation, and incubating at room temperature for 10 min;
(3) after incubation, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution in the tube is clear and transparent, absorbing and discarding the supernatant;
(4) adding 400 mu L of 80% ethanol into the magnetic bead tube, rotating the tube for one week, and then absorbing and removing the supernatant;
(5) repeating the step (4) once, then performing instantaneous centrifugation, placing the centrifugal tube on a magnetic frame to adsorb magnetic beads, and absorbing the residual ethanol by using a 20 mu L sucker;
(6) keeping the magnetic beads on a magnetic frame, and airing for 5 minutes at room temperature until no mirror surface is reflected on the surfaces of the magnetic beads;
(7) adding 21 μ L of Nuclease-Free Water (Nuclease Free Water) into the magnetic bead tube, shaking, mixing, and incubating at room temperature for 5 min;
(8) and (3) instantaneously centrifuging the magnetic bead tube, placing the magnetic bead tube on a magnetic frame to adsorb magnetic beads, and transferring the supernatant into a new pre-prepared 1.5mL centrifuge tube after the solution in the tube is clear and transparent for quality control and SNV panel capture sequencing.
Library quality control and sequencing, the purified library was quantified using a Qubit 4 fluorometer in combination with the Qubit dsDNA HS Assay Kit (thermolfisher). Library fragment quality control was performed using an Agilent 2100 Bioanalyzer in combination with Agilent 2100 DNA 1000 Kit. Before the machine sequencing, ABI StepOne Plus is used for quantifying the mole number of the library, and the machine sequencing is performed according to the quantitative result and the prearranged 15Gb data quantity.
S2, obtaining mutation marker coverage conditions of all healthy samples and cancer samples according to cfDNA sequencing results, and finally obtaining SNP/INDEL characteristics; and (3) constructing baseline of the CVN event by using gDNA sequencing results of part of healthy samples, and processing information in the baseline to obtain CNV characteristics.
The method comprises the following specific steps:
acquisition of SNP/INDEL characteristics:
(1) screening of Point mutation marker
412 cases of TCGA bladder cancer SNP/INDEL public data are applied to filter the human population frequency, the site coverage depth and the functional area, and the variation information of the hot gene is extracted. The results of integrating all samples by mutation sites, 155 mutation markers on TERT, TP53, ERBB2, ERCC2, FGFR3, KDM6A, PIK3CA, ARID1A, ERBB3, GATA3, BRCA2, CREBP, CTNNB1, ELF3, FH 15 genes were screened according to the idea of maximizing the samples by the least mutation sites.
The specific 155 variant markers obtained by screening are respectively as follows:
the relevant markers on gene ARID1A were, chr 1: 27099915, chr 1: 27101380-27101398, chr 1: 27101427-27101449, chr 1: 27101551-27101551, chr 1: 27101586-: 27101645 and 27101645, chr 1: 27102132-: 27102137-27102137, chr 1: 27105648 and 27105649, chr 1: 27106081-27106087, chr 1: 27106240-27106240, chr 1: 27106279-27106279, chr 1: 27106354-27106354, chr 1: 27106539-: 27107081-: 27107134-27107135.
The relevant markers on gene ELF3 were, chr 1: 201981484. 201981494, chr 1: 201981500, chr 1: 201981537-201981537, chr 1: 201983034-201983034, chr 1: 201983035 and chr 1: 201984418-201984418.
Relevant marker on gene FH, chr 1: 241663871 and 241663871; related marker on CTNNB1, chr 3: 41266113-41266113.
The relevant markers on gene PIK3CA were, chr 3: 178916992 9-: 1789169936-1789169936, chr 3: 178936091-178936091, chr 3: 178936094-178936094, chr 3: 178936095-178936095, chr 3: 178936103-178936103, chr 3: 178938886-178938886, chr 3: 178951994-: 178952039-178952039, chr 3: 178952085-178952085, chr 3: 178952090-178952090, chr 3: 178919256 and chr 3: 178921553-178921553.
The relevant markers on gene FGFR3 were, chr 4: 1803564-1803564, chr 4: 1803568, chr 4: 1806089, chr 4: 1806092, chr 4: 1806099, chr 4: 1806119-1806119, chr 4: 1806153-1806153, chr 4: 1807859-1807859, chr 4: 1807889, chr 4: 1807890, chr 4: 1808916-1808916 and chr 4: 1808937-1808937.
The relevant markers on gene TERT are, chr 5: 1295228-1295228, chr 5: 1295250-1295250 and chr 5: 1295250-1295254. Marker on gene GATA3, chr 10: 8100619-8100619.
The relevant markers on gene ERBB3 were, chr 12: 56478817-: 56478851-: 56478854-: 56481649-56481649, chr 12: 56481660-: 56482341-: 56482537-: 56489571-: 56489582-: 56490408-: 56492623-: 56495696-: 56495711-56495711.
The relevant markers on gene BRCA2 were, chr 13: 32900694, chr 13: 32910686-32910686, chr 13: 32910940-32910940, chr 13: 32912514-32912514 and chr 13: 32972638-32972638.
The relevant marker on gene CREBBP is, chr 16: 3786710, chr 16: 3786726, 3786730, chr 16: 3786740-: 3786775-3786775.
The relative marker on gene TP53 was, chr 17: 7577127 and 7577127, chr 17: 7577505 and 7577505, chr 17: 7577527 and 7577527, chr 17: 7577535 and 7577535, chr 17: 7577538 and 7577538, chr 17: 7577539 and 7577539, chr 17: 7577545 and 7577545, chr 17: 7577548 and 7577548, chr 17: 7577559 and 7577559, chr 17: 7577568 and 7577568, chr 17: 7579313-: 7579328 and 7579328, chr 17: 7579365 and 7579365, chr 17: 7579391-: 7579406 and 7579406, chr 17: 7579415 and chr 17: 7579431 and 7579431, chr 17: 7573982 and chr 17: 7573983-7573983.
The relevant markers on gene ERBB2 were, chr 17: 37873691-37873691, chr 17: 37879658 and 37879658, chr 17: 37880220 and 37880220, chr 17: 37880257 and 37880257, chr 17: 37880261 and 37880261, chr 17: 37880265, chr 17: 37880981, chr 17: 37881329 and 37881329, chr 17: 37883131-37883131, chr 17: 37883158 and 37883158, chr 17: 37883660-37883660 and chr 17: 37884073 and 37884073, chr 17: 37863323 and 37863323, chr 17: 37864656-37864656 and chr 17: 37864665-37864665.
The relevant markers on gene ERCC2 were, chr 19: 45855817-45855817, chr 19: 45855824-45855824, chr 19: 45855835-45855835, chr 19: 45858086-: 45860556-45860556, chr 19: 45860733-: 45864859-45864881, chr 19: 45867571-45867571, chr 19: 45867584 and 45867584, chr 19: 45867687, chr 19: 45872189-45872189, chr 19: 45872213-45872213, chr 19: 45872219-45872219, chr 19: 45872362-45872362, chr 19: 45872380-45872380, chr 19: 45873425-45873425, chr 19: 45873455-45873455 and chr 19: 45873456-45873456.
The relevant marker on gene KDM6A, chrX: 44732952 and 44732952, chrX: 44732955 and 44732955, chrX: 44733198 and 44733198, chrX: 44733200, chrX: 44833924 and 44833924, chrX: 44870233 and 44870240, chrX: 44913136 and 44913136, chrX: 44918316-44918316, chrX: 44918532-: 44918582-: 44918668-44918669, chrX: 44920630-44920633, chrX: 44949991-: 44950033-44950034, chrX: 44950066-44950066, chrX: 44969323-44969323 and chrX: 44969369-44969369.
The relevant markers on gene TP53 were, chr 17: 7577574 and 7577574, chr 17: 7577596 and 7577596, chr 17: 7577599 and 7577599, chr 17: 7578382 and 7578382, chr 17: 7578419 and 7578419, chr 17: 7578437 and 7578437, chr 17: 7578442 and 7578442, chr 17: 7578513 and 7578513, chr 17: 7578524-7578524, chr 17: 7579340 and chr 17: 7577571-7577584.
(2) Detection of cfDNA sample variation marker
Applying the filtering conditions (filtering the crowd frequency, the site coverage depth and the functional area) in the step (1), and counting the coverage conditions of 155 variant markers in 83 bladder cancer samples and 67 healthy samples; finally obtaining the SNP/INDEL characteristics.
Obtaining the CNV characteristics:
(1) CNV sliding window coverage calculation
Counting the read number of each bin according to bins with the lengths of 200Kbp for 83 bladder cancer samples and 67 healthy samples, firstly carrying out GC correction on the counted read numbers, and finally obtaining the statistical result of the read number of the CNV sliding window.
The statistical method of the read number comprises the following steps: and counting the sum of the read numbers of which the starting position of the read is located in the bin interval and recording the sum as RC.
The GC correction was performed by: calculating the correction coefficient by loess regression, and then calculating the correction coefficient by the formula RCgc = RC * correction coefficient。
(2) Construction of CNV baseline
Urine sediment samples from healthy persons were randomly selected 16 and treated for areas with no signal and high noise, such as centromeres, telomeres, repeat areas, etc. The expected coverage and median segment variance for each bin is then calculated to use this information for normalization of subsequent test samples.
The calculation formulas of the expected coverage and the median piecewise variance are respectively as follows: ratio = RCgc-bin/mean(RCgc-all-bin) The coverage per bin for each sample was obtained, and then the mean and standard deviation of the coverage per bin for all samples was calculated.
(3) CNV feature acquisition
Normalizing the information in baseline and calculating the log of each bin interval2ratio and Z-Score, and the calculation formulas are respectively:
ratio = RCgc-bin/mean(RCgc-all-bin);
RCgc-binthe GC corrected read number for each bin interval is represented,
mean(RCgc-all-bin) And counting the average number of read numbers after GC correction of all bin intervals.
log2ratio, log base 2.
Z-Score = (ratio–E(ref-ratio))/std(ref-ratio)
Wherein E(ref-ratio)For the desired coverage of each bin of cnv baseline,
std(ref-ratio)standard deviation of coverage for each bin of cnv baseline.
Log of each bin to be finally obtained2ratio and Z-Score perform call segment by using Circular Binding Segment (CBS) method to obtain final log of segment2ratio and Z-Score results. The total genome length with CNV events in each sample Segment file was counted to classify healthy and tumor samples. Obtaining the optimal cutoff _1, setting an iterative gradient, wherein the interval is from 0.05 to 0.1, the step length is 0.01, selecting the optimal cutoff value by drawing an ROC curve, and finally determining that the optimal cutoff _1 is 0.08 for segment filtering.
Classifying the CNV characteristics, and counting the total genome length of the segment, and marking as N; setting cutoff _2 to distinguish tumor samples from healthy samples, drawing an ROC curve to select the optimal cutoff value by the value of cutoff _2, and finally determining the optimal cutoff _2 value as 8700000 for classification.
Respectively drawing ROC curves according to the result of S2; the specific results are shown in fig. 1 and fig. 2.
And S3, integrating the SNP/INDEL characteristics and the CNV characteristics, and establishing the early bladder cancer screening model through an algorithm supporting vector machine learning.
During model training, 7: 3 (training set: test set) splitting data, and using 10 times of cross validation to carry out grid search on the hyper-parameters; and the model was evaluated using AUC.
The bladder cancer prediction model is constructed based on a machine learning framework of the system. The workflow of the framework comprises 1, data preprocessing, 2, splitting data into training and testing data sets, 3, developing a model by using the training data sets of each algorithm, 4, and finally evaluating the accuracy of each algorithm by using the testing data sets.
Wherein the data preprocessing step includes the CNV features and point mutations scaled to have zero mean and unit variance. After preprocessing, the data is separated into a training data set for developing predictive models and a test data set for accuracy verification.
In building the machine learning model, we randomly split the dataset into a training set and a testing set, with a maximum percentage of 70% (55/50) of the data being included in the training set and used to develop the predictive model, and a 30% (28/17) sample small testing dataset used to verify the accuracy of the algorithm.
A training set is defined, and an optimal model is developed for a bladder cancer prediction model algorithm by using a Support Vector Machine (SVM).
And carrying out grid search on the hyper-parameters to search the optimal hyper-parameter set so as to ensure the accuracy of the training data. Each grid search was performed using 10-fold cross validation, in which the training data set was divided into 10 folds of the same size. The model is then created using 90% (9/10 fold) of the data, with the remaining folded data being used to test the model. This process was repeated 10 times, with each fold being used for one of 10 training steps and to evaluate the model accuracy of the training data. In the grid search process, accuracy is used as a primary indicator of model performance evaluation.
Finally, the model obtained by construction is: the optimal hyper-parameter set obtained based on the training data is that a kernel function is a radial basis kernel function (rbf), a kernel function parameter gamma is 0.001, a penalty coefficient C of an objective function is 1000, and the performance of the optimal model of the SVM machine learning algorithm is evaluated on a test data set by using AUC as a main accuracy index. While there are many indicators that can assess the performance of machine learning algorithms, AUC is the most common method in clinical settings and can be compared to other studies, and therefore we use it as the primary indicator for measuring accuracy. However, for the sake of completeness, the present application also reports accuracy, sensitivity and specificity. The SVM machine learning algorithm is implemented using the scimit-learn library of Python (version 1.0) and plots using the matplotlib (version 3.4.2) and the seaborn (version 0.11.1) libraries.
And (3) drawing an ROC curve for the results of the training set and the test set of the support vector machine model, wherein the results are respectively shown in FIG. 3 and FIG. 4, the performance indexes of classification are shown in Table 1, and the classification statistical results are shown in Table 2.
TABLE 1 Performance indicators for the classifications
Figure 924359DEST_PATH_IMAGE001
TABLE 2 Classification of statistical results
Figure 372657DEST_PATH_IMAGE002
Figure 480291DEST_PATH_IMAGE003
In Table 2, tumor refers to bladder cancer specimen and normal refers to healthy specimen.
Comparing the data of fig. 1 with the data of fig. 3 and 4, it is found that only the SNP/INDEL characteristics of the present application are subjected to subsequent ROC curve plotting, and the AUC value thereof is 0.819; compared with the AUC values (0.952 and 0.941 respectively) obtained after the data are drawn by the support vector machine training set ROC curve of FIG. 3, the accuracy of bladder cancer prediction after the model is constructed is remarkably improved.
Comparing the data of fig. 2 with the data of fig. 3 and 4, it is found that only the CNV characteristics of the present application are subjected to subsequent ROC curve plotting, and the AUC value is 0.920; compared with the AUC values (0.952 and 0.941 respectively) obtained after the data are drawn by the support vector machine training set ROC curve of FIG. 3, the accuracy of bladder cancer prediction after the model is constructed is remarkably improved.
Therefore, the data results fully illustrate that the model constructed by the method integrates the information of SNP/INDEL characteristics and CNV characteristics into the same model, and can achieve the effect of more accurately predicting the bladder cancer.
A kit for early screening of bladder cancer based on the model comprises a reagent composition for extracting cfDNA, a reagent composition for gene library construction and a reagent composition for extracting gDNA.
The reagent composition for extracting cfDNA is a reagent composition capable of extracting the urine supernatant cfDNA, and the urine supernatant cfDNA extraction Kit is selected in this embodiment, and specifically may be a southern inspection Apostle MiniMax High Efficiency cfDNA Isolation Kit. In the present example, the KAPA DNA HyperPrep library construction kit (Illumina, PCR-free, 96 reaction) kit (cat # KK 8505) was used to complete the construction of the pre-library, and SNV Panel (cat Custom) was used in combination with the Twist Fast hybrid and wash kit, 96 Reactions (cat # 101175), Twist Binding and purification targets, 96 Reactions (cat # 100984) and Twist Universal Blockers, Trusseq compilations, High Concentration, 4 × 96 Reactions (cat # 101786) to complete the SNV Panel capture.
Wherein, the reagent composition for extracting gDNA is a reagent composition capable of realizing gDNA extraction in Urine sediment cells, the reagent kit for extracting gDNA in Urine sediment cells is selected in the embodiment, and specifically, the reagent kit can be a Kangji century Urine DNA Storage Tube kit (cargo number: CW 2657). When constructing the gene library, the extracted gDNA was constructed into a WGS library, and when constructing the library, the KAPA DNA HyperPrep library construction kit (Illumina, PCR-free, 96 reaction, cat # KK 8505) was used in this example.
The use method of the bladder cancer early screening kit comprises the following steps:
extracting the cfDNA of the sample to be screened by adopting the reagent composition for extracting the cfDNA, constructing by adopting the reagent composition constructed by the gene library to obtain a cfDNA library of the sample to be screened, and obtaining a sequencing result of the cfDNA of the sample to be screened after the machine detection;
extracting gDNA of a sample to be screened by adopting a reagent composition for extracting the gDNA, constructing by adopting a reagent composition constructed by a gene library to obtain a gDNA library of the sample to be screened, and obtaining a gDNA sequencing result of the sample to be screened after the gDNA sequencing result is obtained by machine detection;
and inputting the obtained cfDNA sequencing result and the gDNA sequencing result into the model to obtain the early bladder cancer screening result.
The kit is used for bladder cancer prediction of 50 random samples, the statistical result of a verification set consisting of 50 samples is shown in table 3, and the performance index of the verification set is shown in table 4.
TABLE 3 validation set statistics
Figure 141079DEST_PATH_IMAGE004
In Table 3, tumor refers to bladder cancer specimen and normal refers to healthy specimen.
TABLE 4 Performance indices of the validation set
Figure 486610DEST_PATH_IMAGE005
The specific embodiments are only for explaining the present application and are not limiting to the present application, and those skilled in the art can make modifications to the embodiments without inventive contribution as required after reading the present specification, but all the embodiments are protected by patent law within the scope of the claims of the present application.

Claims (8)

1. A construction method of a bladder cancer early screening model is characterized by comprising the following steps:
s1, obtaining gDNA sequencing results of the healthy sample and the cancer sample from the urine sediment samples of the healthy sample and the cancer sample respectively; obtaining cfDNA sequencing results of the healthy sample and the cancer sample from the urine supernatant samples of the healthy sample and the cancer sample respectively;
s2, obtaining coverage conditions of bladder cancer related mutation markers of all healthy samples and cancer samples according to cfDNA sequencing results, and finally obtaining SNP/INDEL characteristics; the gDNA sequencing result of part of the healthy samples is used for constructing baseline of the CVN event, and the CNV characteristics are obtained by processing the information in the baseline; classifying according to the SNP/INDEL characteristics and the CNV characteristics, and drawing an ROC curve;
s3, integrating the SNP/INDEL characteristics and the CNV characteristics, and establishing the early screening model of the bladder cancer through an algorithm supporting the machine learning of a vector machine.
2. The construction method according to claim 1, wherein in step S2, the step of processing the information in baseline to obtain the CNV characteristics includes:
normalizing the information in baseline and calculating the log of each bin interval2ratio and Z-Score, and the calculation formulas are respectively as follows:
ratio = RCgc-bin/mean(RCgc-all-bin),
RCgc-binrepresenting the number of reads after GC correction per bin interval,
mean(RCgc-all-bin) Counting the corrected re of GC in all bin intervalsThe average number of the ad numbers is,
log2ratio, taking log logarithm taking 2 as base log;
Z-Score = (ratio–E(ref-ratio))/std(ref-ratio)
wherein E(ref-ratio) For the desired coverage of each bin of cnv baseline,
std(ref-ratio)standard deviation of coverage for each bin of cnv baseline;
log of each bin to be obtained2ratio and Z-Score use the Circular Binding Segmentation (CBS) method to perform the call segment, resulting in the final log of segment2ratio and Z-Score results; wherein the cutoff _1 is 0.08 for segment filtering.
3. The constructing method according to claim 1, wherein in step S2, the cutoff _2 value is 8700000 for classification of CNV features.
4. The construction method of claim 1, wherein the distribution gene of the bladder cancer related mutation marker comprises any one or more of gene TERT, gene TP53, gene ERBB2, gene ERCC2, gene FGFR3, gene KDM6A, gene PIK3CA, gene ARID1A, gene ERBB3, gene GATA3, gene BRCA2, gene CREBP, gene CTNNB1, gene ELF3 and gene FH.
5. The method of claim 4, wherein the bladder cancer-associated mutant marker distributed on TERT gene comprises chr 5: 1295228-1295228, chr 5: 1295250-1295250 and chr 5: 1295250 and 1295254;
bladder cancer-associated mutant markers distributed on gene TP53 include chr 17: 7577127 and 7577127, chr 17: 7577505 and 7577505, chr 17: 7577527 and 7577527, chr 17: 7577535 and 7577535, chr 17: 7577538 and 7577538, chr 17: 7577539 and 7577539, chr 17: 7577545 and 7577545, chr 17: 7577548 and 7577548, chr 17: 7577559 and 7577559, chr 17: 7577568 and 7577568, chr 17: 7579313 and 7579313, chr 17: 7579328 and 7579328, chr 17: 7579365-: 7579391-: 7579406 and 7579406, chr 17: 7579415 and 7579415, chr 17: 7579431 and 7579431, chr 17: 7573982 and 7573982, chr 17: 7573983 and 7573983chr 17: 7577574 and 7577574, chr 17: 7577596 and 7577596, chr 17: 7577599 and 7577599, chr 17: 7578382 and 7578382, chr 17: 7578419 and 7578419, chr 17: 7578437 and 7578437, chr 17: 7578442 and 7578442, chr 17: 7578513 and 7578513, chr 17: 7578524-7578524, chr 17: 7579340-: 7577571 and 7577584;
bladder cancer-associated mutant markers distributed on the gene ERBB2 include chr 17: 37873691-37873691, chr 17: 37879658 and 37879658, chr 17: 37880220 and 37880220, chr 17: 37880257 and 37880257, chr 17: 37880261 and 37880261, chr 17: 37880265, chr 17: 37880981 and 37880981, chr 17: 37881329, chr 17: 37883131-37883131, chr 17: 37883158 and 37883158, chr 17: 37883660-37883660, chr 17: 37884073 and 37884073, chr 17: 37863323 and 37863323, chr 17: 37864656-chr 17: 37864665-one or more;
bladder cancer-associated mutant markers distributed on gene ERCC2 include chr 19: 45855817-45855817, chr 19: 45855824-45855824, chr 19: 45855835-: 45858086-: 45860556-45860556, chr 19: 45860733-45860733, chr 19: 45864859-45864881, chr 19: 45867571-45867571, chr 19: 45867584 + 45867584, chr 19: 45867687, chr 19: 45872189-45872189, chr 19: 45872213-45872213, chr 19: 45872219-45872219, chr 19: 45872362-45872362, chr 19: 45872380-45872380, chr 19: 45873425-45873425, chr 19: 45873455-45873455 and chr 19: 45873456-45873456;
bladder cancer-associated mutant markers distributed on the gene FGFR3 include chr 4: 1803564, chr 4: 1803568, chr 4: 1806089, chr 4: 1806092, chr 4: 1806099-1806099, chr 4: 1806119-1806119, chr 4: 1806153-1806153, chr 4: 1807859-1807859, chr 4: 1807889, chr 4: 1807890, chr 4: 1808916-1808916 and chr 4: 1808937-1808937;
bladder cancer-associated mutant markers distributed on gene KDM6A include chrX: 44732952 and 44732952, chrX: 44732955 and 44732955, chrX: 44733198 and 44733198, chrX: 44733200, chrX: 44833924 and 44833924, chrX: 44870233-44870240, chrX: 44913136 and 44913136, chrX: 44918316-44918316, chrX: 44918532-: 44918582-: 44918668-44918669, chrX: 44920630-44920633, chrX: 44949991-: 44950033-44950034, chrX: 44950066-44950066, chrX: 44969323-44969323 and chrX: 44969369 and 44969369;
bladder cancer-associated mutant markers distributed on gene PIK3CA include chr 3: 178916992 9-: 1789169936-1789169936, chr 3: 178936091-178936091, chr 3: 178936094-178936094, chr 3: 178936095-178936095, chr 3: 178936103-178936103, chr 3: 178938886-178938886, chr 3: 178951994-: 178952039-178952039, chr 3: 178952085-178952085, chr 3: 178952090-178952090, chr 3: 178919256 and chr 3: 178921553 and 178921553;
bladder cancer-associated mutant markers distributed over gene ARID1A include chr 1: 27099915, chr 1: 27101380-27101398, chr 1: 27101427-27101449, chr 1: 27101551-27101551, chr 1: 27101586-: 27101645 and 27101645, chr 1: 27102132-: 27102137-27102137, chr 1: 27105648 and 27105649, chr 1: 27106081-27106087, chr 1: 27106240-27106240, chr 1: 27106279-27106279, chr 1: 27106354-27106354, chr 1: 27106539-: 27107081-: 27107134-27107135;
bladder cancer-associated mutant markers distributed on the gene ERBB3 include chr 12: 56478817-: 56478851-containing 56478851, chr 12: 56478854-: 56481649-: 56481660-: 56482341-: 56482537-: 56489571-56489571, chr 12: 56489582-: 56490408-: 56492623-: 56495696-: 56495711 and 56495711;
bladder cancer-associated mutant markers distributed over the gene GATA3 include chr 10: 8100619-8100619;
bladder cancer-associated mutant markers distributed on gene BRCA2 include chr 13: 32900694, chr 13: 32910686-32910686, chr 13: 32910940-32910940, chr 13: 32912514-32912514 and chr 13: 32972638 and 32972638;
bladder cancer-associated mutant markers distributed on the gene CREBBP include chr 16: 3786710, chr 16: 3786726, 3786730, chr 16: 3786740-3786740 and chr 16: 3786775 and 3786775;
bladder cancer-associated mutant markers distributed over the gene CTNNB1 included chr 3: 41266113-41266113;
bladder cancer-associated mutant markers distributed on gene ELF3 include chr 1: 201981484. 201981494, chr 1: 201981500, chr 1: 201981537-201981537, chr 1: 201983034-201983034, chr 1: 201983035 and chr 1: 201984418 and 201984418;
bladder cancer-associated mutant markers distributed on gene FH include chr 1: 241663871-241663871.
6. An early screening model for bladder cancer, which is obtained by the construction method according to any one of claims 1 to 5.
7. The model of claim 6, wherein 70% of the obtained data is admitted to the training set and 30% is admitted to the testing set; obtaining a radial basis kernel function rbf of the model; the parameter gamma of the kernel function is 0.001, and the penalty coefficient C of the objective function is 1000.
8. A kit for early screening of bladder cancer based on the model of any one of claims 6 to 7, comprising a reagent composition for extraction of cfDNA, a reagent composition for gene library construction, a reagent composition for extraction of gDNA;
extracting the cfDNA of the sample to be screened by adopting the reagent composition for extracting the cfDNA, constructing by adopting the reagent composition constructed by the gene library to obtain a cfDNA library of the sample to be screened, and obtaining a sequencing result of the cfDNA of the sample to be screened after the machine detection;
extracting gDNA of a sample to be screened by using a reagent composition for extracting gDNA, constructing by using a reagent composition constructed by a gene library to obtain a gDNA library of the sample to be screened, and obtaining a gDNA sequencing result of the sample to be screened after the gDNA sequencing result is detected on a computer;
and inputting the obtained cfDNA sequencing result and the gDNA sequencing result into the model to obtain the early bladder cancer screening result.
CN202210447648.8A 2022-04-26 2022-04-26 Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model Active CN114566285B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210447648.8A CN114566285B (en) 2022-04-26 2022-04-26 Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210447648.8A CN114566285B (en) 2022-04-26 2022-04-26 Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model

Publications (2)

Publication Number Publication Date
CN114566285A true CN114566285A (en) 2022-05-31
CN114566285B CN114566285B (en) 2022-07-19

Family

ID=81720766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210447648.8A Active CN114566285B (en) 2022-04-26 2022-04-26 Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model

Country Status (1)

Country Link
CN (1) CN114566285B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114743593A (en) * 2022-06-13 2022-07-12 北京橡鑫生物科技有限公司 Construction method of prostate cancer early screening model based on urine, screening model and kit
CN115691667A (en) * 2022-12-30 2023-02-03 北京橡鑫生物科技有限公司 Method for early screening of urothelial carcinoma, model construction method, device and equipment
CN115831355A (en) * 2023-01-09 2023-03-21 北京求臻医学检验实验室有限公司 Early tumor screening method for multiple cancer species WGS
CN116564508A (en) * 2023-07-07 2023-08-08 北京橡鑫生物科技有限公司 A prostate cancer early screening model and its construction method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170101674A1 (en) * 2015-08-21 2017-04-13 Toma Biosciences, Inc. Methods, compositions, and kits for nucleic acid analysis
CN106755547A (en) * 2017-03-15 2017-05-31 上海亿康医学检验所有限公司 The Non-invasive detection and its recurrence monitoring method of a kind of carcinoma of urinary bladder
CN109906276A (en) * 2016-11-07 2019-06-18 格里尔公司 For detecting the recognition methods of somatic mutation feature in early-stage cancer
CN110055331A (en) * 2019-05-10 2019-07-26 人和未来生物科技(长沙)有限公司 A kind of kit and its application for bladder cancer auxiliary diagnosis or screening
CN111154880A (en) * 2020-03-06 2020-05-15 牡丹江医学院 A novel body fluid biopsy biomarker for bladder cancer and its application
CN111662981A (en) * 2020-06-09 2020-09-15 俊兮生物科技(上海)有限公司 Cancer gene detection kit based on second-generation sequencing probe capture method
CN112877441A (en) * 2021-04-27 2021-06-01 苏州仁端生物医药科技有限公司 Application of bladder urothelial cancer detection combined marker
CN113257360A (en) * 2021-06-24 2021-08-13 北京橡鑫生物科技有限公司 Cancer screening model, and construction method and construction device of cancer screening model
CN114107513A (en) * 2022-01-27 2022-03-01 北京优乐复生科技有限责任公司 Detection method and kit for bladder urothelial cancer diagnosis

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170101674A1 (en) * 2015-08-21 2017-04-13 Toma Biosciences, Inc. Methods, compositions, and kits for nucleic acid analysis
CN109906276A (en) * 2016-11-07 2019-06-18 格里尔公司 For detecting the recognition methods of somatic mutation feature in early-stage cancer
CN106755547A (en) * 2017-03-15 2017-05-31 上海亿康医学检验所有限公司 The Non-invasive detection and its recurrence monitoring method of a kind of carcinoma of urinary bladder
CN110055331A (en) * 2019-05-10 2019-07-26 人和未来生物科技(长沙)有限公司 A kind of kit and its application for bladder cancer auxiliary diagnosis or screening
CN111154880A (en) * 2020-03-06 2020-05-15 牡丹江医学院 A novel body fluid biopsy biomarker for bladder cancer and its application
CN111662981A (en) * 2020-06-09 2020-09-15 俊兮生物科技(上海)有限公司 Cancer gene detection kit based on second-generation sequencing probe capture method
CN112877441A (en) * 2021-04-27 2021-06-01 苏州仁端生物医药科技有限公司 Application of bladder urothelial cancer detection combined marker
CN113257360A (en) * 2021-06-24 2021-08-13 北京橡鑫生物科技有限公司 Cancer screening model, and construction method and construction device of cancer screening model
CN114107513A (en) * 2022-01-27 2022-03-01 北京优乐复生科技有限责任公司 Detection method and kit for bladder urothelial cancer diagnosis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ADAM J. WIDMAN 等: "Machine learning guided signal enrichment for ultrasensitive plasma tumor burden monitoring", 《HTTPS://WWW.BIORXIV.ORG/CONTENT/10.1101/2022.01.17.476508V1.ABSTRACT》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114743593A (en) * 2022-06-13 2022-07-12 北京橡鑫生物科技有限公司 Construction method of prostate cancer early screening model based on urine, screening model and kit
CN115691667A (en) * 2022-12-30 2023-02-03 北京橡鑫生物科技有限公司 Method for early screening of urothelial carcinoma, model construction method, device and equipment
CN115691667B (en) * 2022-12-30 2023-04-18 北京橡鑫生物科技有限公司 Urology early screening device, model construction method and equipment
CN115831355A (en) * 2023-01-09 2023-03-21 北京求臻医学检验实验室有限公司 Early tumor screening method for multiple cancer species WGS
CN116564508A (en) * 2023-07-07 2023-08-08 北京橡鑫生物科技有限公司 A prostate cancer early screening model and its construction method
CN116564508B (en) * 2023-07-07 2023-09-29 北京橡鑫生物科技有限公司 Early prostate cancer screening model and construction method thereof

Also Published As

Publication number Publication date
CN114566285B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN114566285B (en) Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model
EP2388336A1 (en) Method and kits for diagnosing colorectal cancer
CN113462781A (en) Detection of cancer using size and number aberrations of plasma DNA
CN109637587B (en) Method, device, storage medium, processor and method for standardizing transcriptome data expression quantity for detecting gene fusion mutation
Liu et al. Fragment enrichment of circulating tumor DNA with low-frequency mutations
CN112501293B (en) Reagent combination for detecting liver cancer, kit and application thereof
CN112322736A (en) Reagent combination for detecting liver cancer, kit and application thereof
EP3249051B1 (en) Use of methylation sites in y chromosome as prostate cancer diagnosis marker
CN112280865A (en) Reagent combination for detecting liver cancer, kit and application thereof
CN106399304A (en) Breast cancer related SNP marker
CN104845970A (en) Gene relevant to papillary thyroid tumors
CN114743593A (en) Construction method of prostate cancer early screening model based on urine, screening model and kit
CN118421802B (en) Bladder cancer polygene methylation detection kit
CN111349700B (en) Gene marker panels, kits and methods for the detection of urothelial carcinoma
CN113481299A (en) Targeted sequencing panel for lung cancer detection, kit and method for obtaining targeted sequencing panel
CN112430658A (en) Detection kit for intranodal peripheral T cell lymphoma related gene and library building method
AU2022357505A1 (en) Microsatellite markers
CN116377064B (en) A detection kit for polymethylation of colorectal cancer genes
CN117344014B (en) A kit, method and device for early diagnosis of pancreatic cancer
CN116987788B (en) Method and kit for detecting early lung cancer by using flushing liquid
CN116656830B (en) Methylation markers, devices, equipment and storage media for auxiliary diagnosis of gastric cancer
CN114891892B (en) Methylation marker panel for diagnosis of pancreaticobiliary tract cancer
JP2010239899A (en) How to determine cancer risk in patients with ulcerative colitis
CN110343761B (en) Prostate cancer marker panel and its application
CN119193835A (en) A target combination for detecting upper urinary tract urothelial carcinoma and its application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100176 1601, 16 / F, building 5, yard 18, Kechuang 13th Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Patentee after: Beijing Xiangxin Biotechnology Co.,Ltd.

Country or region after: China

Patentee after: Tianjin Xiangxin Medical Laboratory Co.,Ltd.

Patentee after: Tianjin Xiangxin Medical Instrument Co.,Ltd.

Address before: 100176 1601, 16 / F, building 5, yard 18, Kechuang 13th Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Patentee before: Beijing Xiangxin Biotechnology Co.,Ltd.

Country or region before: China

Patentee before: Tianjin Xiangxin Biotechnology Co.,Ltd.

Patentee before: Tianjin Xiangxin Medical Instrument Co.,Ltd.

CP03 Change of name, title or address