NZ743311B2 - Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis - Google Patents
Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis Download PDFInfo
- Publication number
- NZ743311B2 NZ743311B2 NZ743311A NZ74331117A NZ743311B2 NZ 743311 B2 NZ743311 B2 NZ 743311B2 NZ 743311 A NZ743311 A NZ 743311A NZ 74331117 A NZ74331117 A NZ 74331117A NZ 743311 B2 NZ743311 B2 NZ 743311B2
- Authority
- NZ
- New Zealand
- Prior art keywords
- dna
- rna
- processing
- data
- genomic
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/10—Ontologies; Annotations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Abstract
system, method and apparatus for executing a sequence analysis pipeline on genetic sequence data includes a integrated circuit formed of a set of hardwired digital logic circuits that are interconnected by physical electrical interconnects. One of the physical electrical interconnects forms an input to the integrated circuit connected with an electronic data source for receiving reads of genomic data. The hardwired digital logic circuits are arranged as a set of processing engines, each processing engine being formed of a subset of the hardwired digital logic circuits to perform one or more steps in the sequence analysis pipeline on the reads of genomic data. Each subset of the hardwired digital logic circuits is formed in a wired configuration to perform the one or more steps in the sequence analysis pipeline.
Claims (12)
1. A computer-implemented method for onsite or cloud-based DNA or RNA processing and analysis, the method comprising: providing a platform application programming interface (API) defining an input for receiving result data from a secondary processing of a plurality of reads of DNA, RNA or genomic sequence data from a subject, providing a bioinformatics processing platform having a memory that stores one or more DNA or RNA reference sequences, and having an integrated circuit formed of a set of pre- configured hardwired digital logic circuits that are interconnected by a plurality of physical electrical interconnects, the integrated circuit having an input for receiving a plurality of reads of DNA or RNA data, and having a memory interface to access the one or more DNA or RNA reference sequences, the hardwired digital logic circuits being arranged as a set of processing engines that are each formed of a subset of the hardwired digital logic circuits to perform one pre-configured step of secondary processing on the plurality of reads of DNA or RNA data, wherein secondary processing comprises receiving the plurality of reads of genomic sequence data and one or more DNA or RNA reference sequences and processing the plurality of reads of genomic sequence data to map and align at least some of the plurality of reads of genomic sequence data according to the one or more DNA or RNA reference sequences, the integrated circuit further having an output to output result data from the secondary processing according to the platform application programming interface (API), wherein the integrated circuit is physically integrated with an automated sequencer; and providing a plurality of user-selectable DNA, RNA or genomic processing pipelines, each having an input defined according to the platform API to receive the result data from the secondary processing, the plurality of DNA, RNA or genomic processing pipelines having a common pipeline API defining tertiary processing operations on the result data from the secondary processing received according to the platform API, wherein tertiary processing comprises performing one or more analyses on the subject’s genetic makeup determined by the secondary processing and each of the plurality of DNA, RNA or genomic processing pipelines is configured to perform a subset of the tertiary processing operations, and executing a user-selected set of the DNA, RNA or genomic processing pipelines, wherein the user-selected set of the DNA, RNA or genomic processing pipelines are configured to output result data of the tertiary processing according to the pipeline API to one or more user- selectable DNA, RNA or genomic analysis applications for additional processing for disease diagnostic, therapeutic treatment and/or prophylactic prevention.
2. The method of claim 1, further comprising: providing a plurality of user-selectable DNA, RNA or genomic analysis applications that are stored in one or more application repositories, each of a selected set of the plurality of DNA, RNA or genomic analysis applications being accessible from an onsite or cloud-based application repository by a computer via an electronic medium for execution by a computer processor to perform a targeted analysis of DNA, RNA or genomic data from the result data of the tertiary processing, each of the plurality of genomic analysis applications being defined by an application API for receiving the result data of the tertiary processing, performing the targeted analysis of the DNA, RNA or genomic data from the result data of the tertiary processing, and outputting the result data from the targeted analysis to one of one or more genomic databases according to the application API.
3. The method of claim 1 or claim 2, further comprising executing, using a computer processor, one or more user-selected DNA, RNA or genomic analysis applications.
4. The method of any one of claims 1 to 3, wherein the plurality of user-selectable genomic processing pipelines are selected from a set of DNA or RNA pipelines that consist of: a genome processing pipeline, an epigenome processing pipeline, a metagenome processing pipeline, a joint genotyping processing pipeline, and a genome analysis tool kit (GATK) processing pipeline.
5. The method of claim 4, wherein the plurality of user-selectable genomic analysis applications are selected from a set of genomic analysis applications that consist of: a non- invasive prenatal testing application, a neo-natal intensive care unit application, a cancer analysis application, a laboratory developed test (LDT) application, and an agricultural and biological analysis application.
6. The method of any one of claims 1 to 5, wherein the memory stores the plurality of reads of DNA or RNA data and the DNA or RNA reference sequence data; and the set of pre- configured hardwired digital logic circuits are comprised in a field programmable gate array (FPGA), one or more of the plurality of physical electrical interconnects comprising a memory interface to access the memory, the set of processing engines comprising a mapping module in a first hardwired configuration to access one or more of the plurality of reads of DNA or RNA data and the DNA or RNA reference sequence data, compare the sequence of nucleotides in at least one of the plurality of reads of DNA or RNA data to the sequence of nucleotides of the DNA or RNA reference sequence data to map the one or more of the plurality of reads of DNA or RNA data to the DNA or RNA reference sequence data so as to produce one or more mapped DNA or RNA reads.
7. The method of claim 6, wherein the FPGA further comprises a second hardwired configuration to access at least one of the mapped reads of DNA or RNA data and the DNA or RNA reference sequence data, compare the sequence of nucleotides in at least one of the mapped reads of DNA or RNA data to the sequence of nucleotides of the DNA or RNA reference sequence data to align the one or more mapped reads of DNA or RNA data to the DNA or RNA reference sequence data.
8. The method of claim 7, wherein the FPGA further comprises a sorting module in a third hardwired configuration to sort the mapped and aligned DNA or RNA reads.
9. The method of any one of claims 1 to 8, wherein the result data from the secondary processing includes reads of genomic data.
10. The method of claim 9, wherein the result data from the secondary processing includes mapped and aligned reads from the plurality of reads of genomic data.
11. The method of claim 10, wherein the result data from the secondary processing includes one or more variant call files generated from the mapped and aligned reads.
12. The method of any one of claims 1 to 11, wherein the memory further stores one or more index of the one or more DNA or RNA reference sequences, and the hardwired digital logic circuits are arranged as a set of processing engines that are each formed of a subset of the hardwired digital logic circuits to perform one pre-configured step of secondary processing on the plurality of reads of DNA or RNA data according to the DNA or RNA reference sequences and the index.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| NZ784189A NZ784189B2 (en) | 2017-01-11 | Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis | |
| NZ784186A NZ784186B2 (en) | 2017-01-11 | Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201662277445P | 2016-01-11 | 2016-01-11 | |
| PCT/US2017/013057 WO2017123664A1 (en) | 2016-01-11 | 2017-01-11 | Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| NZ743311A NZ743311A (en) | 2024-05-31 |
| NZ743311B2 true NZ743311B2 (en) | 2024-09-03 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2019510323A5 (en) | ||
| RU2018120941A (en) | GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | |
| Chereda et al. | Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer | |
| Rahman et al. | Alternative preprocessing of RNA-Sequencing data in The Cancer Genome Atlas leads to improved analysis results | |
| US9697327B2 (en) | Dynamic genome reference generation for improved NGS accuracy and reproducibility | |
| Collins et al. | The human genome project | |
| Petereit et al. | petal: Co-expression network modelling in R | |
| Sehhati et al. | Stable gene signature selection for prediction of breast cancer recurrence using joint mutual information | |
| Huang et al. | Machine learning and multi-omics in precision medicine for ME/CFS | |
| Schätzle et al. | Methodological challenges in translational drug response modeling in cancer: a systematic analysis with FORESEE | |
| Rojano et al. | Assigning protein function from domain-function associations using DomFun | |
| Garcia-Etxebarria et al. | Consistency of metagenomic assignment programs in simulated and real data | |
| Guerrini et al. | Metagenomic analysis through the extended Burrows-Wheeler transform | |
| Guo et al. | Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies | |
| Kearse et al. | The Geneious 6.0. 3 read mapper | |
| Lee et al. | Extracting structured genotype information from free-text HLA reports using a rule-based approach | |
| Chowdhary et al. | Bioinformatics: an overview for cancer research | |
| NZ743311B2 (en) | Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis | |
| Bars-Cortina | Alpha and beta-diversities performance comparison between different normalization methods and centered log-ratio transformation in a microbiome public dataset | |
| CN113999908A (en) | Kit for predicting colorectal cancer prognosis risk, prediction device thereof and training method of prediction model | |
| Östlund et al. | Avoiding pitfalls in gene (co) expression meta-analysis | |
| Zolotareva et al. | Flimma: a federated and privacy-preserving tool for differential gene expression analysis | |
| Krishnan | Classify vertebrate hemoglobin proteins by incorporating the evolutionary information into the general PseAAC with the hybrid approach | |
| Shah et al. | SAS-Pro: Simultaneous residue assignment and structure superposition for protein structure alignment | |
| Borovska | Big Data Analytics and Genetic Research |