[go: up one dir, main page]

MX2019009680A - Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos. - Google Patents

Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.

Info

Publication number
MX2019009680A
MX2019009680A MX2019009680A MX2019009680A MX2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A
Authority
MX
Mexico
Prior art keywords
data
compact representation
multiple genomic
bioinformatics data
class
Prior art date
Application number
MX2019009680A
Other languages
English (en)
Inventor
Zoia Giorgio
Renzi Daniele
Khoso Baluch Mohamed
Alberti Claudio
Original Assignee
Genomsys Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/US2017/017842 external-priority patent/WO2018071055A1/en
Application filed by Genomsys Sa filed Critical Genomsys Sa
Publication of MX2019009680A publication Critical patent/MX2019009680A/es

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/58Random or pseudo-random number generators
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B99/00Subject matter not provided for in other groups of this subclass
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0866Generation of secret information including derivation or calculation of cryptographic keys or passwords involving user or device identifiers, e.g. serial number, physical or biometrical information, DNA, hand-signature or measurable physical characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
    • H04L9/3066Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves
    • H04L9/3073Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves involving pairings, e.g. identity based encryption [IBE], bilinear mappings or bilinear pairings, e.g. Weil or Tate pairing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/30Compression, e.g. Merkle-Damgard construction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/34Encoding or coding, e.g. Huffman coding or error correction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/88Medical equipments

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Optimization (AREA)
  • Signal Processing (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • Human Computer Interaction (AREA)

Abstract

Método y aparato para la compresión de datos de secuencias de genomas producidos por máquinas de secuenciación del genoma. Las lecturas de secuencias se codifican mediante su alineación con respecto a secuencias de referencia preexistentes o construidas, en donde el proceso de codificación está compuesto por una clasificación de las lecturas en clases de datos, a lo cual sigue la codificación de cada clase en términos de una multiplicidad de bloques de descriptores. Se utilizan modelos fuente y codificadores de entropía específicos para cada clase de datos en los que se dividen los datos y para cada bloque de descriptor asociado.
MX2019009680A 2017-02-14 2018-02-14 Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos. MX2019009680A (es)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
PCT/US2017/017842 WO2018071055A1 (en) 2016-10-11 2017-02-14 Method and apparatus for the compact representation of bioinformatics data
PCT/US2017/041591 WO2018071080A2 (en) 2016-10-11 2017-07-11 Method and systems for the representation and processing of bioinformatics data using reference sequences
PCT/US2018/018092 WO2018152143A1 (en) 2017-02-14 2018-02-14 Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors

Publications (1)

Publication Number Publication Date
MX2019009680A true MX2019009680A (es) 2019-10-09

Family

ID=68609803

Family Applications (1)

Application Number Title Priority Date Filing Date
MX2019009680A MX2019009680A (es) 2017-02-14 2018-02-14 Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.

Country Status (10)

Country Link
EP (1) EP3583500A4 (es)
KR (1) KR102733786B1 (es)
AU (1) AU2018221458B2 (es)
CA (1) CA3052824A1 (es)
EA (1) EA201991908A1 (es)
IL (1) IL268651A (es)
MX (1) MX2019009680A (es)
SG (1) SG11201907418YA (es)
WO (1) WO2018152143A1 (es)
ZA (1) ZA201905921B (es)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110189830B (zh) * 2019-05-24 2021-06-08 杭州火树科技有限公司 基于机器学习的电子病历词库训练方法
EP3896698A1 (en) 2020-04-15 2021-10-20 Genomsys SA Method and system for the efficient data compression in mpeg-g
KR102497634B1 (ko) * 2020-12-21 2023-02-08 부산대학교 산학협력단 문자 빈도 기반 서열 재정렬을 통한 fastq 데이터 압축 방법 및 장치
CN116206687A (zh) * 2022-12-30 2023-06-02 深圳百人科技有限公司 一种模糊匹配的k-mer编码方式

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020159625A1 (en) * 2001-04-02 2002-10-31 Cytoprint, Inc. Method and apparatus for discovering, identifying and comparing biological activity mechanisms
US7698067B2 (en) * 2002-02-12 2010-04-13 International Business Machines Corporation Sequence pattern descriptors for transmembrane structural details
US7809765B2 (en) * 2007-08-24 2010-10-05 General Electric Company Sequence identification and analysis
KR101922129B1 (ko) * 2011-12-05 2018-11-26 삼성전자주식회사 차세대 시퀀싱을 이용하여 획득된 유전 정보를 압축 및 압축해제하는 방법 및 장치
US9679104B2 (en) * 2013-01-17 2017-06-13 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
CN103336916B (zh) * 2013-07-05 2016-04-06 中国科学院数学与系统科学研究院 一种测序序列映射方法及系统
US10902937B2 (en) * 2014-02-12 2021-01-26 International Business Machines Corporation Lossless compression of DNA sequences

Also Published As

Publication number Publication date
NZ757185A (en) 2021-05-28
EP3583500A1 (en) 2019-12-25
AU2018221458B2 (en) 2022-12-08
WO2018152143A1 (en) 2018-08-23
KR102733786B1 (ko) 2024-11-26
EP3583500A4 (en) 2020-12-16
CA3052824A1 (en) 2018-08-23
SG11201907418YA (en) 2019-09-27
IL268651A (en) 2019-10-31
KR20190113971A (ko) 2019-10-08
ZA201905921B (en) 2021-05-26
EA201991908A1 (ru) 2020-01-21
AU2018221458A1 (en) 2019-10-03

Similar Documents

Publication Publication Date Title
PH12019501879A1 (en) Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors
MX2019009680A (es) Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.
MX2024009279A (es) Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo codificador de datos tridimensionales y dispositivo decodificador de datos tridimensionales.
MX2024009494A (es) Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo de codificacion de datos tridimensionales y dispositivo de decodificacion de datos tridimensionales.
MX2024007272A (es) Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo de codificacion de datos tridimensionales y dispositivo de decodificacion de datos tridimensionales.
PH12019500791A1 (en) Efficient data structures for bioinformatics information presentation
EP4307592A3 (en) Bit allocation for encoding and decoding
CO2019009919A2 (es) Método y sistemas para la compresión eficiente de lecturas de secuencias genómicas
MY188593A (en) Deriving motion information for sub-blocks in video coding
BR112013031624A8 (pt) Método de decodificação de imagem, método de codificação de imagem, dispositivo de decodificação de imagem, dispositivo de codificação de imagem, e dispositivo de codificação/decodificação de imagem
SA519401514B1 (ar) طريقة وجهاز لضغط تمثيل بيانات المعلومات البيولوجية
EA201991906A1 (ru) Способ и системы для восстановления геномных референсных последовательностей из сжатых прочтений геномной последовательности
MY166213A (en) Picture coding device, picture coding method, and picture coding program
MX2019009681A (es) Metodo y sistemas para la compresion eficiente de lecturas de secuencias genomicas.
SG11202000106WA (en) Method and device for comparing media features
TW201612895A (en) Method and apparatus for coding or decoding subband configuration data for subband groups
MX392029B (es) Codificador, decodificador, método de codificación, método de decodificación, y programa de compresión de representaciones visuales.
PE20191228A1 (es) Metodo y aparato para representacion compacta de datos bioinformaticos
BR112016029539A2 (pt) sistemas e métodos para cópia intrabloco
AR105057A1 (es) Derivación de información de movimiento para subbloques en la codificación de video