[go: up one dir, main page]

WO2023034328A3 - Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data - Google Patents

Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data Download PDF

Info

Publication number
WO2023034328A3
WO2023034328A3 PCT/US2022/042077 US2022042077W WO2023034328A3 WO 2023034328 A3 WO2023034328 A3 WO 2023034328A3 US 2022042077 W US2022042077 W US 2022042077W WO 2023034328 A3 WO2023034328 A3 WO 2023034328A3
Authority
WO
WIPO (PCT)
Prior art keywords
data
parallelized
correlating
graph
portions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2022/042077
Other languages
French (fr)
Other versions
WO2023034328A2 (en
Inventor
Shawn Andrew Pardue Smith
Bryon Kristen Jacob
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Data World Inc
Original Assignee
Data World Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/461,982 external-priority patent/US11755602B2/en
Application filed by Data World Inc filed Critical Data World Inc
Publication of WO2023034328A2 publication Critical patent/WO2023034328A2/en
Publication of WO2023034328A3 publication Critical patent/WO2023034328A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Various embodiments relate generally to data science and data analysis, computer software and systems, and data-driven control systems and algorithms based on graph-based data arrangements, among other things, and, more specifically, to a computing platform configured to receive or analyze datasets in parallel by implementing, for example, parallel computing processor systems to correlate subsets of parallelized data from disparately-formatted data sources to identify entity data and to aggregate graph data portions. In some examples, a method may include classifying data parallelized data to identify a class of observation data, constructing one or more content graphs in a graph data format, correlating parallelized data to other subsets of parallelized data associated with a class of observation data; and aggregating observation data to represent an individual entity.
PCT/US2022/042077 2021-08-30 2022-08-30 Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data Ceased WO2023034328A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/461,982 2021-08-30
US17/461,982 US11755602B2 (en) 2016-06-19 2021-08-30 Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data

Publications (2)

Publication Number Publication Date
WO2023034328A2 WO2023034328A2 (en) 2023-03-09
WO2023034328A3 true WO2023034328A3 (en) 2023-04-13

Family

ID=85413050

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/042077 Ceased WO2023034328A2 (en) 2021-08-30 2022-08-30 Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data

Country Status (1)

Country Link
WO (1) WO2023034328A2 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515085B2 (en) 2016-06-19 2019-12-24 Data.World, Inc. Consolidator platform to implement collaborative datasets via distributed computer networks
US10853376B2 (en) 2016-06-19 2020-12-01 Data.World, Inc. Collaborative dataset consolidation via distributed computer networks
US11068453B2 (en) 2017-03-09 2021-07-20 data.world, Inc Determining a degree of similarity of a subset of tabular data arrangements to subsets of graph data arrangements at ingestion into a data-driven collaborative dataset platform
US11888910B1 (en) * 2022-09-15 2024-01-30 Neptyne Inc System to provide a joint spreadsheet and electronic notebook interface

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030229652A1 (en) * 2000-02-28 2003-12-11 Reuven Bakalash Enterprise-wide data-warehouse with integrated data aggregation engine
US20080162409A1 (en) * 2006-12-27 2008-07-03 Microsoft Corporation Iterate-aggregate query parallelization
US20120011144A1 (en) * 2010-07-12 2012-01-12 Frederik Transier Aggregation in parallel computation environments with shared memory
US20140143760A1 (en) * 2012-11-16 2014-05-22 Ab Initio Technology Llc Dynamic graph performance monitoring
US20140297665A1 (en) * 2013-03-15 2014-10-02 Akuda Labs Llc Optimization for Real-Time, Parallel Execution of Models for Extracting High-Value Information from Data Streams

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030229652A1 (en) * 2000-02-28 2003-12-11 Reuven Bakalash Enterprise-wide data-warehouse with integrated data aggregation engine
US20080162409A1 (en) * 2006-12-27 2008-07-03 Microsoft Corporation Iterate-aggregate query parallelization
US20120011144A1 (en) * 2010-07-12 2012-01-12 Frederik Transier Aggregation in parallel computation environments with shared memory
US20140143760A1 (en) * 2012-11-16 2014-05-22 Ab Initio Technology Llc Dynamic graph performance monitoring
US20140297665A1 (en) * 2013-03-15 2014-10-02 Akuda Labs Llc Optimization for Real-Time, Parallel Execution of Models for Extracting High-Value Information from Data Streams

Also Published As

Publication number Publication date
WO2023034328A2 (en) 2023-03-09

Similar Documents

Publication Publication Date Title
WO2023034328A3 (en) Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data
Narayan et al. Assessing single-cell transcriptomic variability through density-preserving data visualization
KR102556497B1 (en) Unbiased data using machine learning models
US11734315B2 (en) Method and system for implementing efficient classification and exploration of data
de Andrade et al. ENMTML: An R package for a straightforward construction of complex ecological niche models
US10191968B2 (en) Automated data analysis
Pontes et al. Quality measures for gene expression biclusters
US11556852B2 (en) Efficient ground truth annotation
Pérez-Ortega et al. Balancing effort and benefit of K-means clustering algorithms in Big Data realms
Booeshaghi et al. Depth normalization for single-cell genomics count data
US20230259710A1 (en) Dynamic attribute extraction systems and methods for artificial intelligence platform
US12019902B2 (en) Data lineage in a data pipeline
CN116109121B (en) User demand mining method and system based on big data analysis
Lopez et al. Bayesian inference for a generative model of transcriptome profiles from single-cell RNA sequencing
Wagner et al. Accurate denoising of single-cell RNA-Seq data using unbiased principal component analysis
Alexander et al. Capturing discrete latent structures: choose LDs over PCs
Lim et al. Quantifying the clusterness and trajectoriness of single-cell RNA-seq data
Suh et al. Clutch: A clustering-driven runtime estimation scheme for scientific simulations
US11507447B1 (en) Supervised graph-based model for program failure cause prediction using program log files
Singh et al. Predicting the popularity of rumors in social media using machine learning
Huang et al. Efficient learning with projected histograms
Gorin et al. RNA velocity and protein acceleration from single-cell multiomics experiments
Schultheiß et al. Scalable N-Way Model Matching Using Multi-Dimensional Search Trees-Summary
Torregrosa-Cortes et al. Single-cell Bayesian deconvolution
Chothani et al. Benchmarking Mutual Information-based Loss Functions in Federated Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22865449

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22865449

Country of ref document: EP

Kind code of ref document: A2