Grafberger et al., 2024 - Google Patents
Towards Interactively Improving ML Data Preparation Code via" Shadow Pipelines"Grafberger et al., 2024
View PDF- Document ID
- 12167265437447726824
- Author
- Grafberger S
- Groth P
- Schelter S
- Publication year
- Publication venue
- Proceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning
External Links
Snippet
Data scientists develop ML pipelines in an iterative manner: they repeatedly screen a pipeline for potential issues, debug it, and then revise and improve its code according to their findings. However, this manual process is tedious and error-prone. Therefore, we …
- 238000002360 preparation method 0 title description 6
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
- G06F17/30303—Improving data quality; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/75—Structural analysis for program understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12141144B2 (en) | Column lineage and metadata propagation | |
Wang et al. | Amalgam+: Composing rich information sources for accurate bug localization | |
CN113656805B (en) | A method and system for automatically constructing event graphs for multi-source vulnerability information | |
CN114861194B (en) | Multi-type vulnerability detection method based on BGRU and CNN fusion model | |
US11243835B1 (en) | Message-based problem diagnosis and root cause analysis | |
Marmolejos et al. | On the use of textual feature extraction techniques to support the automated detection of refactoring documentation | |
Zhong et al. | Inferring specifications for resources from natural language API documentation | |
Grafberger et al. | Automating and optimizing data-centric what-if analyses on native machine learning pipelines | |
Tsuchiya et al. | Recovering traceability links between requirements and source code using the configuration management log | |
US20250004915A1 (en) | Generative artificial intelligence driven software fixing | |
Nam et al. | A bug finder refined by a large set of open-source projects | |
Saini et al. | Towards automating precision studies of clone detectors | |
Grafberger et al. | Towards Interactively Improving ML Data Preparation Code via" Shadow Pipelines" | |
CN119378494A (en) | An entity relationship extraction method and system for building knowledge graphs in the financial field | |
Liu et al. | Automatically recommend code updates: Are we there yet? | |
Hu et al. | Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks | |
Vitale et al. | A catalog of data smells for coding tasks | |
Ge et al. | Hidden code vulnerability detection: A study of the Graph-BiLSTM algorithm | |
CN114546699A (en) | Newly-added software defect positioning method based on query expansion | |
Wang et al. | WheaCha: A method for explaining the predictions of models of code | |
Salman | Test case generation from specifications using natural language processing | |
Chopra et al. | Transductive Instance Transfer Learning for Cross-Language Defect Prediction | |
Haas | Protocol to discover machine-readable entities of the ecosystem management actions taxonomy | |
Lutellier | Machine Learning for Software Dependability | |
Zhou et al. | Software Vulnerability Detection via Multimodal Deep Learning |