Konda, 2022 - Google Patents
The impact of data preprocessing on data mining outcomesKonda, 2022
View PDF- Document ID
- 17355004608392306506
- Author
- Konda B
- Publication year
- Publication venue
- World Journal of Advanced Research and Reviews
External Links
Snippet
Data preprocessing is a vital initial step during knowledge discovery because it determines the success of data mining projects. A dataset's quality and representation stand as the primary element because any presence of redundant, irrelevant, too noisy, or unreliable …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
- G06F17/30303—Improving data quality; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G06K9/6269—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches based on the distance between the decision surface and training patterns lying on the boundary of the class cluster, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
- G06Q10/063—Operations research or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Braiek et al. | On testing machine learning programs | |
US11899800B2 (en) | Open source vulnerability prediction with machine learning ensemble | |
US20220253856A1 (en) | System and method for machine learning based detection of fraud | |
Nair et al. | Covariate shift: A review and analysis on classifiers | |
WO2019200480A1 (en) | Method and system for model auto-selection using an ensemble of machine learning models | |
US10127694B2 (en) | Enhanced triplet embedding and triplet creation for high-dimensional data visualizations | |
Kaur et al. | An empirical study of software entropy based bug prediction using machine learning | |
US11037073B1 (en) | Data analysis system using artificial intelligence | |
Konda | The impact of data preprocessing on data mining outcomes | |
Bosnić et al. | Enhancing data stream predictions with reliability estimators and explanation | |
GB2620586A (en) | Method and system of generating causal structure | |
Brugnano et al. | An entropy-based approach for a robust least squares spline approximation | |
Bashar et al. | Algan: Time series anomaly detection with adjusted-lstm gan | |
Mehta et al. | An ensemble voting classification approach for software defects prediction | |
Yılmaz et al. | Price prediction using web scraping and machine learning algorithms in the used car market | |
JP6629682B2 (en) | Learning device, classification device, classification probability calculation device, and program | |
Kim et al. | Cafo: Feature-centric explanation on time series classification | |
Kuang et al. | Performance Characterization of Clusterwise Linear Regression Algorithms | |
US20230267363A1 (en) | Machine learning with periodic data | |
Hwu et al. | Markov-switching models with unknown error distributions: identification and inference within the bayesian framework | |
WO2020167156A1 (en) | Method for debugging a trained recurrent neural network | |
Jin et al. | Self-supervised learning for anomaly detection in computational workflows | |
CA3108609A1 (en) | System and method for machine learning based detection of fraud | |
Zhang et al. | An improved sequential importance sampling method for structural reliability analysis of high dimensional problems | |
US20250117697A1 (en) | System and method for configuring an artificial intelligence pipeline |