Liu et al., 2025 - Google Patents
An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT NetworksLiu et al., 2025
- Document ID
- 9282675897430726629
- Author
- Liu C
- Li Y
- Chen C
- Zou X
- Kuang H
- Ma X
- Lu Z
- Zhang Z
- Liu J
- Liu X
- Publication year
- Publication venue
- IEEE Internet of Things Journal
External Links
Snippet
The exponential growth of the Internet of Things (IoT) necessitates the deployment of large- scale models capable of processing the complex and diverse data generated by IoT devices. However, the substantial memory requirements of these models pose significant …
- 230000003044 adaptive effect 0 title abstract description 17
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fu et al. | Client selection in federated learning: Principles, challenges, and opportunities | |
He et al. | Large language models (LLMs) inference offloading and resource allocation in cloud-edge computing: An active inference approach | |
Abdel‐Basset et al. | IEGA: an improved elitism‐based genetic algorithm for task scheduling problem in fog computing | |
Chen et al. | Resource allocation with workload-time windows for cloud-based software services: a deep reinforcement learning approach | |
Zhang et al. | DVFO: Learning-based DVFS for energy-efficient edge-cloud collaborative inference | |
Cheng et al. | GRU-ES: Resource usage prediction of cloud workloads using a novel hybrid method | |
Elsedimy et al. | MOTS‐ACO: An improved ant colony optimiser for multi‐objective task scheduling optimisation problem in cloud data centres | |
Li et al. | SERAC3: Smart and economical resource allocation for big data clusters in community clouds | |
Liu et al. | Energy‐aware task scheduling with time constraint for heterogeneous cloud datacenters | |
Yalla et al. | Enhancing customer relationship management through intelligent and scalable cloud-based data management architectures | |
Chen et al. | DRJOA: intelligent resource management optimization through deep reinforcement learning approach in edge computing | |
Behera et al. | Exploring the boundaries of on-device inference: When tiny falls short, go hierarchical | |
Liu et al. | An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT Networks | |
Xie et al. | Multi-Container Migration Strategy Optimization for Industrial Robotics Workflow Based on Hybrid Tabu-Evolutionary Algorithm | |
CN119960981A (en) | Heterogeneous hardware resource pool scheduling and matching method, device, electronic device, storage medium and program product for dynamic task flow | |
Xiong et al. | A modified sine cosine algorithm for numerical optimization | |
Zhang et al. | Dns-rec: Data-aware neural architecture search for recommender systems | |
Wang et al. | Enabling energy-efficient and reliable neural network via neuron-level voltage scaling | |
Wu et al. | [Retracted] FLOM: Toward Efficient Task Processing in Big Data with Federated Learning | |
Farooq et al. | FR-EAHTS: federated reinforcement learning for enhanced task scheduling with hierarchical load balancing and dynamic power adjustment in multi-core systems | |
Sun et al. | Ssa: A content-based sparse attention mechanism | |
Joshi et al. | The impact of cloud computing on data science and engineering: Opportunities and challenges | |
Jiao et al. | SRA-E-ABCO: terminal task offloading for cloud-edge-end environments | |
Li et al. | A computational offloading algorithm for cloud-edge collaboration in smart agriculture | |
Yin et al. | An reinforcement learning approach for allocating software resources |