Liu et al., 2025 - Google Patents

An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT Networks

Liu et al., 2025

Document ID: 9282675897430726629
Author: Liu C; Li Y; Chen C; Zou X; Kuang H; Ma X; Lu Z; Zhang Z; Liu J; Liu X
Publication year: 2025
Publication venue: IEEE Internet of Things Journal

External Links

Cited by

Snippet

The exponential growth of the Internet of Things (IoT) necessitates the deployment of large- scale models capable of processing the complex and diverse data generated by IoT devices. However, the substantial memory requirements of these models pose significant …

Continue reading at ieeexplore.ieee.org (other versions)

230000003044 adaptive effect 0 title abstract description 17

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption

Similar Documents

Publication	Publication Date	Title
Fu et al.	2023	Client selection in federated learning: Principles, challenges, and opportunities
He et al.	2024	Large language models (LLMs) inference offloading and resource allocation in cloud-edge computing: An active inference approach
Abdel‐Basset et al.	2021	IEGA: an improved elitism‐based genetic algorithm for task scheduling problem in fog computing
Chen et al.	2022	Resource allocation with workload-time windows for cloud-based software services: a deep reinforcement learning approach
Zhang et al.	2024	DVFO: Learning-based DVFS for energy-efficient edge-cloud collaborative inference
Cheng et al.	2019	GRU-ES: Resource usage prediction of cloud workloads using a novel hybrid method
Elsedimy et al.	2022	MOTS‐ACO: An improved ant colony optimiser for multi‐objective task scheduling optimisation problem in cloud data centres
Li et al.	2018	SERAC3: Smart and economical resource allocation for big data clusters in community clouds
Liu et al.	2020	Energy‐aware task scheduling with time constraint for heterogeneous cloud datacenters
Yalla et al.	2018	Enhancing customer relationship management through intelligent and scalable cloud-based data management architectures
Chen et al.	2023	DRJOA: intelligent resource management optimization through deep reinforcement learning approach in edge computing
Behera et al.	2025	Exploring the boundaries of on-device inference: When tiny falls short, go hierarchical
Liu et al.	2025	An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT Networks
Xie et al.	2024	Multi-Container Migration Strategy Optimization for Industrial Robotics Workflow Based on Hybrid Tabu-Evolutionary Algorithm
CN119960981A (en)	2025-05-09	Heterogeneous hardware resource pool scheduling and matching method, device, electronic device, storage medium and program product for dynamic task flow
Xiong et al.	2024	A modified sine cosine algorithm for numerical optimization
Zhang et al.	2024	Dns-rec: Data-aware neural architecture search for recommender systems
Wang et al.	2020	Enabling energy-efficient and reliable neural network via neuron-level voltage scaling
Wu et al.	2022	[Retracted] FLOM: Toward Efficient Task Processing in Big Data with Federated Learning
Farooq et al.	2025	FR-EAHTS: federated reinforcement learning for enhanced task scheduling with hierarchical load balancing and dynamic power adjustment in multi-core systems
Sun et al.	2022	Ssa: A content-based sparse attention mechanism
Joshi et al.	2024	The impact of cloud computing on data science and engineering: Opportunities and challenges
Jiao et al.	2024	SRA-E-ABCO: terminal task offloading for cloud-edge-end environments
Li et al.	2024	A computational offloading algorithm for cloud-edge collaboration in smart agriculture
Yin et al.	2023	An reinforcement learning approach for allocating software resources