[go: up one dir, main page]

Liu et al., 2025 - Google Patents

An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT Networks

Liu et al., 2025

Document ID
9282675897430726629
Author
Liu C
Li Y
Chen C
Zou X
Kuang H
Ma X
Lu Z
Zhang Z
Liu J
Liu X
Publication year
Publication venue
IEEE Internet of Things Journal

External Links

Snippet

The exponential growth of the Internet of Things (IoT) necessitates the deployment of large- scale models capable of processing the complex and diverse data generated by IoT devices. However, the substantial memory requirements of these models pose significant …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power Management, i.e. event-based initiation of power-saving mode
    • G06F1/3234Action, measure or step performed to reduce power consumption

Similar Documents

Publication Publication Date Title
Fu et al. Client selection in federated learning: Principles, challenges, and opportunities
He et al. Large language models (LLMs) inference offloading and resource allocation in cloud-edge computing: An active inference approach
Abdel‐Basset et al. IEGA: an improved elitism‐based genetic algorithm for task scheduling problem in fog computing
Chen et al. Resource allocation with workload-time windows for cloud-based software services: a deep reinforcement learning approach
Zhang et al. DVFO: Learning-based DVFS for energy-efficient edge-cloud collaborative inference
Cheng et al. GRU-ES: Resource usage prediction of cloud workloads using a novel hybrid method
Elsedimy et al. MOTS‐ACO: An improved ant colony optimiser for multi‐objective task scheduling optimisation problem in cloud data centres
Li et al. SERAC3: Smart and economical resource allocation for big data clusters in community clouds
Liu et al. Energy‐aware task scheduling with time constraint for heterogeneous cloud datacenters
Yalla et al. Enhancing customer relationship management through intelligent and scalable cloud-based data management architectures
Chen et al. DRJOA: intelligent resource management optimization through deep reinforcement learning approach in edge computing
Behera et al. Exploring the boundaries of on-device inference: When tiny falls short, go hierarchical
Liu et al. An Adaptive and Scalable Framework for Resource-Efficient Deployment of Mixture of Experts in LLM-Based Intelligent IoT Networks
Xie et al. Multi-Container Migration Strategy Optimization for Industrial Robotics Workflow Based on Hybrid Tabu-Evolutionary Algorithm
CN119960981A (en) Heterogeneous hardware resource pool scheduling and matching method, device, electronic device, storage medium and program product for dynamic task flow
Xiong et al. A modified sine cosine algorithm for numerical optimization
Zhang et al. Dns-rec: Data-aware neural architecture search for recommender systems
Wang et al. Enabling energy-efficient and reliable neural network via neuron-level voltage scaling
Wu et al. [Retracted] FLOM: Toward Efficient Task Processing in Big Data with Federated Learning
Farooq et al. FR-EAHTS: federated reinforcement learning for enhanced task scheduling with hierarchical load balancing and dynamic power adjustment in multi-core systems
Sun et al. Ssa: A content-based sparse attention mechanism
Joshi et al. The impact of cloud computing on data science and engineering: Opportunities and challenges
Jiao et al. SRA-E-ABCO: terminal task offloading for cloud-edge-end environments
Li et al. A computational offloading algorithm for cloud-edge collaboration in smart agriculture
Yin et al. An reinforcement learning approach for allocating software resources