Inference Bayesienne - Search News

LoRAX: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

LoRAX (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned models on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency.

IEEE

Measuring and Improving the Energy Efficiency of Large Language Models Inference

Abstract: Recent improvements in the accuracy of machine learning (ML) models in the language domain have propelled their use in a multitude of products and services, touching millions of lives daily.

IEEE

Poisoning-Assisted Property Inference Attack Against Federated Learning

Abstract: Federated learning (FL) has emerged as an ideal privacy-preserving learning technique which can train a global model in a collaborative way while preserving the private data in the local.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

LoRAX: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Measuring and Improving the Energy Efficiency of Large Language Models Inference

Poisoning-Assisted Property Inference Attack Against Federated Learning

Trending now