Distributed AI Computing

Scale your AI workloads across multiple devices with peer-to-peer distribution

How P2P Distribution Works

LocalAI leverages cutting-edge peer-to-peer technologies to distribute AI workloads intelligently across your network

Instance Federation

Share complete LocalAI instances across your network for load balancing and redundancy. Perfect for scaling across multiple devices.

Model Sharding

Split large model weights across multiple workers. Currently supported with llama.cpp backends for efficient memory usage.

Resource Sharing

Pool computational resources from multiple devices, including your friends' machines, to handle larger workloads collaboratively.

Faster

Parallel processing

Scalable

Add more nodes

Resilient

Fault tolerant

Efficient

Resource optimization

Network Token

The network token can be used to either share the instance or join a federation or a worker network. Below you will find examples on how to start a new instance or a worker with this token.

Warning: P2P token was not specified

You have to enable P2P mode by starting LocalAI with --p2p. Please restart the server with --p2p to generate a new token automatically that can be used to discover other nodes. If you already have a token, specify it with export TOKEN=".." Check out the documentation for more information.