LocalAI leverages cutting-edge peer-to-peer technologies to distribute AI workloads intelligently across your network
Share complete LocalAI instances across your network for load balancing and redundancy. Perfect for scaling across multiple devices.
Split large model weights across multiple workers. Currently supported with llama.cpp backends for efficient memory usage.
Pool computational resources from multiple devices, including your friends' machines, to handle larger workloads collaboratively.
Parallel processing
Add more nodes
Fault tolerant
Resource optimization
The network token can be used to either share the instance or join a federation or a worker network. Below you will find examples on how to start a new instance or a worker with this token.
You have to enable P2P mode by starting LocalAI with --p2p
. Please restart the server with --p2p
to generate a new token automatically that can be used to discover other nodes. If you already have a token, specify it with export TOKEN=".."
Check out the documentation for more information.