b4rtaz/distributed-llama

Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.

C++MIT

Readme
56Issues
1.4kStargazers
30Watchers

Watchers

ai13f
akan
amsys
Ras al-Khaimah
arjn2
avindra
United States
b4rtaz
127.0.0.1
christopherok
Dashbloxx
DifferentialityDevelopment
Differentiality Development
donfuegoadmin
eemailme
EntusiastaIApy
ghchris2021
githuba9f5404
haihecomp
iackov
jkeegan
JohnClaw
l3r
Vancouver
lvzhiqiang
maxcurrent420
przemyslawmatraj
Poland
Python-Z
serralva-ruben
ShawnsonDFW
TheLab.ms
sterling312
stevegt
zhengpeirong
HongKong

Contact site admin: Geeks.