12
(New) papers by Meta: Large Concept Models and BLT
(palaver.p3x.de)
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
Uh, I'm not sure. I didn't have the time yet to read those papers. I suppose the Byte Latent Transformer does. It's still some kind of a transformer architecture. With the Large Concept Models, I'm not so sure. They're encoding whole sentences. And the researchers explore like 3 different (diffusion) architectures. The paper calls itself a "proof of feasibility", so it's more basic research about that approach, not one single/specific model architecture.