Local Llama Quickstart
Updated on
Midjourney-generated llama from Medium
“Llama” refers to a Large Language Model (LLM). “Local llama” refers to a locally-hosted (typically open source) llama, in contrast to commercially hosted ones.
Local LLaMA Quickstart
I’ve recently become aware of the open source LLM (local llama) movement. Unlike traditional open source, the speed at which this field is moving is unprecedented. Several breakthroughs come out on a weekly basis and information dated further back than a month is often functionally deprecated.
This collection was gathered in late December 2023, with the intent to help anyone looking to get caught up with the field.
Models
Model Sources
Practical use
Interfaces
- List of web UIs
- Jailbreaking GPT when prompting for characters
- Guidelines for prompting for characters
- ChatML from OpenAI is quickly becoming the standard for prompting
- Chainlit - open source chat interface builder
- Chasm - multiplayer text generation game
Training
- Teaching llama a new language through tuning
- Mergekit - MoE training framework
- Axolotl - Fine tuning framework
- Unsloth - Fine tuning accelerator
- llama.cpp, Transformers, LangChain, and PyTorch are all popular libraries for training
Tutorials
- Karpathy builds and trains a llama
- Build a llama DIY by freecodecamp
- Understanding different quantization methods
- LLM workshop
Servers
Hardware
GPU stuff
Cloud renting
- Kaggle - 30h/week free, enough VRAM for 7B models
- Lambda Labs - Huge instances, competitive pricing
- Runpod - Bad pricing, community cloud option
- Paperspace - Requires subscription, terrible pricing
- Genesis Cloud - Reddit says it’s affordable… I can’t verify
- Vast.ai - Very affordable, especially the “interruptible” ones