Files
llm-registry/README.md
2025-09-27 18:26:34 +01:00

78 lines
2.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Worka Models Repository
This repository contains **Candle-supported, redistributable, community-quantized models**
(primarily in **GGUF** and **safetensors** formats) ready for use in [Worka](https://github.com/your-org/worka).
## 📂 Repository Structure
```
models/
llama-2-7b/
README.md # Original Hugging Face model card content
model.yaml # Machine-readable metadata (formats, quantizations, files)
quantized/ # GGUF quantized weights
safetensors/ # Float16 safetensor weights
tokenizer/ # Tokenizer files
tools/
download.sh # Script to fetch missing models from Hugging Face
verify-checksums.py # Verify downloaded files against known hashes
generate-registry.py # Generate registry.json from all model.yaml files
.gitattributes # Configure Git LFS
.github/workflows/download.yml # GitHub Action to fetch models automatically
```
## 🧩 Metadata Format (`model.yaml`)
Each model has a `model.yaml` file describing:
- **Model name & description**
- **Publisher attribution**:
- *Original unquantized model publisher* (e.g., Meta for LLaMA)
- *Quantization publisher* (e.g., TheBloke)
- **Available formats** (`gguf`, `safetensors`)
- **Quantization variants** (with user-friendly labels, file lists, download size)
- **Tokenizer files**
- **VRAM requirements**
This YAML is used by Worka to present model options and to **sparse-checkout** only the required files.
## 🚀 Using Sparse Checkout
You can fetch only the files for the model/quantization you need:
```bash
git clone https://github.com/your-org/worka-models.git
cd worka-models
# Enable sparse checkout
git sparse-checkout init --cone
# Set which model files to fetch (example: LLaMA 2 7B Q4_K_M)
git sparse-checkout set models/llama-2-7b/quantized/llama-2-7b.Q4_K_M.gguf models/llama-2-7b/tokenizer/tokenizer.model models/llama-2-7b/model.yaml
```
## 🛠 Helper Scripts
- **`tools/download.sh`** Fetches missing models from Hugging Face using metadata in `model.yaml`.
- **`tools/verify-checksums.py`** Verifies downloaded files against stored hashes.
- **`tools/generate-registry.py`** Generates a consolidated `registry.json` from all YAMLs.
## 🤖 GitHub Actions
A workflow in `.github/workflows/download.yml` runs `download.sh` to fetch any configured model missing from the repo.
## 📜 License & Attribution
All models are redistributed under their respective licenses.
Each `model.yaml` file carries **attribution** for both:
- **Original unquantized publisher**
- **Quantized publisher**
## ⚠️ Notes
- Only **ungated, redistributable** models are included.
- **We do not include** gated models like unquantized LLaMA weights from Meta — these must be fetched by the user directly.
---
For details about individual models, see their `README.md` inside each `models/<model-name>/` folder.