llm-registry/README.md

# Worka Models Repository

This repository contains **Candle-supported, redistributable, community-quantized models**
(primarily in **GGUF** and **safetensors** formats) ready for use in [Worka](https://github.com/your-org/worka).

## 📂 Repository Structure

```
models/
  llama-2-7b/
    README.md               # Original Hugging Face model card content
    model.yaml              # Machine-readable metadata (formats, quantizations, files)
    quantized/              # GGUF quantized weights
    safetensors/            # Float16 safetensor weights
    tokenizer/              # Tokenizer files
tools/
  download.sh               # Script to fetch missing models from Hugging Face
  verify-checksums.py       # Verify downloaded files against known hashes
  generate-registry.py      # Generate registry.json from all model.yaml files
.gitattributes              # Configure Git LFS
.github/workflows/download.yml  # GitHub Action to fetch models automatically
```

## 🧩 Metadata Format (`model.yaml`)

Each model has a `model.yaml` file describing:

- **Model name & description**
- **Publisher attribution**:
  - *Original unquantized model publisher* (e.g., Meta for LLaMA)
  - *Quantization publisher* (e.g., TheBloke)
- **Available formats** (`gguf`, `safetensors`)
- **Quantization variants** (with user-friendly labels, file lists, download size)
- **Tokenizer files**
- **VRAM requirements**

This YAML is used by Worka to present model options and to **sparse-checkout** only the required files.

## 🚀 Using Sparse Checkout

You can fetch only the files for the model/quantization you need:

```bash
git clone https://github.com/your-org/worka-models.git
cd worka-models

# Enable sparse checkout
git sparse-checkout init --cone

# Set which model files to fetch (example: LLaMA 2 7B Q4_K_M)
git sparse-checkout set models/llama-2-7b/quantized/llama-2-7b.Q4_K_M.gguf                         models/llama-2-7b/tokenizer/tokenizer.model                         models/llama-2-7b/model.yaml
```

## 🛠 Helper Scripts

- **`tools/download.sh`** – Fetches missing models from Hugging Face using metadata in `model.yaml`.
- **`tools/verify-checksums.py`** – Verifies downloaded files against stored hashes.
- **`tools/generate-registry.py`** – Generates a consolidated `registry.json` from all YAMLs.

## 🤖 GitHub Actions

A workflow in `.github/workflows/download.yml` runs `download.sh` to fetch any configured model missing from the repo.

## 📜 License & Attribution

All models are redistributed under their respective licenses.
Each `model.yaml` file carries **attribution** for both:
- **Original unquantized publisher**
- **Quantized publisher**

## ⚠️ Notes

- Only **ungated, redistributable** models are included.
- **We do not include** gated models like unquantized LLaMA weights from Meta — these must be fetched by the user directly.

---
For details about individual models, see their `README.md` inside each `models/<model-name>/` folder.