Files
llm-registry/README.md
2025-09-27 18:26:34 +01:00

2.9 KiB
Raw Blame History

Worka Models Repository

This repository contains Candle-supported, redistributable, community-quantized models (primarily in GGUF and safetensors formats) ready for use in Worka.

📂 Repository Structure

models/
  llama-2-7b/
    README.md               # Original Hugging Face model card content
    model.yaml              # Machine-readable metadata (formats, quantizations, files)
    quantized/              # GGUF quantized weights
    safetensors/            # Float16 safetensor weights
    tokenizer/              # Tokenizer files
tools/
  download.sh               # Script to fetch missing models from Hugging Face
  verify-checksums.py       # Verify downloaded files against known hashes
  generate-registry.py      # Generate registry.json from all model.yaml files
.gitattributes              # Configure Git LFS
.github/workflows/download.yml  # GitHub Action to fetch models automatically

🧩 Metadata Format (model.yaml)

Each model has a model.yaml file describing:

  • Model name & description
  • Publisher attribution:
    • Original unquantized model publisher (e.g., Meta for LLaMA)
    • Quantization publisher (e.g., TheBloke)
  • Available formats (gguf, safetensors)
  • Quantization variants (with user-friendly labels, file lists, download size)
  • Tokenizer files
  • VRAM requirements

This YAML is used by Worka to present model options and to sparse-checkout only the required files.

🚀 Using Sparse Checkout

You can fetch only the files for the model/quantization you need:

git clone https://github.com/your-org/worka-models.git
cd worka-models

# Enable sparse checkout
git sparse-checkout init --cone

# Set which model files to fetch (example: LLaMA 2 7B Q4_K_M)
git sparse-checkout set models/llama-2-7b/quantized/llama-2-7b.Q4_K_M.gguf                         models/llama-2-7b/tokenizer/tokenizer.model                         models/llama-2-7b/model.yaml

🛠 Helper Scripts

  • tools/download.sh Fetches missing models from Hugging Face using metadata in model.yaml.
  • tools/verify-checksums.py Verifies downloaded files against stored hashes.
  • tools/generate-registry.py Generates a consolidated registry.json from all YAMLs.

🤖 GitHub Actions

A workflow in .github/workflows/download.yml runs download.sh to fetch any configured model missing from the repo.

📜 License & Attribution

All models are redistributed under their respective licenses.
Each model.yaml file carries attribution for both:

  • Original unquantized publisher
  • Quantized publisher

⚠️ Notes

  • Only ungated, redistributable models are included.
  • We do not include gated models like unquantized LLaMA weights from Meta — these must be fetched by the user directly.

For details about individual models, see their README.md inside each models/<model-name>/ folder.