Setup repo with Phi 3

2025-09-27 18:26:34 +01:00
commit 0c748f1497
16 changed files with 1122 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,77 @@
+# Worka Models Repository
+
+This repository contains **Candle-supported, redistributable, community-quantized models** 
+(primarily in **GGUF** and **safetensors** formats) ready for use in [Worka](https://github.com/your-org/worka).
+
+## 📂 Repository Structure
+
+```
+models/
+  llama-2-7b/
+    README.md               # Original Hugging Face model card content
+    model.yaml              # Machine-readable metadata (formats, quantizations, files)
+    quantized/              # GGUF quantized weights
+    safetensors/            # Float16 safetensor weights
+    tokenizer/              # Tokenizer files
+tools/
+  download.sh               # Script to fetch missing models from Hugging Face
+  verify-checksums.py       # Verify downloaded files against known hashes
+  generate-registry.py      # Generate registry.json from all model.yaml files
+.gitattributes              # Configure Git LFS
+.github/workflows/download.yml  # GitHub Action to fetch models automatically
+```
+
+## 🧩 Metadata Format (`model.yaml`)
+
+Each model has a `model.yaml` file describing:
+
+- **Model name & description**
+- **Publisher attribution**:
+  - *Original unquantized model publisher* (e.g., Meta for LLaMA)
+  - *Quantization publisher* (e.g., TheBloke)
+- **Available formats** (`gguf`, `safetensors`)
+- **Quantization variants** (with user-friendly labels, file lists, download size)
+- **Tokenizer files**
+- **VRAM requirements**
+
+This YAML is used by Worka to present model options and to **sparse-checkout** only the required files.
+
+## 🚀 Using Sparse Checkout
+
+You can fetch only the files for the model/quantization you need:
+
+```bash
+git clone https://github.com/your-org/worka-models.git
+cd worka-models
+
+# Enable sparse checkout
+git sparse-checkout init --cone
+
+# Set which model files to fetch (example: LLaMA 2 7B Q4_K_M)
+git sparse-checkout set models/llama-2-7b/quantized/llama-2-7b.Q4_K_M.gguf                         models/llama-2-7b/tokenizer/tokenizer.model                         models/llama-2-7b/model.yaml
+```
+
+## 🛠 Helper Scripts
+
+- **`tools/download.sh`** – Fetches missing models from Hugging Face using metadata in `model.yaml`.
+- **`tools/verify-checksums.py`** – Verifies downloaded files against stored hashes.
+- **`tools/generate-registry.py`** – Generates a consolidated `registry.json` from all YAMLs.
+
+## 🤖 GitHub Actions
+
+A workflow in `.github/workflows/download.yml` runs `download.sh` to fetch any configured model missing from the repo.
+
+## 📜 License & Attribution
+
+All models are redistributed under their respective licenses.  
+Each `model.yaml` file carries **attribution** for both:
+- **Original unquantized publisher**
+- **Quantized publisher**
+
+## ⚠️ Notes
+
+- Only **ungated, redistributable** models are included.
+- **We do not include** gated models like unquantized LLaMA weights from Meta — these must be fetched by the user directly.
+
+---
+For details about individual models, see their `README.md` inside each `models/<model-name>/` folder.