Export and Deployment
Use this page to package a local checkpoint for Hugging Face and optional GGUF deployment.
Prerequisites: a valid checkpoint (checkpoints/ckpt_last.pt) and matching metadata (data/processed/meta.json or data/meta.json).
HF Export vs GGUF Export
- HF export creates Transformers-compatible artifacts for Hugging Face workflows.
- GGUF conversion creates quantized files for
llama.cpp-style runtimes.
Command(s)
Export local checkpoint to HF layout:
python scripts/export/export_hf.py \
--checkpoint checkpoints/ckpt_last.pt \
--meta data/processed/meta.json \
--output-dir outputs/hf_export
Push exported folder to HF Hub:
python scripts/export/export_hf.py \
--checkpoint checkpoints/ckpt_last.pt \
--meta data/processed/meta.json \
--output-dir outputs/hf_export \
--push \
--repo-id GhostPunishR/labcore-llm-50M
Convert HF export to GGUF and quantize:
python scripts/export/quantize_gguf.py \
--hf-dir outputs/hf_export \
--llama-cpp-dir third_party/llama.cpp \
--output-dir outputs/gguf \
--quant-type Q4_K_M
Output Files / Artifacts Produced
outputs/hf_export/:
model.safetensorsconfig.jsonlabcore_tokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonconfiguration_labcore.pymodeling_labcore.pytokenization_labcore.pyREADME.md
Load the exported folder with Transformers using trust_remote_code=True.
outputs/gguf/:
labcore-50m-f16.gguflabcore-50m-q4_k_m.gguf(orq5_k_m/ both when--quant-type all)
Warning
GGUF conversion requires a valid llama.cpp checkout with conversion script and quantizer binary available.
Common Errors
- Metadata mismatch (
txtvsbin): see Meta path mismatch. - Missing
huggingface_hub/safetensors: see Torch not installed. - Missing llama.cpp tools: see Windows path and policy issues.