Skip to content

Export and Deployment

Use this page to package a local checkpoint for Hugging Face and optional GGUF deployment. Prerequisites: a valid checkpoint (checkpoints/ckpt_last.pt) and matching metadata (data/processed/meta.json or data/meta.json).

HF Export vs GGUF Export

  • HF export creates Transformers-compatible artifacts for Hugging Face workflows.
  • GGUF conversion creates quantized files for llama.cpp-style runtimes.

Command(s)

Export local checkpoint to HF layout:

python scripts/export/export_hf.py \
  --checkpoint checkpoints/ckpt_last.pt \
  --meta data/processed/meta.json \
  --output-dir outputs/hf_export

Push exported folder to HF Hub:

python scripts/export/export_hf.py \
  --checkpoint checkpoints/ckpt_last.pt \
  --meta data/processed/meta.json \
  --output-dir outputs/hf_export \
  --push \
  --repo-id GhostPunishR/labcore-llm-50M

Convert HF export to GGUF and quantize:

python scripts/export/quantize_gguf.py \
  --hf-dir outputs/hf_export \
  --llama-cpp-dir third_party/llama.cpp \
  --output-dir outputs/gguf \
  --quant-type Q4_K_M

Output Files / Artifacts Produced

outputs/hf_export/:

  • model.safetensors
  • config.json
  • labcore_tokenizer.json
  • tokenizer_config.json
  • special_tokens_map.json
  • configuration_labcore.py
  • modeling_labcore.py
  • tokenization_labcore.py
  • README.md

Load the exported folder with Transformers using trust_remote_code=True.

outputs/gguf/:

  • labcore-50m-f16.gguf
  • labcore-50m-q4_k_m.gguf (or q5_k_m / both when --quant-type all)

Warning

GGUF conversion requires a valid llama.cpp checkout with conversion script and quantizer binary available.

Common Errors