Developer Guide
This page is for contributors working on the codebase itself.
Repository Map
src/labcore_llm/
config/ # TOML loader and defaults
data/ # dataset abstractions
model/ # GPT model implementation
tokenizer/ # char + BPE tokenizers
trainer/ # training loop, scheduler, checkpointing
scripts/ # data prep, export, quantize, fine-tune helpers
data/ # data preparation
export/ # HF export + GGUF quantization
finetune/ # LoRA instruction fine-tuning
benchmark/ # inference benchmark
demo/ # pointer to root demo entrypoint
configs/ # TOML presets and base profiles
base/ # base reference presets
presets/ # model-size/tokenizer preset families
hardware/ # hardware-oriented variants
experimental/# work-in-progress presets
tests/
unit/ # isolated component tests
integration/ # end-to-end workflow tests
Local Dev Environment
python -m venv .venv
## PowerShell
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -e ".[torch,dev]"
Validation Commands
Run tests:
Run lint rules aligned with CI:
CI Workflows
.github/workflows/ci.yml: lint + tests.github/workflows/docs.yml: MkDocs build and deploy
Contribution Quality Bar
- Keep commits focused and atomic.
- Update docs for behavior/CLI changes.
- Add tests for bug fixes and new logic.
- Do not commit large data/model artifacts.
Packaging Notes
- Project uses
src/layout with setuptools. - Optional dependency groups are defined in
pyproject.toml. - Entry scripts are plain Python files, not console-script wrappers.