LLMBoost Hub Advanced Usage
Learn how to combine LLMBoost Hub (lbh) with advanced Docker workflows for maximum flexibility and control.
Overview
While lbh serve provides a simple one-command deployment, you may need more control over container management, storage locations, or networking configurations. This guide shows how to use lbh run and lbh attach for advanced workflows while retaining the convenience of LLMBoost Hub.
Why Use Advanced LBH Workflows?
Use these patterns when you need:
- Custom storage locations: Store models on shared filesystems like
/lustre1/$USER/llm_models - Manual container control: Start containers with
lbh run, then attach and configure manually - Integration with existing Docker setups: Combine LBH with your current containerization workflows
- Advanced networking: Custom network configurations or multi-container setups
- Development workflows: Iterative testing and debugging with direct container access
Advanced Workflow
Step 1: Start Container with lbh run
Instead of lbh serve, use lbh run to start a container without immediately launching the inference server:
lbh run meta-llama/Llama-3.1-8B-Instruct
This starts a container but doesn't run the inference server yet. The container name is automatically derived from the model name.
Step 2: Attach to Container
Use lbh attach to get a shell inside the running container:
lbh attach meta-llama/Llama-3.1-8B-Instruct
Now you're inside the container and can run advanced configuration steps.
Step 3: Run Custom Setup
Inside the container, you have full control:
# Set environment variables
export CUDA_VISIBLE_DEVICES=0,1
# Run the inference server with custom options
llmboost serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max-model-len 8192 \
--gpu-memory-utilization 0.95
Configuration with Environment Variables
LLMBoost Hub supports several environment variables for customization:
Model Storage Location
Store all models in a custom directory (useful for shared storage):
export LBH_MODELS=/lustre1/$USER/llm_models
lbh run meta-llama/Llama-3.1-8B-Instruct
This uses /lustre1/$USER/llm_models as the model cache directory and mounts it inside the container.
License File Location
Specify a custom license file path:
export LBH_LICENSE_PATH=/shared/licenses/llmboost_license.skm
lbh run meta-llama/Llama-3.1-8B-Instruct
Hugging Face Token
Set your Hugging Face token for downloading gated models:
export HF_TOKEN=hf_xxxxxxxxxxxxx
lbh run meta-llama/Llama-3.1-70B-Instruct
Advanced Use Cases
1. Shared Model Storage Across Users
In multi-user environments, store models in a shared location to avoid duplication:
# Set shared model directory
export LBH_MODELS=/shared/llm_models
# User 1 downloads model (first time)
lbh run meta-llama/Llama-3.1-8B-Instruct
# User 2 reuses the cached model
lbh run meta-llama/Llama-3.1-8B-Instruct