Command Reference
Complete reference for all LLMBoost Hub (lbh) commands.
Quick Reference
| Command | Purpose |
|---|---|
lbh login | Authenticate with license |
lbh fetch [model] | Find available models |
lbh list | Show local models and status |
lbh prep <model> | Download image and model |
lbh run <model> | Start container |
lbh serve <model> | Start inference server |
lbh test <model> | Send test request |
lbh attach <model> | Open shell in container |
lbh stop <model> | Stop container |
lbh status | Show model status |
lbh tune <model> | Run autotuner |
Getting Help
# Show all available commands
lbh -h
# Get help for a specific command
lbh [COMMAND] -h
# Enable verbose output for troubleshooting
lbh -v [COMMAND]
Core Commands
lbh login
Authenticate with your LLMBoost license.
lbh login
Behavior:
- Reads from
$LBH_LICENSE_PATHif set, otherwise prompts for a token - Validates the license online
- Saves the license file to
$LBH_HOME
lbh fetch
Search for available models supported by LLMBoost.
lbh fetch [model]
Arguments:
[model]: Optional model name pattern (supports regex-style matching). If omitted, lists all available models.
Examples:
# Search for Llama models
lbh fetch llama
# Search for specific model
lbh fetch Llama-3.1-8B
# List all available models
lbh fetch
Behavior:
- Fetches the latest supported models from the LLMBoost registry
- Filters results to match your available GPU hardware
lbh list
List local images and their status.
lbh list [model]
Arguments:
[model]: Optional model name to filter results
Status Indicators:
pending: Model not prepared; Docker image or model assets missingstopped: Model prepared but container not runningrunning: Container running but idleinitializing: Container running and starting LLMBoost serverserving: LLMBoost server ready to accept requeststuning: Autotuner running
lbh prep
Download Docker image and model assets.
lbh prep <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace (e.g.,meta-llama/Llama-3.1-8B-Instruct)
Options:
--only-verify: Only check digests and sizes without downloading--fresh: Remove existing image and re-download model assets from HuggingFace
Examples:
# Prepare a model
lbh prep meta-llama/Llama-3.1-8B-Instruct
# Verify existing downloads
lbh prep meta-llama/Llama-3.1-8B-Instruct --only-verify
# Force fresh download
lbh prep meta-llama/Llama-3.1-8B-Instruct --fresh
lbh run
Start a container for the specified model.
lbh run <Repo/Model-Name> [OPTIONS] -- [DOCKER_FLAGS...]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
--image <image>: Override the Docker image--model_path <path>: Override the model assets path--restart: Restart container if already running
Docker Flags:
Pass additional Docker flags after --:
Examples:
# Basic run
lbh run meta-llama/Llama-3.1-8B-Instruct
# Run with custom memory limit
lbh run meta-llama/Llama-3.1-8B-Instruct -- --memory=32g
# Run with custom network
lbh run meta-llama/Llama-3.1-8B-Instruct -- --network=my-network
# Restart existing container
lbh run meta-llama/Llama-3.1-8B-Instruct --restart
Behavior:
- Automatically mounts
$LBH_HOMEand$LBH_WORKSPACE - Injects
HF_TOKENif available - AMD GPUs: Maps
/dev/driand/dev/kfd - NVIDIA GPUs: Uses
--gpus all
lbh serve
Start the LLMBoost inference server inside a container.
lbh serve <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
--host <host>: Server host address (default:0.0.0.0)--port <port>: Server port (default:8080)--detached: Don't wait for server to be ready--force: Skip GPU utilization checks
Examples:
# Start server with defaults
lbh serve meta-llama/Llama-3.1-8B-Instruct
# Custom port
lbh serve meta-llama/Llama-3.1-8B-Instruct --port 8011
# Detached mode (don't wait)
lbh serve meta-llama/Llama-3.1-8B-Instruct --detached
# Force serve (skip GPU checks)
lbh serve meta-llama/Llama-3.1-8B-Instruct --force
Behavior:
- Waits until server is ready (unless
--detached) - Automatically runs
lbh prepandlbh runif needed - Server will be accessible at
http://<host>:<port>
lbh test
Send a test request to the inference server.
lbh test <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
--query <text>: Custom test query (default: predefined test question)-t <n>: Number of test iterations (default: 1)--host <host>: Server host (default:127.0.0.1)--port <port>: Server port (default:8080)
Examples:
# Basic test
lbh test meta-llama/Llama-3.1-8B-Instruct
# Custom query
lbh test meta-llama/Llama-3.1-8B-Instruct --query "What is AI?"
# Multiple iterations
lbh test meta-llama/Llama-3.1-8B-Instruct -t 5
# Custom port
lbh test meta-llama/Llama-3.1-8B-Instruct --port 8011
lbh attach
Open a shell inside the running container.
lbh attach <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
-c <container>: Specify container name or ID
Example:
lbh attach meta-llama/Llama-3.1-8B-Instruct
lbh stop
Stop the running container.
lbh stop <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
-c <container>: Specify container name or ID
Example:
lbh stop meta-llama/Llama-3.1-8B-Instruct
lbh status
Show the status of models.
lbh status [model]
Arguments:
[model]: Optional model name to filter results
Example:
# Status of all models
lbh status
# Status of specific model
lbh status meta-llama/Llama-3.1-8B-Instruct
Advanced Commands
lbh tune
Run the autotuner to optimize performance.
lbh tune <Repo/Model-Name> [OPTIONS]
Arguments:
<Repo/Model-Name>: Full model name from HuggingFace
Options:
--metrics <metric>: Optimization metric (default:throughput)--detached: Run tuner in background--image <image>: Override Docker image
Examples:
# Run autotuner
lbh tune meta-llama/Llama-3.1-8B-Instruct
# Run in background
lbh tune meta-llama/Llama-3.1-8B-Instruct --detached
Behavior:
- Stores results to
$LBH_HOME/inference.db - Automatically loads optimized settings on next
lbh serve
lbh completions
Set up shell completions for easier command usage.
# For current shell session
eval "$(lbh completions)"
# Persist for virtual environment
lbh completions --venv
# Persist for shell profile
lbh completions --profile
Cluster Commands (Multi-Node Deployments)
-
lbh cluster install [--kubeconfig PATH] [--docker-username USER] [--docker-pat TOKEN] [--docker-email EMAIL] [-- EXTRA_HELM_ARGS]- Install LLMBoost Helm chart and Kubernetes infrastructure for multi-node deployments.
- Displays access credentials for management and monitoring UIs after installation.
- Requires running Kubernetes cluster and helm installed.
- Docker authentication options:
--docker-username,--docker-pat,--docker-email: Provide credentials directly (all three required together)- Alternatively, run
docker loginand credentials will be read from~/.docker/config.json - If neither provided, cluster will be installed without Docker registry secret
-
lbh cluster deploy [-f CONFIG_FILE] [--kubeconfig PATH]- Deploy models across cluster nodes based on configuration file.
- Generates and applies Kubernetes CRD manifests.
- Config template:
$LBH_HOME/utils/template_cluster_config.jsonc
-
lbh cluster status [--kubeconfig PATH] [--show-secrets]- Show status of all model deployments and management services.
- Displays summary statistics:
Models: <ready>/<total> and Mgmt.: <ready>/<total> - Shows model deployment table with pod status, restarts, and error messages.
- Service URLs for management UI and monitoring (Grafana).
- Use
--show-secretsto display access credentials (masked). - Use
-v --show-secretsfor full unmasked credentials.
-
lbh cluster logs [--models|--management] [--pod POD_NAME] [--tail TAIL_ARGS...] [--grep GREP_ARGS...] [--kubeconfig PATH]- View logs from model deployment or management pods.
--models: Show logs from model deployment pods.--management: Show logs from management/monitoring pods (displays as table).--pod POD_NAME: Filter to specific pod by name.--tail TAIL_ARGS: Show last N lines from workspace logs (default: 10).--grep GREP_ARGS: Filter logs by pattern (uses awk for pattern matching).- Defaults to showing both model and management logs if no filter specified.
-
lbh cluster remove <MODEL_NAME> [--all] [--kubeconfig PATH]- Remove specific model deployments from the cluster.
- Deletes LLMBoostDeployment custom resources by name.
--all: Remove all model deployments (requires confirmation unless used with --force).- Example:
lbh cluster remove facebook/opt-125morlbh cluster remove --all
-
lbh cluster uninstall [--kubeconfig PATH] [--force]- Uninstall LLMBoost cluster resources.
- Prompts for confirmation unless
--forceis used. - Does not automatically delete the namespace.
Next: Explore LLMBoost Features to see what you can build.