Quickstart
Start deploying AI applications on Cordatus in 10 minutes. This page is a simple guide. For full steps, screenshots, and videos, follow the links in each section.
Prerequisites
- A Cordatus account with appropriate permissions
- At least one device connected to Cordatus (status: Online or Connected)
- Device with GPU support (recommended for LLM applications)
- Cordatus Client installed and running on your device
If you don't have a device connected yet, complete the Device Hub Quickstart first.
Getting Started in 10 Minutes
1. Browse Available Applications (1 minute)
- Go to Containers > Applications from the left menu
- Browse the application catalog:
- vLLM (high-throughput LLM inference)
- TensorRT-LLM (optimized NVIDIA inference)
- Ollama (simple local models)
- NVIDIA Dynamo (distributed multi-GPU)
- NVIDIA VSS (video analysis)
- Click on any application to view its Detail Page
- Check supported platforms and available Docker image versions
See full details → Application Launch Guide
2. Launch Your First Application (3 minutes)
Example: Deploy vLLM with a Small Model
- Click Start Application on the vLLM detail page
- Select Device: Choose your connected device
- Select Version: Choose the latest Docker image version
- Green checkmark = already downloaded
- Download icon = will be downloaded
- Click Next to proceed to Advanced Settings
See full details → Application Launch Guide - Section 4
3. Configure Basic Settings (3 minutes)
-
General Settings:
- Environment Name: Leave blank for auto-generated name (or enter custom name like
vllm-llama2-7b) - Select GPU: Choose All GPU or select specific GPUs
- Resource Limits: Keep default values (or customize CPU/RAM limits)
- Enable Open Web UI: Check this box (creates chat interface automatically)
- Environment Name: Leave blank for auto-generated name (or enter custom name like
-
Model Selection:
- Switch to Cordatus Models tab
- Search for
llama-2-7bor choose any small model (7B or 13B recommended for first deployment) - Click on the model to select it
-
Skip Other Settings for Now:
- Docker Options: Auto-configured
- Environment Variables: Pre-filled with defaults
- Engine Arguments: Optimized for selected model
See full details → Application Launch Guide - Section 5
4. Launch the Container (1 minute)
- Click Start Environment button (bottom right)
- Enter your sudo password when prompted
- If Docker image needs download, confirm the download
- Wait for deployment to complete (1-5 minutes depending on image size)
What happens behind the scenes:
- Cordatus connects to your device
- Downloads Docker image (if needed)
- Creates and starts the container
- Configures networking and volumes
- Starts Open Web UI container (if enabled)
See full details → Application Launch Guide - Section 6
5. Verify Deployment (2 minutes)
- Go to Containers > Containers from the left menu
- Find your newly created container group
- Verify status shows Running
- Click See the Container Informations (three dots menu)
- Check Logs tab - you should see model loading messages
- Go to Ports tab - copy the Local URL
Access Your Application:
- API Endpoint: Open the Local URL in browser (shows API documentation)
- Open Web UI: If enabled, find the second container in the group and open its URL
- Test with curl:
curl http://localhost:8000/v1/models
See full details → Container Management Guide - Section 3.6
Expected Result
After completing these steps, you should have:
-
Container Running
- Status: Running (green indicator)
- Logs show successful model loading
- API endpoint accessible via Local URL
-
Open Web UI Active (if enabled)
- Second container in group shows Running
- Chat interface accessible via browser
- Can send messages and receive AI responses
-
Application in Containers List
- Visible under Containers > Containers
- Container group properly organized
- All components healthy
What's Next?
Explore VRAM Calculator (5 minutes)
Before deploying larger models, calculate VRAM requirements:
- Go to VRAM Calculator from the main menu
- Select Model: Choose a larger model (e.g., Llama-2-70B)
- Select GPU: Choose your GPU model
- Review Results: Check if VRAM is sufficient
- Adjust Settings: Try different quantization levels (INT8, INT4)
Learn to:
- Calculate memory requirements before deployment
- Compare different quantization options
- Determine optimal batch size and sequence length
- Plan multi-GPU deployments
See full details VRAM Calculator User Guide
Add Your Own Models (10 minutes)
Use models you already downloaded on your device:
-
Configure Model Paths:
- Connect to your device
- Go to Metrics > Model Info
- Define paths for Huggingface, Ollama, or NVIDIA NIM
-
Scan for Models:
- Go to LLM Models page
- Click Explore Models on Your Device
- Click Start Scanning
- Select models to add to Cordatus
-
Deploy User Model:
- Go to LLM Models > User Models tab
- Click Deploy next to any model
- Select inference engine
- Configure and launch See full details User Models and Model Transfer Guide
Deploy Advanced Applications (15-30 minutes)
Try more complex deployments:
NVIDIA AI Dynamo (Multi-GPU Distributed Inference):
- Configure processing mode (Aggregated/Disaggregated)
- Set up router strategy (KV-Aware recommended)
- Create multiple workers with GPU assignments
- Launch distributed inference pipeline See full details NVIDIA AI Dynamo Creation Guide
NVIDIA VSS (Video Analysis):
- Configure main VSS container
- Set up VLM, LLM, Embed, and Rerank components
- Choose to create new or use existing components
- Deploy complete video analysis pipeline See full details NVIDIA VSS Creation Guide
Manage Your Containers (5 minutes)
Learn container management operations:
-
Start/Stop Containers:
- Click Start/Stop buttons for any container
- Select multiple containers for batch operations
-
View Container Information:
- Monitor real-time logs
- Review configuration parameters
- Check port mappings
-
Generate Public URLs:
- Make your applications accessible externally
- Share access with team members
-
Create Open Web UI:
- Add chat interface to existing LLM containers
- Generate public URLs for sharing
-
Duplicate Containers:
- Copy existing configurations
- Modify settings and redeploy See full details Container Management Guide
Troubleshooting Quick Tips
Container Won't Start:
- Check device status is Connected
- Verify GPU is available and not in use
- Review container logs for error messages
- Ensure sufficient disk space for Docker image
Out of VRAM Error:
- Use VRAM Calculator to verify requirements
- Try lower quantization (FP16 → INT8 → INT4)
- Reduce batch size or sequence length
- Add more GPUs or use smaller model
Model Not Found:
- For Custom Models: Verify model name is correct
- For User Models: Ensure model paths are configured
- Check model is accessible from container
- Review volume mappings in Docker Options
Open Web UI Not Working:
- Verify container is running
- Check port is not already in use
- Review Open Web UI container logs
- Ensure network connectivity between containers
Quick Reference
Application Types
| Type | Use Case | Complexity | Setup Time |
|---|---|---|---|
| Standard Apps | Basic containers | Low | 5 min |
| LLM Engines | Model inference | Medium | 5 min |
| NVIDIA Dynamo | Multi-GPU distributed | High | 5 min |
| NVIDIA VSS | Video analysis | High | 10 min |
Recommended GPU Memory
| Model Size | Quantization | Minimum VRAM | Recommended GPU |
|---|---|---|---|
| 7B | INT4 | 4-6 GB | RTX 3090, RTX 4090 |
| 7B | INT8 | 8-10 GB | RTX 4090, A10 |
| 13B | INT4 | 8-10 GB | RTX 4090, A10 |
| 13B | INT8 | 14-16 GB | A10, A100 40GB |
| 70B | INT4 | 40-50 GB | A100 80GB, 2x A100 40GB |
| 70B | INT8 | 80-90 GB | A100 80GB, 2x A100 80GB |
Key Shortcuts
- Containers > Applications: Browse application catalog
- Containers > Containers: Manage running containers
- LLM Models: Access Cordatus Models and User Models
- VRAM Calculator: Calculate memory requirements
- Device Metrics: View GPU/CPU/RAM usage
Get Help
For detailed documentation with screenshots and videos:
- Application Launch Guide
- NVIDIA AI Dynamo Guide
- NVIDIA VSS Guide
- User Models Guide
- Container Management Guide
- VRAM Calculator Guide