Optimal Topology#
VSS Deployment Topologies#
VSS supports different deployment topologies optimized for various GPU types and performance requirements. The choice of topology depends on your hardware configurations.
Default Topology#
The default topology dedicates four GPUs for LLM NIM, two GPUs for VSS ingestion and Retrieval pipeline, and one GPU each for NeMo embedding and reranking NIMs. This topology is designed for the system where single GPU is not enough to handle multiple NIMs. For example, system with L40s GPUs.
For details on the default topology configuration, refer to Default Deployment Topology and Models in Use.