Debuggability Guide#
Overview#
NVIDIA Cloud Functions provides comprehensive debuggability features through two main approaches:
Real-Time Logs
Access near real-time logs for faster debugging
Available through both NGC UI and CLI
No long-term storage, logs are ephemeral during workload lifecycle
Significantly reduced latency compared to traditional logging solutions
Remote Command Execution
Execute commands on function or task containers for debugging purposes
Support for common Linux commands in NGC CLI
Secure, controlled access to container environments
Real-Time Logs#
Real-time logs allow you to view function or task logs with minimal latency, providing immediate feedback during development and troubleshooting.
Key Benefits#
Immediate Feedback: View logs in near real-time, reducing debugging cycles
Reduced Latency: Significantly faster than historical log solutions
Multiple Access Methods: Available through both NGC UI and CLI interfaces
Getting Started#
Real-time logs are accessible for deployed NVIDIA Cloud Functions.
Access Logs via NGC UI
Navigate to your function in the NGC UI
Click the 3-dots button next to an active function
Select “View Logs” or “View Version Logs”
In the Logs page, you’ll see two tabs:
History Logs: For historical log analysis across different function version instances
Live Tail Logs: For near real-time log streaming
Using Live Tail Logs in NGC UI
Select the “Live Tail Logs” tab
Choose the Cluster name and Instance ID
Click “Start Session” to begin viewing live logs
Use “Pause Session” to temporarily halt the log stream
Use “Resume Session” to continue viewing logs
Click “Stop Session” to end the streaming session
Filter logs using the search box for quick identification of specific events
Note
Real-time logs are available after a function instance is actively running and a real-time logging session has begun. Once an instance terminates or restarts, these logs are no longer accessible. For historical log analysis, use the History Logs tab.
Live Tail logs are not stored and cannot be ‘replayed’ after a session ends or after the 50k buffer is exceeded.
Currently, live tail logs are only supported for functions deployed to GFN and DGXC cloud environments (note that not all GFN and DGXC environments may be supported).
Tasks will be supported soon.
Remote Command Execution#
Remote command execution allows you to run commands directly in your function’s or task’s container environment for advanced debugging purposes. Please note that the feature will depend on the user’s own container environment, i.e. if the container is a distroless container, you may not be able to access your target function or task container file system. Additionally, the default working directory will be the root directory of the target container when executing commands.
Key Benefits#
Interactive Debugging: Execute commands for troubleshooting without redeploying
Container Inspection: Examine file systems, processes, and environment variables
Secure Access: Commands are executed in a controlled, secure environment
Distroless Support: Debug containers with minimal operating system components
Getting Started#
View Available Instances
Navigate to your function or task in the NGC UI or use the CLI
Use the CLI to list instances:
1# Function
2ngc cf fn instance ls <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5
6# Task
7ngc cf task instance ls <task-id>
8--org <org-id> #NGC Organization ID
9--team <team-name> #Team name in an org
Execute Commands via NGC CLI
1# Function
2ngc cf fn instance exec <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5--instance-id <instance-id> #Instance ID
6--pod-name <pod-name> #Pod name used
7--container-name <container-name> #Container name used
8--command "<linux-command>" #linux command to be executed
9
10# Task
11ngc cf task instance exec <task-id>
12--org <org-id> #NGC Organization ID
13--team <team-name> #Team name in an org
14--instance-id <instance-id> #Instance ID
15--pod-name <pod-name> #Pod name used
16--container-name <container-name> #Container name used
17--command "<linux-command>" #linux command to be executed
1ngc cf fn instance exec my-function:v1
2--org my-organization
3--team my-team
4--instance-id --instance-id instance-1
5--pod-name pod-1234
6--container-name main
7--command "ls -la"
8
9ngc cf task instance exec my-task
10--org my-organization
11--team my-team
12--instance-id --instance-id instance-1
13--pod-name pod-1234
14--container-name main
15--command "ls -la"
NGC CLI Requirements and Examples#
CLI Version Requirements#
The debuggability features are only available in NGC CLI versions 3.131.5 and newer.
Detailed CLI Examples#
List Function Instances, Containers, and Pods
1# Function
2ngc cf fn instance ls <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5
6# Task
7ngc cf task instance ls <task-id>
8--org <org-id> #NGC Organization ID
9--team <team-name> #Team name in an org
1# Function
2ngc cf fn instance ls my-function:v1
3--org my-organization
4--team my-team
5
6# Task
7ngc cf task instance ls my-task
8--org my-organization
9--team my-team
Execute Commands on Target Containers
1# Function
2ngc cf fn instance exec <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5--instance-id <instance-id> #Instance ID
6--pod-name <pod-name> #Pod name used
7--container-name <container-name> #Container name used
8--command "<linux-command>" #linux command to be executed
9
10# Task
11ngc cf task instance exec <task-id>
12--org <org-id> #NGC Organization ID
13--team <team-name> #Team name in an org
14--instance-id <instance-id> #Instance ID
15--pod-name <pod-name> #Pod name used
16--container-name <container-name> #Container name used
17--command "<linux-command>" #linux command to be executed
1# Function
2ngc cf fn instance exec my-function:v1
3--org my-organization
4--team my-team
5--instance-id instance-1
6--pod-name pod-1234
7--container-name main
8--command "ls -la"
9
10# Task
11ngc cf task instance exec task-id
12--org my-organization
13--team my-team
14--instance-id instance-1
15--pod-name pod-1234
16--container-name main
17--command "ls -la"
Attach Log Output from a Specific Pod Container
1# Function
2ngc cf fn instance logs <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5--instance-id <instance-id> #Instance ID
6--pod-name <pod-name> #Pod name used
7--container-name <container-name> #Container name used
8
9# Task
10ngc cf task instance logs <task-id>
11--org <org-id> #NGC Organization ID
12--team <team-name> #Team name in an org
13--instance-id <instance-id> #Instance ID
14--pod-name <pod-name> #Pod name used
15--container-name <container-name> #Container name used
1# Function
2ngc cf fn instance logs my-function:v1
3--org my-organization
4--team my-team
5--instance-id instance-1
6--pod-name pod-1234
7--container-name main
8
9# Task
10ngc cf task instance logs my-task
11--org my-organization
12--team my-team
13--instance-id instance-1
14--pod-name pod-1234
15--container-name main
Attach Log Output from an Entire Instance
1# Function
2ngc cf fn instance logs <function-id>:<version-id>
3--org <org-id> #NGC Organization ID
4--team <team-name> #Team name in an org
5--instance-id <instance-id> #Instance ID
6
7# Task
8ngc cf task instance logs <task-id>
9--org <org-id> #NGC Organization ID
10--team <team-name> #Team name in an org
11--instance-id <instance-id> #Instance ID
1# Function
2ngc cf fn instance logs my-function:v1
3--org my-organization
4--team my-team
5--instance-id instance-1
6
7# Task
8ngc cf task instance logs my-task
9--org my-organization
10--team my-team
11--instance-id instance-1
Supported Commands#
The following commands are supported for remote execution:
Command/Method |
Description |
---|---|
cat |
Display file contents |
ls |
List directory contents |
cd |
Change directory |
pwd |
Print working directory |
man |
Display manual pages |
sort |
Sort lines of text files |
df |
Report file system disk space usage |
du |
Estimate file space usage |
grep |
Search for patterns in files |
find |
Search for files |
head |
Display beginning of files |
more |
Page through text |
less |
Page through text with more features |
tail |
Display end of files |
wc |
Print newline, word, and byte counts |
cut |
Remove sections from lines |
echo |
Display a line of text |
printf |
Format and print data |
Print data |
|
ps |
Report process status |
base64 |
Base64 encode/decode |
Pipe (|) |
Pipe output |
Input redirect (<) |
Redirect input |
Command separator (;) |
Separate commands |
Command chaining (&&) |
Chain commands |
Note
The command execution environment is isolated and has no impact on the function’s running state. Command execution is logged for security and audit purposes.
Security#
NVCF ensures secure debugging capabilities:
Authentication and authorization for all debugging actions
Container isolation prevents unauthorized access
Limited command set to prevent system modifications
Access control based on NGC permissions
All debugging actions are logged and auditable
Troubleshooting#
Common Error Codes#
Error Code |
Description |
Possible Resolution |
---|---|---|
400 (BadRequestException) |
Function/Task is inactive or invalid parameters provided |
Ensure function/task is active and parameters are correct |
401 (NotAuthorizedException) |
Invalid authentication token |
Check that your NGC API key or SSA token is valid |
403 (ForbiddenException) |
Insufficient permissions or function/task does not exist |
Verify that your token has the appropriate scopes and the function/task exists |
404 (NotFoundException) |
Selected pod/container/instance does not exist |
Verify that the specified resources exist and are correctly named |
429 (TooManyRequestsException) |
Rate limit exceeded |
Reduce the frequency of requests and try again later |
500 (UpstreamException) |
Internal service error |
Contact support if the issue persists |
Required Permissions#
To use the debuggability features, ensure your NGC API key has the correct permissions:
When generating an NGC API key from the NGC console, select the “Cloud Function” permission
This permission grants the necessary access to use both Live Tail Logs and Command Execution features
Limitations#
Real-time logs are ephemeral with no long-term storage
Historical logs are still available through the standard logging system
Command execution is limited to a predefined set of commands
Debugging sessions have a maximum duration of 2 hours
Output size is limited to 2MB per command
Live tail logs are only supported for functions deployed to GFN and DGXC cloud environments
Live tail logs view maintains a maximum of 50,000 lines in the console buffer
Real-time logs cannot be searched on aggregate across all functions (e.g., searching for a string across all functions in an organization)
Appendix A: Terminology#
Term |
Definition |
---|---|
NGC |
NVIDIA GPU Cloud which provides a way for users to set up and manage access to NVIDIA cloud services |
NVCF |
NVIDIA Cloud Functions |
NVCT |
NVIDIA Cloud Tasks |
Ephemeral Container |
A temporary container created within a pod for debugging purposes |
Real-time Logs |
Logs streamed with minimal latency during function execution |
DGXC |
DGX Cloud service |
History Logs |
Logs stored for longer-term analysis with search capabilities |
Live Tail Logs |
Near real-time streaming logs with minimal latency |
Distroless Container |
A container image with minimal operating system components |