Most web frameworks default to localhost in development. Always explicitly set 0.0.0.0 for containerized environments.
Not Enough Resources
Symptom: Deployment fails with “insufficient resources” or pods remain in “Pending” state.Cause: Your cluster doesn’t have enough CPU or memory capacity to run your service.Solutions:
Reduce Service Resources
Upgrade Instance Type
Increase Node Count
Lower the resource requests for your service if they’re set too high:
Go to your service Settings → Resources
Reduce CPU or Memory requests
Redeploy the service
Start with minimum resources and scale up as needed. Most applications don’t need as much as you think!
If your services legitimately need more resources:
Go to Cluster Settings → Node Pools
Select larger instance types
Update the cluster
Example: Upgrade from t3.medium (2 vCPU, 4 GB RAM) to t3.large (2 vCPU, 8 GB RAM)
Allow your cluster to scale to more nodes:
Go to Cluster Settings → Node Pools
Increase Maximum nodes count
Update the cluster
With Karpenter, this allows automatic scaling to meet demand.
Application is Crashing
Symptom: Your service deploys but immediately crashes or restarts repeatedly.Solution: Debug using the Qovery Shell
1
Access the Container
Use the Qovery CLI to access your container:
qovery shell
This opens an interactive shell inside your running container.
2
Investigate the Issue
Once inside:
Check environment variables: env
Test your startup command manually
Review application configuration files
Check for missing dependencies
3
For Rapidly Crashing Apps
If your app crashes too fast to shell into:
Remove the port temporarily from service settings (this prevents Kubernetes from restarting it)
Modify your Dockerfile to use a sleep command:
# Comment out your entrypoint# ENTRYPOINT ["npm", "start"]# Add sleep to keep container runningENTRYPOINT ["sleep", "infinity"]
Deploy with this change
Use qovery shell to debug
Fix the issue and restore the original entrypoint
Remember to restore your port configuration and entrypoint after debugging!
SSL/TLS Certificate Issues
Symptom: SSL certificates aren’t being generated for your custom domain.Cause: DNS records are not properly configured for your custom domain.Solution:
1
Identify the Problem
Check the Qovery Console for which domain is failing certificate generation. You’ll see an error indicator next to the domain.
2
Verify DNS Configuration
Your domain should have a CNAME record pointing to your Qovery cluster URL.
Verify DNS resolution:
dig your-domain.com CNAME
You should see a CNAME pointing to your Qovery cluster domain.
3
Fix and Redeploy
Update your DNS CNAME record with your domain provider
Wait for DNS propagation (can take up to 48 hours, usually minutes)
Redeploy your application in Qovery
Certificate generation should succeed
DNS changes can take time to propagate. Use DNS Checker to verify propagation globally.
Docker Build Timeout
Symptom: Your build fails with a timeout error after 30 minutes.Cause: The default Docker build timeout is 1800 seconds (30 minutes). Complex builds (like compiling large codebases) may exceed this limit.Solution:
1
Increase Build Timeout
Go to your service Settings → Advanced Settings
Find the build.timeout_max_sec parameter
Increase the value (e.g., 3600 for 1 hour)
Save and redeploy
2
Optimize Your Build (Recommended)
Consider optimizing your Dockerfile:
Use multi-stage builds
Leverage build caching effectively
Only copy necessary files
Install dependencies before copying source code
Example Multi-stage Dockerfile:
# Build stageFROM node:18 AS builderWORKDIR /appCOPY package*.json ./RUN npm ciCOPY . .RUN npm run build# Production stageFROM node:18-alpineWORKDIR /appCOPY --from=builder /app/dist ./distCOPY package*.json ./RUN npm ci --productionCMD ["node", "dist/index.js"]
Git Submodule Errors
Symptom: Build fails when trying to clone private Git submodules.Cause: Private submodules require authentication, which isn’t available during the build.Solutions:
Make Submodule Public (Recommended)
Use Git Credential Helper
Use SSH Keys
If possible, make your submodule repository public. This is the simplest solution.
Embed basic authentication in your .gitmodules file:
Symptom: Your Lifecycle Job or Cronjob fails to complete successfully.Common Causes:
Code exceptions - Errors in your application code
Out of memory - Job exceeds memory limits
Execution timeout - Job takes longer than configured maximum duration
Solutions:
1
Check Job Logs
Go to your Job service
Click Logs tab
Look for error messages or stack traces
Identify the root cause (exception, OOM, timeout)
2
Fix Based on Cause
For Code Exceptions:
Fix the bug in your code
Redeploy the job
For Out of Memory:
Increase memory allocation in Settings → Resources
Optimize your code to use less memory
For Timeouts:
Go to Settings → Max Duration
Increase the timeout value
Or optimize your job to run faster
For long-running jobs, consider breaking them into smaller tasks or using a queue system.
SnapshotQuotaExceeded Error (Database)
Symptom: Database deletion fails with SnapshotQuotaExceeded error.Cause: Qovery automatically creates a snapshot before deleting a database. If you’ve reached your cloud provider’s snapshot quota, this fails.
Solutions:
Delete Old Snapshots
Request Quota Increase
Remove obsolete database snapshots from your cloud provider:AWS RDS:
Find solutions for common runtime errors and issues you may encounter when operating services on Qovery after successful deployment.
SIGKILL Signal 137 - Memory Exhaustion
Symptom: Your container terminates unexpectedly with exit code 137 or SIGKILL signal.Cause: Your application has exceeded its memory limit. When system resources become constrained, Kubernetes forcibly terminates the container to reclaim memory (Out of Memory Kill - OOMKill).How to Identify:Check your logs for messages like:
Container killed with exit code 137
or
OOMKilled: true
Solutions:
1
Increase Memory Allocation
Go to your service Settings → Resources
Increase the Memory limit
Start with a 50% increase (e.g., 512MB → 768MB)
Redeploy and monitor
Watch your memory usage metrics to find the right allocation. Don’t over-allocate unnecessarily!
2
Investigate Memory Leaks
Before just increasing memory, check if your application has a memory leak:Signs of a Memory Leak:
Memory usage steadily increases over time
Container was fine, then started crashing after recent code changes
Memory never levels off or decreases
Recent Changes to Review:
New dependencies or library updates
Code changes in recent deployments
New features that load data into memory
Caching implementations without expiration
3
Optimize Memory Usage
Common optimization strategies:
Clear unused variables and objects
Implement pagination for large datasets
Use streaming for file processing
Add proper cache eviction policies
Profile your application to find memory-intensive code
import tracemalloctracemalloc.start()# Your code heresnapshot = tracemalloc.take_snapshot()top_stats = snapshot.statistics('lineno')for stat in top_stats[:10]: print(stat)
Continuously increasing memory without investigating the root cause will lead to higher costs and may just delay the problem!
Debugging Rapidly Crashing Applications
Symptom: Your application crashes within seconds of starting, making it impossible to connect and debug.Challenge: The container restarts so quickly that you can’t use qovery shell to investigate.Solution:
1
Temporarily Remove Application Port
Go to your service Settings → Ports
Remove or disable the application port
Deploy the changes
Removing the port prevents Kubernetes from performing health checks and auto-restarting the container.
2
Modify Dockerfile to Keep Container Running
Update your Dockerfile to override the entrypoint with a sleep command:
# Comment out your normal entrypoint/CMD# ENTRYPOINT ["npm", "start"]# CMD ["python", "app.py"]# Add sleep to keep container aliveENTRYPOINT ["sleep", "infinity"]
Or for debugging purposes:
# Run a shell insteadENTRYPOINT ["/bin/sh"]CMD ["-c", "while true; do sleep 30; done"]
Commit and deploy these changes.
3
Access the Container
Once deployed, use the Qovery CLI to shell into the container:
qovery shell
Now your container stays running and you can debug interactively!
4
Debug Manually
Inside the container, you can now:Check environment variables:
# Node.jsnpm list# Pythonpip list# Check system packageswhich <command>
Review configuration files:
cat config/app.jsoncat .env
5
Fix and Restore
Identify and fix the issue in your code
Restore the original Dockerfile entrypoint
Re-add the application port
Deploy the fixed version
Don’t forget to restore your port configuration and original entrypoint! The sleep command is only for debugging.
Helm Service Logging Limitations
Symptom: When deploying Helm charts, you can’t see logs or pod status in the Qovery Console.Cause: Qovery requires specific labels and annotations on your Kubernetes resources to enable log access and pod status visibility.Solution:Add Qovery-specific macros to your Helm chart templates:
1
Add Labels and Annotations
Update your Helm chart’s deployment.yaml, service.yaml, or job.yaml to include Qovery macros:
Find solutions for common errors you might encounter while deploying or updating Qovery clusters.
DependencyViolation Errors During Cluster Deletion
Symptom: When attempting to delete a Qovery cluster, you receive a DependencyViolation error.Cause: Resources managed outside of Qovery remain attached to cluster infrastructure elements, preventing deletion.Example Error:
DeleteError - Unknown error while performing Terraform command(terraform destroy -lock=false -no-color -auto-approve), here is the error:Error: deleting EC2 Subnet (subnet-xxx): operation error EC2: DeleteSubnet,https response error StatusCode: 400, RequestID: xxx, api error DependencyViolation:The subnet 'subnet-xxx' has dependencies and cannot be deleted.
Solution:
1
Access Cloud Provider Console
Log into your cloud provider console (AWS, GCP, Azure, or Scaleway).