Get help with SM-AI-MODELS integration, troubleshooting, and service issues.
Contact
| Channel | Details | Response Time |
|---|---|---|
| Technical Support | support@unicode-solutions.com | Within 4 business hours |
| Emergency (P1) | escalation@unicode-solutions.com | Within 1 hour |
| Account / Sales | sales@unicode-solutions.com | Within 1 business day |
| Documentation Issues | docs@unicode-solutions.com | Within 2 business days |
Service Level Agreement (SLA)
| Tier | Uptime | Support Hours | Response Time |
|---|---|---|---|
| Standard | 99.5% | Sun–Thu, 8AM–6PM AST | 4 hours |
| Premium | 99.9% | 24/7 | 1 hour |
| Enterprise | 99.95% | 24/7 + Dedicated TAM | 30 minutes |
SLA applies to Unicode Solutions managed deployments. Self-hosted deployments are supported on a best-effort basis.
When Reporting Issues
Include the following information in your support request:
For API Errors
- Request ID — from
X-Request-IDheader orrequest_idin error response - Timestamp — when the error occurred (with timezone)
- Endpoint — which URL you were calling
- Request body — parameters sent (redact sensitive data)
- Full error response — complete JSON error body
- Service version — from
/healthendpoint response
For Performance Issues
- Latency measurements — TTFC and total response time
- Request volume — approximate requests per minute
- GPU info — output of
nvidia-smi - Docker logs —
docker compose logs --since 30m sm-tts-v2 - Metrics snapshot — from
/metricsendpoint
Template
Code
Service Status
Monitor the health of your SM-AI-MODELS deployment:
Code
Frequently Asked Questions
General
Q: Does SM-AI-MODELS require internet access?
A: No. SM-AI-MODELS runs entirely on-premise. All models are bundled in the Docker image. No data leaves your infrastructure.
Q: What GPU do I need?
A: Minimum NVIDIA T4 (16GB VRAM) for a single service. NVIDIA A100 (40GB) recommended for production. See Models for hardware requirements.
Q: Can I run TTS and ASR on the same GPU?
A: Yes, if the GPU has sufficient VRAM (24GB+). For production, dedicated GPUs per service are recommended.
TTS
Q: What is the maximum text length?
A: 5,000 characters per request (configurable via TTS_MAX_TEXT_LENGTH). For longer content, split into sentences and use streaming.
Q: How do I get the lowest latency?
A: Use gRPC streaming with PCM format at 16kHz. See Performance for the full optimization guide.
Q: Can I add custom voices?
A: Custom voice training is available as an Enterprise service. Contact sales@unicode-solutions.com.
ASR
Q: What is the maximum audio file size?
A: 25 MB per request (configurable via ASR_MAX_FILE_SIZE). Maximum duration is 300 seconds. For longer audio, use gRPC streaming or async processing.
Q: What sample rate should I use?
A: 16kHz mono is optimal. Higher sample rates work but don't improve accuracy. Lower than 8kHz is not supported.
Q: Does ASR support speaker diarization?
A: Not yet. Speaker diarization is planned for sm-ASR-v3. See Changelog for the roadmap.
Documentation
| Page | Description |
|---|---|
| Overview | Service overview and quick examples |
| Quick Start | Get running in minutes |
| Authentication | API key setup and security |
| Text-to-Speech | TTS endpoint reference |
| Speech Recognition | ASR endpoint reference |
| Streaming | Real-time audio streaming |
| gRPC API | gRPC endpoint reference |
| Models | Engine specs and voice details |
| Languages | Language and dialect support |
| SDKs | Python and Node.js libraries |
| Error Handling | Error codes and retry logic |
| Rate Limits | Quotas and throttling |
| Performance | Latency optimization |
| Changelog | Release history |
| API Reference | Interactive OpenAPI playground |
