Support

Get help with SM-AI-MODELS integration, troubleshooting, and service issues.

Contact

Channel	Details	Response Time
Technical Support	support@unicode-solutions.com	Within 4 business hours
Emergency (P1)	escalation@unicode-solutions.com	Within 1 hour
Account / Sales	sales@unicode-solutions.com	Within 1 business day
Documentation Issues	docs@unicode-solutions.com	Within 2 business days

Service Level Agreement (SLA)

Tier	Uptime	Support Hours	Response Time
Standard	99.5%	Sun–Thu, 8AM–6PM AST	4 hours
Premium	99.9%	24/7	1 hour
Enterprise	99.95%	24/7 + Dedicated TAM	30 minutes

SLA applies to Unicode Solutions managed deployments. Self-hosted deployments are supported on a best-effort basis.

When Reporting Issues

Include the following information in your support request:

For API Errors

Request ID — from X-Request-ID header or request_id in error response
Timestamp — when the error occurred (with timezone)
Endpoint — which URL you were calling
Request body — parameters sent (redact sensitive data)
Full error response — complete JSON error body
Service version — from /health endpoint response

For Performance Issues

Latency measurements — TTFC and total response time
Request volume — approximate requests per minute
GPU info — output of nvidia-smi
Docker logs — docker compose logs --since 30m sm-tts-v2
Metrics snapshot — from /metrics endpoint

Template

Code
 
Subject: [SM-AI-MODELS] <Brief description>

Request ID: req_abc123def456
Service: SM-TTS-V1 / SM-STT-V1
Timestamp: 2024-06-15T10:30:00+03:00
Environment: Production / Staging / Development

Description:
<What happened, what you expected, and what actually occurred>

Steps to reproduce:
1. ...
2. ...

Error response:
<paste full JSON error>

Additional context:
<logs, screenshots, metrics>

Service Status

Monitor the health of your SM-AI-MODELS deployment:

Code
 
# Quick health check
curl -s http://YOUR_HOST:9999/health | jq
curl -s http://YOUR_HOST:8088/health | jq

# Detailed readiness
curl -s http://YOUR_HOST:9999/ready | jq
curl -s http://YOUR_HOST:8088/ready | jq

# Check rate limit status
curl -sI http://YOUR_HOST:9999/health | grep X-RateLimit

Frequently Asked Questions

General

Q: Does SM-AI-MODELS require internet access?
A: No. SM-AI-MODELS runs entirely on-premise. All models are bundled in the Docker image. No data leaves your infrastructure.

Q: What GPU do I need?
A: Minimum NVIDIA T4 (16GB VRAM) for a single service. NVIDIA A100 (40GB) recommended for production. See Models for hardware requirements.

Q: Can I run TTS and ASR on the same GPU?
A: Yes, if the GPU has sufficient VRAM (24GB+). For production, dedicated GPUs per service are recommended.

TTS

Q: What is the maximum text length?
A: 5,000 characters per request (configurable via TTS_MAX_TEXT_LENGTH). For longer content, split into sentences and use streaming.

Q: How do I get the lowest latency?
A: Use gRPC streaming with PCM format at 16kHz. See Performance for the full optimization guide.

Q: Can I add custom voices?
A: Custom voice training is available as an Enterprise service. Contact sales@unicode-solutions.com.

ASR

Q: What is the maximum audio file size?
A: 25 MB per request (configurable via ASR_MAX_FILE_SIZE). Maximum duration is 300 seconds. For longer audio, use gRPC streaming or async processing.

Q: What sample rate should I use?
A: 16kHz mono is optimal. Higher sample rates work but don't improve accuracy. Lower than 8kHz is not supported.

Q: Does ASR support speaker diarization?
A: Not yet. Speaker diarization is planned for sm-ASR-v3. See Changelog for the roadmap.

Documentation

Page	Description
Overview	Service overview and quick examples
Quick Start	Get running in minutes
Authentication	API key setup and security
Text-to-Speech	TTS endpoint reference
Speech Recognition	ASR endpoint reference
Streaming	Real-time audio streaming
gRPC API	gRPC endpoint reference
Models	Engine specs and voice details
Languages	Language and dialect support
SDKs	Python and Node.js libraries
Error Handling	Error codes and retry logic
Rate Limits	Quotas and throttling
Performance	Latency optimization
Changelog	Release history
API Reference	Interactive OpenAPI playground

Last modified on February 7, 2026