Changelog

Track all changes, improvements, and fixes to SM-AI-MODELS.

Versioning: SM-AI-MODELS follows Semantic Versioning. API changes are documented with migration guides when needed.

v2.1.0 — Arabic Dialects · Current Release

Released: 2026-07

New Features

🗣️ Arabic dialect selection (TTS) — new dialect parameter on REST, gRPC, and WebSocket, also accepted as the X-TTS-Dialect header. Supported dialects: ar-najdi, ar-hijazi, ar-qassimi (Yara), ar-egyptian (Sherine), ar-levantine (Myriam). Unknown or omitted values default to ar-najdi.
🎙️ New voices — Sherine (Egyptian) and Myriam (Levantine) Arabic female voices, alongside Yara (Saudi: Najdi / Hejazi / Qassimi).
🌍 ASR multi-dialect — Arabic transcription now spans Najdi, Hejazi, Qassimi, Egyptian, and Levantine, auto-detected from speech (no dialect parameter — pass language=ar or omit). Output preserves dialectal spelling (no MSA normalization).

Changes

Yara is now the Saudi reference voice (ar-najdi / ar-hijazi / ar-qassimi) and remains the default.
⚠️ Removed voices — Nouf and Atheer are no longer available; use Yara, Sherine, or Myriam. Requests for a removed/unknown voice fall back to the default.

Availability note: Rollout is staged — ar-najdi (Yara) and ar-egyptian (Sherine) are live; ar-hijazi, ar-qassimi, ar-levantine, and the Myriam voice may fall back to the default until fully deployed.

v2.0.0

Released: 2026

New Features

🚀 SM-TTS-V1 — Next-generation neural TTS engine with improved Arabic pronunciation
🚀 SM-STT-V1 — Upgraded ASR model with lower word error rate
🎙️ Nouf voice — New Arabic female voice with warm, conversational tone
🇬🇧 Yara_en voice — English support via dedicated English voice
⚡ gRPC streaming — Real-time audio streaming via gRPC
🔊 OPUS format — Added OPUS audio output for low-latency streaming
🔊 FLAC format — Added lossless FLAC output
🎚️ Speed control — Adjustable speech speed from 0.5x to 2.0x

Improvements

Reduced TTS latency by ~40% compared to v1
Improved Arabic diacritics handling
Better mixed Arabic/English text processing
Enhanced number normalization (Arabic and English)
Improved audio quality at 16kHz sample rate

API Changes

New endpoint path: /v1/audio/speech (was /synthesize in v1)
New endpoint path: /v1/audio/transcriptions (was /transcribe in v1)
New response format for errors (structured JSON with error codes)
Added /health and /ready endpoints
gRPC streaming endpoints

Breaking Changes

⚠️ v1 endpoints deprecated — /synthesize and /transcribe still work but will be removed
⚠️ Response format changed — Error responses now use {"error": {"code": ..., "message": ...}}
⚠️ Default audio format — Changed from WAV to MP3

Migration from v1

v1	v2	Notes
`POST /synthesize`	`POST /v1/audio/speech`	New path
`POST /transcribe`	`POST /v1/audio/transcriptions`	New path, multipart form
`text` parameter	`input` parameter	Renamed in TTS
`speaker` parameter	`voice` parameter	Renamed, new voice names
WAV default output	MP3 default output	Set `response_format: "wav"` for old behavior

v1.0.0 — Legacy

Released: 2025 — ⚠️ Deprecated

Initial release with basic Arabic TTS
Single voice (Yara)
WAV output only
REST API only

End of Life: v1 endpoints will be removed in a future release. Please migrate to v2.

Deprecation Policy

Stage	Timeline	Action Required
Deprecated	Feature announced as deprecated	Begin migration planning
Sunset	6 months after deprecation	Complete migration
Removed	After sunset date	Feature no longer available

Deprecated features are marked with ⚠️ in the documentation. Breaking changes are announced at least 30 days in advance via:

This changelog
API response headers (X-Deprecation-Warning)
Email notification to registered API key owners

Last modified on July 6, 2026

Changelog

Track all changes, improvements, and fixes to SM-AI-MODELS.

Versioning: SM-AI-MODELS follows Semantic Versioning. API changes are documented with migration guides when needed.

v2.1.0 — Arabic Dialects · Current Release

Released: 2026-07

New Features

🗣️ Arabic dialect selection (TTS) — new dialect parameter on REST, gRPC, and WebSocket, also accepted as the X-TTS-Dialect header. Supported dialects: ar-najdi, ar-hijazi, ar-qassimi (Yara), ar-egyptian (Sherine), ar-levantine (Myriam). Unknown or omitted values default to ar-najdi.
🎙️ New voices — Sherine (Egyptian) and Myriam (Levantine) Arabic female voices, alongside Yara (Saudi: Najdi / Hejazi / Qassimi).
🌍 ASR multi-dialect — Arabic transcription now spans Najdi, Hejazi, Qassimi, Egyptian, and Levantine, auto-detected from speech (no dialect parameter — pass language=ar or omit). Output preserves dialectal spelling (no MSA normalization).

Changes

Yara is now the Saudi reference voice (ar-najdi / ar-hijazi / ar-qassimi) and remains the default.
⚠️ Removed voices — Nouf and Atheer are no longer available; use Yara, Sherine, or Myriam. Requests for a removed/unknown voice fall back to the default.

Availability note: Rollout is staged — ar-najdi (Yara) and ar-egyptian (Sherine) are live; ar-hijazi, ar-qassimi, ar-levantine, and the Myriam voice may fall back to the default until fully deployed.

v2.0.0

Released: 2026

New Features

🚀 SM-TTS-V1 — Next-generation neural TTS engine with improved Arabic pronunciation
🚀 SM-STT-V1 — Upgraded ASR model with lower word error rate
🎙️ Nouf voice — New Arabic female voice with warm, conversational tone
🇬🇧 Yara_en voice — English support via dedicated English voice
⚡ gRPC streaming — Real-time audio streaming via gRPC
🔊 OPUS format — Added OPUS audio output for low-latency streaming
🔊 FLAC format — Added lossless FLAC output
🎚️ Speed control — Adjustable speech speed from 0.5x to 2.0x

Improvements

Reduced TTS latency by ~40% compared to v1
Improved Arabic diacritics handling
Better mixed Arabic/English text processing
Enhanced number normalization (Arabic and English)
Improved audio quality at 16kHz sample rate

API Changes

New endpoint path: /v1/audio/speech (was /synthesize in v1)
New endpoint path: /v1/audio/transcriptions (was /transcribe in v1)
New response format for errors (structured JSON with error codes)
Added /health and /ready endpoints
gRPC streaming endpoints

Breaking Changes

⚠️ v1 endpoints deprecated — /synthesize and /transcribe still work but will be removed
⚠️ Response format changed — Error responses now use {"error": {"code": ..., "message": ...}}
⚠️ Default audio format — Changed from WAV to MP3

Migration from v1

v1	v2	Notes
`POST /synthesize`	`POST /v1/audio/speech`	New path
`POST /transcribe`	`POST /v1/audio/transcriptions`	New path, multipart form
`text` parameter	`input` parameter	Renamed in TTS
`speaker` parameter	`voice` parameter	Renamed, new voice names
WAV default output	MP3 default output	Set `response_format: "wav"` for old behavior

v1.0.0 — Legacy

Released: 2025 — ⚠️ Deprecated

Initial release with basic Arabic TTS
Single voice (Yara)
WAV output only
REST API only

End of Life: v1 endpoints will be removed in a future release. Please migrate to v2.

Deprecation Policy

Stage	Timeline	Action Required
Deprecated	Feature announced as deprecated	Begin migration planning
Sunset	6 months after deprecation	Complete migration
Removed	After sunset date	Feature no longer available

Deprecated features are marked with ⚠️ in the documentation. Breaking changes are announced at least 30 days in advance via:

This changelog
API response headers (X-Deprecation-Warning)
Email notification to registered API key owners

Last modified on July 6, 2026