Learn how to use the gRPC APIs for high-performance, type-safe integration with SM-AI-MODELS services.
Why gRPC?
gRPC offers several advantages over REST for certain use cases:
| Feature | gRPC | REST |
|---|
| Performance | Binary protocol (Protocol Buffers) | Text-based (JSON) |
| Streaming | Built-in bidirectional streaming | Not natively supported |
| Type Safety | Strongly typed with .proto files | Dynamic JSON schemas |
| Code Generation | Auto-generate clients in any language | Manual client implementation |
| HTTP/2 | Required (multiplexing, header compression) | HTTP/1.1 or HTTP/2 |
Use gRPC when:
- You need high performance and low latency
- You want type safety and auto-generated clients
- You're building service-to-service communication
- You need bidirectional streaming
Use REST when:
- You need browser compatibility
- You prefer simple debugging with curl
- You want human-readable JSON
- You're building public APIs
Service Endpoints
| Service | REST Endpoint | gRPC Endpoint |
|---|
| SM-TTS-V1 | https://api-tts.withsm.ai | api-tts.withsm.ai:443 |
| SM-STT-V1 | https://api-asr.withsm.ai | api-asr.withsm.ai:443 |
Proto Files
The proto files define the gRPC service interface. Contact your account manager to obtain the proto files. You'll need these to:
- Generate client code for your language
- Understand available RPC methods
- Know the request/response message structures
TTS Service Definition
syntax = "proto3";
package tts;
service TextToSpeech {
// Streaming synthesis (bidirectional)
rpc Synthesize(stream SynthesizeRequest) returns (stream SynthesizeResponse);
// Simple unary call for short text
rpc SynthesizeSimple(SynthesizeSimpleRequest) returns (SynthesizeSimpleResponse);
}
message SynthesizeRequest {
oneof request {
SynthesisConfig config = 1; // Send first
string text = 2; // Then text chunks
}
}
message SynthesizeResponse {
bytes audio_content = 1;
bool is_final = 2;
SynthesisMetadata metadata = 3;
}
message SynthesizeSimpleRequest {
SynthesisConfig config = 1; // Configuration
string text = 2; // Text to synthesize
}
message SynthesizeSimpleResponse {
bytes audio_content = 1;
SynthesisMetadata metadata = 2;
}
message SynthesisConfig {
string voice_name = 1; // "Yara", "Nouf", "Yara_en"
string language_code = 2; // "ar" or "en"
AudioEncoding encoding = 3; // Audio format
int32 sample_rate_hz = 4; // e.g., 22050, 44100
float speaking_rate = 5; // 0.5 - 2.0
float pitch = 6;
float volume_gain_db = 7;
bool enable_ssml = 8;
}
message SynthesisMetadata {
int32 audio_duration_ms = 1;
int32 character_count = 2;
string voice_name = 3;
}
enum AudioEncoding {
ENCODING_UNSPECIFIED = 0;
LINEAR16 = 1; // 16-bit PCM
MULAW = 2; // 8-bit mu-law
ALAW = 3; // 8-bit a-law
MP3 = 4; // MP3 format
OGG_OPUS = 5; // Ogg Opus
}
ASR Service Definition
syntax = "proto3";
package asr;
service SpeechRecognition {
// Streaming recognition (bidirectional)
rpc Recognize(stream RecognizeRequest) returns (stream RecognizeResponse);
// Simple unary call for short audio clips
rpc RecognizeSimple(RecognizeSimpleRequest) returns (RecognizeSimpleResponse);
}
message RecognizeRequest {
oneof request {
RecognitionConfig config = 1; // Send first
bytes audio_content = 2; // Then audio chunks
}
}
message RecognizeResponse {
repeated SpeechRecognitionResult results = 1;
SpeechEvent event = 2;
RecognitionMetadata metadata = 3;
}
message RecognizeSimpleRequest {
RecognitionConfig config = 1;
bytes audio_content = 2;
}
message RecognizeSimpleResponse {
repeated SpeechRecognitionResult results = 1;
RecognitionMetadata metadata = 2;
}
message RecognitionConfig {
AudioEncoding encoding = 1; // Required
int32 sample_rate_hz = 2; // Required (e.g., 16000)
int32 channels = 3; // Default: 1
string language_code = 4; // "ar" or "en"
string model = 5;
bool enable_interim_results = 6; // Streaming only
bool single_utterance = 7; // Streaming only
int32 max_alternatives = 8;
repeated SpeechContext speech_contexts = 9;
string grammar_uri = 10;
VadConfig vad_config = 11;
bool enable_punctuation = 12;
bool enable_word_timestamps = 13;
}
message SpeechRecognitionResult {
repeated SpeechRecognitionAlternative alternatives = 1;
bool is_final = 2;
ResultEndReason end_reason = 3;
float stability = 4;
}
message SpeechRecognitionAlternative {
string transcript = 1;
float confidence = 2;
repeated WordInfo words = 3;
}
message WordInfo {
string word = 1;
int64 start_time_ms = 2;
int64 end_time_ms = 3;
float confidence = 4;
}
message SpeechEvent {
SpeechEventType type = 1;
int64 timestamp_ms = 2;
}
enum SpeechEventType {
EVENT_UNSPECIFIED = 0;
START_OF_SPEECH = 1;
END_OF_SPEECH = 2;
END_OF_UTTERANCE = 3;
}
enum AudioEncoding {
ENCODING_UNSPECIFIED = 0;
LINEAR16 = 1; // 16-bit PCM
MULAW = 2; // 8-bit mu-law
ALAW = 3; // 8-bit a-law
FLAC = 4; // FLAC codec
OGG_OPUS = 5; // Ogg Opus
}
Health Check Protocol
Both services implement the standard gRPC Health Checking Protocol (grpc.health.v1.Health).
Health Check RPC
service Health {
rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}
Using grpcurl
# Check TTS health
grpcurl -plaintext -d '{"service": ""}' \
api-tts.withsm.ai:443 grpc.health.v1.Health/Check
# Check ASR health
grpcurl -plaintext -d '{"service": ""}' \
api-asr.withsm.ai:443 grpc.health.v1.Health/Check
Expected response:
Health Status Values
| Status | Meaning |
|---|
SERVING | Service is healthy and ready |
NOT_SERVING | Service is not available |
UNKNOWN | Health status unknown |
SERVICE_UNKNOWN | Service doesn't exist |
Watch Health (Streaming)
Monitor health status changes in real-time:
# Stream health updates for TTS
grpcurl -plaintext -d '{"service": ""}' \
api-tts.withsm.ai:443 grpc.health.v1.Health/Watch
This will keep the connection open and stream status updates as they occur.
Service Discovery
Use grpcurl to discover available services and methods:
List All Services
# TTS services
grpcurl api-tts.withsm.ai:443 list
# ASR services
grpcurl api-asr.withsm.ai:443 list
Client Examples
Python with grpcio
Installation
pip install grpcio grpcio-tools
Generate Client Code
# Generate Python code from proto files
python -m grpc_tools.protoc \
-I./protos \
--python_out=. \
--grpc_python_out=. \
tts.proto
TTS Client Example
import grpc
from generated import tts_pb2, tts_pb2_grpc
def generate_speech(text, voice="Yara"):
"""Generate speech using gRPC."""
# Create channel
channel = grpc.secure_channel('api-tts.withsm.ai:443', grpc.ssl_channel_credentials())
stub = tts_pb2_grpc.TextToSpeechStub(channel)
# Create request
request = tts_pb2.SynthesizeRequest(
input=text,
voice=voice,
response_format="mp3",
speed=1.0
)
# Make RPC call
response = stub.Synthesize(request)
return response.audio_content
# Usage
audio = generate_speech("مرحباً بكم", voice="Yara")
with open("output.mp3", "wb") as f:
f.write(audio)
ASR Client Example
import grpc
from generated import asr_pb2, asr_pb2_grpc
def transcribe_audio(audio_path):
"""Transcribe audio using gRPC."""
# Create channel
channel = grpc.secure_channel('api-asr.withsm.ai:443', grpc.ssl_channel_credentials())
stub = asr_pb2_grpc.SpeechRecognitionStub(channel)
# Read audio file
with open(audio_path, "rb") as f:
audio_content = f.read()
# Create request
request = asr_pb2.RecognizeRequest(
audio_content=audio_content
)
# Make RPC call
response = stub.Recognize(request)
return response.text
# Usage
text = transcribe_audio("recording.wav")
print(f"Transcribed: {text}")
Health Check Example
import grpc
from grpc_health.v1 import health_pb2, health_pb2_grpc
def check_health(host):
"""Check service health."""
channel = grpc.secure_channel(host, grpc.ssl_channel_credentials())
stub = health_pb2_grpc.HealthStub(channel)
request = health_pb2.HealthCheckRequest(service="")
response = stub.Check(request)
return response.status == health_pb2.HealthCheckResponse.SERVING
# Usage
tts_healthy = check_health('api-tts.withsm.ai:443')
asr_healthy = check_health('api-asr.withsm.ai:443')
print(f"TTS: {'✓' if tts_healthy else '✗'}")
print(f"ASR: {'✓' if asr_healthy else '✗'}")
Node.js with @grpc/grpc-js
Installation
npm install @grpc/grpc-js @grpc/proto-loader
Load Proto File
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
// Load proto file
const packageDefinition = protoLoader.loadSync(
'tts.proto',
{
keepCase: true,
longs: String,
enums: String,
defaults: true,
oneofs: true
}
);
const ttsProto = grpc.loadPackageDefinition(packageDefinition).sm.tts.v2;
TTS Client
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
const fs = require('fs');
// Load proto
const packageDefinition = protoLoader.loadSync('tts.proto');
const ttsProto = grpc.loadPackageDefinition(packageDefinition).sm.tts.v2;
// Create client
const client = new ttsProto.TextToSpeech(
'api-tts.withsm.ai:443',
grpc.credentials.createSsl()
);
// Generate speech
client.Synthesize(
{
input: 'مرحباً بكم',
voice: 'Yara',
response_format: 'mp3',
speed: 1.0
},
(error, response) => {
if (error) {
console.error('Error:', error);
return;
}
fs.writeFileSync('output.mp3', response.audio_content);
console.log('Audio saved to output.mp3');
}
);
Health Check
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
// Load health check proto
const packageDefinition = protoLoader.loadSync(
'grpc/health/v1/health.proto'
);
const healthProto = grpc.loadPackageDefinition(packageDefinition).grpc.health.v1;
// Create client
const client = new healthProto.Health(
'api-tts.withsm.ai:443',
grpc.credentials.createSsl()
);
// Check health
client.Check({ service: '' }, (error, response) => {
if (error) {
console.error('Error:', error);
return;
}
console.log('Status:', response.status); // SERVING
});
Go Example
Installation
go get google.golang.org/grpc
go get google.golang.org/protobuf/cmd/protoc-gen-go
go get google.golang.org/grpc/cmd/protoc-gen-go-grpc
Generate Code
protoc --go_out=. --go-grpc_out=. tts.proto
TTS Client
package main
import (
"context"
"io/ioutil"
"log"
"google.golang.org/grpc"
pb "your-module/generated/tts"
)
func main() {
// Connect to server
conn, err := grpc.Dial("api-tts.withsm.ai:443", grpc.WithTransportCredentials(credentials.NewTLS(&tls.Config{})))
if err != nil {
log.Fatalf("Failed to connect: %v", err)
}
defer conn.Close()
// Create client
client := pb.NewTextToSpeechClient(conn)
// Generate speech
resp, err := client.Synthesize(context.Background(), &pb.SynthesizeRequest{
Input: "مرحباً بكم",
Voice: "Yara",
ResponseFormat: "mp3",
Speed: 1.0,
})
if err != nil {
log.Fatalf("RPC failed: %v", err)
}
// Save audio
ioutil.WriteFile("output.mp3", resp.AudioContent, 0644)
log.Println("Audio saved to output.mp3")
}
Error Handling
gRPC Status Codes
| Code | Name | Description |
|---|
| 0 | OK | Success |
| 3 | INVALID_ARGUMENT | Invalid request parameters |
| 13 | INTERNAL | Internal server error |
| 14 | UNAVAILABLE | Service is unavailable |
Python Error Handling
import grpc
try:
response = stub.Synthesize(request)
except grpc.RpcError as e:
if e.code() == grpc.StatusCode.INVALID_ARGUMENT:
print(f"Invalid request: {e.details()}")
elif e.code() == grpc.StatusCode.UNAVAILABLE:
print("Service is unavailable")
else:
print(f"RPC failed: {e.code()}, {e.details()}")
Node.js Error Handling
client.Synthesize(request, (error, response) => {
if (error) {
console.error('Status code:', error.code);
console.error('Message:', error.message);
console.error('Details:', error.details);
return;
}
// Handle response
});
Advanced Topics
Timeouts
# Python: Set timeout (in seconds)
response = stub.Synthesize(request, timeout=30)
// Node.js: Set deadline
const deadline = new Date();
deadline.setSeconds(deadline.getSeconds() + 30);
client.Synthesize(request, { deadline }, callback);
# Python: Add metadata
metadata = [
('authorization', 'Bearer YOUR_TOKEN'),
('x-custom-header', 'value')
]
response = stub.Synthesize(request, metadata=metadata)
// Node.js: Add metadata
const metadata = new grpc.Metadata();
metadata.add('authorization', 'Bearer YOUR_TOKEN');
client.Synthesize(request, metadata, callback);
Connection Pooling
# Python: Reuse channel
channel = grpc.secure_channel('api-tts.withsm.ai:443', grpc.ssl_channel_credentials())
stub = tts_pb2_grpc.TextToSpeechStub(channel)
# Make multiple calls
audio1 = stub.Synthesize(request1)
audio2 = stub.Synthesize(request2)
# Close when done
channel.close()
Comparison: gRPC vs REST
Request Examples
gRPC:
response = stub.Synthesize(
tts_pb2.SynthesizeRequest(
input="مرحباً",
voice="Yara"
)
)
REST:
response = requests.post(
"https://api-tts.withsm.ai/v1/audio/speech",
json={"input": "مرحباً", "voice": "Yara"}
)
Performance Comparison
| Metric | gRPC | REST |
|---|
| Payload size | ~50% smaller (binary) | Larger (JSON text) |
| Latency | Lower (HTTP/2) | Higher (HTTP/1.1) |
| Type safety | Compile-time | Runtime |
| Debugging | Requires tools | Easy with curl |
Troubleshooting
Connection Refused
# Check if service is listening
grpcurl api-tts.withsm.ai:443 list
Solution: Ensure the service is running and listening on the correct port.
Proto File Not Found
Solution: Ensure proto files are in the correct path and use -I flag:
protoc -I./protos --python_out=. tts.proto
SSL/TLS Errors
All connections use TLS by default:
credentials = grpc.ssl_channel_credentials()
channel = grpc.secure_channel('api-tts.withsm.ai:443', credentials)
For custom certificates (e.g., self-signed):
with open('ca.crt', 'rb') as f:
root_cert = f.read()
credentials = grpc.ssl_channel_credentials(root_certificates=root_cert)
channel = grpc.secure_channel('api-tts.withsm.ai:443', credentials)
Best Practices
- Reuse Channels — Don't create a new channel for every request
- Set Timeouts — Always specify a deadline to prevent hanging
- Handle Errors — Check status codes and handle failures gracefully
- Use Health Checks — Monitor service health before making requests
- Connection Pooling — Reuse connections for better performance
- Compression — Enable gzip compression for large payloads
- Keep-Alive — Configure keep-alive for long-lived connections
Next Steps
Last modified on