1. Core Components
- Task Manager
- Creates “Translation Task” whenever a chunk needs translating.
- Tracks status: open → in-review → accepted → paid.
- Translator Registry & Reputation
- Every translator has a profile:
id
,wallet_address
,reputation_score
,staked_tokens
- Reputation updates on each accepted translation or community vote.
- Every translator has a profile:
- Review & Voting
- Completed translations go into a review queue.
- Other translators (or end-users) up/down-vote.
- Votes weighted by voter’s reputation.
- Rewards Pool
- Powered by OTA/WOTA tokens.
- Each task has a bounty in WOTA.
- When a translation is accepted, bounty disburses pro-rata to contributor(s).
- GET /v1/pow/challenge →
{ challenge: "a1b2…", difficulty: 3 }
- Client runs
pow_gate.mine(challenge)
→ getsnonce
. - POST /v1/transcribe with form-fields
challenge, nonce, engine
, plusaudio
file and optionaltranslate_to
. - Server verifies PoW, runs STT, then (if asked) translation, returns json
{ "lang_detected":"es", "transcript":"¿Cómo estás?" }
- (Optional) open a WebSocket at
/v1/stream/transcribe
for low-latency chunked operation.
Example cURL in Thetis sdr.
bash# 1) get PoW
curl https://thetis.local/v1/pow/challenge \
-H "X-API-KEY: your_key"
# 2) transcribe
curl -X POST https://thetis.local/v1/transcribe \
-H "X-API-KEY: your_key" \
-F challenge=abcd1234 \
-F nonce=1f2e3d \
-F engine=whisper-small \
-F translate_to=en \
-F audio=@recording.wav
Define a simple engine interface in your Python plugin layer, then register all three:
python# stt_engines.py
class STTEngine:
def transcribe(self, audio_np: np.ndarray, sr: int) -> (str, str):
"""Return (lang_code, transcript)."""
class WhisperEngine(STTEngine):
def __init__(self, model_size="small"):
import whisper
self.model = whisper.load_model(model_size)
def transcribe(self, audio_np, sr=16000):
r = self.model.transcribe(audio_np, language=None, fp16=False)
lang = r.get("language") or detect(r["text"])
return lang, r["text"].strip()
class VoskEngine(STTEngine):
def __init__(self, model_path="model"):
from vosk import Model, KaldiRecognizer
self.model = Model(model_path)
def transcribe(self, audio_np, sr=16000):
rec = KaldiRecognizer(self.model, sr)
rec.AcceptWaveform((audio_np*32768).astype(np.int16).tobytes())
text = rec.Result()
return "und", json.loads(text).get("text","")
class GoogleCloudEngine(STTEngine):
def __init__(self, credentials_json):
from google.cloud import speech
self.client = speech.SpeechClient.from_service_account_json(credentials_json)
def transcribe(self, audio_np, sr=16000):
audio = speech.RecognitionAudio(content=audio_np.tobytes())
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=sr, language_code="und",
enable_automatic_punctuation=True)
resp = self.client.recognize(config=config, audio=audio)
text = " ".join(r.alternatives[0].transcript for r in resp.results)
lang = resp.results[0].language_code if resp.results else "und"
return lang, text
In your relay callback, using "whisper-small"
, "whisper-large"
, "vosk"
, or "google"
.
3. Adding translation
Treat translation as a separate pipeline stage ,wrap any of these with a translator interface:
python #translator.py
class Translator:
def translate(self, text: str, target: str) -> str: pass
class MarianMTTranslator(Translator):
def __init__(self, model_name="Helsinki-NLP/opus-mt-xx-to-en"):
from transformers import MarianMTModel, MarianTokenizer
self.tokenizer = MarianTokenizer.from_pretrained(model_name)
self.model = MarianMTModel.from_pretrained(model_name)
def translate(self, text, target):
tokens = self.tokenizer(text, return_tensors="pt", padding=True)
out = self.model.generate(**tokens)
return self.tokenizer.batch_decode(out, skip_special_tokens=True)[0]
you’d expose:
pythondef process_chunk(buffer_bytes, engine_name, translate_to=None):
audio = bytes_to_np(buffer_bytes)
lang, raw = ENGINES[engine_name].transcribe(audio)
if translate_to:
raw = TRANSLATOR.translate(raw, translate_to)
return lang, raw
When a task closes:
- Primary translator gets 70% of bounty.
- Reviewers (who voted with the winning consensus) split 20%.
- Platform fee holds 10% for operations or burning.
All payouts are on-chain transfers to each user’s wallet via your token-transfer module.
4. REST API Endpoints
yamlPOST /v1/tasks
# create a translation task
body:
{ audio_id, source_lang, target_lang, bounty: number }
GET /v1/tasks?status=open
# list open tasks
POST /v1/tasks/{task_id}/submit
# submit translation
body:
{ translator_id, text }
POST /v1/tasks/{task_id}/vote
# vote a submission
body:
{ voter_id, submission_id, vote: up|down }
POST /v1/tasks/{task_id}/accept
# mark final translation
body:
{ submission_id }
GET /v1/translators/{translator_id}/reputation
# fetch reputation score and stats
POST /v1/rewards/distribute
# trigger payout (protected/admin)
5. Data Models (simplified)
jsonc// Translator
{
"id": "trike123",
"wallet": "osmo1…",
"reputation": 3.7,
"staked": 5.0 // WOTA
}
// Task
{
"id": "task789",
"audio_id": "chunk456",
"source": "es", "target": "en",
"bounty": 10.0,
"status": "in-review"
}
// Submission
{
"id": "sub001",
"task_id": "task789",
"translator_id": "trike123",
"text": "Hello world",
"votes": { "up": 5, "down": 1 },
"score": 4.4
}
Install dependencies
bashpip install fastapi uvicorn numpy soundfile whisper vosk transformers google-cloud-speech
Run the server
bashuvicorn process_api:app --host 0.0.0.0 --port 8000
Client flow
- GET
/v1/pow/challenge
→{ challenge, difficulty }
- Mine a nonce: python
nonce, _ = PoWGate(difficulty).mine(challenge)
- POST
/v1/process
asmultipart/form-data
with fieldschallenge
,nonce
,engine
- and either
audio
(wav file) ortext
- optional
translate_to
(e.g. “en”, “es”)
- Receive
{ lang, text }
That single endpoint now handles both raw audio transcription and pure-text translation, all behind the same engine interface. Adjust model paths etc as needed