优谷雅言 开发者文档
中英文双语语音评测 · 兼容声通 API 协议 · v1
概述
优谷雅言对用户朗读音频做多维度多层级评分,产出结构化评分、自然语言报告与标准示范音频。基址 https://open.shengzhiai.com,所有接口返回 JSON。
result(声通兼容字段命名)② AI 综合测评报告 report ③ 识别文本 asrText(含漏/错/增读标注)④ 标准读音 standardAudio。逐字段含义见 返回值说明。鉴权
三种方式,按场景选择:
| 方式 | 用法 | 适用 |
|---|---|---|
| API Key(推荐) | 请求头 X-App-Key: {appKey}(或查询参数 ?api_key={appKey}) | 服务端直连评测/TTS/报告接口;appKey 在控制台「API Key」页创建 |
| Bearer JWT | Authorization: Bearer {token},token 由 POST /api/v1/auth/login 颁发 | 控制台/浏览器登录态调用(在线评测页即此方式) |
| 声通 sig 签名 | 请求头 X-App-Key / X-Timestamp(秒级,±300s 内)/ X-Nonce(可选,防重放)/ X-Signature。签名:取业务参数(query/form,丢空值)按 key 升序拼成 k1=v1&k2=v2,X-Signature = Base64( HMAC-SHA256( payload, secretKey ) )。WS 握手时凭证改走 query(?appKey=×tamp=&signature=&nonce=)。 | 声通协议兼容层(见 coreType 参考)与声通兼容 WS,存量声通接入零改造迁移 |
Authorization: Bearer 头只接受登录颁发的 JWT;appKey 直接放 Bearer 会返回 {"code":2002,"message":"token无效"},appKey 请用 X-App-Key 头。快速开始
四步完成第一次评测:
- 获取密钥:在开放平台控制台创建 API Key,得到 appKey(请求头
X-App-Key)与 secretKey。 - 准备音频:wav / mp3 / ogg,推荐 16kHz、16bit、单声道 wav(≤10MB、≤5 分钟);其他采样率引擎自动重采样,过低采样率会损失评分精度。
- 调用评测:multipart 表单 POST
/api/v1/evaluate,字段audio(文件)+config(JSON 字符串,该分片的 Content-Type 必须为application/json)。 - 解析结果:取
result.overall总分与各维分;保存recordId以便事后GET /api/v1/report/{recordId}回查。
test_app_key 为试用密钥(共享、随时可能轮换,生产请在控制台创建你自己的 appKey),示例可直接复制运行;免代码体验请直接打开 /eval 在线评测。提交一段中文朗读音频做句子评测:
curl -X POST https://open.shengzhiai.com/api/v1/evaluate \
-H "X-App-Key: test_app_key" \
-F "audio=@reading.wav;type=audio/wav" \
-F 'config={"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN"};type=application/json'
import requests
cfg = '{"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN"}'
r = requests.post("https://open.shengzhiai.com/api/v1/evaluate",
headers={"X-App-Key": "test_app_key"},
files={"audio": ("reading.wav", open("reading.wav", "rb"), "audio/wav"),
"config": (None, cfg, "application/json")}) # config 分片须为 application/json
r.raise_for_status()
print(r.json()["result"]["overall"])
// OkHttp(com.squareup.okhttp3:okhttp)
OkHttpClient client = new OkHttpClient();
String cfg = "{\"coreType\":\"sentence\",\"referenceText\":\"鹅,鹅,鹅,曲项向天歌。\",\"language\":\"zh-CN\"}";
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("audio", "reading.wav",
RequestBody.create(new File("reading.wav"), MediaType.parse("audio/wav")))
.addFormDataPart("config", null,
RequestBody.create(cfg, MediaType.parse("application/json")))
.build();
Request req = new Request.Builder()
.url("https://open.shengzhiai.com/api/v1/evaluate")
.header("X-App-Key", "test_app_key")
.post(body).build();
try (Response resp = client.newCall(req).execute()) {
System.out.println(resp.body().string());
}
const fd = new FormData();
fd.append("audio", fileBlob, "reading.wav");
fd.append("config", new Blob([JSON.stringify(
{coreType:"sentence", referenceText:"鹅,鹅,鹅,曲项向天歌。", language:"zh-CN"}
)], {type:"application/json"})); // config 分片须为 application/json
const r = await fetch("https://open.shengzhiai.com/api/v1/evaluate",
{method:"POST", headers:{"X-App-Key":"test_app_key"}, body:fd});
console.log((await r.json()).result.overall);
package main
import (
"bytes"
"fmt"
"io"
"mime/multipart"
"net/http"
"net/textproto"
"os"
)
func main() {
buf := &bytes.Buffer{}
w := multipart.NewWriter(buf)
fw, _ := w.CreateFormFile("audio", "reading.wav")
f, _ := os.Open("reading.wav")
defer f.Close()
io.Copy(fw, f)
h := textproto.MIMEHeader{} // config 分片须为 application/json
h.Set("Content-Disposition", `form-data; name="config"`)
h.Set("Content-Type", "application/json")
cw, _ := w.CreatePart(h)
cw.Write([]byte(`{"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN"}`))
w.Close()
req, _ := http.NewRequest("POST",
"https://open.shengzhiai.com/api/v1/evaluate", buf)
req.Header.Set("X-App-Key", "test_app_key")
req.Header.Set("Content-Type", w.FormDataContentType())
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
fmt.Println(string(body))
}
/* gcc evaluate.c -o evaluate -lcurl (libcurl ≥ 7.56,curl_mime API) */
#include <stdio.h>
#include <curl/curl.h>
int main(void) {
curl_global_init(CURL_GLOBAL_ALL);
CURL *h = curl_easy_init();
if (!h) return 1;
curl_mime *form = curl_mime_init(h);
curl_mimepart *p = curl_mime_addpart(form); /* 音频文件分片 */
curl_mime_name(p, "audio");
curl_mime_filedata(p, "reading.wav");
curl_mime_type(p, "audio/wav");
p = curl_mime_addpart(form); /* config JSON 分片 */
curl_mime_name(p, "config");
curl_mime_data(p,
"{\"coreType\":\"sentence\",\"referenceText\":\"鹅,鹅,鹅,曲项向天歌。\",\"language\":\"zh-CN\"}",
CURL_ZERO_TERMINATED);
curl_mime_type(p, "application/json"); /* 必须:否则报 50000 */
struct curl_slist *hdr =
curl_slist_append(NULL, "X-App-Key: test_app_key");
curl_easy_setopt(h, CURLOPT_URL,
"https://open.shengzhiai.com/api/v1/evaluate");
curl_easy_setopt(h, CURLOPT_HTTPHEADER, hdr);
curl_easy_setopt(h, CURLOPT_MIMEPOST, form);
CURLcode rc = curl_easy_perform(h); /* 响应 JSON 默认写到 stdout */
if (rc != CURLE_OK)
fprintf(stderr, "error: %s\n", curl_easy_strerror(rc));
curl_slist_free_all(hdr);
curl_mime_free(form);
curl_easy_cleanup(h);
curl_global_cleanup();
return (int)rc;
}
<?php
// PHP ≥ 8.1(CURLStringFile 用于带 Content-Type 的字符串分片)
$cfg = '{"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN"}';
$ch = curl_init("https://open.shengzhiai.com/api/v1/evaluate");
curl_setopt_array($ch, [
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => ["X-App-Key: test_app_key"],
CURLOPT_POSTFIELDS => [ // 数组形式即 multipart/form-data
"audio" => new CURLFile("reading.wav", "audio/wav", "reading.wav"),
"config" => new CURLStringFile($cfg, "config", "application/json"),
],
]);
$raw = curl_exec($ch);
if ($raw === false) { die("curl error: " . curl_error($ch)); }
curl_close($ch);
$res = json_decode($raw, true);
echo $res["result"]["overall"], PHP_EOL;
朗读评测 POST/api/v1/evaluate
中英文朗读多维评分(完整度/准确度/声调/流利度/朗读技巧/情感),返回全文/句/字/音素多层级。multipart 表单:audio(文件)+ config(JSON,Content-Type 须为 application/json)。
config 核心字段
| 字段 | 说明 |
|---|---|
| coreType | 必填:word / sentence / passage / alpha / connected / open,见 coreType 参考 |
| referenceText | 必填:参考文本(开放题时为题干),≤1000 字符 |
| language | zh-CN / en-US / en-GB(缺省 en-US) |
| slack / scale / precision | 松紧度 / 量程 / 精度 |
| toneWeight | 声调占总分比例(中文,默认 0.2) |
全部可选参数(refPinyin / agegroup / phonemeOutput / includeReport / includeStandardAudio / includeAsrText / taskType 等)见 评测参数。
多语言示例
curl -X POST https://open.shengzhiai.com/api/v1/evaluate \
-H "X-App-Key: test_app_key" \
-F "audio=@reading.wav;type=audio/wav" \
-F 'config={"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN","includeReport":true,"includeStandardAudio":true,"includeAsrText":true};type=application/json'
import requests
cfg = {"coreType": "sentence",
"referenceText": "鹅,鹅,鹅,曲项向天歌。",
"language": "zh-CN", "includeReport": True}
r = requests.post("https://open.shengzhiai.com/api/v1/evaluate",
headers={"X-App-Key": "test_app_key"},
files={"audio": ("reading.wav", open("reading.wav", "rb"), "audio/wav"),
"config": (None, __import__("json").dumps(cfg, ensure_ascii=False), "application/json")})
d = r.json()
print(d["recordId"], d["result"]["overall"], d["result"]["tone"])
print(d["report"]["summary"])
// OkHttp:multipart audio + config(application/json)
String cfg = "{\"coreType\":\"sentence\",\"referenceText\":\"鹅,鹅,鹅,曲项向天歌。\",\"language\":\"zh-CN\"}";
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("audio", "reading.wav",
RequestBody.create(new File("reading.wav"), MediaType.parse("audio/wav")))
.addFormDataPart("config", null,
RequestBody.create(cfg, MediaType.parse("application/json")))
.build();
Request req = new Request.Builder()
.url("https://open.shengzhiai.com/api/v1/evaluate")
.header("X-App-Key", "test_app_key").post(body).build();
try (Response resp = new OkHttpClient().newCall(req).execute()) {
System.out.println(resp.body().string());
}
// 见「快速开始」Go 完整示例;要点:config 分片用 CreatePart 显式
// 设 Content-Type: application/json,音频用 CreateFormFile。
h := textproto.MIMEHeader{}
h.Set("Content-Disposition", `form-data; name="config"`)
h.Set("Content-Type", "application/json")
cw, _ := w.CreatePart(h)
cw.Write([]byte(`{"coreType":"sentence","referenceText":"鹅,鹅,鹅,曲项向天歌。","language":"zh-CN"}`))
// Node 20+:内置 fetch / FormData / Blob
import { readFile } from "node:fs/promises";
const fd = new FormData();
fd.append("audio", new Blob([await readFile("reading.wav")], {type:"audio/wav"}), "reading.wav");
fd.append("config", new Blob([JSON.stringify({
coreType: "sentence",
referenceText: "鹅,鹅,鹅,曲项向天歌。",
language: "zh-CN"
})], {type:"application/json"}));
const r = await fetch("https://open.shengzhiai.com/api/v1/evaluate",
{method:"POST", headers:{"X-App-Key":"test_app_key"}, body:fd});
const d = await r.json();
console.log(d.recordId, d.result.overall);
响应(节选,真实调用 recordId=eval_9aa80c616edb)
{
"recordId": "eval_9aa80c616edb", "eof": 1,
"result": {
"overall": 67, "pronunciation": 69, "tone": 100, "fluency": 71,
"rhythm": 67, "integrity": 68, "speed": 77, "rear_tone": "rise",
"duration": "7.320", "warning": [],
"words": [{"word":"鹅","pinyin":"e","rawpinyin":"e2","symbolpinyin":"é",
"charType":0,"readType":0,"tone":"tone2",
"scores":{"overall":100,"pronunciation":100,"tone":100},
"span":{"start":28,"end":32},
"phonemes":[{"phoneme":"E","phone":"e","pronunciation":100,"tone_index":"2"}], ...}],
"sentences": [{"sentence":"鹅,鹅,鹅,曲项向天歌。","index":0,
"scores":{"overall":67,"pronunciation":69,"fluency":71,"integrity":68},"details":[...]}],
"compositeReport": {"compositeScore":67,"emotionScore":66,"stopConnScore":70,
"intonationScore":72,"readSpeedScore":57, ...},
"yuguScores": {...}
},
"report": {"source":"llm","summary":"本次朗读总体表现尚可…","dimensions":{...},"suggestions":[...]},
"standardAudio": {"url":"https://…/tts/audio/ab3393f2021111ef.wav","format":"wav","duration":"4.280"},
"asrText": {"text":"鹅 鹅 鹅 曲 项 向 天 歌 …","alignment":[...]},
"warnings": []
}
返回值说明
以下逐字段说明以一次真实 /api/v1/evaluate 调用(中文《咏鹅》朗读,recordId eval_9aa80c616edb)为准,示例列为该次调用的真实取值。
顶层字段
| 字段 | 类型 | 示例(真值) | 说明 |
|---|---|---|---|
| recordId | string | "eval_9aa80c616edb" | 评测记录 ID,用于 GET /api/v1/report/{recordId} 回查;connected/open 模式前缀为 conn_ / open_ |
| eof | int | 1 | 1 = 终评结果 |
| result | object | — | 结构化评分(声通兼容字段命名),见下表 |
| report | object | — | AI 综合测评报告:source("llm" 或模板回退)、summary、dimensions(integrity/accuracy/fluency/reading_skill/emotion 逐维评语)、suggestions[] |
| standardAudio | object | — | 标准读音:url(完整 URL,可直接播放,GET 无需鉴权)、format("wav")、duration("4.280") |
| asrText | object | — | 识别文本:text(空格分隔)、alignment[](逐字对齐,见下) |
| warnings | array | [] | 顶层警告汇总;请与 result.warning[] 合并并过滤空值后使用 |
result 字段
| 字段 | 类型 | 示例(真值) | 说明 |
|---|---|---|---|
| overall | int | 67 | 总分(0 ~ scale,默认 0-100) |
| integrity | int | 68 | 完整度(朗读覆盖率主导:漏读/增读直接拉低) |
| pronunciation | int | 69 | 发音准确度(字级声学 GOP 调制) |
| tone | int | 100 | 声调(中文,声学声调分类器判定) |
| fluency | int | 71 | 流利度(语速、停顿、自然度) |
| rhythm | int | 67 | 节奏 / 朗读技巧(停连、重音、语调) |
| speed | int | 77 | 原始语速(字/分钟),不是分数;归一化语速分见 compositeReport.readSpeedScore(本例 57) |
| rear_tone | string | "rise" | 句尾语调:rise / fall |
| duration | string | "7.320" | 音频时长(秒,字符串);numeric_duration 为数值型 7.32 |
| warning | array | [] | 音频质量警告 [{code,message}],码表见 错误码参考 |
| words | array | 18 项 | 字/词级明细,见下表 |
| sentences | array | 1 项 | 句级明细:sentence / index / scores{overall,pronunciation,fluency,integrity} / details[](同 words 结构) |
| compositeReport | object | — | 综合报告分:compositeScore 67、integrityScore 68、accuracyScore 69、toneScore 100、nasalsScore 100、phonemeScore 100、fluencyScore 71、readSpeedScore 57、readSpeed 77、skillScore 67、stopConnScore 70、stressScore 67、intonationScore 72、emotionScore 66、timbreScore 68 等,以及 integritySuggest / accuracySuggest / fluencySuggest / skillSuggest / emotionSuggest 五条建议 |
| yuguScores | object | — | 引擎原生层级分:accuracy{tone,nasal,phoneme}、fluency{speed_score,naturalness}、reading_skill{pause,stress,intonation}、prosody、emotion、timbre 等(数值型) |
| kernel_version / resource_version | string | "1.0.0" | 评测内核 / 资源包版本(DOC-005 版本同步依据) |
words[] 字段(字/词级)
| 字段 | 示例(真值) | 说明 |
|---|---|---|
| word | "鹅" | 字 / 词文本 |
| pinyin / rawpinyin / symbolpinyin | "e" / "e2" / "é" | 拼音(无调 / 数字调 / 符号调) |
| charType | 0 | 0 = 正常字符;标点等非读字符为其他值 |
| readType | 0 | 0 正常 / 1 错读 / 2 漏读 / 3 重复读;增读字以插入项形式出现 |
| tone | "tone2" | 参考声调 |
| scores | {overall:100, pronunciation:100, tone:100, overall_pron:100, prominence:0} | 字级分;英文另有 stress |
| span | {start:28, end:32} | 时间区间,单位 10ms 帧(×10 = 毫秒),用于切音回放 |
| pause | {type:0, duration:0} | 该字后停顿类型与时长 |
| phonemes[] | {phoneme:"E", phone:"e", category:0, pronunciation:100, tone_index:"2", span:{…}} | 音素级评分(phonemeOutput 控制),见 音素级评分 |
| normalized_syllables[] | {syllables:"鹅", pinyin:"e", tone:"tone2", tone_sandhi:"tone2"} | 音节归一(含变调 tone_sandhi) |
| word_parts[] | {part:"鹅", charType:0, beginIndex:0, endIndex:0} | 字符切分定位 |
| phonics[] / linkable | [] / false | 英文自然拼读块 / 是否连读位(英文模式) |
asrText.alignment[] 字段(逐字对齐)
| 字段 | 示例(真值) | 说明 |
|---|---|---|
| char / index | "鹅" / 0 | 识别字与参考文本下标 |
| read_status | "correct" | correct / wrong / missed 等朗读判定 |
| start_time / end_time | 28 / 32 | 时间(10ms 帧) |
| asr_pinyin / asr_tone / asr_final | "e2" / "2" / "e" | 实际识别拼音 / 声调 / 韵母 |
| gop_score | 100.0 | 逐字声学 GOP 真值(0-100;无声学证据时为 null,不造假) |
开放题 POST/api/v1/evaluate · config.coreType = "open"
看图说话 / 情景问答 / 自由表达。与朗读评测同一端点,config 设 coreType:"open" + taskType(picture / situational / free),referenceText 作为题干/任务提示(考生自由作答);返回内容/语言/表达三组主观维 + 客观音质维(Distill-MOS)。
curl -X POST https://open.shengzhiai.com/api/v1/evaluate \
-H "X-App-Key: test_app_key" \
-F "audio=@answer.wav;type=audio/wav" \
-F 'config={"coreType":"open","taskType":"free","referenceText":"请谈谈你最喜欢的季节。","language":"zh-CN"};type=application/json'
cfg = {"coreType": "open", "taskType": "free",
"referenceText": "请谈谈你最喜欢的季节。", "language": "zh-CN"}
r = requests.post("https://open.shengzhiai.com/api/v1/evaluate",
headers={"X-App-Key": "test_app_key"},
files={"audio": ("answer.wav", open("answer.wav", "rb"), "audio/wav"),
"config": (None, json.dumps(cfg, ensure_ascii=False), "application/json")})
res = r.json()["result"]
print(res["overall"], res["content"], res["feedback"]["suggestions"])
String cfg = "{\"coreType\":\"open\",\"taskType\":\"free\","
+ "\"referenceText\":\"请谈谈你最喜欢的季节。\",\"language\":\"zh-CN\"}";
// multipart 组装与朗读评测完全一致(audio 文件 + config application/json 分片)
// 取分:result.overall / result.content / result.delivery / result.feedback
cw.Write([]byte(`{"coreType":"open","taskType":"free",` +
`"referenceText":"请谈谈你最喜欢的季节。","language":"zh-CN"}`))
// multipart 组装与朗读评测一致;解析 result.overall / content / delivery / feedback
fd.append("config", new Blob([JSON.stringify({
coreType: "open", taskType: "free",
referenceText: "请谈谈你最喜欢的季节。", language: "zh-CN"
})], {type:"application/json"}));
const d = await (await fetch("https://open.shengzhiai.com/api/v1/evaluate",
{method:"POST", headers:{"X-App-Key":"test_app_key"}, body:fd})).json();
console.log(d.result.overall, d.result.feedback);
响应(节选,真实调用 recordId=open_3538888d398c)
{
"recordId": "open_3538888d398c", "eof": 1,
"result": {
"taskType": "free", "language": "zh", "duration_s": 4.36,
"transcript": "今天天气很好我们一起去公园小鸟在唱歌", "hasSpeech": true,
"overall": 57,
"content": {"overall":50, "relevance":60, "coherence":50, "task_achievement":40},
"languageUse": {"overall":35, "grammar":30, "vocabulary":40},
"delivery": {"overall":91, "fluency":100, "pronunciation":78,
"speech_rate_label":"266.0 字/分", "n_pauses":0},
"feedback": {"strengths":"…", "weaknesses":"回答偏离了题目要求…", "suggestions":["…"]},
"audioQuality": {"mos":4.51, "quality":88},
"scoringSource": "asr+llm+vad+sfgop+distillmos"
}
}
英文连读 POST/api/v1/evaluate · config.coreType = "connected"
英文连读/失爆/弱读判定与节奏分析。与朗读评测同一端点,config 设 coreType:"connected"、language:"en-US",referenceText 为目标英文句;逐词边界标注连读是否实现。
curl -X POST https://open.shengzhiai.com/api/v1/evaluate \
-H "X-App-Key: test_app_key" \
-F "audio=@english.wav;type=audio/wav" \
-F 'config={"coreType":"connected","referenceText":"The quick brown fox jumps over the lazy dog.","language":"en-US"};type=application/json'
cfg = {"coreType": "connected", "language": "en-US",
"referenceText": "The quick brown fox jumps over the lazy dog."}
r = requests.post("https://open.shengzhiai.com/api/v1/evaluate",
headers={"X-App-Key": "test_app_key"},
files={"audio": ("english.wav", open("english.wav", "rb"), "audio/wav"),
"config": (None, json.dumps(cfg), "application/json")})
res = r.json()["result"]
print(res["connected_overall"], res["linking"], res["rhythm"], res["boundaries"])
String cfg = "{\"coreType\":\"connected\",\"language\":\"en-US\","
+ "\"referenceText\":\"The quick brown fox jumps over the lazy dog.\"}";
// multipart 组装与朗读评测完全一致
// 取分:result.connected_overall / linking / rhythm / boundaries[]
cw.Write([]byte(`{"coreType":"connected","language":"en-US",` +
`"referenceText":"The quick brown fox jumps over the lazy dog."}`))
// 解析 result.connected_overall / linking / rhythm / boundaries
fd.append("config", new Blob([JSON.stringify({
coreType: "connected", language: "en-US",
referenceText: "The quick brown fox jumps over the lazy dog."
})], {type:"application/json"}));
const d = await (await fetch("https://open.shengzhiai.com/api/v1/evaluate",
{method:"POST", headers:{"X-App-Key":"test_app_key"}, body:fd})).json();
console.log(d.result.connected_overall, d.result.boundaries);
响应(节选,真实调用 recordId=conn_7a9b15dc84f0)
{
"recordId": "conn_7a9b15dc84f0", "eof": 1,
"result": {
"connected_overall": 69, "linking": 8, "rhythm": 100,
"elision": 0, "reduction": 100,
"raw": {"linking_rate":0.548, "nPVI_V":31.5, "pctV":33.3, ...},
"n_boundaries": 3,
"boundaries": [
{"between":["quick","brown"], "tags":["elision"], "gap_ms":90.0,
"continuity":0.25, "realized":0.37, "start_ms":740.0, "end_ms":970.0},
{"between":["jumps","over"], "tags":["linking_CV"], "gap_ms":80.0, ...}
],
"calibrated": true, "classifier": true, "posterior_used": true
}
}
音素级评分(GOP)
无需单独端点:朗读评测 config 的 phonemeOutput(默认开)即输出音素级明细 —— result.words[].phonemes[] 给出每个声母/韵母的 pronunciation(0-100)、音素符号(phoneme/phone)与时间区间;asrText.alignment[].gop_score 给出逐字声学 GOP 真值(零发音标注的 CTC 后验 + 判别头打分,无声学证据时为 null)。
// 真实 words[0].phonemes(《咏鹅》首字"鹅")
"phonemes": [{"phoneme":"E", "phone":"e", "category":0,
"pronunciation":100, "tone_index":"2",
"span":{"start":28,"end":32}}]
// 真实 asrText.alignment[0]
{"char":"鹅","index":0,"read_status":"correct","gop_score":100.0,
"asr_pinyin":"e2","asr_tone":"2","asr_final":"e"}
实时 WebSocket WS/api/v1/ws/evaluate
边录边传,结束即评。协议(文本帧 JSON + 二进制音频帧):
wss://open.shengzhiai.com/api/v1/ws/evaluate
← {"event":"connected","message":"stream channel ready"}
→ {"cmd":"start","coreType":"sentence","referenceText":"今天天气很好","language":"zh-CN"}
← {"event":"started"}
→ 二进制音频分片 ×N(或文本帧 {"cmd":"audio","data":"<base64>"})
→ {"cmd":"end"}
← {"event":"result","recordId":"eval_…","eof":1,"result":{…},"report":{…},"asrText":{…},"warnings":[]}
end 后一次性返回与 HTTP 同构的终评;流式中间字幕能力见 /eval 实时模式。import asyncio, json, websockets
async def main():
async with websockets.connect("wss://open.shengzhiai.com/api/v1/ws/evaluate") as ws:
print(await ws.recv()) # {"event":"connected",...}
await ws.send(json.dumps({"cmd": "start", "coreType": "sentence",
"referenceText": "今天天气很好", "language": "zh-CN"}))
print(await ws.recv()) # {"event":"started"}
data = open("reading.wav", "rb").read()
for i in range(0, len(data), 3200): # ~100ms/帧
await ws.send(data[i:i+3200])
await ws.send(json.dumps({"cmd": "end"}))
final = json.loads(await ws.recv()) # {"event":"result",...}
print(final["result"]["overall"])
asyncio.run(main())
// JDK 11+ java.net.http.WebSocket,无第三方依赖
HttpClient client = HttpClient.newHttpClient();
WebSocket ws = client.newWebSocketBuilder()
.buildAsync(URI.create("wss://open.shengzhiai.com/api/v1/ws/evaluate"),
new WebSocket.Listener() {
@Override public CompletionStage<?> onText(WebSocket w, CharSequence data, boolean last) {
System.out.println(data); // connected / started / result
w.request(1);
return null;
}
}).join();
ws.sendText("{\"cmd\":\"start\",\"coreType\":\"sentence\","
+ "\"referenceText\":\"今天天气很好\",\"language\":\"zh-CN\"}", true);
byte[] audio = Files.readAllBytes(Path.of("reading.wav"));
for (int i = 0; i < audio.length; i += 3200)
ws.sendBinary(ByteBuffer.wrap(audio, i, Math.min(3200, audio.length - i)), true).join();
ws.sendText("{\"cmd\":\"end\"}", true);
// go get github.com/gorilla/websocket
c, _, err := websocket.DefaultDialer.Dial(
"wss://open.shengzhiai.com/api/v1/ws/evaluate", nil)
if err != nil { panic(err) }
defer c.Close()
c.ReadMessage() // connected
c.WriteJSON(map[string]string{"cmd": "start", "coreType": "sentence",
"referenceText": "今天天气很好", "language": "zh-CN"})
c.ReadMessage() // started
audio, _ := os.ReadFile("reading.wav")
for i := 0; i < len(audio); i += 3200 {
end := i + 3200
if end > len(audio) { end = len(audio) }
c.WriteMessage(websocket.BinaryMessage, audio[i:end])
}
c.WriteJSON(map[string]string{"cmd": "end"})
_, msg, _ := c.ReadMessage() // result
fmt.Println(string(msg))
// npm i ws
import WebSocket from "ws";
import { readFileSync } from "node:fs";
const ws = new WebSocket("wss://open.shengzhiai.com/api/v1/ws/evaluate");
ws.on("message", raw => {
const d = JSON.parse(raw);
if (d.event === "connected")
ws.send(JSON.stringify({cmd:"start", coreType:"sentence",
referenceText:"今天天气很好", language:"zh-CN"}));
else if (d.event === "started") {
const buf = readFileSync("reading.wav");
for (let i = 0; i < buf.length; i += 3200) ws.send(buf.subarray(i, i + 3200));
ws.send(JSON.stringify({cmd:"end"}));
} else if (d.event === "result") {
console.log(d.result.overall);
ws.close();
}
});
声通兼容实时 WebSocket WSwss://host/{coreType}
面向存量声通接入方的流式评测,URL 直接取声通 coreType(如 /sent.eval.cn、/word.eval、/para.eval.cn 等附录全集)。与同名 POST /{coreType} 兼容 REST 同址共存(升级请求走 WS、普通 POST 走 REST)。
{"event":"connected","coreType":"…"} → 客户端先发参数帧 {"refText":"…","language":"zh-CN","realtime_feedback":true}(回 {"event":"started"})→ 推音频(二进制帧,建议 640B/20ms@16k;或 {"cmd":"audio","data":"<base64>"})→ {"cmd":"end"} 触发终评 {"recordId":…,"eof":1,"result":{…}}。realtime_feedback=true 时收音过程下发进度中间帧 {"eof":0,"result":{"bytes":n}}。?token={jwt},或 sig ?appKey=×tamp=&signature=&nonce=(签名规则见 鉴权)。// 浏览器:连接声通兼容 WS 做中文句子流式评测
const ws = new WebSocket("wss://ygyx.dragonai.tech/sent.eval.cn");
ws.binaryType = "arraybuffer";
ws.onmessage = (e) => {
const d = JSON.parse(e.data);
if (d.event === "connected") ws.send(JSON.stringify({refText:"北京你好", language:"zh-CN", realtime_feedback:true}));
else if (d.event === "started") sendAudioFramesThen(() => ws.send(JSON.stringify({cmd:"end"})));
else if (d.eof === 0) console.log("进度", d.result.bytes);
else if (d.eof === 1) { console.log("终评", d.result.overall); ws.close(); }
};
标准读音 TTS POST/api/v1/tts/generate
JSON body:text(≤1000 字符)、language(zh-CN / en-US / en-GB)、voice(female / male / xiaoyan / xiaofeng)、format(wav / mp3 / ogg)、speed / pitch / volume(0-100,默认 50)、sampleRate(采样率 Hz,可选 8000 / 16000 / 24000,默认 16000)、style(≤200 字自然语言风格指令,如「用新闻播报的语气」「温柔地朗读」)。按合成字符数计量(coreType tts.standard),单 Key 限频 60 次/分。
data.warnings 会给出提示(正常为空数组 [])。curl -X POST https://open.shengzhiai.com/api/v1/tts/generate \
-H "X-App-Key: test_app_key" -H "Content-Type: application/json" \
-d '{"text":"今天天气真好","language":"zh-CN","voice":"female",
"format":"wav","sampleRate":16000,"style":"用新闻播报的语气"}'
// 真实响应
{"code":0,"message":"success",
"data":{"audioUrl":"https://…/tts/audio/ab3393f2021111ef.wav",
"duration":"4.280","format":"wav","warnings":[]},
"timestamp":1781274776737}
语音识别 ASR POST/api/v1/asr/recognize
独立语音转文字(纯转写,不评测)。multipart/form-data:audio(音频文件 wav/mp3,≥16kHz)+ language(zh 中文 Paraformer / en 英文 WhisperX,默认 zh)。返回 data.text(转写文本)、data.words(中文逐字时间戳 start_ms/end_ms)、data.duration(秒)、data.confidence(英文置信度)。按音频时长(秒)计量(coreType asr.stream),单 Key 限频 60 次/分。
curl -X POST https://open.shengzhiai.com/api/v1/asr/recognize \
-H "X-App-Key: test_app_key" \
-F "audio=@speech.wav" -F "language=zh"
// 真实响应
{"code":0,"message":"success",
"data":{"text":"今天天气很好我们一起去公园",
"language":"zh","duration":4.36,
"words":[{"word":"今","start_ms":290,"end_ms":450}, …],
"confidence":null},
"timestamp":1782632…}
报告查询 GET/api/v1/report/{recordId}
按 recordId 查询历史评测报告,返回统一信封:data.recordId / data.score(评测时的完整 result)/ data.report(AI 报告)。recordId 不存在时返回 {"code":40001,"message":"评测记录不存在: …"}。
curl https://open.shengzhiai.com/api/v1/report/eval_9aa80c616edb \
-H "X-App-Key: test_app_key"
→ {"code":0,"message":"success","data":{"recordId":"eval_9aa80c616edb","score":{…},"report":{…}}}
coreType 参考
原生接口 coreType(config 必填)
| coreType | 说明 |
|---|---|
| word | 单词 / 单字(拼音)评测 |
| sentence | 句子评测 |
| passage | 段落 / 篇章评测(含句级 sentences[] 多层级) |
| alpha | 英文字母题(referenceText 为空格分隔字母,如 "A B C") |
| connected | 英文连读评测(见英文连读) |
| open | 开放题 / 自发口语(见开放题,配 taskType) |
language 需与题型语种一致(中文 zh-CN / 英文 en-US、en-GB)。声通协议兼容层
面向声通存量客户的协议适配端点 POST /{coreType}(coreType 取声通命名:word.eval / sent.eval / para.eval 及 .cn 中文、.pro 自适应变体)。multipart 表单:audio(音频文件)+ request(JSON 字符串,内含 refText 与鉴权三元组 appKey / timestamp(秒级)/ sig);响应按声通字段命名(顶层含 applicationId / dtLastResponse / refText / result),存量接入零改造迁移。
/api/v1/evaluate。评测参数
以下参数均放在 config JSON 内(原生接口字段命名,与声通同名参数语义对齐):
| 参数 | 类型 | 默认 | 范围 / 取值 | 说明 |
|---|---|---|---|---|
| coreType | string | 必填 | word / sentence / passage / alpha / connected / open | 评测内核,见 coreType 参考 |
| referenceText | string | 必填 | ≤1000 字符 | 参考文本;开放题(open)时为题干/任务提示 |
| language | string | en-US | zh-CN / en-US / en-GB | 评测语种,需与题型语种一致 |
| slack | float | 0 | [-1, 1] | 评分松紧度:>0 更宽松,<0 更严格 |
| scale | int | 100 | (0, 100] | 分数量程上限;scale=10 时 overall ∈ [0,10] |
| precision | float | 1 | (0, 1] | 分数精度步长;0.1 = 保留一位小数 |
| agegroup | int | 3 | 1 学前 / 2 小学 / 3 >12 岁 | 年龄段评分基准(影响语速评分区间) |
| toneWeight | float | 0.2 | [0, 1] | 中文声调维在 overall 中的权重 |
| refPinyin | string | null | 空格分隔拼音串 | 多音字注音覆盖,如 "chong2 qing4",优先于 G2P 自动注音 |
| phonemeOutput | bool | true | true / false | 是否输出 words[].phonemes 音素级评分明细 |
| includeReport | bool | — | true / false | AI 综合报告开关(当前版本默认返回) |
| includeStandardAudio | bool | — | true / false | 标准读音开关(当前版本默认返回) |
| includeAsrText | bool | — | true / false | 识别文本开关(当前版本默认返回) |
| taskType | string | free | picture / situational / free | 仅 coreType=open:看图说话 / 情景问答 / 自由表达 |
错误码参考
① 平台统一信封 code(评测/TTS/报告/控制台接口)
错误响应统一为 {"code", "message", "timestamp"}(成功时为 {"code":0,"message":"success","data":…};评测成功响应直接返回结果体不包信封)。
| code | HTTP | 含义(实测 message 示例) |
|---|---|---|
| 0 | 200 | 成功 |
| 40001 | 400 | 请求参数校验失败,如 "referenceText 不能为空"、"评测记录不存在: eval_xxx" |
| 40100 | 401 | 未认证:"认证失败,请提供有效的认证信息"(缺 X-App-Key / JWT) |
| 40300 | 403 | 无权限,如 "该 API Key 未授权调用此 coreType: sentence"(Key 绑定了 coreType 白名单) |
| 40400 | 404 | 资源不存在 |
| 40900 | 409 | 资源冲突(重复创建) |
| 42900 | 429 | 请求频率超限(IP / 用户 / API Key 网关限流) |
| 42901 | 429 | 并发评测数超出套餐层级限制(trial 2 路 / standard 5 路 / enterprise 10 路),建议指数退避重试或升级套餐 |
| 42902 | 429 | 试用层 AI 报告生成达每日上限 |
| 50000 | 500 | 服务器内部错误:"system busy, please try again later"(含 config 分片缺 application/json 类型的场景) |
| 50010 | 500 | 功能未实现 |
鉴权细分另有业务码(如 2001 token 已过期、2002 token 无效),以响应 message 为准。
② 声通兼容层错误(实测 detail 原文)
| HTTP | detail | 含义 |
|---|---|---|
| 401 | [2001] missing appKey/timestamp/sig | request JSON 缺鉴权三元组 |
| 401 | [2001] sig mismatch | 签名不匹配(核对 secretKey 与拼接顺序 appKey+timestamp+secretKey) |
| 401 | [2002] timestamp out of range ±300s | 时间戳超窗;注意 timestamp 为秒级,传毫秒必触发此错 |
| 401 | [2003] unknown appKey | appKey 不存在 |
| 404 | Unknown coreType: xxx | coreType 不在支持列表(word/sent/para × .eval/.eval.cn/.pro) |
| 422 | [{"type":"missing","loc":["body","audio"],…}] | 表单字段缺失/类型错误(缺 audio 文件或 request 字段) |
③ 音频质量警告码(result.warning[],评分仍返回)
| code | message | 说明 |
|---|---|---|
| 1001 | No valid audio detected! | 未检测到有效音频(未录上音/与文本完全不一致);分数不可信,引导重录 |
| 1002 | Audio volume too low! | 音量过低(距麦太远) |
| 1003 | Audio volume too high! | 音量过高(截幅) |
| 1004 | Audio noisy! | 环境噪声明显 |
| 1005 | Audio not complete! | 音频不完整(按漏读比例判定疑似截断);分数仅供参考,建议重录 |
| 1009 | scorer degraded | 部分评分组件临时降级,本次分数仅供参考,建议重试 |
SDK 使用指南
当前形态:REST + WebSocket 直连,无需安装 SDK。所有能力经标准 HTTP multipart 与 WebSocket 暴露,本页已提供可直接复制运行的多语言调用示例:
| 语言 / 端 | 示例位置 |
|---|---|
| curl / Python / Java / JavaScript / Go / C / PHP | 快速开始(HTTP 评测全流程) |
| Python / Java / Go / Node | 朗读评测、开放题、英文连读、实时 WebSocket 各接口 Tab |
| 浏览器(录音→评测) | Demo 示例 /eval 核心代码 |
五端 SDK(已发布,对应 PRD §5.3 SDK-001~005)
各 SDK 均封装:整段评测(REST)、TTS、报告查询、原生实时 WS、声通兼容 REST/WS,内置 HMAC-SHA256 签名器(命中统一测试向量)与录音采集(16k/16bit/单声道),含 Demo/示例与 README 集成文档。点击下载(tar.gz):
| 端 | 形态 / 关键技术 | 下载 |
|---|---|---|
| PC Web(JavaScript) | ES Module + UMD;Web Crypto 签名;AudioWorklet 录音 | yugu-web-sdk.tgz |
| 微信小程序 | 纯 JS(自带 sha256/HMAC,无 subtle 依赖);wx.* API;附示例小程序 | yugu-miniprogram-sdk.tgz |
| Android | Kotlin + OkHttp(REST/WS)+ AudioRecord;Gradle 模块 + Demo Activity | yugu-android-sdk.tgz |
| iOS | Swift Package;URLSession + CryptoKit + AVAudioEngine;SwiftUI Demo | yugu-ios-sdk.tgz |
| 服务端 Java | JDK 17;java.net.http(REST/WS,零三方网络依赖)+ Jackson;含 JUnit 签名测试 | yugu-java-sdk.tgz |
对接契约(字段/签名/协议单一基准):CONTRACT.md · 校验:SHA256SUMS。仍可不装 SDK,直接按本页 REST/WS 示例接入。
Demo 示例
两个在线工具与平台同域部署,打开即用,均为单文件页面,可直接查看源码作为接入参考。以下代码片段中 BASE = "https://open.shengzhiai.com"。
/eval — 在线评测入口(中英完全分流)
中文与英文评测完全分开,先选语言再选能力:中文 /eval/zh/{read,open,realtime}(朗读字/句/篇 · 开放题口语 · 实时),English /eval/en/{read,linking,open,realtime}(reading word/sentence/passage/alphabet · 连读 linking · open · realtime)。每页语言锁定、无语言下拉。支持浏览器录音与本地上传:六维雷达、逐字四色标注、AI 综合报告、标准示范音对比,实时模式边录边出流式字幕。核心调用:
// 浏览器录音 → multipart 评测;能力仅切换 config.coreType(word/sentence/passage/connected/open/alpha),
// config.language 固定 zh-CN 或 en-US(中英分流);实时模式走 WS /api/v1/ws/evaluate。关闭浏览器音频处理三件套,交给引擎链路降噪。
const stream = await navigator.mediaDevices.getUserMedia({audio:
{noiseSuppression:false, autoGainControl:false, echoCancellation:false}});
const mr = new MediaRecorder(stream), chunks = [];
mr.ondataavailable = e => chunks.push(e.data);
mr.onstop = async () => {
const fd = new FormData();
fd.append("audio", new Blob(chunks, {type:"audio/webm"}), "rec.webm");
fd.append("config", new Blob([JSON.stringify({coreType:"sentence",
referenceText:"鹅,鹅,鹅,曲项向天歌。", language:"zh-CN"})],
{type:"application/json"}));
const r = await fetch(BASE + "/api/v1/evaluate",
{method:"POST", headers:{"X-App-Key":"test_app_key"}, body:fd});
render((await r.json()).result); // 六维分 + words[] 逐字四色着色
};
mr.start(); setTimeout(() => mr.stop(), 5000);
/annotate.html — 多评委标注工具(共识金标)
评委加载样本清单(JSON:sample_id / audio / refText),听音后对完整度、准确度、流利度、朗读技巧、情感、声调六维做 0-100 打分,纯客户端导出 CSV(sample_id,rater_id,dimension,score),不调用评测接口、与机器分相互独立。多评委各自导出后用 annotation_aggregate.py 聚合:两两 Pearson、ICC(2,k) 一致性、共识金标 CSV 与低一致样本主动学习标记,用于引擎校准与验收对比。核心流程:
// 纯前端:六维滑杆打分(0-100)→ Blob 导出 CSV,无任何网络请求
let csv = "sample_id,rater_id,dimension,score\n";
for (const sid in store)
for (const dim in store[sid]) // integrity/accuracy/fluency/reading_skill/emotion/tone
csv += `${sid},${rater},${dim},${store[sid][dim]}\n`;
download(new Blob([csv], {type:"text/csv"}), `ratings_${rater}.csv`);
// 多评委线下聚合(脚本随引擎工程交付):
// python annotation_aggregate.py ratings_A.csv ratings_B.csv ratings_C.csv
// → 两两 Pearson / ICC(2,k) / 共识金标 CSV / 低一致样本标记
FAQ 常见问题
1. 支持哪些音频格式与采样率?
支持 wav / mp3 / ogg 上传(≤10MB、≤5 分钟)。推荐 16kHz、16bit、单声道 wav(引擎内部统一以 16kHz 处理);其他采样率会自动重采样,但 8kHz 等过低采样率会损失高频信息、影响声母/音素判定精度。WebSocket 按音频文件原始字节流分片发送(推荐 wav,首帧含文件头)。
2. coreType 怎么选?中英文有什么差异?
原生接口 coreType 为必填枚举:word / sentence / passage / alpha / connected / open,不会按文本长度自动选择;语种由 language(zh-CN / en-US / en-GB)指定,需与题型一致。中文评测多产出声调(tone)维、儿化/平翘舌/前后鼻音诊断;英文多产出重音(stress)维,并可用 connected 模式做连读/失爆/弱读分析。声通命名的 coreType(sent.eval.cn 等)仅用于声通兼容层(见参考)。
3. 警告码 1001-1005、1009 分别是什么意思?要重测吗?
它们是音频质量警告(不是错误,评分仍正常返回):1001 未检测到有效音频、1002 音量过低、1003 音量过高(截幅)、1004 环境噪声明显、1005 音频不完整(按漏读比例判定疑似截断)、1009 部分评分组件临时降级。出现 1001 / 1005 时分数不可信,建议引导用户重录;1002-1004 可提示用户调整距离/环境后重试;1009 建议重试一次。警告位于 result.warning[] 数组,每项含 code 与 message,如 {"code":1001,"message":"No valid audio detected!"};顶层 warnings[] 为冗余汇总且可能含空占位,请两处合并并过滤空值后使用。
4. 鉴权方式怎么选?sig 怎么算?
① X-App-Key:请求头 X-App-Key: {appKey}(或查询参数 ?api_key=),推荐服务端新接入使用,最简单。② Bearer JWT:仅用于控制台登录态(/api/v1/auth/login 颁发);appKey 放 Bearer 头会报 code 2002 token无效。③ 声通 sig 兼容(兼容层 + 声通兼容 WS):请求头 X-App-Key / X-Timestamp(秒级)/ X-Nonce(可选)/ X-Signature;签名 = 业务参数丢空值后按 key 升序拼 k1=v1&k2=v2 再 Base64( HMAC-SHA256( payload, secretKey ) );timestamp 与服务器相差超 ±300s 判重放拒绝。WS 握手时凭证改走 query(?appKey=×tamp=&signature=&nonce=)。已接声通的客户用方式③可零改造迁移。
5. 计费按什么维度?
评测接口每次成功调用计量三项:调用次数 + 参考文本字符数 + 音频时长(秒),按 appKey 逐日累计,具体计价以所购套餐为准(实时用量在控制台「用量统计」页查看);TTS 按合成字符数计。WebSocket 流式评测与 HTTP 同口径(一次会话计一次调用)。失败调用(参数校验失败、评测异常)不计费。
6. 英文连读评测怎么调?
POST /api/v1/evaluate,multipart 与朗读评测完全一致,config 设 coreType:"connected" + language:"en-US" + referenceText。响应返回 connected_overall(连读总分)、linking(连读)、rhythm(节奏)、elision(失爆)、reduction(弱读),并在 boundaries[] 对每个词边界标注连读类型(linking_CV / elision 等)与实现度 realized。完整真实响应见 英文连读。
7. 开放题(无参考朗读文本)怎么调?
POST /api/v1/evaluate,config 设 coreType:"open" + taskType(picture 看图说话 / situational 情景问答 / free 自由表达),referenceText 填题干(仅作相关性参考,考生自由作答)。返回 content(相关/连贯/任务达成)、languageUse(语法/词汇)、delivery(流利/发音/语速)三组维度 + feedback(优点/不足/建议)+ 客观音质 audioQuality(Distill-MOS)。完整真实响应见 开放题。
8. 评测报告能查询多久?
GET /api/v1/report/{recordId} 返回 data.score(与评测时完全相同的 result)与 data.report。报告落库持久保存,当前版本不主动清理,建议业务侧仍在 90 天内回查并自行归档。recordId 不存在或已清理时返回 {"code":40001,"message":"评测记录不存在: …"}。
9. 标准示范音的 URL 怎么用?
评测响应里 standardAudio.url(及 TTS 接口的 data.audioUrl)是完整 URL,GET 不需要鉴权头,可直接喂给 <audio> 标签播放或下载。其路径形如 /tts/audio/{id}.wav,同路径在平台域同样可达(https://open.shengzhiai.com/tts/audio/{id}.wav),浏览器端建议改写为平台域相对路径以避免跨域。
10. 返回 400(code 40001)/ 500(code 50000)常见原因?
50000 最常见原因是 config 分片没有声明 Content-Type:config 必须作为 multipart 的一个分片传入且该分片 Content-Type 为 application/json(curl 写 -F 'config={…};type=application/json',JS 用 new Blob([json],{type:"application/json"})),否则被当作 octet-stream 解析失败。40001 常见原因:缺 coreType / referenceText(message 直接给出缺哪个);coreType 不在枚举(word/sentence/passage/connected/open/alpha);language 取值非 zh-CN/en-US/en-GB;referenceText 超 1000 字符。另:API Key 绑定了 coreType 白名单时调未授权类型返回 403 code 40300。
11. 并发限制是多少?超限怎么办?
评测并发按客户层级限额:试用(trial)2 路、标准(standard)5 路、企业(enterprise)10 路;超限立即返回 429 {"code":42901},不排队、不计费,客户端应做指数退避重试(如 1s/2s/4s)。试用层 AI 报告生成另有每日上限(超出返回 42902)。网关层对 IP/用户/Key 还有 QPS 限流(42900)。需要更高并发请升级套餐或联系商务。
12. WebSocket 实时评测的消息怎么发?
连接 wss://open.shengzhiai.com/api/v1/ws/evaluate 后服务端先推 {"event":"connected"};客户端发 {"cmd":"start","coreType":"sentence","referenceText":"…","language":"zh-CN"}(收到 {"event":"started"})→ 持续发二进制音频分片(或 {"cmd":"audio","data":"<base64>"} 文本帧)→ 发 {"cmd":"end"},服务端返回 {"event":"result",…} 终评(与 HTTP 响应同构)。当前开放平台 WS 为"边录边传、结束即评";录音过程中的流式字幕/跟读反馈见 /eval 实时模式。四语言完整示例见 实时 WebSocket。
实时接口清单(OpenAPI 自动同步)
/openapi.json(引擎 OpenAPI 反代)拉取,与线上引擎 API 版本自动同步,无需手工维护。注:清单为引擎全量端点;公网平台域开放的入口以上文各章节为准。| 方法 | 路径 | 说明 |
|---|