iOS
Update:
Overview
This document will guide developer using the real-time voice translation SDK (RTVT) from LiveData, to implement translation services in real-time voice scenarios.
Integration
Service Activation
To integration the LiveData RTVT SDK, you need to register an personal/enterprise account from LiveData official website (https://www.ilivedata.com/) , create a real-time voice translation service project and update the corresponding project parameters in SDK.
Version Support
iOS 12.0 and above.
Requirements
- Audio format : Only PCM format is supported
- Sampling rate: 16KHz
- Encoding: 16-bit depth
- Channels: Monaural
Configuring build settings
- Setting Linker Flags
- Go to the “Build Settings” tab under your project’s “TARGETS”.
- Add the “-ObjC” flag to the “Other Linker Flags” section.
- Make sure to do this in the “ALL” configurations view.
- Ensure that the “O” and “C” in “-ObjC” are capitalized and the preceding hyphen “-” is included.
- Ensuring Support for Objective-C++
- Your project needs to have at least one source file with a “.mm” extension to support Objective-C++.
- If not present, you can rename an existing “.m” file to “.mm”.
- Adding Libraries
- Add the “libresolv.9.tbd” library to your project.
- This is typically done in the “Link Binary With Libraries” section.
Initialization
+ (nullable instancetype)clientWithEndpoint:(nonnull NSString * )endpoint
projectId:(int64_t)projectId
delegate:(id <RTVTProtocol>)delegate;
Parameter
Type
M/O
Description
endpoint
string
M
endpoint (check from LiveData Console service configuration)
projcetId
int64
M
project id
delgate
-
-
For delgate, refer to the following content for inclusion
RTVTProtocol delegate
/// translatedResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: translated text
/// - language: target language
/// - recTs: ms timestamp of voice recognition
-(void)translatedResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// recognizedResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: recognized text
/// - language: source language
/// - recTs: ms timestamp of voice recognition
-(void)recognizedResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// translatedTempResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: translated temporary text
/// - language: target language
/// - recTs: ms timestamp of voice recognition
-(void)translatedTmpResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// recognizedTempResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: recognized temporary text
/// - language: source language
/// - recTs: ms timestamp of voice recognition
-(void)recognizedTmpResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
Notice: you can find method of automatic reconnection
from RTVTProtocol, so there is no need to concern about loss of connection.
Login
- (void)loginWithToken:(nonnull NSString *)token
ts:(int64_t)ts
success:(RTVTLoginSuccessCallBack)loginSuccess
connectFail:(RTVTLoginFailCallBack)loginFail;
Parameter
Type
M/O
Description
token
string
M
token generation using key from LiveData Console service configuration
ts
int64
M
token reference timestamp
Start translate
-(void)starStreamTranslateWithAsrResult:(BOOL)asrResult
transResult:(BOOL)transResult
tempResult:(BOOL)tempResult
userId:(NSString * _Nullable)userId
srcLanguage:(nonnull NSString *)srcLanguage
destLanguage:(nonnull NSString *)destLanguage
srcAltLanguage:(NSArray <NSString*> * _Nullable) srcAltLanguage
success:(void(^)(int64_t streamId))successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter
Type
M/O
Description
srcLanguage
string
M
source language
destLanguage
string
M
target language.
The target language can be passed as an empty string if only transcription is needed.
srcAltLanguage
array
O
The range of alternative languages for the source language supports a maximum of three language types.
asrResult
bool
M
Set whether the final result of voice recognition is needed.
transResult
bool
M
Set whether the result of the translation is needed.
tempResult
bool
M
Set whether temporary results are needed.
userId
string
O
user id, input as needed.
ttsResult
bool
O
voice output, value is true means enabled
callback
-
-
the RTVT server will generate a streamId and callbacks to the SDK after successful operation.
Start translate (Multiple Language)
-(void)multi_starTranslateWithAsrResult:(BOOL)asrResult
tempResult:(BOOL)tempResult
userId:(NSString * _Nullable)userId
srcLanguage:(nonnull NSString *)srcLanguage
srcAltLanguage:(NSArray <NSString*> * _Nullable) srcAltLanguage
success:(void(^)(int64_t streamId))successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter
Type
M/O
Description
asrResult
bool
M
Set whether the final result of voice recognition is needed.
tempResult
bool
M
Set whether temporary results are needed.
userId
string
O
user id, input as needed.
srcLanguage
string
M
source language
srcAltLanguage
array
O
he range of alternative languages for the source language supports a maximum of three language types.
callback
-
-
the RTVT server will generate a streamId and callbacks to the SDK after successful operation.
Notice:
1.In scenarios where callback of recognition results is needed, asrResult
should be set to true, srcLanguage
is mandatory, and srcAltLanguage
is optional.
2.In scenarios where callback of translation results is needed, transResult
should be set to true, destLanguage
is mandatory, and it cannot be an empty string.
3.If temporary recognition results and temporary translation results are needed, tempResult
should be set to true.
4.If a language is passed into srcAltLanguage
, the RTVT will default to a language recognition process first. The beginning part of the voice (about 3 seconds) will be used for language recognition, and the subsequent recognition/translation results will be returned normally.
5.If the language passed is not within the range of supported languages, a error message indicating “language not supported” will be displayed; if the language passed is not enabled in the project, a message indicating “project does not support” will be displayed.
6.In the event of a automatic reconnection, this method must be called again to obtain a new streamId.
Send voice clip
-(void)sendVoiceWithStreamId:(int64_t)streamId
voiceData:(nonnull NSData*)voiceData
seq:(int64_t)seq
ts:(int64_t)ts
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter
Type
M/O
Description
streamId
int64_t
M
stream ID
seq
int64_t
M
audio segment sequence number (preferably in order)
voiceData
byte
M
audio data,default of 640 bytes
ts
int64_t
M
audio frame reference timestamp
Send voice clip (Multiple Language)
-(void)multi_sendVoiceWithStreamId:(int64_t)streamId
voiceData:(nonnull NSData*)voiceData
destLanguages:(NSArray<NSString*>*)destLanguages
seq:(int64_t)seq
ts:(int64_t)ts
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter
Type
M/O
Description
streamId
int64_t
M
stream ID
seq
int64_t
M
audio segment sequence number (preferably in order)
destLanguage
string
M
target language
voiceData
byte
M
audio data,default of 640 bytes
ts
int64_t
M
audio frame reference timestamp
Notice: If no voice data is sent for a certain period of time, the RTVT will perform a timeout process. At this point, it is necessary to call the starStreamTranslateWithAsrResult
method or multi_starTranslateWithAsrResult
method again to obtain a new streamId.
Stop translate
-(void)endTranslateWithStreamId:(int)streamId
lastSeq:(int)lastSeq
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter
Type
M/O
Description
streamId
int
M
stream Id need to translate
lastSeq
int
M
the final audio frame
Close RTVT
- (BOOL)closeConnect;
Error code
Error code
Description
800000
unknown error
800001
unverified link
800002
login failed
800003
expired token
800004
invalid verification time
800005
invalid token
800006
streamId does not exist
More information
For SDK download and more information, please go to Github.
Overview
This document will guide developer using the real-time voice translation SDK (RTVT) from LiveData, to implement translation services in real-time voice scenarios.
Integration
Service Activation
To integration the LiveData RTVT SDK, you need to register an personal/enterprise account from LiveData official website (https://www.ilivedata.com/) , create a real-time voice translation service project and update the corresponding project parameters in SDK.
Version Support
iOS 12.0 and above.
Requirements
- Audio format : Only PCM format is supported
- Sampling rate: 16KHz
- Encoding: 16-bit depth
- Channels: Monaural
Configuring build settings
- Setting Linker Flags
- Go to the “Build Settings” tab under your project’s “TARGETS”.
- Add the “-ObjC” flag to the “Other Linker Flags” section.
- Make sure to do this in the “ALL” configurations view.
- Ensure that the “O” and “C” in “-ObjC” are capitalized and the preceding hyphen “-” is included.
- Ensuring Support for Objective-C++
- Your project needs to have at least one source file with a “.mm” extension to support Objective-C++.
- If not present, you can rename an existing “.m” file to “.mm”.
- Adding Libraries
- Add the “libresolv.9.tbd” library to your project.
- This is typically done in the “Link Binary With Libraries” section.
Initialization
+ (nullable instancetype)clientWithEndpoint:(nonnull NSString * )endpoint
projectId:(int64_t)projectId
delegate:(id <RTVTProtocol>)delegate;
Parameter | Type | M/O | Description |
---|---|---|---|
endpoint | string | M | endpoint (check from LiveData Console service configuration) |
projcetId | int64 | M | project id |
delgate | - | - | For delgate, refer to the following content for inclusion |
RTVTProtocol delegate
/// translatedResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: translated text
/// - language: target language
/// - recTs: ms timestamp of voice recognition
-(void)translatedResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// recognizedResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: recognized text
/// - language: source language
/// - recTs: ms timestamp of voice recognition
-(void)recognizedResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// translatedTempResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: translated temporary text
/// - language: target language
/// - recTs: ms timestamp of voice recognition
-(void)translatedTmpResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
/// recognizedTempResult
/// - Parameters:
/// - startTs: ms timestamp of voice starts
/// - endTs: ms timestamp of voice ends
/// - result: recognized temporary text
/// - language: source language
/// - recTs: ms timestamp of voice recognition
-(void)recognizedTmpResultWithStreamId:(int64_t)streamId
startTs:(int64_t)startTs
endTs:(int64_t)endTs
result:(NSString * _Nullable)result
language:(NSString * _Nullable)language
recTs:(int64_t)recTs;
Notice: you can find method of automatic reconnection
from RTVTProtocol, so there is no need to concern about loss of connection.
Login
- (void)loginWithToken:(nonnull NSString *)token
ts:(int64_t)ts
success:(RTVTLoginSuccessCallBack)loginSuccess
connectFail:(RTVTLoginFailCallBack)loginFail;
Parameter | Type | M/O | Description |
---|---|---|---|
token | string | M | token generation using key from LiveData Console service configuration |
ts | int64 | M | token reference timestamp |
Start translate
-(void)starStreamTranslateWithAsrResult:(BOOL)asrResult
transResult:(BOOL)transResult
tempResult:(BOOL)tempResult
userId:(NSString * _Nullable)userId
srcLanguage:(nonnull NSString *)srcLanguage
destLanguage:(nonnull NSString *)destLanguage
srcAltLanguage:(NSArray <NSString*> * _Nullable) srcAltLanguage
success:(void(^)(int64_t streamId))successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter | Type | M/O | Description |
---|---|---|---|
srcLanguage | string | M | source language |
destLanguage | string | M | target language. The target language can be passed as an empty string if only transcription is needed. |
srcAltLanguage | array | O | The range of alternative languages for the source language supports a maximum of three language types. |
asrResult | bool | M | Set whether the final result of voice recognition is needed. |
transResult | bool | M | Set whether the result of the translation is needed. |
tempResult | bool | M | Set whether temporary results are needed. |
userId | string | O | user id, input as needed. |
ttsResult | bool | O | voice output, value is true means enabled |
callback | - | - | the RTVT server will generate a streamId and callbacks to the SDK after successful operation. |
Start translate (Multiple Language)
-(void)multi_starTranslateWithAsrResult:(BOOL)asrResult
tempResult:(BOOL)tempResult
userId:(NSString * _Nullable)userId
srcLanguage:(nonnull NSString *)srcLanguage
srcAltLanguage:(NSArray <NSString*> * _Nullable) srcAltLanguage
success:(void(^)(int64_t streamId))successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter | Type | M/O | Description |
---|---|---|---|
asrResult | bool | M | Set whether the final result of voice recognition is needed. |
tempResult | bool | M | Set whether temporary results are needed. |
userId | string | O | user id, input as needed. |
srcLanguage | string | M | source language |
srcAltLanguage | array | O | he range of alternative languages for the source language supports a maximum of three language types. |
callback | - | - | the RTVT server will generate a streamId and callbacks to the SDK after successful operation. |
Notice:
1.In scenarios where callback of recognition results is needed, asrResult
should be set to true, srcLanguage
is mandatory, and srcAltLanguage
is optional.
2.In scenarios where callback of translation results is needed, transResult
should be set to true, destLanguage
is mandatory, and it cannot be an empty string.
3.If temporary recognition results and temporary translation results are needed, tempResult
should be set to true.
4.If a language is passed into srcAltLanguage
, the RTVT will default to a language recognition process first. The beginning part of the voice (about 3 seconds) will be used for language recognition, and the subsequent recognition/translation results will be returned normally.
5.If the language passed is not within the range of supported languages, a error message indicating “language not supported” will be displayed; if the language passed is not enabled in the project, a message indicating “project does not support” will be displayed.
6.In the event of a automatic reconnection, this method must be called again to obtain a new streamId.
Send voice clip
-(void)sendVoiceWithStreamId:(int64_t)streamId
voiceData:(nonnull NSData*)voiceData
seq:(int64_t)seq
ts:(int64_t)ts
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter | Type | M/O | Description |
---|---|---|---|
streamId | int64_t | M | stream ID |
seq | int64_t | M | audio segment sequence number (preferably in order) |
voiceData | byte | M | audio data,default of 640 bytes |
ts | int64_t | M | audio frame reference timestamp |
Send voice clip (Multiple Language)
-(void)multi_sendVoiceWithStreamId:(int64_t)streamId
voiceData:(nonnull NSData*)voiceData
destLanguages:(NSArray<NSString*>*)destLanguages
seq:(int64_t)seq
ts:(int64_t)ts
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter | Type | M/O | Description |
---|---|---|---|
streamId | int64_t | M | stream ID |
seq | int64_t | M | audio segment sequence number (preferably in order) |
destLanguage | string | M | target language |
voiceData | byte | M | audio data,default of 640 bytes |
ts | int64_t | M | audio frame reference timestamp |
Notice: If no voice data is sent for a certain period of time, the RTVT will perform a timeout process. At this point, it is necessary to call the starStreamTranslateWithAsrResult
method or multi_starTranslateWithAsrResult
method again to obtain a new streamId.
Stop translate
-(void)endTranslateWithStreamId:(int)streamId
lastSeq:(int)lastSeq
success:(RTVTAnswerSuccessCallBack)successCallback
fail:(RTVTAnswerFailCallBack)failCallback;
Parameter | Type | M/O | Description |
---|---|---|---|
streamId | int | M | stream Id need to translate |
lastSeq | int | M | the final audio frame |
Close RTVT
- (BOOL)closeConnect;
Error code
Error code | Description |
---|---|
800000 | unknown error |
800001 | unverified link |
800002 | login failed |
800003 | expired token |
800004 | invalid verification time |
800005 | invalid token |
800006 | streamId does not exist |
More information
For SDK download and more information, please go to Github.