Android

Overview

This document will guide developer using the real-time voice translation SDK (RTVT) from LiveData, to implement translation services in real-time voice scenarios.

Integration

Service Activation

To integration the LiveData RTVT SDK, you need to register an personal/enterprise account from LiveData official website (https://www.ilivedata.com/) , create a real-time voice translation service project and update the corresponding project parameters in SDK.

Version Support

Android 5.0 (api21) and above.

Requirements

  • Audio format : Support PCM OPUS audio format
  • Sampling rate: 16KHz
  • Bytes: 640byte
  • Encoding: 16-bit depth
  • Channels: Monaural

Initialization

public static RTVTClient CreateClient(String endpoint, long pid, String uid, RTVTPushProcessor pushProcessor, Context applicationContext)
Parameter Type M/O Description
endpoint string M endpoint (check from LiveData Console service configuration)
pid long M project id
pushProcessor - M OverloadedCallback, specific reference
Context - M applicationContext

Notice: you can find method of automatic reconnection from pushProcessor, so there is no need to concern about loss of connection.

pushProcessor

RTVTPushProcessor callback class method:

    //RTVT close connection (default automatic reconnection)
    public void rtvtConnectClose(){}

    //RTVT restart reconnection
    //Each reconnection will evaluate the return value of reloginWillStart:
    //if it returns false, the reconnection process will be interrupted
    public boolean reloginWillStart(int reloginCount){return true;}

    // RTVT complete reconnection
    //if `successful` is false, it indicates that the final reconnection has failed, and `answer` will contain detailed error codes and error messages;
    //if `successful` is true, it indicates that the reconnection was successful
    public void reloginCompleted(boolean successful, RTVTStruct.RTVTAnswer answer, int reloginCount){}


    //recognizedResult
    /*
     * @param startTs: ms timestamp of voice starts
     * @param endTs: ms timestamp of voice ends
     * @param recTs: ms timestamp of voice recognition
     * @param language: source language
     * @param srcVoiceText : ecognized text
     */
    public void recognizedResult(long streamId, long startTs, long endTs, long recTs, String language, String srcVoiceText){}


    //recognizedTempResult
    /*
     * @param startTs: ms timestamp of voice starts
     * @param endTs: ms timestamp of voice ends
     * @param recTs: ms timestamp of voice recognition
     * @param language: source language
     * @param srcVoiceText: recognized temporary text
    */
    public void recognizedTempResult(long streamId, long startTs, long endTs, long recTs, String language, String srcVoiceText){}


    //translatedResult
    /*
     * @param startTs: ms timestamp of voice starts
     * @param endTs: ms timestamp of voice ends
     * @param recTs: ms timestamp of voice recognition
     * @param language:targe language
     * @param destVoiceText:translated text
    */
    public void translatedResult(long streamId, long startTs, long endTs, long recTs, String language, String destVoiceText){}

    //translatedTempResult
    /*
     * @param startTs: ms timestamp of voice starts
     * @param endTs: ms timestamp of voice ends
     * @param recTs: ms timestamp of voice recognition
     * @param language: targe language
     * @param destVoiceText: translated temporary text
    */
    public void translatedTempResult(long streamId, long startTs, long endTs, long recTs, String language, String destVoiceText){}

Login

public void login( String token, long ts, UserInterface.IRTVTEmptyCallback callback)
Parameter Type M/O Description
token string M token generation using key from LiveData Console service configuration
ts long M token reference timestamp(Unit: seconds)

Start translate

public void startTranslate(String srcLanguage, String destLanguage, String[] srcAltLanguage, boolean asrResult, boolean tempResult, boolean transResult, String userId, final RTVTUserInterface.IRTVTCallback<VoiceStream> callback)
Parameter Type M/O Description
srcLanguage string M source language.
destLanguage string M target language.
The target language can be passed as an empty string if only transcription is needed.
srcAltLanguage List<String> O The range of alternative languages for the source language supports a maximum of three language types.
asrResult boolean M Set whether the final result of voice recognition is needed. The recognition result is provided through the recognizedResult callback.
transResult boolean M Set whether the result of the translation is needed. The translation result is provided through the translatedResult callback.
tempResult boolean M Set whether temporary results are needed.
userId string O user id, input as needed.
ttsResult bool O Voice output, true means voice output, false means no voice output.
ttsSpeaker string O Type of tone
codec RTVTStruct.Codec M Audio format (PCM or OPUS)
attrs string O User-defined information
callback - - the RTVT server will generate a streamId and callbacks to the SDK after successful operation.

Start translate (Multiple Language)

public void startTranslate(String srcLanguage, String destLanguage, List<String> srcAltLanguage, boolean asrResult, boolean tempResult, boolean transResult, boolean ttsResult, String ttsSpeaker, String userId, RTVTStruct.Codec codec, String attrs, final RTVTUserInterface.IRTVTCallback<VoiceStream> callback)
Parameter Type M/O Description
srcLanguage string M source language.
srcAltLanguage List<String> O The range of alternative languages for the source language supports a maximum of three language types.
asrResult boolean M Set whether the final result of voice recognition is needed. The recognition result is provided through the recognizedResult callback.
tempResult boolean M Set whether temporary results are needed.
userId string O user id, input as needed.
ttsResult bool O voice output, value is true means enabled
callback - - the RTVT server will generate a streamId and callbacks to the SDK after successful operation.

Notice:
1.In scenarios where callback of recognition results is needed, asrResult should be set to true, srcLanguage is mandatory, and srcAltLanguage is optional.
2.In scenarios where callback of translation results is needed, transResult should be set to true, destLanguage is mandatory, and it cannot be an empty string.
3.If temporary recognition results and temporary translation results are needed, tempResult should be set to true.
4.If a language is passed into srcAltLanguage, the RTVT will default to a language recognition process first. The beginning part of the voice (about 3 seconds) will be used for language recognition, and the subsequent recognition/translation results will be returned normally.
5.If the language passed is not within the range of supported languages, a error message indicating “language not supported” will be displayed; if the language passed is not enabled in the project, a message indicating “project does not support” will be displayed.
6.In the event of a automatic reconnection, this method must be called again to obtain a new streamId.

Send voice clip

public void sendVoice(long streamId, long seq, byte[] voicedata, long voiceDataTs,  UserInterface.IRTVTEmptyCallback callback)
Parameter Type M/O Description
streamId long M stream ID
seq long M audio segment sequence number (preferably in order)
voicedata byte[] M audio data,default of 640 bytes
voiceDataTs long M audio frame reference timestamp

Notice: If no voice data is sent for a certain period of time, the RTVT will perform a timeout process. At this point, it is necessary to call the sendVoice method again to obtain a new streamId.

Send voice clip (Multiple Language)

public void sendVoice(long streamId, long seq, byte[] voicedata, long voiceDataTs, List<String> dstLanguageList, UserInterface.IRTVTEmptyCallback callback)
Parameter Type M/O Description
streamId long M stream ID
seq long M audio segment sequence number (preferably in order)
voicedata byte[] M audio data,default of 640 bytes
voiceDataTs long M audio frame reference timestamp
dstLanguageList List<String> M list of languages needed for translation

Notice: If no voice data is sent for a certain period of time, the RTVT will perform a timeout process. At this point, it is necessary to call the sendVoice method again to obtain a new streamId.

Stop translate

public void stopTranslate(long streamId)
Parameter Type M/O Description
streamId long M stream Id needed to translate

Close RTVT

public void closeRTVT()

Notice: The network broadcast listener will hold the RTVTClient object. If this interface is not called, the RTVTClient object will continue to be held and not released. If it is released and needs to be used again, you must call RTVTCenter.CreateClient again.

Error code

Error code Description
800000 Unknown error
800002 Unverified link
800003 Invalid parameter
800101 Invalid system time
800102 Illegal token, invalid encoding
800103 Invalid pid
800105 Unsupported language
800106 Too many alternative languages
800107 Translation stream reaches upper limit
800200 Stream id does not exist

More information

For SDK download and more information, please go to Github