Transcribe Audio using AI in Compose Multiplatform
Use Groq AI to transcribe data in Jetpack Compose
We will be using Groq AI to transcribe audio to text in Compose Multiplatform.
Getting the API Key
We need to get the API key from Groq. Create a free account from this link.
Then, go to the API Keys
section and create a new API key.
One thing to note here is that Groq is not related to Twitter/X Grok.
Dependency setup
Groq provides REST APIs to transcribe the data among other things. While that is good, there is a Kotlin library that provides a wrapper around the Groq API. This library describes itself as an idiomatic Kotlin Multiplatform library for the Groq API.
You can check out the repo for all the benefits it provides.
Another thing to not here is that grok-kt repository is not an official wrapper of Groq.
Let us now add the required dependencies. Add the groq and ktor dependency in the libs.versions.toml
file.
[versions]
groq = "0.1.1"
ktor = "3.0.2"
[libraries]
ktor-client-core = { group = "io.ktor", name = "ktor-client-core", version.ref = "ktor" }
ktor-client-android = { group = "io.ktor", name = "ktor-client-android", version.ref = "ktor" }
ktor-client-darwin = { group = "io.ktor", name = "ktor-client-darwin", version.ref = "ktor" }
groq = { module = "io.github.vyfor:groq-kt", version.ref = "groq"}
Add these dependencies in the build.gradle.kts
file.
kotlin {
sourceSets {
androidMain.dependencies {
implementation(libs.ktor.client.android)
}
commonMain.dependencies {
implementation(libs.kotlinx.serialization)
implementation(libs.ktor.client.core)
implementation(libs.groq)
}
iosMain.dependencies {
implementation(libs.ktor.client.darwin)
}
}
}
Storing API Key
There are various ways of storing and fetching the API keys securely. We will use the expect/actual pattern and fetch it from the local.properties
file in Android and from Config.plist
in iOS. You can use other more secure ways as well.
We will create a file called Enviroment
containing the expect
function. Implement this in Android & iOS.
expect fun getApiKey(): String
Android
We will get the API key from the local.properties
file in Android. Here is the setup for it.
Update the Environment.android.kt
file.
actual fun getApiKey(): String = BuildConfig.GROQ
Update the build.gradle.kts
file.
android {
buildFeatures {
buildConfig = true
}
defaultConfig {
buildConfigField("String", "GROQ", "\"${getLocalProperty("groq")}\"")
}
}
// Outside all the blocks
fun getLocalProperty(key: String): String {
val localProperties = Properties().apply {
val localPropertiesFile = rootProject.file("local.properties")
if (localPropertiesFile.exists()) {
load(FileInputStream(localPropertiesFile))
}
}
return localProperties.getProperty(key) ?: ""
}
Add the API key in the local.properties
file. Rebuild the project and BuildConfig.GROQ
should have been generated.
iOS
There might be better ways that what I am about to do here as I am not as good in iOS as I am in android. So if there are any better ways to get the key securely in iOS, please implement that instead of the following.
We will create Config.plist
file which will hold the API key and this plist file will be ignored from version control.
To create the Config.plist
file, in Xcode,
- Right-click on your project navigator (left sidebar)
- Select “New File…”
- Choose “Property List” under Resource types
- Name it “Config.plist”
- Make sure it is added to your app target
Add the API key to the Config.plist
file.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>GROQ</key>
<string>your_api_key_here</string>
</dict>
</plist>
Update the Environment.ios.kt
file.
actual fun getApiKey(): String {
val bundle = NSBundle.mainBundle
val configPath = bundle.pathForResource("Config", "plist") ?: error("Config.plist not found")
val config = NSDictionary.dictionaryWithContentsOfFile(configPath) ?: error("Could not read Config.plist")
return config["GROQ"] as? String ?: error("GROQ not found in Config.plist")
}
Audio Files
We can either store the audio file locally or provide a URL. We will work with a remote audio file for simplicity.
If you are using a local file, then fetch it from the internal or external folder of the app and provide the full path.
Groq Client Setup
Let us create a TranscribeAudioViewModel
class where we will initiate the Groq client.
We can initialise the client as follows.
private val client: GroqClient by lazy {
GroqClient(apiKey = getApiKey())
}
To transcribe audio through the Groq client, we need the audio file, a filename and the model that will actually transcribe the audio.
val result: Result<GroqResponse<AudioTranscription>> = client.transcribeAudio {
model = GroqModel.DISTIL_WHISPER_LARGE_V3_EN
filename = "Audio"
// Replace this with your audio of choice.
url = "https://cdn.pixabay.com/download/audio/2024/08/04/audio_be9247e137.mp3?filename=girl-ix27ve-never-been-out-of-the-village-before-229855.mp3"
// file("path/to/audio.mp3") // For local files
}
We will also create a TranscribeAudioState
class.
data class TranscribeAudioState(
val isLoading: Boolean = false,
val transcribeAudio: String? = null,
val error: String? = null
)
Putting it all together, we have the following code.
data class TranscribeAudioState(
val isLoading: Boolean = false,
val transcribeAudio: String? = null,
val error: String? = null
)
class TranscribeAudioViewModel : ViewModel() {
private val _state = MutableStateFlow(TranscribeAudioState())
val state = _state.asStateFlow()
private val client: GroqClient by lazy {
GroqClient(apiKey = getApiKey())
}
fun transcribe() {
_state.update { it.copy(isLoading = true) }
viewModelScope.launch {
val result: Result<GroqResponse<AudioTranscription>> = client.transcribeAudio {
model = GroqModel.DISTIL_WHISPER_LARGE_V3_EN
filename = "Audio"
url =
"https://cdn.pixabay.com/download/audio/2024/08/04/audio_be9247e137.mp3?filename=girl-ix27ve-never-been-out-of-the-village-before-229855.mp3"
}
if (result.isFailure) {
_state.update {
it.copy(
isLoading = false,
error = result.exceptionOrNull()?.message
)
}
return@launch
}
if (result.isSuccess) {
_state.update {
it.copy(
isLoading = false,
transcribeAudio = result.getOrNull()?.data?.text
)
}
}
}
}
}
I’ve ignored the internet check here for simplicity. Also, add the internet permission in the android manifest file.
UI Setup
Let’s create a basic UI to display the transcribed audio.
@Composable
fun TranscribeAudio(
modifier: Modifier = Modifier,
viewModel: TranscribeAudioViewModel = TranscribeAudioViewModel()
) {
val state = viewModel.state.collectAsStateWithLifecycle()
TranscribeAudioContent(
modifier = modifier,
state = state,
onTranscribeClick = viewModel::transcribe
)
}
@Composable
private fun TranscribeAudioContent(
modifier: Modifier = Modifier,
state: State<TranscribeAudioState>,
onTranscribeClick: () -> Unit,
) {
Box(
modifier = modifier.fillMaxSize()
) {
Column(
modifier = Modifier.fillMaxSize()
) {
Button(onClick = onTranscribeClick) {
Text(text = "Transcribe Audio")
}
if (state.value.transcribeAudio != null) {
Text(
text = state.value.transcribeAudio ?: "",
color = Color.Blue,
fontSize = 20.sp,
)
}
if (state.value.error != null) {
Text(
text = state.value.error ?: "",
color = Color.Red,
fontSize = 20.sp,
)
}
}
if (state.value.isLoading) {
Box(modifier = Modifier.fillMaxSize()) {
CircularProgressIndicator(
modifier = Modifier.width(64.dp).align(Alignment.Center),
color = Color.Red,
)
}
}
}
}
There you have it. Audio transcription using AI in Compose Multiplatform. The grok-kt
library also supports chat completion and streaming.
You can find the complete code in this repository. This repository contains other code as well which you can checkout.
Reference
Sound Effect by Lucy_voice_character from Pixabay