How to create your own document scanner using ML Kit and Jetpack Compose

Jordan Mungujakisa
2 min readMar 18, 2024

--

Google recently released their ML Kit document scanner API that can help you easily add a powerful AI powered document scanner to your app. In this blog post, i will go through a step by step guide to implementing it in your own app. So, lets dive right in.

Firstly, we need to add the dependency to our app-level build.gradle file.

implementation 'com.google.android.gms:play-services-mlkit-document-scanner:16.0.0-beta1'

Next, we must set the document scanner options by defining the following configurations:

  • Whether to allow imports from the gallery
  • Set a limit for the maximum number of pages scanned
  • Set the scanner mode to control the feature sets in the workflow.
val options = GmsDocumentScannerOptions.Builder()
.setGalleryImportAllowed(true)
.setPageLimit(100)
.setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF)
.setScannerMode(SCANNER_MODE_FULL)
.build()

Next, we need to create an instance of our document scanner. We do this by obtaining the client and passing in the scanner options that we previously defined.

val scanner = GmsDocumentScanning.getClient(options)

After creating the document scanner instance, you’ll need to use the activity callback methods. These allow you to launch the document scanner activity and receive a result.

First, let’s register for our scanner activity result:

val scannerLauncher =
registerForActivityResult(ActivityResultContracts.StartIntentSenderForResult()) { result ->
if (result.resultCode == RESULT_OK) {
handleResult(result)
}
}

We obtain an IntentSender from the document scanner instance defined earlier. Then, we use this IntentSender to launch the scanner activity as follows:

scanner.getStartScanIntent(context)
.addOnSuccessListener { intentSender ->
scannerLauncher.launch(
IntentSenderRequest
.Builder(intentSender)
.build()
)
}.addOnFailureListener {
Toast.makeText(context, "${it.message}", Toast.LENGTH_SHORT).show()
}

Here’s how I handled the results in my app. I wanted to store the PDF meta information in a Room database for later retrieval.

fun handleResult(result: ActivityResult) {
val currentTime = System.currentTimeMillis()
val scanResult = GmsDocumentScanningResult.fromActivityResultIntent(result.data)
scanResult?.pdf?.let { pdf ->
val pdfUri = pdf.uri
val pageCount = pdf.pageCount
// do something with the result
}
}

Similarly, we can receive the first page image.

Here’s a demonstration of how the PDF Document Scanner i created works.

PDF Document Scanner Demo

You can download the app on the Google PlayStore here: https://play.google.com/store/apps/details?id=com.keru.pdfcreator

Source code is here: https://github.com/jordan-jakisa/pdf_creator

Please star the repo if you found this article useful 🙏

Ciao 🤓

--

--

Jordan Mungujakisa

Mobile app alchemist who is trying to transmute elegant designs, into elegant code, into beautiful mobile app experiences.