Article·Announcements·Aug 21, 2024

Announcing Official Go Support: v1.0 Released for General Availability

We are excited to announce the release of version 1.0 of the Deepgram Go SDK.

Why the Go SDK?Quick Tour of the Go SDK Pre-recorded Transcription Live Streaming Transcription Management API Demo: Tour of the Examples Example Application Leveraging the Go SDK What's on the Horizon?Sign up for Deepgram Learn more about Deepgram

Share this guide

By David vonThenenDeveloper Experience

Last UpdatedAug 21, 2024

We are excited to announce the release of version 1.0 of the Deepgram Go SDK, which provides official support for the Go community. This release marks a pivotal stride forward in facilitating developers' use of Deepgram's suite of APIs and enabling new use cases within the developer ecosystem.

This blog post will guide you through what features you can expect to find, a walkthrough of prerecorded and live transcription coding examples, and an example application that leverages the live transcription with a demo. Then, we will discuss what is next for the project and what we will see in the roadmap in the weeks and months ahead.

Why the Go SDK?

Initially, Deepgram started to support the Go Community by offering a community-developed Go SDK. While there were many users, there was a need to support the language officially due to the unique use cases and value proposition that Go provides.

Those use cases are:

IoT and Edge Devices: This is ideal for Go because you aren't lugging around an interpreter. You can natively compile to ARM.
Resource Constrained Environments - You save a lot of disk and memory resources not using a JVM or its language equivalent.
Containerized Workloads: Smaller binaries mean smaller images and faster restart times for environments like Kubernetes
Enterprise-Scale Applications: Go is a modern, scalable language that shines for backend applications (ie Microservices)

While the other official SDKs (JavaScript, Python, and .NET) each bring their unique value to the table, the Go SDK enables building applications to address those before-mentioned use cases.

Quick Tour of the Go SDK

This new version of the Go SDK continues to provide capabilities everyone in the ecosystem has come to love: transcribing pre-recorded and live-streaming audio. Additionally, you will find all of the Management APIs to oversee billing, usage, member access, and more.

One new take that we have included in the project to help increase community adoption and onboard developers quickly is a "lead by example" approach. That is to say, we have provided examples for every single API contained within the SDK. The examples serve as quick references to how to use a particular API and how it functions end-to-end via a short and simple "hello world" style application.

Without further ado, let's look at our first use case... if reading isn't your thing, check out the video demo at the end of this section.

Pre-recorded Transcription

Transcribing pre-recorded audio has been a staple in the Deepgram offering. It's perfect for reading the meeting minutes without watching hours of recorded video, obtaining closed captioning in VTT and SRT formats, and much more.

There are several examples in the repo, including getting a transcript from a local file, a stream, and the example we will look at from a URL.

Live Streaming Transcription

The most significant improvement from the previous release has been in the interface updates to the Live Streaming Client. The Live Client implements the standard Go io.ReadCloser and io.WriteCloser interfaces, which is a Go philosophy that this SDK agrees with. If you look at files, HTTP body results, streams, etc, they all implement the io.ReadCloser and io.WriteCloser interfaces (depending on if they are a source or destination of data). That's why the Live Client can take in any object without knowing it's a local file, an HTTP stream, a microphone stream, or even your own custom object.

This helps the user by abstracting all implementation details for the underlying websocket interface. In the previous release of the SDK, the user had to partly manage the websocket, which can be highly problematic for beginners, put more maintenance and feature development on the user, and didn't give users a blueprint for implementing their applications.

The best way to illustrate this change is by looking at some code. The example from the repo we are going to take a look at uses your local microphone and streams it to the Deepgram platform.

If you are interested in using this example, please look at the README contained within this example. You will need to install Portaudio to make use of your local microphone.

Management API

Last but not least, you need an API to manage your Deepgram projects and accounts, which this last section covers. The goal for the examples for the Management APIs is to exercise all of the CRUD operations within a given example. That is to say, we want to Create an object, Read the object, Update the Object, and Delete the object all in a single "hello world" example.

You can find the examples for the management API in the examples/manage at the root of the repository. Still, I thought I would look at least one example, the Invitations example, to better illustrate what one of these "hello world" examples might look like.

The goal of each example will be to exercise CRUD and bring your environment or configuration back to where it was before running the example.

Demo: Tour of the Examples

If you are a visual learner, you can view a tour of the examples in the YouTube video below.

Example Application Leveraging the Go SDK

We have covered the APIs and where best to find resources using them, but it might be interesting to look at a more complex example. There is a new Deepgram Virtual Assistant repository that captures an example of using the Live Transcription Client in the Go SDK.

This example implements a Virtual Scribe or Transcriber utility and leverages an open source project called Open Virtual Assistant, a simple framework for creating Alexa, Siri, and Google Home devices that can live on your laptop or IoT/Edge device, like a Raspberry Pi.

Once launched, the Virtual Scribe records everything that's said via your microphone, and when finished with your note, you can say "send email," which will email out your note using the preconfigured email. If you need to take a moment to collect your thoughts or take an interruption, you can say "pause" to tell the Virtual Scribe to stop recording. When you are ready to continue, say "resume" to pick up where you left off.

Let's take a look at this quick demo.

If you are interested in some of the key implementation details, I would encourage taking a look at:

The transcription class that performs the speech-to-text on the microphone
The implementation guts of building the Virtual Scribe. It's hardly any code at all.

What's on the Horizon?

Since Go is an officially supported SDK, more updates, enhancements, and improvements are coming. Some immediate enhancements coming real soon are of the intelligence variety, and some medium to longer-term initiatives are things like OSS project management, GitHub Actions, linting, tools to assist development, and more.

I encourage those interested in using the Go SDK to give it a try, file any bugs you see, and provide any feedback to help make this project even better. If you build any applications using the SDK, drop us a line in Discord and let us know about it. Happy coding!

Sign up for a Deepgram account and get $200 in Free Credit (up to 45,000 minutes), absolutely free. No credit card needed!

Learn more about Deepgram

We encourage you to explore Deepgram by checking out the following resources:

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.