This is a follow-up to our post yesterday. We attempted to leverage spaCy and TextBlob to perform Natural Language Processing (NLP). What we were hoping was to automate a live blog of the AWS keynote this morning. Truth be told, it was an aggressive goal, but we learned a lot in the process.

To follow up, we thought we'd process some of last year's keynote. We ran it through the non-streaming transcription process with Amazon Transcribe.

The results were surprisingly improved.  

Check it out:

See the new H one instances lunch last night, which is perfect for those Get the hardware. You still want the elasticity of the reliability and steal ability of eight of you asked. We now have their metal instances that we now slash Immunity for you. We have the most powerful GPU is out there and three instances. You want a PGA instance? We have that. Not a ray of instances and then what we do is we make all of our excess capacity. Anyone point available to you a spot market. View of applications that can afford to be in a rush. The intermittent where you use the capacity ones available and you don’t win. It’s not spot allows you to save about ninety percent on the price of on demand instances, which is really you…Andy Jassy, re:Invent Keynote - AWS Transcribe via streaming

…constrained. Or, if you have big data at work loads, you could see the new h one instances who launch last night, which is perfect for those if you need to get at the hardware. But you still want the elasticity and the reliability and scale ability of eight of you asked. We now have bare metal instances that we now it's last night. If you need gpu, we have the most powerful gpu instances out there and pete three instances. And if you want an f p g. A instance, we have that too. So it's that broad array of instances, and then what we do is we make all of our excess capacity. Anyone point available to you a spot market. And if you have applications that can afford to be interrupted and be intermittent, where you use the capacity ones available and you don't when it's not, that spot allows you say about ninety percent on the price of on demand instances, which is really useful. We have a huge spot market. Andy Jassy, re:Invent Keynote - AWS Transcribe file-based upload

NLP Needs a Little Time...

It's pretty clear that the non-live process did a far superior job. Here's the metadata output from the transcription job:

{
    "TranscriptionJob": {
        "TranscriptionJobName": "2017-keynote1-1",
        "TranscriptionJobStatus": "COMPLETED",
        "LanguageCode": "en-US",
        "MediaSampleRateHertz": 44100,
        "MediaFormat": "mp3",
        "CreationTime": "2018-11-28T18:57:23.815Z",
        "CompletionTime": "2018-11-28T19:31:12.379Z",
        "Settings": {
            "VocabularyName": "vocabModerate",
            "ChannelIdentification": false
        }
    }
}

In a little under 34 minutes, it processed almost an hour and a half of audio, transcribed it and gave me a 'pretty good' result. We used a custom vocabulary and when we shoved an MP3 in, we got (what we think is) a better transcript than the live stream tests we did from the same audio.

In our test, 82:50 of audio was processed in 33:49 - 40.82% of the runtime of the file itself. Can you listen to and transcribe an audio file that long at almost 2.5x speed? Neither can I.

The moral of the story--unless you really need live transcription, the file-based process is actually pretty darn good and it's relatively fast. For longer events like this, batching chunks of the audio as the event is going on could help you get to a final transcript in 'near real-time' that'd likely be completed quickly after the event is complete.

Next Steps

So we're at re:invent 2018, and time is a little tight, but check back soon for a third follow-up of processing the transcribed audio via our NLP and Named Entity Recognition (NER) workflow trained on 'cloudy concepts.'

Let us know if you want to connect about how we can help you apply Natural Language Processing, Machine Learning and the breadth of awesome tools the AWS team have been presenting over the last few days!

Is the #10YearChallenge A Sign of the AI Apocalypse?

Viral social media "challenges," memes, and gimmicks have taken over our feeds in recent years. The term "challenge" is used

Read more

AWS re:Invent 2018 - NLP-Driven Keynote Live Blog (almost)

We've spent a lot of time working with natural language processing (NLP) processes and timing lined up for us to

Read more