Automatically transcribe your Evernote voice notes

I am a big believer in taking notes and working on my personal infrastructure. I find joy in polishing the “Artur OS” and removing little pockets of friction in my setup. Have a look at my automation philosophy:

Recently, we adopted a dog and I have quite a bit of time spent walking. What if I could use it for some deep thinking?

So I wrote a bot.

How does it work?

A: I record an audio note on my phone:

B: I run my magical code

C: A transcription shows underneath

D: Profit!

There are other ways to solve this problem. Particularly, Otter.AI is a great service to transcribe your notes. However, it requires extra manual steps to open the app, export recordings, etc.

If you don’t have an elaborate setup behind your Evernote account, I recommend you check out Otter.

What do you need:

  1. You need to set up the PHP (Yes) SDK for Evernote
  2. You need an API key to Google Cloud Speech API. You don’t need a client library! Just follow these steps:
    1. Enable the API for your project here
    2. Create a new “API Interface Key” here

This code assumes you already search for a note and pass it in. I tag a note with `Tools`. A cron job periodically checks the label and does certain magical things on the notes tagged with it.

Code

It finds the audio recording in your Evernote note and transcribes it:

<?php
// https://piszek.com/2020/05/27/evernote-transcriber/
function transcribe_audio_file_in_a_note( $evernoteClient, $note ) {
// This will find the place where file is embedded, so we can display the transcription underneath.
if( preg_match( '#<en-media hash="([a-z0-9]+)"[^>]+>#is', $note->content, $res ) ) {
$id = hex2bin($res[1] );
$resources = array_filter( $note->resources, function( $resource ) use ( $id ) {
return $resource->data->bodyHash === $id;
} );
if( $resources ) {
$resource = array_shift( $resources );
}
}
if ( isset( $resource->mime ) && $resource->mime === 'audio/x-m4a' ) {
// Only audio files
if ( isset( $resource->attributes->applicationData->keysOnly['transcribed'] ) ) {
$this->log( LOG_INFO, 'This resource is already transcribed.' );
return;
}
// Set for the future
$evernoteClient->client->getNoteStore()->setResourceApplicationDataEntry( $resource->guid, 'transcribed', 'true' );
// this is your Google Speech API token
$token = "tsrtrastr8astars8tras8t";
$in = tempnam(sys_get_temp_dir(), 'evernote_transcript') . '.mp4';
$out = tempnam(sys_get_temp_dir(), 'evernote_transcript') . '.wav';
$data = $evernoteClient->client->getNoteStore()->getResourceData( $resource->guid );
file_put_contents( $in, $data );
// Because Google Speech API is crap and cannot deal with other formats, we have to recode it.
system( "ffmpeg -i $in $out" );
$data = file_get_contents( $out );
$payload = array(
"audio" => array( "content" => base64_encode( $data ) ),
"config" => array(
"languageCode" => "en-US",
"alternativeLanguageCodes" => [ "pl-PL" ], // I only use English or Polish. Your mileage may vary.
"encoding" => "LINEAR16",
"sampleRateHertz" => 44100,
"maxAlternatives" => 1,
"enableAutomaticPunctuation" => true
)
);
$payload = json_encode( $payload );
$context = stream_context_create( array(
'http' => array(
'ignore_errors' => true,
'header' => "Content-Type: application/json\r\n",
'method' => 'POST',
'content' => $payload
)
) );
// Wondering about the v1p1beta1 here? You have to use this version to have alternativeLanguageCodes. This of course is not in documentation.
$result = file_get_contents( "https://speech.googleapis.com/v1p1beta1/speech:recognize?fields=results&key=$token&quot;, false, $context );
$result = json_decode( $result, true );
if ( ! isset ( $result['results'][0]['alternatives'][0]['transcript'] ) ) {
$this->log( LOG_WARNING, 'Empty transcript. ' . print_r( $result, true ) );
return;
}
$text = $result['results'][0]['alternatives'][0]['transcript'];
$this->log( LOG_INFO, 'Transcript OK: ' . $text );
$new_body = str_replace( $res[0], $res[0] . "<div style='font-style: italic'>$text</div>",$note->content );
$note->content = $new_body;
$evernoteClient->client->getNoteStore()->updateNote( $note );
}
return $note->content;
}

More of my adventures in automation:

Why do you have so many bots?

If I died today, I don’t think my friends would notice for a while.

My digital ghost would keep responding to some emails, pay my bills, and send birthday cards. He would read my text messages, forward important ones to my virtual assistant, or respond.

This spooky afterlife is not the goal. I have been automating bits and pieces of my daily responsibilities for the opposite purpose – to save more time for the things that truly matter in life. My ghost is here to help me now.

Ikiryō (生霊, lit. “living ghost”), in Japanese popular belief and fiction, refers to a spirit that leaves the body of a living person and subsequently haunts other people or places, sometimes across great distances.

Examples include:

  • Reading the invoices I get over email to pay specific ones and file them for my accountant,
  • Answering, recording and transcribing the calls I get from all unknown numbers,
  • Putting all the newsletters that I choose to receive in my pocket app, where I consume all articles,
  • Monitoring my communication and reminding me to contact friends I haven’t reached out to for a while,
  • Many many more, including sending birthday cards to my friends.

And this list does not even include automations that run this blog!

You see, I don’t automate to save a minute here or there. Writing, testing, and ensuring nothing goes awry is labor-intensive. I automate to forget about things. My digital ghost worries about A LOT, so I don’t have to.

My goals of automation

Photo by Nghia Le on Unsplash

Do you know that glut of “stuff” sitting in your stomach? The nagging notion that you have SO MUCH to do? Or maybe you are familiar with the guilt that you are so far behind in errands?

Transfers have to go out, invitations to whatever event sent, expenses reported. How can you find time this weekend to do something fun when you have amassed all this?

It certainly was a feeling for me!

That stressful notion of overwhelm is called the “Cognitive Load.” Think of it as a tax for remembering to do stuff. It does not even include doing the actual task – it’s just an overhead and my first goal of automation is to cut it as much as possible.

My second goal is to make sure things are done. Since both me and my wife work remotely, we travel a fair bit. After work, we have been exploring the cenotes of Yukatan, safaris of South Africa, and depths of underwater Thailand. But when you have to put in 8 hours of solid work and then rush to catch a diving boat, doing bank transfers, taxes, and calls is a real inconvenience. It’s really hard to do taxes underwater (although the feeling is the same).

We would postpone those things, and then, after the trip is finished, we would be hit by a freight train of obligations. Taxes on Jetlag are not much fun either.

As Stephen Wolfram summarized in his fantastic post “Notes on my personal infrastructure”, my bots consist of “the technology and other things that help me live and work better, feel less busy, and be more productive every day.”

Building a personal infrastructure has freed my time, mental energy and capacity to focus on more inspiring tasks. Instead of treading the water copying cells from one excel spreadsheet to another, I can spend time with my wife or promote Remote Work in an effort to help curb climate change.

And I want you to free your potential too, so you can focus on a higher calling.

Everyone can Automate 

This Barista has fully embraced automation

Automation is no longer only for programmers like me and theoretical physicists, like Steven Wolfram. It often does not require a single line of code.

Eric Dietrich has a fantastic post, “Don’t learn to program, learn to automate” where he describes his process of automation. Here are the steps required to write your first bot:

1 – You have to get very clear on what you are trying to achieve.

2 – You have to think about your process of achieving that.

If you are familiar with the GTD methodology, you might have noticed that these steps are the surefire way to Get Stuff Done. In the majority of cases, I would stop here. Focusing on goals and optimizing the “algorithm” of a manual task pays off before automating, and it’s sometimes enough.

But If you want to get the thing totally out of your mind:

3 – Implement the process. I know this sounds daunting, but have a look at a service called Zapier (or the free alternative – IFTTT). With a few clicks, I have created automations that will:

  • Save messages I starred in slack to my TODO list, 
  • Tweet 3 and 10 days after I have published a blog post
  • Remind me what my mom needs help with whenever I’m close to her place
  • Keep the tweets and pocket articles I starred in a spreadsheet
  • Many, many more.

I would often need to change my manual process because Zapier would not let me implement a specific flow, so don’t be surprised if you’ll have to go back to the previous point.

4 – Bonus points: maintenance

Stuff breaks, services change their offerings, and your automations will work unreliably. Just like a manager ensuring his team is getting the intended results, you’ll have to budget an hour per month to make sure everything works as expected.

With your own bots (or Ikiryo if you prefer) handling the overhead, you could have more time for what you want from life.

But I have to warn you: busywork is sometimes enjoyable. It gives you a quick dopamine boost and satisfaction from a well-accomplished task. Having the stuff to do has a way of making us feeling essential and special.

The more you automate, the more deliberate you have to be with your life.

For me, that’s the third goal of automation.