Automatically transcribe your Evernote voice notes

I am a big believer in taking notes and working on my personal infrastructure. I find joy in polishing the “Artur OS” and removing little pockets of friction in my setup. Have a look at my automation philosophy:

Recently, we adopted a dog and I have quite a bit of time spent walking. What if I could use it for some deep thinking?

So I wrote a bot.

How does it work?

A: I record an audio note on my phone:

B: I run my magical code

C: A transcription shows underneath

D: Profit!

There are other ways to solve this problem. Particularly, Otter.AI is a great service to transcribe your notes. However, it requires extra manual steps to open the app, export recordings, etc.

If you don’t have an elaborate setup behind your Evernote account, I recommend you check out Otter.

What do you need:

  1. You need to set up the PHP (Yes) SDK for Evernote
  2. You need an API key to Google Cloud Speech API. You don’t need a client library! Just follow these steps:
    1. Enable the API for your project here
    2. Create a new “API Interface Key” here

This code assumes you already search for a note and pass it in. I tag a note with `Tools`. A cron job periodically checks the label and does certain magical things on the notes tagged with it.


It finds the audio recording in your Evernote note and transcribes it:

function transcribe_audio_file_in_a_note( $evernoteClient, $note ) {
// This will find the place where file is embedded, so we can display the transcription underneath.
if( preg_match( '#<en-media hash="([a-z0-9]+)"[^>]+>#is', $note->content, $res ) ) {
$id = hex2bin($res[1] );
$resources = array_filter( $note->resources, function( $resource ) use ( $id ) {
return $resource->data->bodyHash === $id;
} );
if( $resources ) {
$resource = array_shift( $resources );
if ( isset( $resource->mime ) && $resource->mime === 'audio/x-m4a' ) {
// Only audio files
if ( isset( $resource->attributes->applicationData->keysOnly['transcribed'] ) ) {
$this->log( LOG_INFO, 'This resource is already transcribed.' );
// Set for the future
$evernoteClient->client->getNoteStore()->setResourceApplicationDataEntry( $resource->guid, 'transcribed', 'true' );
// this is your Google Speech API token
$token = "tsrtrastr8astars8tras8t";
$in = tempnam(sys_get_temp_dir(), 'evernote_transcript') . '.mp4';
$out = tempnam(sys_get_temp_dir(), 'evernote_transcript') . '.wav';
$data = $evernoteClient->client->getNoteStore()->getResourceData( $resource->guid );
file_put_contents( $in, $data );
// Because Google Speech API is crap and cannot deal with other formats, we have to recode it.
system( "ffmpeg -i $in $out" );
$data = file_get_contents( $out );
$payload = array(
"audio" => array( "content" => base64_encode( $data ) ),
"config" => array(
"languageCode" => "en-US",
"alternativeLanguageCodes" => [ "pl-PL" ], // I only use English or Polish. Your mileage may vary.
"encoding" => "LINEAR16",
"sampleRateHertz" => 44100,
"maxAlternatives" => 1,
"enableAutomaticPunctuation" => true
$payload = json_encode( $payload );
$context = stream_context_create( array(
'http' => array(
'ignore_errors' => true,
'header' => "Content-Type: application/json\r\n",
'method' => 'POST',
'content' => $payload
) );
// Wondering about the v1p1beta1 here? You have to use this version to have alternativeLanguageCodes. This of course is not in documentation.
$result = file_get_contents( "$token&quot;, false, $context );
$result = json_decode( $result, true );
if ( ! isset ( $result['results'][0]['alternatives'][0]['transcript'] ) ) {
$this->log( LOG_WARNING, 'Empty transcript. ' . print_r( $result, true ) );
$text = $result['results'][0]['alternatives'][0]['transcript'];
$this->log( LOG_INFO, 'Transcript OK: ' . $text );
$new_body = str_replace( $res[0], $res[0] . "<div style='font-style: italic'>$text</div>",$note->content );
$note->content = $new_body;
$evernoteClient->client->getNoteStore()->updateNote( $note );
return $note->content;

More of my adventures in automation:

Wtyczka do WordPress’a


Wtyczka tworzy nowy widget, kt贸ry pozwala na zamieszczenie link贸w systemu w Twoim wordpressie

Po co? pozwala na zarabianie na Twojej stronie internetowej. Zamieszczasz linki kt贸re nie do艣膰, 偶e s膮 dobr膮 reklam膮 to pozycjonuj膮 stron臋 docelow膮. Instalacja prolinka na Twojej stronie wymaga zamieszczania plik贸w, wstawiania kodu i paru r贸偶nych operacji.

Gdy masz x stron na wordpressie instalacja mo偶e by膰 uci膮偶liwa. Problemem jest tak偶e edycja szablonu.

Jak to dzia艂a?

Continue reading “Wtyczka do WordPress’a”

Archiwizowanie rozm贸w telefonicznych

Moja praca wymaga bardzo du偶ej ilo艣ci kontakt贸w telefonicznych. Klienci cz臋sto dyktuj膮 mi r贸偶ne informacje, albo m贸wi膮 r贸偶ne rzeczy kt贸rych potem nie pami臋taj膮.. No nie jest czasem ciekawie

Id膮c za przyk艂adem naszego by艂ego ministra sprawiedliwo艣ci postanowi艂em Uziobrowi膰 sw贸j telefon. Jako 偶e posiadam tzw. HTC Wizarda (SPV M3000) i ROM WM6 Pathfinder 3.2 NxS, (czyli w skr贸cie – m贸j telefon to PDA z Windows Mobile), mog艂em uruchomi膰 na nim program PMRecorder, nagrywaj膮cy wszystkie rozmowy. Program jest darmowy, jednak nagrywa rozmowy w dziwnym formacie.

Wi臋c, do rzeczy. Napisa艂em program, kt贸ry przekszta艂ci katalog z plikami PMRecordera i BSCallTimes.xml (plik z histori膮 rozm贸w WM6) na pliki do otwarcia w MediaPlayerze a dane wrzuci do bazy mysql.

Jest to skrypt PHP. Wiem 偶e to szalone, g艂upie i bezsensowne, ale lubi臋 PHP, mam dost臋p do serwera i odpalam go bezproblemowo w Shellu.

Warto zwi臋kszy膰 limit czasu wykonywania i pami臋ci w php.ini
Schemat tabeli w bazie danych:

CREATE TABLE `owczarek_rozmowy` (`id` int(11) NOT NULL auto_increment,`wav_file` varchar(16) NOT NULL default '',`type` text NOT NULL,`time` int(12) NOT NULL default '0',
`length` int(6) NOT NULL default '0',
`number` varchar(12) NOT NULL default '',
`caller` text NOT NULL,
`number_type` char(1) NOT NULL default '',
`note` text NOT NULL,

A oto w艂a艣ciwy skrypt:



//A great function i found at

function xml2array($contents, $get_attributes=1) {
    if(!$contents) return array();

    if(!function_exists('xml_parser_create')) {
        //print "'xml_parser_create()' function not found!";
        return array();
    //Get the XML parser of PHP - PHP must have this module for the parser to work
    $parser = xml_parser_create();
    xml_parser_set_option( $parser, XML_OPTION_CASE_FOLDING, 0 );
    xml_parser_set_option( $parser, XML_OPTION_SKIP_WHITE, 1 );
    xml_parse_into_struct( $parser, $contents, $xml_values );
    xml_parser_free( $parser );

    if(!$xml_values) return;//Hmm...

    $xml_array = array();
    $parents = array();
    $opened_tags = array();
    $arr = array();

    $current = &$xml_array;

    //Go through the tags.
    foreach($xml_values as $data) {
        unset($attributes,$value);//Remove existing values, or there will be trouble
        extract($data);//We could use the array by itself, but this cooler.

        $result = '';
        if($get_attributes) {//The second argument of the function decides this.
            $result = array();
            if(isset($value)) $result['value'] = $value;

            //Set the attributes too.
            if(isset($attributes)) {
                foreach($attributes as $attr => $val) {
                    if($get_attributes == 1) $result['attr'][$attr] = $val; //Set all the attributes in a array called 'attr'
                    /**  :TODO: should we change the key name to '_attr'? Someone may use the tagname 'attr'. Same goes for 'value' too */
        } elseif(isset($value)) {
            $result = $value;

        //See tag status and do the needed.
        if($type == "open") {//The starting of the tag ''
            $parent[$level-1] = &$current;

            if(!is_array($current) or (!in_array($tag, array_keys($current)))) { //Insert New tag
                $current[$tag] = $result;
                $current = &$current[$tag];

            } else { //There was another element with the same tag name
                if(isset($current[$tag][0])) {
                    array_push($current[$tag], $result);
                } else {
                    $current[$tag] = array($current[$tag],$result);
                $last = count($current[$tag]) - 1;
                $current = &$current[$tag][$last];

        } elseif($type == "complete") { //Tags that ends in 1 line ''
            //See if the key is already taken.
            if(!isset($current[$tag])) { //New Key
                $current[$tag] = $result;

            } else { //If taken, put all things inside a list(array)
                if((is_array($current[$tag]) and $get_attributes == 0)//If it is already an array...
                        or (isset($current[$tag][0]) and is_array($current[$tag][0]) and $get_attributes == 1)) {
                    array_push($current[$tag],$result); // ...push the new element into that array.
                } else { //If it is not an array...
                    $current[$tag] = array($current[$tag],$result); //...Make it an array using using the existing value and the new value

        } elseif($type == 'close') { //End of tag ''
            $current = &$parent[$level-1];



	//If there's a call log file in input directory.


		echo "Processing ".$input."/BSCallTimes.xml file n";



			// If it's an sms, let's put size instead of length.

			//Time to parse date & time


			//What the hell do we need ? for?

			$wynik=mysql_query("SELECT * FROM $sql_table WHERE time='".$call['time']."';");

				//Is it already in the databese, but maybe we put it during call record parsing.


					//Yup, we can update the data.
					mysql_query("UPDATE $sql_table SET type='".$call['type']."',length='".$call['length']."',number='".$call['number']."' WHERE time='".$call['time']."';");
					echo "At:".time()." updated call at [".$call['time']."] - Type:[".$call['type']."], Length:[".$call['length']."], Number: [".$call['number']."]n";

					//Lets insert some data.
					mysql_query("INSERT INTO $sql_table SET type='".$call['type']."',length='".$call['length']."',number='".$call['number']."',time='".$call['time']."',caller='".$call['caller']."';");
					echo "At: ".time()." inserted call at [".$call['time']."] - Type:[".$call['type']."], Length:[".$call['length']."], Number: [".$call['number']."], Caller: [".$call['caller']."]n";			


		echo "BSCallTimes Processed.n";

	//So, now that BSCallTimes file is processed, let's process other files.

		$directory = opendir($input);
 		while($file = readdir($directory)) {

			//If it's PMRecorder file:


				//So we don't do this again:


					echo "Processing file ".$pliczek."n";




					echo "Wav file saved at ".$wav_archive."/".$naz[0].".wav"."n";



					//PMRecorder file name
					//Unix time
					//Caller name
					//Caller number
					preg_match ("#([a-z])#is",$data[3],$tmp);
					//Number type (h/m/...)
					//Filtered number

					//Once again some sql:

					$wynik=mysql_query("SELECT * FROM $sql_table WHERE time >'".($data[1]-4)."' AND time<'".($data[1]+1)."';");

					//Why so strange where clause? PMRecorder is a bit retarded in comparison with BSCAlltimes 馃檪		

						//Is it already in the databese, but maybe we put it during call record parsing.

							//Yup, we can update the data.
							mysql_query("UPDATE $sql_table SET wav_file='".$data[0]."',number_type='".$data[4]."',caller='".$data[2]."' WHERE time>'".($data[1]-4)."' AND time<'".($data[1]+1)."';");
							echo "At:".time()." updated call at [".$data[1]."] - Wav file:[".$data[0]."], Caller:[".$data[2]."],Number type:[".$data[4]."],n";

							//Call wasnt inserted during bscalltimes file parsing. Let's do this!
							//Lets insert some data.
							mysql_query("INSERT INTO $sql_table SET wav_file='".$data[0]."',number_type='".$data[4]."',caller='".$data[2]."',time='".$data[1]."',number='".$data[3]."';");
							echo "At:".time()." inserted call at [".$data[1]."] - Wav file:[".$data[0]."], Caller:[".$data[2]."],Number type:[".$data[4]."],Number:[".$data[3]."]n";		


					echo "File ".$pliczek." processed.n";