Voice interactive ... again.

"Once more unto the breach, dear friends"...

It seems like I was only here not long ago.  This time I was at it with OSX, taking the opportunity to learn a little objective-c and play with one of the nosql databases which seem to be popular these days.  I decided to play with elastic search as my data store.  The plan is (was) to use the built in enhanced dictation as the recognizer, and Apple's AppKit NSSpeechSynthesizer to create an interactive dialog system.  

The basic loop is kicked off by starting the recognizer.  When it finishes recognizing a phrase (uninterrupted speech followed by a pause), it gives me a controlTextDidChange notification.  I then interrupt the recognizer, and pass the message via another notification to a central controller.  The recognized phrase goes through 4 phases:  direct recognition, query engine, personalized recognition, and then a AIML based chat bot.  In other words... ya I'm just re-inventing Siri... and doing it poorly mind you.  
Direct recognition is exact match for some canned queries.  "When is sunset tomorrow?" or "What time is it?".  The phrases can get interesting when you consider location, so I dumped zip code + city name + lat & long data into es, and make queries to earthtools.org to determine for example sunrise and sunset in any city (in the US).  It does seem as if their data doesn't jive with weather.com (I didn't like their api) but in the end I'm just doing this for fun so accuracy doesn't matter as much as "oh shiny".  

Theres a pretty cool NSLinguisticTagger that can be handy.  This page was a pretty useful jumping off point into the world of LinguisticTagger and I've really only scratched the surface.   For example, given the query "WHO WROTE THE DECLARATION OF INDEPENDENCE" I get back:

2015-01-20 23:47:22.596 Chatter[52212:595641] WHO: Pronoun

2015-01-20 23:47:22.596 Chatter[52212:595641] WROTE: Verb

2015-01-20 23:47:22.596 Chatter[52212:595641] THE: Determiner

2015-01-20 23:47:22.596 Chatter[52212:595641] DECLARATION: Noun

2015-01-20 23:47:22.597 Chatter[52212:595641] OF: Preposition

2015-01-20 23:47:22.597 Chatter[52212:595641] INDEPENDENCE: Noun


Granted thats not how I dealt with the above query.  I figured using a knowledge system approach would be best for a query engine.  So I have a collection of facts dumped in one path of es: 

Thomas Jefferson wrote the Declaration of Independence

George Washington was the first President of the United States

Adolf Hilter was the leader of the nazi party

Robert Oppenheimer was the father of the atomic bomb

Albert Einstein developed the theory of relativity

Bill Clinton was the 42nd President of the United States


... and so on.


Given any who question, we can then chop the "who" and search the rest:

"who developed the theory of relativity?"  --> becomes an es query on "developed the theory of relativity" --> becomes the response "Albert Einstein developed the theory of relativity."  There are some similar tricks for the "where" and "when".  The "how" is a bit tougher.  And these are of course simple tricks, not conclusive or exhaustive.  The good news is that knowledge representation is a fairly well known area, and given the internet and google, wikipedia, and wolfram alpha, we have access to a very nice data source.  


The personalized recognition was more for things like location, interests, calendar, birthdays, contact information, identifying relationships, etc.  The idea being that as you interact with the system, it should "remember" information about you, your relationships, and more.  At some point the system should know that I have a child (if I've mentioned it) and even one day spontaneously inquire as to the child's well being.  Still working on this... ;-)  


The last part is the "fall back".  AIML chat bots have gone a pretty long way at simulating basic conversation skills.  If the other systems didn't respond with anything it would be up to an AIML base chat bot to reply.


The response is fed to NSSpeechSynthesizer, which we act as a delegate for.  When we receive the notification didFinishSpeaking then we know that we can re-enable the recognizer for the next round.


Alas other projects and interests have caught me and this is really just me taking notes while this gets pushed to the back burner.  So until next time... adieu.


chatter.png




Ubuntu upgrade problems

So I finally stopped ignoring the fact that my Ubuntu version was no longer supported and decided to upgrade.  It was painful.  It was stupid.  And so here I am typing about it for (hopefully) someone else's benefit.  

Started with Ubuntu 11.04 (Natty Narwal).  Tried using the Update Manager to go to 11.10.  It kept failing with lots of Hash Sum Mismatch errors.  I tried everything I could find via google, which was quite often:

sudo rm -fR /var/lib/apt/lists/*

and then update / upgrade apt-get and try it again.  No dice.  Hash Sum Mismatch.  I tried removing all of the sources except the main Canonical.  Still no.

Ok fine, more googling turns up apt-get clean, apt-get autoclean, etc... still no dice.
 
Finally I wanted to see what was failing.  First run through, I copy-pasted the failed list.  Emacs tells me there are 253 lines.  On a whim I tried it again (sign of insanity is what?) and lo and behold 240 lines.  There is some timeout issue (JUST A GUESS!) causing files to not finish downloading.  It's quite the pain in the ass.  But subsequent requests pull more and more of it correctly.  After the 4th or 5th run through of that (150ish lines remaining) I decided there must be a better way.  

And here it is.  Use the command line.  This is what I did, you might not need to do it.  But it won't hurt.

sudo apt-get update && sudo apt-get upgrade
sudo apt-install update-manager-core
sudo do-release-upgrade
y (enter)
Repeat.  Over and Over.  At least its much faster to iterate via the CLI.  Finally the light at the end of the tunnel.  Answer a few questions on upgrading and then the only thing left was postgres. 

sudo su postgres
pg_dropcluster --stop 9.1 main
pg_upgradecluster 8.4 main

take a minute to run pgadmin and verify the upgrade was golden (it was) and then 

pg_dropcluster 8.4 main

Yay 11.10.  And then I made the mistake of re-enabling other sources, poke around a bit... and then decide to move up to 12.04.  I assume the intermediary step was required for the change to Unity.  So... fire up the Update Manager.  Problems.  Ok disable other sources, run again.  Hash Sum Mismatch.  You've gotta be fraking kidding me.  At least this time it only took 3 iterations.  

But the point of all this IS... if you keep doing sudo rm -fR /var/lib/apt/lists/* and cleaning apt-get... you'll never get anywhere.  

If this keeps up (its been going for a while it appears from the google results)  I fear Ubuntu will lose quite a few users.  

Voice interactive system

So I managed to accomplish something in like 2 days that I couldn't do after a year in Japan doing a undergrad research project.  I threw together a voice interactive application.  Granted the machines were much slower back then, and I had only an old version of the HTK toolkit and some very bad training data to work with back then.  Alcohol and Karaoke may have also been a factor.  And considering I was a good bit lower down in the recognition process (tagging my own speech recordings for training, generating the HMM bigram / trigram models for comparison testing, etc...) today was a bit anti-climatic.  Also, I haven't really made much here.. just mashed two awesome projects together.

Solo Guitar songs

A few original songs.  No fancy mic's or effects sorry =)  

I'm Going Home - A deathbed song.  You might say I'm channeling my wife on this one (I'm not a religious person but she is).  

Out Of Time - Inspired by the life of a poor character in an ever present dungeon crawler game with a countdown.  We spend all of our time imagining ourselves inside the game.  They'd probably just want to get out!

Refined - A song about someone who can't escape a past love.  About how memories boil down to powerful moments.




Advanced Open Water!

After some delay due to illness, and then a few scheduling mishaps, I finally made it into the PADI Advanced Open Water course.  Started in March, which often competes with February for the coldest diving month in San Diego.  For the AOW course, there is some flexibility.  You need to complete 5 adventure dives.  Two of them are required: Underwater Navigation, and Deeper Diving.  

After that you can choose 3 others.  At the orientation, the instructor Patrick suggested the best other 3 for us to do but gave some leeway in case we wanted to do something different.  Thankfully everyone was fairly laid back and agreed to go with the offered suggestion:  Underwater Naturalist, Wreck Diving, and Night Diving.  

Reboot: Total Recall.

Gotta love it.  Apple just released a trailer hyping up the upcoming release of a trailer, for the Total Recall remake.  That marketing department is workin' overtime I tell ya.  Snark aside, it looks pretty damn cool.  I thought the original was great.  But like many SciFi films getting the makeover, I think Total Recall could really benefit from the higher production value possibilities. Given the enclosed nature of most of the shots in the original I thought it was interesting the trailer (of the trailer) opened with a nice broad expanse shot of the (assumed) hero looking out over some jacked up city of the future.  

Me

 

January 2015

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31