Lucida For Personal Artificial Intelligence |
Written by Nikos Vaggalis |
Wednesday, 07 September 2016 |
The Clarity Labs team of researchers at Michigan University made headlines last year with the release of its own IPA (Intelligent Personal Assistant), called Sirius. Sirius was mistakenly regarded by many as the open source version of Apple's Siri, but that wasn't the case since the two projects are totally unrelated. Maybe that's one reason for rebranding Sirius as Lucida. As Jeremy Russell, a member of the core team, puts it:
A little bit of history is necessary in order to fully comprehend that statement. Initially, the Sirius project was founded to facilitate benchmarking and extending research into future server architectures that can handle the astronomical workload that the Cloud platforms supporting Machine Learning as a service, on which all IPAs (Apple’s Siri, Google’s Google Now, Microsoft’s Cortana, or Amazon’s Echo) rely upon, are put under. As the current datacenter architectures reach their computational limits, Sirius comes forth with a brand new proposition. For the datacenters to continue doing their job but without having to scale up stacking up on hardware, they can do so by leveraging highly optimized and dedicated algorithms.
Lucida however aims to be more than that, and goes beyond what Sirius achieved. Built on Sirius' foundations, it has evolved into the next, more intelligent. generation with modularity and extensibility in mind. It still remains a provider of speech recognition, image matching, natural language processing and question-and-answering services, but due to the newly found modularity it can now allow for all or any of its main components, Automatic Speech Recognition (ASR) , Image Matching (IMM) or Question-Answering System (QA), to be modified or completely replaced by custom made components. Say, for example that a researcher has come up with his own speech recognition engine, he can now just replace Lucida's ASR component with his own so that he still takes advantage of the rest of Lucida's backend components. Or, in another case where he might not be interested in the Image Matching component, he can remove it and work with a bare bones version of Lucida instead. Lucida consumes queries in the form of speech or image and answers in the form of natural language,just like assigning a task to a human assistant. A prime demonstration of that fact can be experienced in the following promotional video where a human operator talks to a Lucida powered tablet, asking it a series of questions in natural language: Who's the author of James Bond? to get a reply of Ian Flemming The next two questions, When was Google's IPO?'(!) followed by Who invented peanut butter? highlight the engine's agility in interpreting domain agnostic questions.
But there's more, as the mind blowing moment of the video had yet to arrive, when the researcher presented Lucida with a picture of the leaning Tower of Pisa and asked it for its height. That's a mammoth task for any computer to undertake because it first has to identify the building, analyze and understand the spoken request, translate that into a format the backend database can understand, and then retrieve the answer and restructure it in natural language for the user to understand. We've already explored such an approach where human pilots communicate and coordinate with an AI Wingman in humanly understood language, a vital tool when in the middle of an air battle.
Technically speakingLucida is formed by the fusion of three separate and self-contained components : The Automatic Speech Recognition (ASR) component, which utilizes Gaussian Mixture Model and/or Deep Neural Network scoring, is backed by the Signal Processing Deep Neural Network backend and supports several speech recognition toolkits: Kaldi (Deep Neural Network-Hidden Markov Model based), Pocketsphinx and Sphinx4 (Gaussian Mixture Model-Hidden Markov Model based). The Image Matching (IMM) component, which utilizes the Feature Extraction (FE) and Feature Description (FD) techniques, is backed by the Image Processing DNN backend, and uses SURF, a class of the OpenCV computer vision and machine learning software library, for extracting Speeded Up Robust Features from an image and use them as queries to a database. The Question-Answering System (QA), (Regular Expression/Regex, Porter word stemming/ Stemmer, and Conditional Random Fields/CRF tagging), is backed by the Natural Language Processing DNN backend and utilizes OpenEphyra, a Java platform-independent framework for question answering, plus a Wikipedia database stored in Lemur’s Indri format.This is how Lucida could answer the How tall the tower of Pisa is question;it looked it up in an embedded Wikipedia database. These DNN backends, together with 7 dependent upon applications, were united under the Deep-Learning-As-A-Service umbrella, taking shape in the DjINN and Tonic suite. The Tonic suite therefore, is a collection of applications that accept a series tasks, be it Image Processing related tasks: • Image classification (IMC) Speech Processing related tasks: • Automatic speech recognition (ASR) Natural Language Processing related tasks: • Part-of-speech tagging (POS) all derived from the user supplied queries. The applications then call into the DNN web service to forward it the request, which would take it from there, process the request and reply in natural language format. The flexibility of the system lies in that you can mix and match those services in order to develop pipelined applications.For example you could combine the ASR and QA services or all ASR+IMM+QA to pull something like taking a picture of a restaurant and asking Lucida What time does this restaurant close, in order for Lucida to promptly reply at 8 o'clock. You can easily see where that leads to. Wearable or mobile devices having more intimate relationships with their owners, knowing their secrets, habits and belongings so that they're capable of not just answering general questions the kind of where is the nearest tube station but personal ones too like how many pounds does my roof rack hold?, per the promotional video, or the quintessential and potentially life saving when is my wife's birthday question (pun intended). Of course, the aspect of privacy and security is a grand issue pertaining to all IoT devices, which to this day remains not fully satisfied (although Bitcoin's Blockchain infrastructure looks like holding the key, but that's a subject for some other time). Starting today Lucida is offered as a Cloud platform to the world, just like IBM Watson Developer cloud and Hewlett Packard's Haven OnDemand, courtesy of the University of Michigan and Clinc, a company specifically set up for this cause. The idea here is to expose APIs to the DNN backend that will empower anyone to create intelligent Personal Assistant applications in a move emphasizing an emerging trend of our times;that Machine Learning and AI have reached the status of tradeable commodity.Competition amongst the major stakeholders looks nothing but fierce...
More InformationRelated ArticlesAchieving Autonomous AI Is Closer Than We Think Artificial Intelligence in Pokémons' Service Haven OnDemand Offers Machine Learning As A Service OpenFace - Face Recognition For All
To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, Facebook, Google+ or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
Last Updated ( Wednesday, 07 September 2016 ) |