Posted by Billy Rutledge, Director, AIY Projects
Makers are hands-on when it comes to making change. We're explorers, hackers and problem solvers who build devices, ecosystems, art (sometimes a combination of the three) on the basis of our own (often unconventional) ideas. So when my team first sought out to empower makers of all types and ages with the AI technology we've honed at Google, we knew whatever we built had to be open and accessible. We stayed clear of limitations that come from platform and software stack requirements, high cost and complex set up, and fixed our focus on the curiosity and inventiveness that inspire makers around the world.
When we launched our Voice Kit with help from our partner Raspberry Pi in May and sold out globally in just a few hours, we got the message loud and clear. There is a genuine demand among do-it-yourselfers for artificial intelligence that makes human-to-machine interaction more like natural human interaction.
Last week we announced the Speech Commands Dataset, a collaboration between the TensorFlow and AIY teams. The dataset has 65,000 one-second long utterances of 30 short words by thousands of different contributors of the AIY website and allows you to build simple voice interfaces for applications. We're currently in the process of integrating the dataset with the next release of the Voice Kit, so makers could build devices that respond to simple voice commands without the press of a button or an internet connection.
Today, you can pre-order your Voice Kit, which will be available for purchase in stores and online through Micro Center.
Or you may have to resort to the hack that maker Shivasiddarth created when Voice Kit with MagPi #57 sold out in May, and then again (within 17 minutes) earlier this month.
Martin Mander created a retro-inspired intercom that he calls 1986 Google Pi Intercom. He describes it as "a wall-mounted Google voice assistant using a Raspberry PI 3 and the Google AIY (Artificial Intelligence Yourself) [voice] kit." He used a mid-80s intercom that he bought on sale for £4. It cleaned up well!
Get the full story from Martin and see what Slashgear had to say about the project.
(This one's for Dr. Who fans) Tom Minnich created a Dalek-voiced assistant.
He offers a tutorial on how you can modify the Voice Kit to do something similar — perhaps create a Drogon-voiced assistant?
Victor Van Hee used the Voice Kit to create a voice-activated internet streaming radio that can play other types of audio files as well. He provides instructions, so you can do the same.
The Voice Kit is currently available in the U.S. We'll be expanding globally by the end of this year. Stay tuned here, where we'll share the latest updates. The strong demand for the Voice Kit drives us to keep the momentum going on AIY Projects.
What we build next will include vision and motion detection and will go hand in hand with our existing Voice Kit. AIY Project kits will soon offer makers the "eyes," "ears," "voice" and sense of "balance" to allow simple yet powerful device interfaces.
We'd love to bake your input into our next releases. Go to hackster.io or leave a comment to start up a conversation with us. Show us and the maker community what you're working on by using hashtag #AIYprojects on social media.
Automatic speech recognition (ASR) has seen widespread adoption due to the recent proliferation of virtual personal assistants and advances in word recognition accuracy from the application of deep learning algorithms. Many speech recognition teams rely on Kaldi, a popular open-source speech recognition toolkit. We're announcing today that Kaldi now offers TensorFlow integration.
With this integration, speech recognition researchers and developers using Kaldi will be able to use TensorFlow to explore and deploy deep learning models in their Kaldi speech recognition pipelines. This will allow the Kaldi community to build even better and more powerful ASR systems as well as providing TensorFlow users with a path to explore ASR while drawing upon the experience of the large community of Kaldi developers.
Building an ASR system that can understand human speech in every language, accent, environment, and type of conversation is an extremely complex undertaking. A traditional ASR system can be seen as a processing pipeline with many separate modules, where each module operates on the output from the previous one. Raw audio data enters the pipeline at one end and a transcription of recognized speech emerges from the other. In the case of Kaldi, these ASR transcriptions are post processed in a variety of ways to support an increasing array of end-user applications.
Yishay Carmiel and Hainan Xu of Seattle-based IntelligentWire, who led the development of the integration between Kaldi and TensorFlow with support from the two teams, know this complexity first-hand. Their company has developed cloud software to bridge the gap between live phone conversations and business applications. Their goal is to let businesses analyze and act on the contents of the thousands of conversations their representatives have with customers in real-time and automatically handle tasks like data entry or responding to requests. IntelligentWire is currently focused on the contact center market, in which more than 22 million agents throughout the world spend 50 billion hours a year on the phone and about 25 billion hours interfacing with and operating various business applications.
For an ASR system to be useful in this context, it must not only deliver an accurate transcription but do so with very low latency in a way that can be scaled to support many thousands of concurrent conversations efficiently. In situations like this, recent advances in deep learning can help push technical limits, and TensorFlow can be very useful.
In the last few years, deep neural networks have been used to replace many existing ASR modules, resulting in significant gains in word recognition accuracy. These deep learning models typically require processing vast amounts of data at scale, which TensorFlow simplifies. However, several major challenges must still be overcome when developing production-grade ASR systems:
One of the ASR system modules that exemplifies these challenges is the language model. Language models are a key part of most state-of-the-art ASR systems; they provide linguistic context that helps predict the proper sequence of words and distinguish between words that sound similar. With recent machine learning breakthroughs, speech recognition developers are now using language models based on deep learning, known as neural language models. In particular, recurrent neural language models have shown superior results over classic statistical approaches.
However, the training and deployment of neural language models is complicated and highly time-consuming. For IntelligentWire, the integration of TensorFlow into Kaldi has reduced the ASR development cycle by an order of magnitude. If a language model already exists in TensorFlow, then going from model to proof of concept can take days rather than weeks; for new models, the development time can be reduced from months to weeks. Deploying new TensorFlow models into production Kaldi pipelines is straightforward as well, providing big gains for anyone working directly with Kaldi as well as the promise of more intelligent ASR systems for everyone in the future.
Similarly, this integration provides TensorFlow developers with easy access to a robust ASR platform and the ability to incorporate existing speech processing pipelines, such as Kaldi's powerful acoustic model, into their machine learning applications. Kaldi modules that feed the training of a TensorFlow deep learning model can be swapped cleanly, facilitating exploration, and the same pipeline that is used in production can be reused to evaluate the quality of the model.
We hope this Kaldi-TensorFlow integration will bring these two vibrant open-source communities closer together and support a wide variety of new speech-based products and related research breakthroughs. To get started using Kaldi with TensorFlow, please check out the Kaldi repo and also take a look at an example for Kaldi setup running with TensorFlow.
Whether it's opening night for a Broadway musical or launch day for your app, both are thrilling times for everyone involved. Our agency, Posse, collaborated with Hamilton to design, build, and launch the official Hamilton app... in only three short months.
We decided to use Firebase, Google's mobile development platform, for our backend and infrastructure, while we used Flutter, a new UI toolkit for iOS and Android, for our front-end. In this post, we share how we did it.
We love to spend time designing beautiful UIs, testing new interactions, and iterating with clients, and we don't want to be distracted by setting up and maintaining servers. To stay focused on the app and our users, we implemented a full serverless architecture and made heavy use of Firebase.
A key feature of the app is the ticket lottery, which offers fans a chance to get tickets to the constantly sold-out Hamilton show. We used Cloud Functions for Firebase, and a data flow architecture we learned about at Google I/O, to coordinate the lottery workflow between the mobile app, custom business logic, and partner services.
For example, when someone enters the lottery, the app first writes data to specific nodes in Realtime Database and the database's security rules help to ensure that the data is valid. The write triggers a Cloud Function, which runs business logic and stores its result to a new node in the Realtime Database. The newly written result data is then pushed automatically to the app.
Because of Hamilton's intense fan following, we wanted to make sure that app users could get news the instant it was published. So we built a custom, web-based Content Management System (CMS) for the Hamilton team that used Firebase Realtime Database to store and retrieve data. The Realtime Database eliminated the need for a "pull to refresh" feature of the app. When new content is published via the CMS, the update is stored in Firebase Realtime Database and every app user automatically sees the update. No refresh, reload, or pull required!
Besides powering our lottery integration, Cloud Functions was also extremely valuable in the creation of user profiles, sending push notifications, and our #HamCam — a custom Hamilton selfie and photo-taking experience. Cloud Functions resized the images, saved them in Cloud Storage, and then updated the database. By taking care of the infrastructure work of storing and managing the photos, Firebase freed us up to focus on making the camera fun and full of Hamilton style.
With only three months to design and deliver the app, we knew we needed to iterate quickly on the UX and UI. Flutter's hot reload development cycle meant we could make a change in our UI code and, in about a second, see the change reflected on our simulators and phones. No rebuilding, recompiling, or multi-second pauses required! Even the state of the app was preserved between hot reloads, making it very fast for us to iterate on the UI with our designers.
We used Flutter's reactive UI framework to implement Hamilton's iconic brand with custom UI elements. Flutter's "everything is a widget" approach made it easy for us to compose custom UIs from a rich set of building blocks provided by the framework. And, because Flutter runs on both iOS and Android, we were able to spend our time creating beautiful designs instead of porting the UI.
The FlutterFire project helped us access Firebase Analytics, Firebase Authentication, and Realtime Database from the app code. And because Flutter is open source, and easy to extend, we even built a custom router library that helped us organize the app's UI code.
We enjoyed building the Hamilton app (find it on the Play Store or the App Store) in a way that allowed us to focus on our users and experiment with new app ideas and experiences. And based on our experience, we'd happily recommend serverless architectures with Firebase and customized UI designs with Flutter as powerful ways for you to save time building your app.
For us, we already have plans how to continue and develop Hamilton app in new ways, and can't wait to release those soon!
If you want to learn more about Firebase or Flutter, we recommend the Firebase docs, the Firebase channel on YouTube, and the Flutter website.
We've come a long way since our initial open source release in February 2016 of TensorFlow Serving, a high performance serving system for machine learned models, designed for production environments. Today, we are happy to announce the release of TensorFlow Serving 1.0. Version 1.0 is built from TensorFlow head, and our future versions will be minor-version aligned with TensorFlow releases.
For a good overview of the system, watch Noah Fiedel's talk given at Google I/O 2017.
When we first announced the project, it was a set of libraries providing the core functionality to manage a model's lifecycle and serve inference requests. We later introduced a gRPC Model Server binary with a Predict API and an example of how to deploy it on Kubernetes. Since then, we've worked hard to expand its functionality to fit different use cases and to stabilize the API to meet the needs of users. Today there are over 800 projects within Google using TensorFlow Serving in production. We've battle tested the server and the API and have converged on a stable, robust, high-performance implementation.
We've listened to the open source community and are excited to offer a prebuilt binary available through apt-get install. Now, to get started using TensorFlow Serving, you can simply install and run without needing to spend time compiling. As always, a Docker container can still be used to install the server binary on non-Linux systems.
With this release, TensorFlow Serving is also officially deprecating and stopping support for the legacy SessionBundle model format. SavedModel, TensorFlow's model format introduced as part of TensorFlow 1.0 is now the officially supported format.
To get started, please check out the documentation for the project and our tutorial. Enjoy TensorFlow Serving 1.0!
Starting today, we're making all your apps built for the Google Assistant available to our en-GB users across Google Home (recently launched in the UK), select Android phones and the iPhone.
While your apps will appear in the local directory automatically this week, to make your apps truly local, here are a couple of things you should do:
Apps like Akinator, Blinkist Minute and SongPop have already optimized their experience for en-GB Assistant users—and we can't wait to see who dives in next!
And for those of you who are excited about the ability to target Google Assistant users on en-GB, now it is the perfect time to start building. Our developer tools, documentation and simulator have all been updated to make it easy for you to create, test and deploy your first app.
We'll continue to make the Actions on Google platform available in more languages over the coming year. If you have questions about internationalization, please reach out to us on Stackoverflow and Google+.
Cheerio!
The mission of Google Developers Launchpad is to enable startups from around the world to build great companies. In the last 4 years, we've learned a lot while supporting early and late-stage founders. From working with dynamic startups---such as teams applying Artificial Intelligence technology to solving transportation problems in Israel, improving tele-medicine in Brazil, and optimizing online retail in India---we've learned that these startups require specialized services to help them scale.
So today, we're launching a new initiative - Google Developers Launchpad Studio - a full-service studio that provides tailored technical and product support to Artificial Intelligence & Machine Learning startups, all in one place.
Whether you're a 3-person team or an established post-Series B startup applying AI/ML to your product offering, we want to start connecting with you.
The global headquarters of Launchpad Studio will be based in San Francisco at Launchpad Space, with events and activities taking place in Tel Aviv and New York. We plan to expand our activities and events to Toronto, London, Bangalore, and Singapore soon.
As a member of the Studio program, you'll find services tailored to your startups' unique needs and challenges such as:
We're looking forward to working closely with you in the AI & Machine Learning space, soon!
"Innovation is open to everyone, worldwide. With this global program we now have an important opportunity to support entrepreneurs everywhere in the world who are aiming to use AI to solve the biggest challenges." Yossi Matias, VP of Engineering, Google
We're constantly working to secure our users and their data. Earlier this year, we detailed some of our latest anti-phishing tools and rolled-out developer-focused updates to our app publishing processes, risk assessment systems, and user-facing consent pages. Most recently, we introduced OAuth apps whitelisting in G Suite to enable admins to choose exactly which third-party apps can access user data.
Over the past few months, we've required that some new web applications go through a verification process prior to launch based upon a dynamic risk assessment.
Today, we're expanding upon that foundation, and introducing additional protections: bolder warnings to inform users about newly created web apps and Apps Scripts that are pending verification. Additionally, the changes we're making will improve the developer experience. In the coming months, we will begin expanding the verification process and the new warnings to existing apps as well.
Beginning today, we're rolling out an "unverified app" screen for newly created web applications and Apps Scripts that require verification. This new screen replaces the "error" page that developers and users of unverified web apps receive today.
The "unverified app" screen precedes the permissions consent screen for the app and lets potential users know that the app has yet to be verified. This will help reduce the risk of user data being phished by bad actors.
This new notice will also help developers test their apps more easily. Since users can choose to acknowledge the 'unverified app' alert, developers can now test their applications without having to go through the OAuth client verification process first (see our earlier post for details).
Developers can follow the steps laid out in this help center article to begin the verification process to remove the interstitial and prepare your app for launch.
We're also extending these same protections to Apps Script. Beginning this week, new Apps Scripts requesting OAuth access to data from consumers or from users in other domains may also see the "unverified app" screen. For more information about how these changes affect Apps Script developers and users, see the verification documentation page.
Apps Script is proactively protecting users from abusive apps in other ways as well. Users will see new cautionary language reminding them to "consider whether you trust" an application before granting OAuth access, as well as a banner identifying web pages and forms created by other users.
In the coming months, we will continue to enhance user protections by extending the verification process beyond newly created apps, to existing apps as well. As a part of this expansion, developers of some current apps may be required to go through the verification flow.
To help ensure a smooth transition, we recommend developers verify that their contact information is up-to-date. In the Google Cloud Console, developers should ensure that the appropriate and monitored accounts are granted either the project owner or billing account admin IAM role. For help with granting IAM roles, see this help center article.
In the API manager, developers should ensure that their OAuth consent screen configuration is accurate and up-to-date. For help with configuring the consent screen, see this help center article.
We're committed to fostering a healthy ecosystem for both users and developers. These new notices will inform users automatically if they may be at risk, enabling them to make informed decisions to keep their information safe, and will make it easier to test and develop apps for developers.
I'm happy to share that we opened registrations for the European installment of our global event series — Google Developer Days (GDD). Google Developer Days showcase our latest developer product and platform updates to help you develop high quality apps, grow & retain an active user base, and tap into tools to earn more.
Google Developer Days — Europe (GDD Europe) will take place on September 5-6 2017, in Krakow, Poland. We'll feature technical talks on a range of products including Android, the Mobile Web, Firebase, Cloud, Machine Learning, and IoT. In addition, we'll offer opportunities for you to join hands-on training sessions, and 1:1 time with Googlers and members of our Google Developers Experts community. We're looking forward to meeting you face-to-face so we can better understand your needs and improve our offerings for you.
If you're interested in joining us at GDD Europe, registration is now open.
Can't make it to Krakow? We've got you covered. All talks will be livestreamed on the Google Developers YouTube channel, and session recordings will be available there after the event. Looking to tune into the action with developers in your own neighborhood? Consider joining a GDD Extended event or organizing one for your local developer community .
Whether you're planning to join us in-person or remotely, stay up-to-date on the latest announcements using #GDDEurope on Twitter, Facebook, and Google+.
We're looking forward to seeing you in Europe soon!
You might be using the Google Calendar API, or alternatively email markup, to insert events into your users' calendars. Thankfully, these tools allow your apps to do this seamlessly and automatically, which saves your users a lot of time. But what happens if plans change? You need your apps to also be able to modify an event.
While email markup does support this update, it's limited in what it can do, so in today's video, we'll show you how to modify events with the Calendar API. We'll also show you how to create repeating events. Check it out:
Imagine a potential customer being interested in your product, so you set up one or two meetings with them. As their interest grows, they request regularly-scheduled syncs as your product makes their short list—your CRM should be able to make these adjustments in your calendar without much work on your part. Similarly, a "dinner with friends" event can go from a "rain check" to a bi-monthly dining experience with friends you've grown closer to. Both of these events can be updated with a JSON request payload like what you see below to adjust the date and make it repeating:
JSON
var TIMEZONE = "America/Los_Angeles"; var EVENT = { "start": {"dateTime": "2017-07-01T19:00:00", "timeZone": TIMEZONE}, "end": {"dateTime": "2017-07-01T22:00:00", "timeZone": TIMEZONE}, "recurrence": ["RRULE:FREQ=MONTHLY;INTERVAL=2;UNTIL=20171231"] };
This event can then be updated with a single call to the Calendar API's events().patch() method, which in Python would look like the following given the request data above, GCAL as the API service endpoint, and a valid EVENT_ID to update:
events().patch()
GCAL
EVENT_ID
GCAL.events().patch(calendarId='primary', eventId=EVENT_ID, sendNotifications=True, body=EVENT).execute()
If you want to dive deeper into the code sample, check out this blog post. Also, if you missed it, check out this video that shows how you can insert events into Google Calendar as well as the official API documentation. Finally, if you have a Google Apps Script app, you can access Google Calendar programmatically with its Calendar service.
We hope you can use this information to enhance your apps to give your users an even better and timely experience.
At Area 120, Google's internal workshop for experimental ideas, we're working on early-stage projects and quickly iterate to test concepts. We heard from developers that they're looking at how to make money to fund their VR applications, so we started experimenting with what a native, mobile VR ad format might look like.
Developers and users have told us they want to avoid disruptive, hard-to-implement ad experiences in VR. So our first idea for a potential format presents a cube to users, with the option to engage with it and then see a video ad. By tapping on the cube or gazing at it for a few seconds, the cube opens a video player where the user can watch, and then easily close, the video. Here's how it works:
Our work focuses on a few key principles - VR ad formats should be easy for developers to implement, native to VR, flexible enough to customize, and useful and non-intrusive for users. Our Area 120 team has seen some encouraging results with a few test partners, and would love to work with the developer community as this work evolves - across Cardboard (on Android and iOS), Daydream and Samsung Gear VR.
If you're a VR developer (or want to be one) and are interested in testing this format with us, please fill out this form to apply for our early access program. We have an early-stage SDK available and you can get up and running easily. We're excited to continue experimenting with this format and hope you'll join us for the ride!