Amazon Echo is the most popular Internet of Things (IoT) device, whether you call it a smart microphone, virtual digital assistant, home robot, voice control or R2-D2, this kind of voice-based Products are rapidly emerging. ..

Compared to Echo, the price of the latest second-generation Echo Dot has been adjusted from $89.99 to $49.99 and will be available in the US this month.

Amazon Echo (and its subsequent Dot) opens up a new marketplace that enables device vendors to capture voice, improve microphone resolution, more advanced background noise filtering, better sound field detection and stability Competition in terms of connectivity and other aspects to provide better audio quality.

Amazon Echo (left) and its second generation product, Dot

Amazon Echo (left) and its second generation product, Dot

Companies like XMOS, although their own chips are not used in Echo, are also targeting this new voice interface market. Paul Neil, vice president of marketing and business development at XMOS, said, "The Internet of Things is now a fast-moving feast. To control IoT devices, voice is the most natural user interface."

Neil said that because of the combination of "conventional microcontroller (MCU) performance, embedded DSP and flexible I / O combination, our technology is an ideal choice for the voice interface."

However, the war between hardware devices is only part of the entire smart microphone/speaker market. Paul Erickson, senior analyst at IHS Markit's online family, emphasized that "the real competitive variables come from the cloud."

In pursuit of a more "smart" smart microphone (capable of handling complex queries and random problems), the competition in the cloud services space is becoming more and more intense. Google is expected to launch Google Home and Google Assistant (a new version of Google Now) by the end of this year and enter the market. Erickson said, "And the market is rumored that Apple is likely to enter its field in 2017."

Amazon Echo - Another reason why this IoT device is so popular is that it has the potential to realize the important advantages of IoT: it can withstand future-proofing.

Skip Ashton, vice president of software at Silicon Labs, explained that Future-proofing means "ensuring that the device can continue to add more features over time." For example, Alexa provided 70 voice services for Echo at the beginning. It has now increased to more than 1,700.

Echo can answer questions, read news, rate sports, control lighting, order products from Amazon.com, and set alarms. The user can also use the device to call the Uber or order a pizza delivery.

“Echo is currently updated once every two weeks via the cloud,” Ashton said. “Amazon will send an email to Echo users on Friday to post new features,” and Echo users “have an expectation of continued product enhancements.”

Local wisdom

Tom Hackenberg, principal analyst at IHS Markit Embedded Processors, analyzes why Amazon Echo has had a major impact on the electronics industry: Smart Microphone/Smart Speaker applications "is of great value to processor vendors."

Because the key to this device is not just to provide "local intelligence." The voice interface is being widely used in a wide range of market segments. Not only is the digital assistant market appearing, it is the consumer electronics version of the smart phone app, and the speaker is not its only form factor application.

For example, he explained, “Home automation centers and digital assistants can be built into TVs, set-top boxes (STBs), HVAC/environmental control hubs, etc. In addition, there are a large number of applications in in-vehicle infotainment, especially with a focus on Its hands-free advantage."

Dismantling Echo and Echo Dot

After dissolving Echo and Echo Dot and comparing them further, Hackenberg said, "In addition to the memory supplier, I found that the processing elements of Echo and Echo Dot are not significantly different."

Disassemble Echo and Echo Dot (Source: iFixit)

According to the disassembly of iFixit, Amazon Echo uses:

Samsung K4X2G323PD-8GD8 256MB LPDDR1 RAM (volatile memory)

SanDisk SDIN7DP2-4G 4GB iNAND Ultra Flash (non-volatile storage)

And when the new version of Dot is used:

Micron MT46H64M32LFBQ 256MB (16Meg x 32 x 4Banks) LPDDR SDRAM (volatile memory)

Samsung KLM4G1FEPD 4GB high performance eMMC NAND Flash (non-volatile storage)

The processors used in both products are identical. At the core of Echo and Dot is "Texas Instruments (TI) media processor DM3725", in addition, Qualcomm Atheros QCA6234 application-specific standard processor is used to provide "connectivity."

Hackenberg explained that although memory may have a slight impact on performance, memory pricing is volatile. Therefore, it is not uncommon for all products of the Echo series to change the components used throughout their life cycle.

In contrast, "connection modules, especially media processors, are more complex, and if they are not major product updates, they generally won't change," he said.

Hackenberg pointed out that the Atheros processor is dedicated to standard products for connectivity applications. The design is based on Tensilica's customized XTensa core, and "it only does one thing - coordinate communication with the network to facilitate the job."

Erickson added, "Connectivity is critical because it involves the speed and reliability of what data can be captured, transmitted and received, etc. Due to the speed/response availability factor, it directly affects the speaker. How “immediate” interaction is. Therefore, Wi-Fi throughput, quality of service (QoS) and range improvement will help.”

All "local" smart functions are handled by the TI DM3725. Hackenberg pointed out, "This is a system-on-a-chip (SoC) designed for a variety of multimedia applications such as STBs, TVs, monitors, video game systems, etc."

The DM3725 is a component based on the ARM Cortex A8 and integrates TI's C64x+DSP and 3D graphics acceleration engine. "The Cortex A8 is a mature and cost-effective application processor, but it's enough to perform simple tasks locally," Hackenberg said.

However, "If the application becomes more complex than just a speaker, it may change."


Amazon Echo Dot Motherboard (Source: iFixit)

Integrated DSP

According to Hackenberg, the key to this SoC is the integration of the DSP, and possibly even the GPU.

“In a typical design, there are multiple input sensors (mainly microphones). The entire audio input is first highly filtered by the DSP, allowing the system to quickly understand the difference between the user's speech and environmental noise,” he said. .

"It can even interpret the location relative to the device or even who the talker is; it also creates a pattern that can be processed to match the pattern (usually sent to the cloud)," he added.

But what does the GPU do?

Hackenberg believes that "for local intelligence, GPUs can be used for simpler, faster, and more efficient local pattern matching."

This allows the device to still respond to stored control modes such as "lower volume", "switch channels" or other simple controls without the need for a network connection, he explained. "Next, the application core executes the application based on the response required, the input or control required to start/stop, and the content that must be displayed."

Microphone array

The appeal of Amazon Echo and Dot is that it uses a 7-microphone array. Amazon claims that Echo and Dot use "multiple microphones and beamforming technology," so "can hear your voice throughout the room - even in the context of playing music." The company also said that Echo is a Professional-grade tuning speakers that fill the entire room with 360° immersive sound.

According to Marwan Boustany, senior analyst for MEMS and sensors at IHS Markit, Echo uses the MEMS microphone from Knowles.

Dot uses 7 microphone arrays

Boustany pointed out that increasing the signal-to-noise ratio (SNR), matching, and performance for sound frequencies will help far-field audio capture while improving speech recognition.

But in the end, "the algorithm is the real key to achieving better speech recognition," he said. "The so-called "wisdom" is that the cloud may continue to be a critical application, while local processing can improve for simple/predefined phrases ( Such as Hey Siri, etc.).

He cited Cypheras as an example. "This type of software vendor will benefit voice recognition in smart home systems such as Alexa."

Amazon Echo Dot

Amazon Echo Dot (Source: iFixit)

Increasing competition

In terms of suppliers, there are currently several vendors offering microcontrollers (MCUs) and ASSPs for connectivity that may compete in this area, including Apple, Broadcom, Cypress, and Microchip. (Microchip), NXP, Renesas, STMicroelectronics, and Silicon Labs. Boustany said, "The combination of 802.11n and BT 4.0 is not common, but some designs may use Bluetooth only for lower cost solutions.

Media processors are tricky. While many mobile device application processor vendors are available, it is costly for simple applications. According to Hackenberg's observations, vendors may choose not to provide comparable DSP or pattern matching capabilities.

"I might consider Apple Ax, Broadcom BCM7xxxx, Hisilicon Hi3xxx, NXP i.MX, MediaTek MT8xxx, STMicroelectronics STIHxxx, Qualcomm Snapdragon, etc. Of course, TI may have the best DSP support in terms of cost. (The importance of voice recognition is important), but other suppliers are constantly closing the gap."

XMOS believes the company will gain momentum in this market. For voice assistant products such as Echo, the key to improving performance is far-field voice capture, beamforming and processing speed. Neil believes that "with a large amount of processing power and embedded DSP, our XMOS single-chip components provide a scalable and differentiated solution."

XMOS xCore voice interface case

XMOS xCore voice interface case (Source: XMOS)

Compile: Susan Hong

(Reference: Amazon Echo & How It Resonates, by Junko Yoshida)

High GTS

Maskking(Shenzhen) Technology CO., LTD , https://www.szdisposableecigs.com