Frontend News Digest 26-29.11

Pros of Strapi CMS, create a Restful API Using Mongoose and Joi, Node OracleDB update and more
29 November 2019   188

Greetings! I hope your week went great! Here's new Python news digest.

Learn how to publish npm packages with meta files, check tutorial to Helm 3, how to reduce Twilio webhooks and more

Guides

  • Publishing npm Packages Without Meta Files 

Learn what ‘meta’ files (config files, .npmignore, IDE files, etc.) should not make it into your npm packages

  • Guide to Helm 3 with an Express.js microservice

Helm is a package management tool for the Kubernetes ecosystem and this tutorial covers creating a chart (packages/pre-packaged apps Helm can work with) for an Express.js service

  • How to secure Twilio webhook URLs in Node.js

Three ways to secure your webhook (HTTP request that Twilio, a popular communication API provider for SMS, voice, video performs to find out what the reaction to a Twilio should be) URLs

Articles

  • 5 Things I love about Strapi, a Node.js headless CMS

Overview of Strapi CMS, covering its pros

Updates

  • fix-es-imports

Fixes your ES import paths

  • PostGraphile

This solutiona llows you to get an instant  GraphQL API for your PostgreSQL database with one command

  • node-oracledb

Oracle Corp supported Oracle Database driver for Node.js 

  • public-ip 

Allows you to get public IP address "very fast", as developers say - maybe, worth checking?

Videos

  • Promises From Scratch In A Post-Apocalyptic Future

  • Node.js | Hapi.js & MongoDB | Create a Restful API Using Mongoose and Joi

Mozilla to Release DeepSpeech 0.6

DeepSpeech is than traditional systems and provides higher recognition quality if the noise is present
09 December 2019   123

Introduced Mozilla’s DeepSpeech 0.6 speech recognition engine, which implements the eponymous speech recognition architecture proposed by Baidu researchers. The implementation is written in Python using the TensorFlow machine learning platform and is distributed under the free MPL 2.0 license. Supported work in Linux, Android, macOS and Windows. There is enough performance to use the engine on LePotato, Raspberry Pi 3 and Raspberry Pi 4 boards.

The kit also offers trained models, sample audio files, and command line recognition tools. To embed the speech recognition function in their programs, ready-to-use modules for Python, NodeJS, C ++ and .NET are offered (third-party developers separately prepared modules for Rust and Go). The finished model is delivered only for the English language, but for other languages ​​according to the attached instructions, you can train the system yourself using voice data collected by the Common Voice project.

DeepSpeech is much simpler than traditional systems and at the same time provides higher recognition quality in the presence of extraneous noise. The development does not use traditional acoustic models and the concept of phonemes; instead, they use a well-optimized machine learning system based on a neural network, which eliminates the need to develop separate components for modeling various deviations, such as noise, echo, and speech features.

The flip side of this approach is that to obtain high-quality recognition and training of a neural network, the DeepSpeech engine requires a large amount of heterogeneous data dictated in real conditions by different voices and in the presence of natural noise. The Common Voice project created by Mozilla is engaged in the collection of such data. It provides a proven data set with 780 hours in English, 325 in German, 173 in French and 27 hours in Russian.

The ultimate goal of the Common Voice project is the accumulation of 10 thousand hours with recordings of various pronunciation of typical phrases of human speech, which will achieve an acceptable level of recognition errors. In the current form, the project participants have already dictated a total of 4.3 thousand hours, of which 3.5 thousand passed the test. When teaching the final English model for DeepSpeech, 3816 hours of speech were used, except for Common Voice covering data from LibriSpeech, Fisher and Switchboard projects, as well as including about 1700 hours of transcribed radio show recordings.

When using the ready-made English model for downloading, the recognition error level in DeepSpeech is 7.5% when evaluated with the LibriSpeech test suite. For comparison, the level of errors in human recognition is estimated at 5.83%.

DeepSpeech consists of two subsystems - an acoustic model and a decoder. The acoustic model uses deep machine learning methods to calculate the probability of the presence of certain characters in the input sound. The decoder uses a ray search algorithm to convert character probability data into a text representation.