Python News Digest 23-29.11

Interviews with the authors of Python Testing with pytest and Python Crash Cours, learn to work with Neurotech data 'fNIRS' and more
29 November 2019   151

Greetings! I hope your week went great! Here's new Python news digest.

Read the inteview with Guido, check what's new in Python 3.9a1 and check the latest update for munggoggo.

Guides

  • Learn to Work with Next Gen. Neurotech Data ‘fNIRS’

Get started with fNIRS sensing data specifically oxygenated hemoglobin “HbO2/HbO” data for analyzing a data stream from a sensor

Articles

  • Guido van Rossum on How Python Makes Thinking in Code Easier

A conversation with the creator of the world’s most popular programming language on removing brain friction to work better.

  • What’s New in Python 3.9a1

The first draft changelog for Python 3.9 alpha 1 is out, so it's time to check it.

  • Python Community Interview With Brian Okken

Brian Okken is the author of Python Testing with pytest, and the host of two Python-related podcasts

  • Variable Explorer improvements in Spyder 4

Spyder, popular Python IDE, will receive a major update in the nearest future, so it's time to get familiar with some of the upcoming features

Updates

  • munggoggo

Asyncio based agent platform written in Python and based on RabbitMQ

  • pytest-quarantine

A plugin for pytest to manage expected test failures

  • tsaug

Package for time series augmentation

Podcast

  • Interview with Eric Matthes

Kelly Paredes (Curriculum guru, Google educator and co-host) and Sean Tibor (computer science teacher and co-host) interview Eric Matthes, author of Python Crash Cours

Mozilla to Release DeepSpeech 0.6

DeepSpeech is than traditional systems and provides higher recognition quality if the noise is present
09 December 2019   123

Introduced Mozilla’s DeepSpeech 0.6 speech recognition engine, which implements the eponymous speech recognition architecture proposed by Baidu researchers. The implementation is written in Python using the TensorFlow machine learning platform and is distributed under the free MPL 2.0 license. Supported work in Linux, Android, macOS and Windows. There is enough performance to use the engine on LePotato, Raspberry Pi 3 and Raspberry Pi 4 boards.

The kit also offers trained models, sample audio files, and command line recognition tools. To embed the speech recognition function in their programs, ready-to-use modules for Python, NodeJS, C ++ and .NET are offered (third-party developers separately prepared modules for Rust and Go). The finished model is delivered only for the English language, but for other languages ​​according to the attached instructions, you can train the system yourself using voice data collected by the Common Voice project.

DeepSpeech is much simpler than traditional systems and at the same time provides higher recognition quality in the presence of extraneous noise. The development does not use traditional acoustic models and the concept of phonemes; instead, they use a well-optimized machine learning system based on a neural network, which eliminates the need to develop separate components for modeling various deviations, such as noise, echo, and speech features.

The flip side of this approach is that to obtain high-quality recognition and training of a neural network, the DeepSpeech engine requires a large amount of heterogeneous data dictated in real conditions by different voices and in the presence of natural noise. The Common Voice project created by Mozilla is engaged in the collection of such data. It provides a proven data set with 780 hours in English, 325 in German, 173 in French and 27 hours in Russian.

The ultimate goal of the Common Voice project is the accumulation of 10 thousand hours with recordings of various pronunciation of typical phrases of human speech, which will achieve an acceptable level of recognition errors. In the current form, the project participants have already dictated a total of 4.3 thousand hours, of which 3.5 thousand passed the test. When teaching the final English model for DeepSpeech, 3816 hours of speech were used, except for Common Voice covering data from LibriSpeech, Fisher and Switchboard projects, as well as including about 1700 hours of transcribed radio show recordings.

When using the ready-made English model for downloading, the recognition error level in DeepSpeech is 7.5% when evaluated with the LibriSpeech test suite. For comparison, the level of errors in human recognition is estimated at 5.83%.

DeepSpeech consists of two subsystems - an acoustic model and a decoder. The acoustic model uses deep machine learning methods to calculate the probability of the presence of certain characters in the input sound. The decoder uses a ray search algorithm to convert character probability data into a text representation.