• Home
  • About us
  • Your Publication
  • Catalogue
  • Newsletter
  • Help
  • Account
  • Contact / Imprint
Thesis - Publication series - Conference proceedings - Reference book - Lecture notes/Textbook - Journal - CD-/DVD-ROM - Online publication - Open Access
Newsletter for authors and editors - New publications service - Archive
View basket
Catalogue : Details

Ziyue Zhao

Contributions to Neural Network-Based Speech Processing: Nonlinear Speech Prediction, Decoder Postprocessing, and Perceptual Loss Functions

FrontBack
 
ISBN:978-3-8440-8779-6
Series:Mitteilungen aus dem Institut für Nachrichtentechnik der Technischen Universität Braunschweig
Herausgeber: Prof. Dr.-Ing. U. Reimers, Prof. Dr.-Ing. T. Kürner and Prof. Dr.-Ing. T. Fingscheidt
Braunschweig
Volume:70
Keywords:Speech Coding; Speech Enhancement; Neural Networks
Type of publication:Thesis
Language:English
Pages:154 pages
Figures:28 figures
Weight:228 g
Format:21 x 14,8 cm
Binding:Paperback
Price:48,80 € / 61,10 SFr
Published:October 2022
Buy:
  » plus shipping costs
Download:

Available PDF-Files for this title:

You need the Adobe Reader, to open the files. Here you get help and information, for the download.

These files are not printable.

 
 DocumentDocument 
 TypePDF 
 Costs36,60 EUR 
 ActionPurchase in obligation and display of file - 1,1 MB (1198555 Byte) 
 ActionPurchase in obligation and download of file - 1,1 MB (1198555 Byte) 
     
 
 DocumentTable of contents 
 TypePDF 
 Costsfree 
 ActionDisplay of file - 256 kB (261802 Byte) 
 ActionDownload of file - 256 kB (261802 Byte) 
     

User settings for registered users

You can change your address here or download your paid documents again.

User:  Not logged in.
Actions:  Login / Register
 Forgotten your password?
Recommendation:You want to recommend this title?
Review copy:Here you can order a review copy.
Link:You want to link this page? Click here.
Export citations:
Text
BibTex
RIS
Abstract:Speech processing technologies are omnipresent in our daily communication products and services. Neural networks, as powerful data-driven models, have shown promising performance in various research fields, including speech processing. This thesis focuses on neural network-based speech processing, and it can be divided into three parts as follows.

In the field of speech prediction, a nonlinear speech predictor using the echo state network (ESN) is proposed as a novel adaptive prediction approach. This proposed nonlinear predictor shows better prediction performance than all baseline prediction methods in the simulations, including a predictor based on a long short-term memory (LSTM) structure. Second, the field of neural network-based speech enhancement puts focus on loss functions. A novel perceptual weighting filter (PWF) loss function motivated by the weighting filter from code-excited linear prediction (CELP) speech coding is proposed. A fully connected neural network (FCNN) and a convolutional neural network (CNN) are both used to evaluate the proposed loss functions, and the simulation results show their superior performance compared to baselines. Finally, neural network-based postprocessing for the enhancement of coded speech is studied. CNN-based postprocessors are proposed either to directly enhance the raw waveform in an end-to-end fashion, or to enhance the cepstral domain features using analysis synthesis. Furthermore, an advanced network structure, the fully convolutional recurrent network (FCRN), is utilized to enhance coded speech in the frequency domain, with the PWF loss function advantageously applied. The experimental results confirm the effectiveness of the proposed postprocessors with improved speech quality.