Petr Pollak: On-line publications

Petr Pollak: List of on-line avialable publications:
The Noise Suppression System for a Car. Eurospeech'93, Berlin.
Cepstral Speech/Pause Detectors. IEEE NSIP'95, Chalkidiki.
The Study of Speech/Pause Detectors for Speech Enhancement Methods. Eurospeech'95, Madrid.
Extended Spectral Subtraction. Eusipco'96, Trieste.
Study of Speech Recognition in Noisy Environment. ECSAP'97, Prague.

BACK TO:
Publication List (Petr Pollak) | Home page (Petr Pollak) | Speech Processing Group (K331) | Speech Enhancement Group (K331)

Problems or comments - mail to: pollak@feld.cvut.cz. I will appreciate any information about Your experiences which You have made with presented algorithms or problem solutions.
Last change 3 Oct 1998

THE NOISE SUPPRESSION SYSTEM FOR A CAR.
Petr Pollak & Pavel Sovka & Jan Uhlir.
In proc. of the 3rd European Conference on Speech Communication and Technology - EUROSPEECH'93, pp.1073-1076, Berlin, Germany, Sep 1993.

ABSTRACT

The whole system for noise suppression in speech recorded in a running car was designed. One channel spectral subtraction method with full-wave rectification was chosen because of its robustness, simplicity, and non-musical tone output. The improvement of noise suppression was gained by the repetition of this method. Directional microphones for the signal picking up were chosen to improve the input signal-to-noise ratio (SNR) of corrupted speech signal. Segment speech/pause detector based on energy tracking was used with some prefiltration of corrupted speech to improve detector function.

Scanned version of proceeding paper in PDF format

CEPSTRAL SPEECH/PAUSE DETECTORS.
Petr Pollak & Pavel Sovka
In proc. of 1995 IEEE Workshop on Nonlinear Signal and Image Processing, pp.388-391, Neos Marmaras, Halkidiki, Greece, June 20-22, 1995.

ABSTRACT

Many systems for noisy speech processing usually require reliable speech/pause detector. This paper describes two algorithms for speech/pause cepstral detectors. Integral cepstral algorithm and differential algorithm based on differenced cepstrum. Both algorithms use smoothing procedure based on median filtering. New criteria have been used for detectors comparisons. Many experiments confirmed detectors reliability and their ability to detect speech in real car noise with high probability. The computational cost of presented algorithms is slow so they are suitable for real-time implementation.

Scanned version of the paper in the PDF format

THE STUDY OF SPEECH/PAUSE DETECTORS FOR SPEECH ENHANCEMENT METHODS.
Sovka,P. & Pollak,P.
Proceedings of the 4th European Conference on Speech Communication and Technology, pp.1575-1578, Madrid, Spain, September 1995.

ABSTRACT

Speech/pause detectors are the limiting parts of systems for the suppression of additive noises in speech, because the quality of the detector determines the performance of the whole noise suppression system. If the speech/pause decision is not correct then speech echoes and residual noises are present in enhanced speech. Information about speech activity is need not only for an estimation of background noise characteristics but also for time delay compensation of signals picked up by microphone array. Basic principles of various adaptive algorithms for speech detection in a noise and their behaviour under real car noise conditions are described. Energy, spectral, cepstral, and coherence detectors are compared. All these algorithms are suitable for real time implementation with one or two microphones. High probability of correct speech/pause detection can be obtained even if signal to noise ratio is low and noises are highly nonstationary.

Scanned paper in the PDF format

EXTENDED SPECTRAL SUBTRACTION
Sovka,P. & Pollak,P. & Kybic,J.
In proc. of European Conference on Signal processing and Communication, Trieste, September, 1996.

ABSTRACT

The spectral subtraction offers the simple and computationally efficient tool for the suppression of an additive noise in a speech signal. This method has been extensively studied for almost twenty years. The research has been focused on higher degree of noise suppression, lower speech distortion, and less audible musical noise. The last requirement is important especially in the hand-free telephony application. But the main shortcoming of this method has not been overcome for a long time. It is the updating of the background noise characteristics estimation, especially during speech sequences. The extended spectral subtraction overcomes the typical disadvantage of the standard spectral subtraction technique - the impossibility of noise estimation during speech sequence. Our method is the combination of Wiener filtering and spectral subtraction. The noise can be succesfully updated even during the speech sequences and that is why there is no need of voice activity detector.

Postscript version of the paper

Paper in PDF-format

STUDY OF SPEECH RECOGNITION IN NOISY ENVIRONMENT
Kreisinger,T. & Pollak,P. & Sovka,P. & Uhlir,J.
In proc. of 1-st European Conference on Signal prediction and Analysis, Prague, June, 1997.

ABSTRACT

This paper addresses effects of mismatched conditions and their minimization with respect to the performance of speaker-independent isolated word recognition in the car-noise environment without consideration of Lombard effect. This study is primarily intended to study the dependence of the recognition rate on the SNR of an input signal without and with noise enhancement preprocessing, especially to find conditions under that the modified spectral subtraction can be effectively used for the speech recognition in a real non-stationary car-noise environment. If as the worst recognition rate is admitted e.g. 80\%, then the use of spectral subtraction methods enables to use wider interval of input SNRs: for the trainig made on a clean speech this interval is (40,6) dB; for the training made on a noisy speech this interval is (40,-2) dB; for the training performed on an enhanced speech this interval is (40, -8) dB. The third case gives the widest interval of SNRs in which a recogniser (with the final recognition rate in the interval of (100,80)\%) can be used.

Postscript version of the paper - It will be here soon !