-
Notifications
You must be signed in to change notification settings - Fork 29
/
Copy pathtutorial_processing.html
318 lines (308 loc) · 20.4 KB
/
tutorial_processing.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
<!doctype html>
<html lang="en">
<head>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<!-- BLOCK INDEX -->
<meta name="robots" content="noindex">
<!-- BLOCK INDEX -->
<link rel="icon" href="img/favicon.png" type="image/png">
<title>SpeechBrain</title>
<!-- Bootstrap CSS -->
<link rel="stylesheet" href="css/bootstrap.css">
<link rel="stylesheet" href="vendors/linericon/style.css">
<link rel="stylesheet" href="css/font-awesome.min.css">
<link rel="stylesheet" href="vendors/owl-carousel/owl.carousel.min.css">
<link rel="stylesheet" href="vendors/lightbox/simpleLightbox.css">
<link rel="stylesheet" href="vendors/nice-select/css/nice-select.css">
<link rel="stylesheet" href="vendors/animate-css/animate.css">
<!-- main css -->
<link rel="stylesheet" href="css/style.css">
<link rel="stylesheet" href="css/responsive.css">
</head>
<body>
<!--================Header Menu Area =================-->
<header class="header_area">
<div class="main_menu">
<nav class="navbar navbar-expand-lg navbar-light">
<div class="container box_1620">
<!-- Brand and toggle get grouped for better mobile display -->
<a class="navbar-brand logo_h" href="index.html"><img src="img/logo_line_big.png" width="175px" alt=""></a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse offset" id="navbarSupportedContent">
<ul class="nav navbar-nav menu_nav justify-content-center">
<li class="nav-item active"><a class="nav-link" href="index.html">Home</a></li>
<li class="nav-item"><a class="nav-link" href="about.html">About SpeechBrain</a>
<li class="nav-item"><a class="nav-link" href="contributing.html">Contributing</a>
<li class="nav-item"><a class="nav-link" href="https://speechbrain.readthedocs.io/en/latest/index.html">Documentation</a>
<li class="nav-item submenu dropdown">
<a href="#" class="nav-link dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Tutorials</a>
<ul class="dropdown-menu">
<li class="nav-item"><a class="nav-link" href="tutorial_basics.html">SpeechBrain Basics</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_advanced.html">SpeechBrain Advanced</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_asr.html">Speech Recognition</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_separation.html">Source Separation</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_enhancement.html">Speech Enhancement</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_nn.html">Neural Architectures</a></li>
<li class="nav-item"><a class="nav-link" href="tutorial_processing.html">Speech Processing</a></li>
</ul>
</li>
</ul>
</div>
</div>
</nav>
</div>
</header>
<!--================Header Menu Area =================-->
<!--================Home Banner Area =================-->
<section class="home_banner_area blog_banner">
<div class="banner_inner d-flex align-items-center">
<div class="overlay bg-parallax" data-stellar-ratio="0.9" data-stellar-vertical-offset="0" data-background=""></div>
<div class="container">
<div class="blog_b_text text-center">
<h2>SpeechBrain Tutorials</h2>
<h3>Speech Processing</h3>
</div>
</div>
</div>
</section>
<!--================End Home Banner Area =================-->
<!--================Blog Categorie Area =================-->
<section class="blog_categorie_area">
<!-- We don't have category for now -->
</section>
<!--================Blog Categorie Area =================-->
<!--================Blog Area =================-->
<section class="blog_area">
<div class="container">
<div class="main_title discourse">
<p> <img src="img/logo_discourse.png"/> Join our official <a href="https://speechbrain.discourse.group">Discourse</a> to discuss with SpeechBrain users coming from all around the world! <img src="img/logo_discourse.png"/></p>
</div>
<div class="row">
<div class="col-md-9">
<div class="blog_left_sidebar">
<article class="row blog_item">
<div class="col-md-3">
<div class="blog_info text-right">
<div class="post_tag">
<a class="active" href="#">Speech Processing</a>
</div>
<ul class="blog_meta list">
<li><a href="about.html">Ravanelli M.<i class="lnr lnr-user"></i></a></li>
<li><a href="#">Jan. 2021<i class="lnr lnr-calendar-full"></i></a></li>
<li><a href="#">Difficulty: easy<i class="lnr lnr-cog"></i></a></li>
<li><a href="#">Time: 20min<i class="lnr lnr-hourglass"></i></a></li>
</ul>
</div>
</div>
<div class="col-md-9">
<div class="blog_post">
<div class="blog_details">
<h2>Speech Augmentation</h2>
<p>A popular saying in machine learning is "there is no better data than more data". However, collecting new data can be expensive
and we must cleverly use the available dataset. One popular technique is called speech augmentation. The idea is to artificially
corrupt the original speech signals to give the network the "illusion" that we are processing a new signal. This acts as a powerful regularizer,
that normally helps neural networks improving generalization and thus achieve better performance on test data.</p>
<a href="https://colab.research.google.com/drive/1JJc4tBhHNXRSDM2xbQ3Z0jdDQUw4S5lr?usp=sharing" class="blog_btn">Open in Google Colab</a>
</div>
</div>
</div>
</article>
<article class="row blog_item">
<div class="col-md-3">
<div class="blog_info text-right">
<div class="post_tag">
<a class="active" href="#">Speech Processing</a>
</div>
<ul class="blog_meta list">
<li><a href="about.html">Ravanelli M.<i class="lnr lnr-user"></i></a></li>
<li><a href="#">Jan. 2021<i class="lnr lnr-calendar-full"></i></a></li>
<li><a href="#">Difficulty: easy<i class="lnr lnr-cog"></i></a></li>
<li><a href="#">Time: 20min<i class="lnr lnr-hourglass"></i></a></li>
</ul>
</div>
</div>
<div class="col-md-9">
<div class="blog_post">
<div class="blog_details">
<h2>Fourier Transform and Spectrograms</h2>
<p>In speech and audio processing, the signal in the time-domain is often transformed into another domain.
Ok, but why do we need to transform an audio signal? Some speech characteristics/patterns of the signal (e.g, pitch, formats)
might not be very evident when looking at the audio in the time-domain. With properly designed transformations,
it might be easier to extract the needed information from the signal itself. The most popular transformation is the
Fourier Transform, which turns the time-domain signal into an equivalent representation in the frequency domain.
In the following sections, we will describe the Fourier transforms along with other related transformations such as
Short-Term Fourier Transform (STFT) and spectrograms.</p>
<a href="https://colab.research.google.com/drive/18IgBv3Ip0rWXjYoZywttSmW7Y2AIK1vJ?usp=sharing" class="blog_btn">Open in Google Colab</a>
</div>
</div>
</div>
</article>
<article class="row blog_item">
<div class="col-md-3">
<div class="blog_info text-right">
<div class="post_tag">
<a class="active" href="#">Speech Processing</a>
</div>
<ul class="blog_meta list">
<li><a href="about.html">Ravanelli M.<i class="lnr lnr-user"></i></a></li>
<li><a href="#">Jan. 2021<i class="lnr lnr-calendar-full"></i></a></li>
<li><a href="#">Difficulty: easy<i class="lnr lnr-cog"></i></a></li>
<li><a href="#">Time: 20min<i class="lnr lnr-hourglass"></i></a></li>
</ul>
</div>
</div>
<div class="col-md-9">
<div class="blog_post">
<div class="blog_details">
<h2>Speech Features (MFCC, FBANK)</h2>
<p>Speech is a very high-dimensional signal. For instance, when the sampling frequency is 16 kHz,
we have 16000 samples for each second. Working with such very high dimensional data can be critical from a machine learning perspective.
The goal of feature extraction is to find more compact ways to represent speech.</p>
<a href="https://colab.research.google.com/drive/1CI72Xyay80mmmagfLaIIeRoDgswWHT_g?usp=sharing" class="blog_btn">Open in Google Colab</a>
</div>
</div>
</div>
</article>
<article class="row blog_item">
<div class="col-md-3">
<div class="blog_info text-right">
<div class="post_tag">
<a class="active" href="#">Speech Processing</a>
</div>
<ul class="blog_meta list">
<li><a href="about.html">Ravanelli M.<i class="lnr lnr-user"></i></a></li>
<li><a href="#">Feb. 2021<i class="lnr lnr-calendar-full"></i></a></li>
<li><a href="#">Difficulty: medium<i class="lnr lnr-cog"></i></a></li>
<li><a href="#">Time: 20min<i class="lnr lnr-hourglass"></i></a></li>
</ul>
</div>
</div>
<div class="col-md-9">
<div class="blog_post">
<div class="blog_details">
<h2>Environmental corruption</h2>
<p>In realistic speech processing applications, the signal recorded by the microphone is corrupted by noise and reverberation.
This is particularly harmful in distant-talking (far-field) scenarios, where the speaker and the reference microphone are distant
(think about popular devices such as Google Home, Amazon Echo, Kinect, and similar devices).</p>
<a href="https://colab.research.google.com/drive/1mAimqZndq0BwQj63VcDTr6_uCMC6i6Un?usp=sharing" class="blog_btn">Open in Google Colab</a>
</div>
</div>
</div>
</article>
<article class="row blog_item">
<div class="col-md-3">
<div class="blog_info text-right">
<div class="post_tag">
<a class="active" href="#">Speech Processing</a>
</div>
<ul class="blog_meta list">
<li><a href="about.html">Grondin F. & Aris W.<i class="lnr lnr-user"></i></a></li>
<li><a href="#">Jan. 2021<i class="lnr lnr-calendar-full"></i></a></li>
<li><a href="#">Difficulty: medium<i class="lnr lnr-cog"></i></a></li>
<li><a href="#">Time: 20min<i class="lnr lnr-hourglass"></i></a></li>
</ul>
</div>
</div>
<div class="col-md-9">
<div class="blog_post">
<div class="blog_details">
<h2>Multi-microphone Beamforming</h2>
<p>Using a microphone array can be very handy to improve the signal quality
(e.g. reduce reverberation and noise) prior to performing speech recognition tasks.
Microphone arrays can also estimate the direction of arrival of a sound source, and this information can later
be used to "listen" in the direction of the source of interest.</p>
<a href="https://colab.research.google.com/drive/1UVoYDUiIrwMpBTghQPbA6rC1mc9IBzi6?usp=sharing" class="blog_btn">Open in Google Colab</a>
</div>
</div>
</div>
</article>
<nav class="blog-pagination justify-content-center d-flex">
<ul class="pagination">
<li class="page-item">
<a href="#" class="page-link" aria-label="Previous">
<span aria-hidden="true">
<span class="lnr lnr-chevron-left"></span>
</span>
</a>
</li>
<li class="page-item active"><a href="#" class="page-link">1</a></li>
<li class="page-item">
<a href="#" class="page-link" aria-label="Next">
<span aria-hidden="true">
<span class="lnr lnr-chevron-right"></span>
</span>
</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
</div>
</section>
<!--================Blog Area =================-->
<!--================Footer Area =================-->
<footer class="footer_area p_120">
<div class="container">
<div class="row footer_inner">
<div class="col-lg-5 col-sm-6">
<aside class="f_widget ab_widget">
<div class="f_title">
<h3>About Us</h3>
</div>
<p style="text-align:justify">SpeechBrain isn't a company or an association.
It is an open-source toolkit and a community created by Dr. Mirco Ravanelli and co-created by Dr. Titouan Parcollet.
We aim at making speech technologies more accessible for the community. </p>
<p><!-- Link back to Colorlib can't be removed. Template is licensed under CC BY 3.0. -->
Copyright ©<script>document.write(new Date().getFullYear());</script> All rights reserved</p>
</aside>
</div>
<div class="col-lg-5 col-sm-6">
<aside class="f_widget news_widget">
<div class="f_title">
<h3>Opportunities</h3>
</div>
<p>Thanks to our sponsors, we often recruit talented candidates to continue expanding the functionalities of SpeechBrain.
Feel free to contact us at: [email protected]</p>
</aside>
</div>
<div class="col-lg-2">
<aside class="f_widget social_widget">
<div class="f_title">
<h3>Follow Us</h3>
</div>
<p>Let us be social</p>
<ul class="list">
<li><a href="https://twitter.com/SpeechBrain1"><i class="fa fa-twitter"></i></a></li>
</ul>
</aside>
</div>
</div>
</div>
</footer>
<!--================End Footer Area =================-->
<!-- Optional JavaScript -->
<!-- jQuery first, then Popper.js, then Bootstrap JS -->
<script src="js/jquery-3.2.1.min.js"></script>
<script src="js/popper.js"></script>
<script src="js/bootstrap.min.js"></script>
<script src="js/stellar.js"></script>
<script src="vendors/lightbox/simpleLightbox.min.js"></script>
<script src="vendors/nice-select/js/jquery.nice-select.min.js"></script>
<script src="vendors/isotope/imagesloaded.pkgd.min.js"></script>
<script src="vendors/isotope/isotope-min.js"></script>
<script src="vendors/owl-carousel/owl.carousel.min.js"></script>
<script src="js/jquery.ajaxchimp.min.js"></script>
<script src="js/mail-script.js"></script>
<script src="vendors/counter-up/jquery.waypoints.min.js"></script>
<script src="vendors/counter-up/jquery.counterup.min.js"></script>
<script src="js/theme.js"></script>
</body>
</html>