GitHub - krazyjoy/AI_CUP

Workflow

Meaning of Melspectrogram

Studies have shown that humans do not perceive frequencies on a linear scale. We are better at detecting differences in lower frequencies than higher frequencies. For example, we can easily tell the difference between 500 and 1000 Hz, but we will hardly be able to tell a difference between 10,000 and 10,500 Hz, even though the distance between the two pairs are the same. In 1937, Stevens, Volkmann, and Newmann proposed a unit of pitch such that equal distances in pitch sounded equally distant to the listener. This is called the mel scale. We perform a mathematical operation on frequencies to convert them to the mel scale.

reference: understanding the mel spectrogram

Melspectrogram Feature Process

load audio
$\quad \quad \downarrow$
convert to melspectrogram
$\quad \quad\downarrow$
(n_mels: 256, fmin=0, fmax=14000)
$\quad \quad\downarrow$
frequency map to decibel format
(np.abs(stft))
$\quad \quad\downarrow$
resize to (256, 512)
"from skimage.transform import resize"
$\quad \quad\downarrow$
stack to 3 channels (for cnn)
$\quad \quad\downarrow$
np.stack((stft_db),(stft_db),(stft_db))
$\quad \quad\downarrow$
append each sample to list
$\quad \quad \downarrow \quad$
(nsamples, 3, 256, 512)

Medical Record Processing

all columns $\rightarrow$ subset of relevant columns


'ID'	'Sex'	'Age'	'Narrow pitch range'
'Decreased volume'	'Fatigue'	'Dryness'	'Lumping'
'heartburn'	'Choking'	'Eye dryness'	'PND'
'Smoking'	'PPD'	'Drinking'	'frequency'
'Diurnal pattern'	'Onset of dysphonia'	'Noise at work'	'Occupational vocal demand'
'Head injury'	'CVA'	'Voice handicap index - 10'	'Disease category'

'Disease category' is the classification column

CNN Model

The model summary for custom CNN model.

optimizer: nadam
minimum lr: 1e-8
loss: categorical cross entropy
validation ratio: 5%
callback: reduce lr on validation loss and early stopping after 10 epochs
save checkpoint: True

DNN Model

The model summary for DNN model.

hidden layers: 3
activation function: sigmoid (hidden nodes), softmax (categorical prediction)
loss: categorical corss entropy
optimizer: adam
metrics: accuracy

Results UAR and Confusion Matrix

testing data	UAR
public	0.687
private	0.543

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
images		images
AI_CUP_Pytorch.ipynb		AI_CUP_Pytorch.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workflow

Meaning of Melspectrogram

Melspectrogram Feature Process

Medical Record Processing

CNN Model

DNN Model

Results UAR and Confusion Matrix

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Workflow

Meaning of Melspectrogram

Melspectrogram Feature Process

Medical Record Processing

CNN Model

DNN Model

Results UAR and Confusion Matrix

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages