BAB V PENUTUP
5.2 Saran
Beberapa saran untuk peneliti lainnya yang tertarik untuk mengembangkan penelitian terkait penggunaan data audio adalah sebagai berikut
1. Dalam ekstraksi ciri menggunakan metode MFCC agar dapat ditemukan metode optimasi parameter yang tepat dalam menentukan parameter-parameter yang optimal sehingga dapat dihasilkan data hasil ekstraksi yang lebih optimal pula.
2. Data yang digunakan pada penelitian selanjutnya dapat ditambahkan data suara dengan kebisingan seperti berada di keramaian atau dapat juga dengan mempertimbangkan jarak antara microphone dan partisipan serta faktor-faktro lain seperti jenis kelamin dan umur responden.
3. Pada tahap klasifikasi suara dapat diterapkan lagi penggunaan metode klasifikasi yang lainnya atau metode klasifikasi terbarukan agar dapat dibandingkan metode mana yang paling baik digunakan dalam mengklasifikasi data suara hasil ekstraksi ciri MFCC.
50
DAFTAR PUSTAKA
[1] K. T. Putra, "Sistem Pengenal Wicara Menggunakan Mel-Frequency Cepstral Coefficients," Jurnal Ilmiah Semesta Teknika, vol. 20, no. 1, pp. 75-80, 2017.
[2] A. G and A. Padmanabhan, "devopedia.org," [Online]. Available: https://devopedia.org/speech-recognition. [Accessed 31 January 2021].
[3] M. A. Hilmi, "Identifikasi Suara Menggunakan FFT dan Neural Network," Universitas Muhammadiyah Gresik, 2018. [Online]. Available:
http://eprints.umg.ac.id/386/. [Accessed 8 12 2020].
[4] T. Nasution, "Metoda Mel Frequency Cepstrum Coefficients (MFCC) untuk Mengenali Ucapan pada Bahasa Indonesia," Jurnal Sains dan Teknologi Informasi, vol. 1, no. 1, pp. 22-31, 2012.
[5] T. Chamidy, "Metode Mel Frequency Cepstral Coefficients (MFCC) Pada Klasifikasi Hidden Markov Model (HMM) Untuk Kata Arabic Pada Penutur Indonesia," Jurnal MATICS, vol. 8, no. 1, pp. 36-39, 2016.
[6] R. Hidayat, A. Bejo and D. G. R, "Pengenalan Suara Kata Khusus Dalam Percakapan Bahasa Indonesia Berbasis MFCC dan SVM," CITEE, pp. 163-167, 2019.
[7] A. Anggoro, S. Herdjunanto and R. Hidayat, "MFCC dan KNN untuk Pengenalan Suara Artikulasi P," AVITEC, vol. 2, no. 1, pp. 13-19, 2020.
[8] R. Efendi, "Automatic Speech Recognition Bahasa Indonesia Menggunakan Bidirectional Long Short-Term Memory dan Connectionist Temporal Classification," Universitas Sumatera Utara, Medan, 2019.
[9] V. H. Noya, F. Rumlawang and Y. Lenusssa, "Aplikasi Transformasi Fourier untuk Menentukan Periode Curah Hujan (Studi Kasus : Periode Curah Hujan
51
di Kabupaten Seram Bagian Barat, Provinsi Maluku)," Jurnal Matematika Integratif, vol. 10, no. 2, pp. 85-94, 2014.
[10] A. Ahmad, "Mengenal Artificial Intelligence, Machine Learning, Neural Network, dan Deep Learning," Yayasan Cahaya Islam, Jurnal Teknologi Indonesia, pp. 1-5, 2017.
[11] A. Amri, "Implementasi Algoritma Random Forest untuk Mendeteksi Hate Speech dan Abusive Languagen Pada Twitter Bahasa Indonesia," UIN Sultan Syarif Kasim Riau, Pekan Baru, 2020.
[12] R. A. Haristu, "Penerapan Metode Random Forest untuk Memprediksi Win Ratio Pemain Player Unkw=nown Battleground," Universitas Sanata Dharma, Yogyakarta, 2019.
[13] L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.
[14] S. A. Yahya, "Klasifikasi Ketepatan Lama Studi Mahasiswa Menggunakan Metode Support Vector Machine dan Random Forest," Universitas Islam Indonesia, 2018, 2018.
[15] A. J. T, D. Yanosma and K. Anggriani, "Implementasi Metode K-Nearest Neighbor (KNN) dan Simple Additive Weighting (SAW) Dalam Pengambilan Keputusan Seleksi Penerimaan Anggota Paskibraka," Jurnal Pseudocode, vol. 3, no. 2, pp. 98-112, 2016.
[16] W. Yustanti, "Algoritma K-Nearest Neighbour Untuk Memprediksi Harga Jual Tanah," Jurnal Matematika, Stastistik dan Komputasi, vol. 9, no. 1, pp. 57-68, 2012.
[17] B. Santosa, "Tutorial SVM 1," [Online]. Available:
http://sutikno.blog.undip.ac.id/files/2011/11/tutorial-svm.pdf. [Accessed 17 November 2020].
52
[18] S. Ailiyya, "Analisis Sentimen Berbasis Aspek Pada Ulasan Aplikasi
Tokopedia Menggunakan Support Vector Machine," UIN Syarif Hidayatullah Jakarta, Jakarta, 2020.
[19] M. Awad and R. Khanna, Efficient Learning Machines :Theories, Concepts, and Applications for Engineers and System Designers, Apress, 2015.
[20] K. Sembiring, "Support Vector Machine- Sutikno," September 2007. [Online]. Available: http://sutikno.blog.undip.ac.id/files/2011/11/tutorial-svm-bahasa-indonesia-oleh-krisantus.pdf. [Accessed 5 December 2020].
[21] I. Syarif, A. P. Bennett and G. Wills, "SVM Parameter Optimization using Grid Search and Genetic Algorithm to Improve Classification Performance," TELKOMNIKA, pp. 1502-1509, 2016.
[22] B. Utami and Y. Rahayu, "Klasifikasi Penentuan Tim Utama Olahraga Houkey Menggunakan Algoritma C4.5 (Studi Kasus : Houkey Kabupaten Kendal)," Techno.COM, vol. 15, no. 4, pp. 364-368, 2016.
[23] "Scipy.io.wavfile.read," The SciPy community, [Online]. Available:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html. [Accessed 12 Juli 2020].
[24] Heriyanto, S. Hartati and A. E. Putra, "Ekstraksi Ciri Mel Frequency Cepstral Coefficients (MFCC) dan Rerata Coefficients untuk Pengecekan Bacaan Al-Qur'an," TELEMATIKA, vol. 15, no. 2, pp. 99-108, 2018.
[25] I. S. Permana, Y. I. Nurhasanah and A. Zulkarnain, "Implementasi Metode MFCC dan DTW Untuk Pengenalan Jenis Suara Pria dan Wanita," MIND Journal, vol. 3, no. 1, pp. 49-63, 2018.
[26] K. S. Rao and M. K.E, "Speech Recognition Using Articulatory and Excitation Source Features," 2017. [Online]. Available:
53
https://link.springer.com/content/pdf/bbm%3A978-3-319-49220-9%2F1.pdf. [Accessed 21 November 2020].
[27] A. R. Fauzi, "Simulasi Control Smart Home Berbasis Mel Frequency Cepstral Coefficients Menggunakan Metode Support Vector Machine (SVM)," UIN Sunan Ampel, Surabaya, 2020.
[28] J. Makhoul, "A Fast Cosine Transorm in One and Two Dimensions," IEEE Transactions on Acoustics, Speech and Signal Processing, Vols. ASSP-28, no. 1, pp. 27-34, 1980.
54 LAMPIRAN
A. Hasil Konversi Data Analog ke bentuk Digital
Hasil Konversi File Suara “Assalamualaikum”
0 1 2 3 … 44099 0 -0,00655 -0,01246 -0,01002 -0,00646 … 0,04037 1 -0,00263 -0,00438 -0,00132 -0,00725 … 0,00189 2 -0,01220 -0,01102 -0,00158 -0,00510 … 0,00138 3 -0,01391 -0,02107 -0,02323 -0,02793 … -0,00769 4 0,00400 0,00679 0,00347 -0,00159 … -0,00685 … … … … … 29 0,00197 -0,00469 0,00387 -0,00663 … -0,00092
Hasil Konversi File Suara “Astaghfirullah”
0 1 2 3 … 44099 0 -0,00172 -0,00370 0,00180 0,00129 … -0,12748 1 -0,00578 -0,00378 -0,00193 -0,00200 … -0,00264 2 0,00000 0,00000 0,00000 0,00000 … 0,00990 3 0,00000 0,00000 0,00000 0,00000 … -0,00354 4 0,00411 0,00061 0,00210 -0,00100 … 0,00408 … … … … … 29 0,00744 0,00710 0,00380 0,00472 … 0,00479
B. Ekstraksi Ciri Dengan Mel Frequency Ceptral Coefficients (MFCC) :
Importing modul import numpy
55 import scipy.io.wavfile
from scipy.fftpack import dct
import glob
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
Pre emphasis
def pre_emphasis(signal):
pre_emphasis = 0.97
signal_emphasis = numpy.append(signal[0], signal[1:] - pre_emphasis * signal[:-1])
return signal_emphasis
Frame Blocking
def framming(signal_emphasis, sample_rate):
frame_size=0.025
56 frame_length = frame_size * sample_rate
frame_step = frame_stride * sample_rate
signal_length = len(signal_emphasis)
frames_overlap = frame_length - frame_step
num_frames = numpy.abs(signal_length - frames_overlap) // numpy.abs(frame_length - frames_overlap)
rest_samples = numpy.abs(signal_length - frames_overlap) % numpy.abs(frame_length - frames_overlap)
pad_signal_length = int(frame_length - rest_samples)
z = numpy.zeros((pad_signal_length)) pad_signal = numpy.append(signal_emphasis, z) frame_length = int(frame_length) frame_step = int(frame_step) num_frames = int(num_frames)
indices = numpy.tile(numpy.arange(0, frame_length), (num_frames, 1))+ numpy.tile(numpy.arange(0, num_frames * frame_step,
frame_step),(frame_length, 1)).T
57 return frames, frame_length
Windowing
def windowing(frames, frame_length):
frames= frames*(numpy.hamming(frame_length))
return frames
Fast Fourier Transform (FFT) def fft(frames): NFFT = 512 mag_frames = numpy.absolute(numpy.fft.rfft(frames, NFFT)) pow_frames = ((1.0 / NFFT) * ((mag_frames) ** 2)) return pow_frames, NFFT Filter Bank
def filter_bank(pow_frames, sample_rate, NFFT):
nfilt = 40
low_freq_mel = 0
high_freq_mel = (2595 * numpy.log10(1 + (sample_rate / 2) / 700
mel_points = numpy.linspace(low_freq_mel, high_freq_mel, nfilt + 2)
58
bin = numpy.floor((NFFT + 1) * hz_points / sample_rate)
fbank = numpy.zeros((nfilt, int(numpy.floor(NFFT / 2 + 1))))
for m in range(1, nfilt + 1):
f_m_minus = int(bin[m - 1]) # left
f_m = int(bin[m]) # center
f_m_plus = int(bin[m + 1]) # right
for k in range(f_m_minus, f_m):
fbank[m - 1, k] = (k - bin[m - 1]) / (bin[m] - bin[m - 1])
for k in range(f_m, f_m_plus):
fbank[m - 1, k] = (bin[m + 1] - k) / (bin[m + 1] - bin[m])
filter_banks = numpy.dot(pow_frames, fbank.T)
filter_banks = 20 * numpy.log10(filter_banks) # dB
return(filter_banks)
Discrete Cosinous Transform & Cepstral Liftering def cepstral_liftering(filter_banks):
num_ceps = 12
cep_lifter= 11
59 (nframes, ncoeff) = mfcc.shape
n = numpy.arange(ncoeff)
lift = 1 + (cep_lifter / 2) * numpy.sin(numpy.pi * n / cep_lifter)
mfcc *= lift
mfcc -= (numpy.mean(mfcc, axis=0) + 1e-8)
return mfcc Ekstraksi Ciri MFCC def generate_features(): all_features=[] all_labels=[]
audios=['alhamdulillah', 'assalamualaikum', 'astaghfirullah','subhanallah', 'waalaikumsalam', ]
for audio in audios:
sound_files=glob.glob('audio/'+audio+'/*.wav')
print('processing %d audios in %s file... '% (len(sound_files), audio))
for f in sound_files:
sample_rate, signal = scipy.io.wavfile.read(f)
signal = signal[0:int(1*sample_rate), 0]
60 frames=framming(pre, sample_rate) window=windowing(frames[0], frames[1]) ff=fft(window) filter=filter_bank(ff[0], sample_rate, ff[1]) mfcc=cepstral_liftering(filter) mfcc=numpy.ndarray.flatten(mfcc) all_features.append(mfcc) all_labels.append(audio)
return all_features, all_labels
C. Hasil Ektraksi Ciri dengan MFCC
Hasil Ekstraksi Ciri MFCC Data Suara “Subhanallah”
0 1 2 3 … 1175 0 -38,337 -245,812 -138,206 169,163 … -4,088 1 -105,784 -324,813 -218,501 -98,019 … -11,022 2 -67,438 -108,481 -188,076 -21,469 … -4,994 3 3,164 -108,481 -188,796 286,928 … 1,661 4 -28,172 -53,643 -8,423 31,634 … 9,607 … … … … … 29 -13,319 -92,784 -174,765 89,633 … -6,039
Hasil Ekstraksi Ciri MFCC Data Suara “Waalaikumsalam”
0 1 2 3 … 1175
61 1 -136,464 -284,894 -12,566 54,861 … -7,651 2 1,558 140,450 292,509 144,821 … -11,666 3 -19,404 69,041 238,048 242,339 … -10,454 4 71,928 154,744 54,342 110,388 … -10,610 … … … … … 29 -8,349 39,595 154,606 133,131 … 4,333 D. Pembagian Data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2,random_state=0)
E. Optimasi Parameter
parameters = [ {"n_estimators":[10,20,30,40], "max_depth":[4, 5, 10, 20, 100, 300, 500],"random_state":[0]}]
Rf = GridSearchCV(RandomForestClassifier(), parameters, cv=10, scoring = "accuracy")
Rf.fit(X_train, y_train)
print(Rf.best_estimator_)
print(Rf.best_score_)
k_range = list(range(1, 51))
weight_options = ['uniform', 'distance']
62
knn= GridSearchCV(KNeighborsClassifier(), prm_grid,cv=10, scoring='accuracy')
knn.fit(X_train, y_train)
print('Parameter terbaik:{}\nbest score:{}'.format(knn.best_params_, knn.best_score_))
svm= GridSearchCV(SVC(kernel='linear'),
param_grid = {'C': [0.1, 1, 10, 100, 1000]},cv=10, scoring='accuracy')
svm.fit(X_train, y_train)
print('Parameter terbaik:{}\nbest score:{}'.format(svm.best_params_, svm.best_score_))
kNN_= KNeighborsClassifier(n_neighbors= 1, weights= 'uniform')
svm_=SVC(kernel= 'linear', C= 0.1)
Rf_=RandomForestClassifier(max_depth=4, n_estimators=40, random_state=0)
Plotting akurasi dengan parameter terbaik model_ = ['kNN', 'SVM', "Rf"]
knn_score = cross_val_score(kNN_, X_train, y_train, cv=10, scoring='accuracy', n_jobs=-2, verbose=1)
svm_score = cross_val_score(svm_, X_train, y_train, cv=10, scoring='accuracy', n_jobs=-2, verbose=1)
63
Rf_score= cross_val_score(Rf_, X_train, y_train, cv=10, scoring='accuracy', n_jobs=-2, verbose=1)
score_ = [knn_score, svm_score, Rf_score]
data = {m:s for m,s in zip(model_, score_)}
for name in data.keys():
print("Accuracy %s: %0.2f (+/- %0.2f)" % (name, data[name].mean(), data[name].std() * 2)) sns.boxplot(data=pd.DataFrame(data), orient='h') plt.savefig("plot.png") F. Evaluasi Model svm_.fit(X_train, y_train) y_pred = svm_.predict(X_test) import sklearn sklearn.metrics.accuracy_score(y_test, y_pred)
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
print('Akurasi : ', accuracy_score(y_test, y_pred))
kNN_.fit(X_train, y_train)
64 import sklearn
sklearn.metrics.accuracy_score(y_test, y_pred)
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
print('Akurasi : ', accuracy_score(y_test, y_pred))
G. Confusion Matrix
from sklearn.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 10))
plot_confusion_matrix(svm_, X_test, y_test, cmap=plt.cm.Blues, ax=ax)
plt.savefig('confusion matrix.png')
plt.show()
from sklearn.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 10))
plot_confusion_matrix(kNN_, X_test, y_test, cmap=plt.cm.Blues, ax=ax)
plt.savefig('confusion matrix-knn.png')