• Tidak ada hasil yang ditemukan

dswxdgrfzyjhqpnmlklrquchthkh

N/A
N/A
Protected

Academic year: 2024

Membagikan "dswxdgrfzyjhqpnmlklrquchthkh"

Copied!
6
0
0

Teks penuh

(1)

VQ \Sn Tfim vo Dtg Tgp chi KHOA HQC & CONG NGHE 122(08): 53-58

NHUNG VAN BAN TIENG VIET TRONG Dif LIEU AUDIO DV*A VAO DAC DIEM CUA CHU" VIET TIENG VIET

Vu Vfin Tam'", Phan Tr^ng Hanh^

Dgi hoc Ky thudt - Hgu cdn CAND (BQ Cong an).

^HQC vien Ky thugt Qudn su (Bo Quoc phong) TOM TAT

Cfic bfii tofin nhung vSn bfin tieng Viet trong dCr lieu audio deu phfii giai quy^t hai vfin de lien quan, do la tfing hieu qufi nhiing va bao mat noi dung tin nhiing. Chiing toi giai quy^t bfii toan nay bang cfich phan tich dac diem cua chff vigt tiSng Viet; tir d6 xay dung khoa ma, dimg dh ma hoa va nen van ban tieng Viet truac khi thyc hien nhimg. Ket qua thir nghiem vai cac van ban ti^ng Viet thong dung, cho thfiy so bit tin cfin nhung giam ddng kk so vdi phuang phap nhiing thong thucmg;

dong thiri noi dung van bfin nhiing dugc bao mat,

Tir khoa: Nhung vdn bdn; nhung du lieu; md hoa van bdn; nen vdn bdn; nhiing audio DAT VAN DE

Nhiing van ban tieng Viet trong du lieu audio la mpt trong nhirng bai toan co ban nhat ciia ky thugt giSu tin hieu trong dil lieu sd. Hien nay, da cd nhieu phuang phfip tiep can khac nhau nhu: Phuang phfip nhung LSB (Least Significant Bit) [2], [4], [5]; Mfi hda Parity (Parity Coding) [2], [4]; Mfi hda Phase (Phase Coding) [2], [5]; U'ng dyng ky thuat trai phd [I], [2]; Ky thugt ma hda echo [2], [6]. Cac phuong phap neu tren tap trung chu yeu vao xay dyng thugt toan nhung tin nhfim bfio dfim tin nhimg dugc dn dinh

Viec ket hgp giira nhiing tin vdi nen va mfi hda tin cd the se tfing dugc hieu qufi nhiing va bao mat dugc tin can nhimg. Tren co sd nghien ciru ve dac diem ciia chu- viet tieng Vi?t, tir do xay dung cfic khda mfi diing de nen vfi ma hda chir viet tieng Viet trudc khi thyc hi?n nhung vfio dir lieu audio.

Vdi each tiep can nhu tren, bai bao dugc trinh bay theo thir ty sau: Dac diem chu- viet tieng Vi?t; Xay dyng md hinh; Xfiy dyng cfic thufit tofin; Thii nghi?m vfi danh gifi vfi cudi ciing la phan kit luan.

DAC DIEM C H O VIET TIENG VI$T Bg chir viet tieng Viet dugc chia lfim 2 loai:

Chii sd (tir 0 den 9), Ifi logi khdng cd dfiu.

Tel 0168973SSS Email lamli6ht\ici,v

chiing cd the ket hgp vdi nhau de tgo thanh cfic gifi tri sd ldn hon; Chir cfii bao gdm 3 thfinh phfin chinh, dd la: Thfinh phfin phy fim (i c / v dswxdgrfzyjhqpnmlklrquchthkh nh gi ng ngh ghph); Thanh phfin nguyen fim {a ddan dn dn am dm dm au du ai ao ac dc dc at dt dt ach anh ang dng dng ap dp dp ay dy o6 a on on cm om dm am oc 6c owe ot ol at op op ap oat odt oen oam oan oan oanh oap ode oa oet oac oai oi 61 ai oe oach ong oc oang ong ong oay u u ut ut uc uc uan udy ua ua uyet uech uinh irang uong um um un un ung uya uam uom uen udn uyen uyt uan ui ui uau e e en en em em el et ep ep ec ec eu eo eng eng enh ech i it in im ip iu la inh ing ieng iip iec let ien iem ieu ich yet yeu yen ; / . = ? % ~ ! @ # $

^ & * ( ) . + " I \ [ ] { }) va thanh phfin dau (khdng dfiu, nang, huyen, sac, hdi va nga), Neu coi chir sd lfi phu am thi tdng sd phu am la 10 + 33 = 43, tdng sd nguyen am lfi 185 vfi tdng sd dau lfi 6. Do vgy, sd phu am, nguyen fim va dau la hiru han vfi chung ta hoan toan CO the biet tnrdc,

Vdi phucmg phap nhiing vfin ban tieng Vi?t thdng thudng, mdi ky tu se dugc ma hda thanh 8 bit, vi du chir "Nguyin" gdm 6 ky ty se dugc ma hda thanh (i x « = 48 [bit], Neu chiing ta thuc hien tfich rieng phu am. nguyen fim \fi dau: sau dd mfi hda thi so bit se lfi' Phy fim "Ng" = ] [bit], nguyen fim "uyen" = 6 [bit] vfi dfiu " ~ " = r (bit], khi dd tdng so 53

(2)

Vfl van Tfim vo Dtg Tgp chi KHOA HOC & CONG NGHE 122(08): 53 - 58 bit se la d^a-\-b + c [bit]. Neu d < 48 thi sd

bit gifim dugc so vdi trudng hgp nhiing thdng thudng lfi 48-d = e [bit]. Ngofii ra, vdi viec ma hda nhu tren thi ngi dung vfin ban can nhiing da dugc bao mat (ben nhgn phai cd cfic tham sd ciia bd ma mdi gifii ma dugc ndi dung tin hieu nhiing).

XAY DI/NG MO HINH So do mo hinh

Text Tfich

1

cho Ma hoa phu fim

Kh6a C

1

ma

1

Ma hoa nguyen am

nhiing

Gifii ma phu am

1

Giai ma nguyen

Hinh L Mo hinh nhung lieng Viel trong du- lieu Audio Cac chii" ciia van ban tieng Viet dugc dua vfio bd tach chu', thanh phan phy am se dugc ma hda thfinh a [bit] bdi bd ma hda phu am, thfinh phan nguyen fim va dau dugc ma hda thfinh (b + c) [bit] bdi bd ma hda nguyen am.

Sau bd ghep bit chiing ta cd d [bit] (vdi d=a+b+c. Su dyng phuong phap nhiing LSB de nhiing va gifii nhiing cfic bit tin vdi dir lieu goc audio, Phia ben nhan se thuc hien ngugc lgi vdi ben phat de nhfin dugc ndi dung van bfin da nhiing,

Xay dyng thuat toan - Thanh phdn phu dm:

+ Khda ma: Sir dung phucmg phap ma trfin, cac 6 thugc hang I va cfic d thudc cdt I ciia ma tran diing de ghi cac td hgp bit khda ma (Hinh 2a),

Cac d cdn trdng tren ma trfin dugc diing ma hda cfic gian each hoac dy phdng. De ma hda cho mgt phu am chiing ta Ifiy to hgp bit cua hang vfi cdt tuang iing vdi d chira phii fim, vi du phu fim "gh" se dugc ma la =110000.

+ Cdi dat tren may tinh: De dan gifin hda viec cfii dfit tren mfiy tinh, chiing ta thyc hien 54

luu khda ma dudi dang file text cd ten la phu_am.txt (Hinh 2b).

+ Thuat todn ma hoa: p lfi phu am cfin ma hda, £[i] la cac thfinh phan ciia file phu_am.txt, p la 6 bit sau khi ma hda.

00 01 10 11

OOOd b f 1 (1.

0001 c f 1:

pt.

flom 1 I ti lb

0011 V y (|D I

OlOO d 1 ch J

01 Ul s b kti 4

Olio W

'1 uL 5

O l l l X P Kl 6

1000 d n

litj lOOl

%

IU uijh

X 1010

1 V 1

1011

(a) Ma tran khoa ma phan phu am

(b) Cau true file

000000 b 000001 c 000010t 101001 ngh lllOOIi Hinh 2. Khoa ma phdn phi dm Fori e l,...,length(P)do

P [ i ] ^ P [ i ] + 32 end for

Open file phu_am.txt Fori G l,2,...,48do

£[i] *— line[i] ofphu_am.txt a ^ copy(£[i],7,4) F o r j £ I,,. ,length(a) do

c t U ] ^ a U ] + 3 2 end for if a - = p then

p ^ c o p y ( £ [ i ] , I , 6 ) exit for end if end for close phu_am.txt

+ Thuat todn giai ma: Thyc hien ngugc lai qufi trinh ma hoa.

Open file phu_am txt F o r i e l,2,...,48do

£[i] •«— line[i] ofphu_am.txt a«--copy(£[i]J,6) if a == p then

[3 ^ copy(£[i],7,4) exit for end if end for close phu_am.txt - Thdnh phan nguyen dm vd dau:

(3)

Vu van Tfim vo Dtg Tap chf KHOA HQC & CONG NGHE 122(08): 53-58 -I- Khoa ma: Tuong tir nhir phan phy am, khda

mfi phan nguyen am va dfiu dugc xay dyng nhu sau (Hinh 3a).

+ Cai dat trin mdy tinh: Cac nguyen am lien quan den dau, do vgy mdi mgt nguyen fim se dugc luu tren file thanh 6 ddng tucmg iing vdi 6 trudng hgp: Khdng cd dau, cd dau ngng, cd dau huyen, cd dfiu sfic, cd dfiu hdi vfi cd dau nga (Hinh 3b).

+ Thuat todn ma hoa: ^ la nguyen fim can ma hda, ^{\\ la cac thfinh phfin ciia file nguyen_am.txt, ro la 11 bit sau khi mfi hda.

Fori e I,...,]ength(^)do

^ [ i ] - ^ [ i ] + 32 end for

Open file nguyen_am.txt Fori e l,2,...,992do

H/[i] +— line[i] ofnguyen_am.txt a <—copy (\|/[i], 12,4) Forj G l,...,length(a)do

a G ] * - a G ] + 32 end for ifa== ^then

© •— copy(»|»[i],I,l I) exit for

end if end for close nguyen_am.txt

0 1 fell

00 k h'dn 11

hoi 01 m i ^ JB!;l

10 Inivrii

11 snr

I 1

)

(a) Ma tran khda ma phan ooooooooooi a nguyen am va dau ^ooooooooioi.

: i l l l l O O I O O O y e n

(b) Cau true file •iiiiiooiooi yfn Nguyen am.txt

Hinh 3. Khoa mdphdn nguyen dm vd ddu

+ Thuat todn gidi md:

Open file nguyen_am.txt Fori G l,2,...,992do

i}/[i] <— line[i] of nguyen am.txt a-<— copy (\|/[i],1,11) if a = (0 then

i, »- copy(v[i],12,4) exit for end if end for close nguyen_am.txt - Nhiing vd giai nh dng:

+ Thuat todn nhung: 5 lfi chudi tin cfin nhiing dang nhi phan, 5 dugc chia ra thanh cfic doan 4 bit de thay the 4 bit thap cua cac mfiu dir lieu audio.

5 ^ "bit start" + 6 + "bit end"

Open file audiol Open file audio2 Fori e l,2,...,44 do

B *— data[i] for file audiol data[i] to file audio2 *— B end for

C «—data[4l,.44] for file audiol C ^ C / 2

Forj G 1,2,.. ,C do i . - j + 44

B *— data[i] for file audiol selected Dau *— 1 If B < 0 Then selected Dau *- -I B ^ | B |

selected ST *— ""

ForK 6 1,2,.. ,20 do ST ^ Str(B mod 2) + ST B ^ B \ 2

If B = 0 Then Exit For end for

lfLength(ST)<l6Then

ForK e l,2,...,(l6-Length(ST))do ST = "0"+ ST

end for End If

ST^copy(ST, 1, 12) Tin ^ copy(5, ((i - 1 ) * 4 ) + 1,4) ST ^ ST + Tin

selected B <- 0 selected H •— 1 ForM e 16,15,...,I do

Tin *— copy{ST, M, 1) G *- Val{Tin) B ^ B + ( H * G )

55

(4)

Vu Vfin Tam vo Dtg Tgp chi KHOA HOC & CONG NGHE 122(08): 53-58 H - ^ H + H

end for B *- B * Dau data[i] to file audio2 •— B end for

Close file audiol Close file audio2

+ Thuat todn giai nhung: Chudi bit tin (5) dugc hinh thfinh tir viec lay 4 bit thfip cua cfic mau audio lien tiep nhau cho d8n khi gap chudi bit dfinh dfiu ket thiic nhiing.

Open file audio2

C ^ data[41 ..44] for file audio2 C ^ C / 2

selected Giai *— 0 Forj e 1,2,...,C do

i * - j + 4 4

B «— data[i] for file audio2 B - | B |

selected ST *— ""

ForK e 1,2,...,20 do ST ^ (B Mod 2) + ST B ^ B \ 2

If B = = 0 Then Exit For end for

IfLength(ST)<16Then

ForK e l,2,...,(16-Length(ST))do ST <- "0" + ST

end for End If

S T ^ c o p y ( S T , 13,4) IfGiai==OThen

Chuoi ^ Chuoi + ST Else

6 *- 5 + ST End If

If Length(Chuoi) >= 8 Then

ST <- copy(Chuoi, Length(Chuoi) - 7, 8) If ST =="00110011"ThenGia End If

IfLength(5)>=8Then ST ^ copy(8, Length(5 ) - 7, 8) If ST == "11001100" Then Giai 5 ^ c o p y ( S , l,Length(S)-8) End If

End If

I f G i a i = = 2 T h e n E x i t F o r End for

Close file audio2 36

• 1

THirc NGHIEM VA DANH GIA Dit lieu audio diroc sir dung thil nghiem I^

file am thanh chimes.wav trong Windows CO kich thucfc phan data la S„ = 35380 [byte] va cac tham so khac nhu hinh 4. Tin can nhitng la cac doan van ban ti6ng Viet co do dai (s6 ky tir) khac nhau. Ngoai ra, sur dung phuong phap nhijng LSB khong nen [2], [4], [5] de so sanh hieu qua nhijng, dp bao mat ciia tin nhiing. Ty le nhiing (B) se dat 100% khi so bit tin nhiing = S^ j A . Giao dien thir nghi?m dugc lap trinh bang ngon ngir Visual Basic (Hinh 4). Lan thii" I, tin can nhiing la 1 do^n van bang tieng Viet thong dung bao gom chit cai (phu am va nguyen am) va chir s6; Lan thii 2, van ban can nhung hoan toan la cac phu am; Lan thir 3, van ban can nhung hoan toan la 160 nguyen am va cac gian each giita chiing. K6t qua thir nghiem va so sanh dugc trinh bay trong bang la, lb.

Bang la. Ket qud ihit nghiem nhung thdng thicang Lan

thii'

1 2 3 TB

B o dai van ban (T.) 8.000 ky tu 7.700 ky tir 554 ky tir 5.418 ky tir

Khong ni3 h6a, khdng n^n Dung

luong tin (M2)

64.000 bit 61.600

bit 18.240

bit 47.95

bit T y l e nhung (B2) 90,45 %

87,055

%

25,78 % 67,76 %

Bao mat

Khong KhSng Khong Khong

Bang \\i. Kit qud thic nghiipi nhung theo mo hinh de xudt

Lan thii' 1 2 3 TB

B o dai van ban (T.) 8.000 ky tu 7.700 ky III 554 ky to 5.418 ky lir

C6 mS hOa, c6 n^n Dung

luong tin (Ml) 38 400

bit 56 462

bil 2.720

bil 32.527

bil T y l e nhung (B,) 54,27 % 79,79 % 3,84 % 45,97 %

Bao mat

Co C6 C6 C6

(5)

vo Vfin Tam vd Dtg Tap chi KHOA HQC & CONG NGHE 122(08): 53-58

Hinh 4. Giao dien thir nghiem theo md hinh

(a) DU li4u audio trudc khi nhung m>^ < W i i iiillHlllDIIWimi

(b) Du li?u audio sau nhiing theo mo hinh mi <m milllllHHIIMi

(c) D& lieu audio sau nhiing thong thuong Hinh S. Dgng song dU lieu audio ihuc nghidm Dung lugng tin can nhiing (M) va ty le nhung (B) dugc tinh theo cdng thiic sau:

- Trudng hgp nhiing thdng thudng:

MJ - T X 8 [bit];

B, = ( M , x ] 0 0 ) / ( 5 ^ / 4 ) [%];

- Trudng hgp cd ma hda, nen:

M^ ={P^ x6) + Ng^^ x i l [bit];

B,-(MjXlOO)/(5^/4) [%];

Trong dd, P^, Ng_^ lfi so phu fim va nguyen am CUE vfin bfin can nhiing.

Ket qufi thii nghi?m vdi 3 dang van ban khfic nhau (Bang 1) deu cho ket qufi tdt hon so vdi phuang phfip nhiing thdng thudng. Ngofii viec giam dung Iugng tin cfin nhung dan tdi ty le

nhung gifim tu do gifim miic do anh hudng den chat lugng ciia dir lieu audio goc (Hinh 5); thi thugt toan cdn cho phep bfio mat dugc ngi dung van bfin cfin nhiing (ben nhgn phai cd khda ma gidng ben phfit mdi giai ma dugc tin nhiing). Dac biet, khi nhiing cfic van ban thdng thudng (cd nhieu thfinh phfin nguyen am) thi dung Iugng tin can nhung gifim dang ke vfi do bao mat cang cao.

Tu cfiu true ciia hai ma tran khda ma, chiing ta cd the danh gia do bao mat nhu sau:

Do bao mat cua khda ma phu am:

K^ - 1 6 ! x 4 !

- Do bao mat ciia khda ma nguyen am:

K^^ -161x16!

- Do bao mat ciia he thdng:

^HT ^^A ^^Ng ^ ( 1 6 ! x 4 ! ) x ( 1 6 ! x l 6 ! ) Nhu vgy, gia trj ciia K^j, Ifi rfit ldn; Ngofii ra, viec sfip xep Igi cfic td hgp bit d hfing 1 vfi cgt 1 cua cac ma trfin khda mfi tren se tao ra mpt khda mfi mdi.

KET LUAN

Viec ket hgp ma hda, nen van ban tieng Viet khi thyc hien nhung vao du lieu audio la mdt hudng tiep can mdi trong xir ly tin hieu sd.

Qua phfin tich dac diem ciia chu viet tieng Viet, chiing tdi da dua ra md hinh, xay dyng cac thuat tofin de thyc hien ngi dung tren. Qua thir nghiem md hinh vdi cac loai van ban khfic nhau cho thay diing lugng tin can nhiing M^, ty le nhiing B^ dugc gifim dang ke so vdi phuang phap nhiing thdng thudng; ddng thdi bfio mat dugc ndi dung van bfin tieng Viet cfin nhiing, Ket qufi nghien ctiu nay rat cd y nghTa cho hudng nghien ciru ve nhiing van ban tieng Viet trong cfic dir lieu so khac.

TAI LIEU THAM KHAO 1. Vii Dinh Ba, ''Gidu thong tm irong ca sd die lieu khong gian," Tap chi Nghien ciiu khoa hc)C ky thuat vfi cong nghe Quan su, so 4. 30-37.

2. Nguyin Xuan Huy, Huynh Ba Di^u, "Nghien cuu ky thugt gidu lin trong audio ho tra xdc ihuc."

Tap chi Khoa hpc DHQGHN, Khoa hpc T nhien va Cong nghe, s6 1 (25). 69-74.

57

(6)

v a Van Tfim vd Dtg Tgp chi KHOA HOC & CONG N G H E 122(08): 53 - 58 3. F. Siebenhaar, C. Neubauer, R. B'auml, and J. j _ I.J.COX ET. AL. "Secure Spread Spectrum Herre, "New High Data Rate Audio Watermarking iVatermarking of Images. Audio and Video," Proc based on SCS (Scalar Costa Scheme)," in 113th ^,- . • i ^ ^ i D Convention of the AES, Los Angeles, USA, ' E E E Intemanonal Conf on Image Processing, October 5-8 2002, preprint 5645. ICIP-96, Vol.3, pp 243-246.

4 R. Z. WANG.C.F. LIN, AND J. C. LIN. "Image 6. DICKINSON B., T A O B., "Adaptive Hiding by LSB Substitution and Watermarking in DCT Domain," Proc. of IEEE G«„«/c^/g»r,am,"Proceedings of International , „ , , „ ^ , i „ „ 3 | c„nf. on Acoustics Speech and Symposium on Multimedia Intormation

Processing, Chung-Li,Taiwan, R.O.C, December Signal Processing, ICASSP-97,Vol.4. pl985- 1998,671-683. 2988,1997

S U M M A R Y

E M B E D D I N G V I E T N A M E S E T E X T EV A U D I O D A T A B A S E D O N T H E C H A R A C T E R I S T I C S

O F T H E V I E T N A M E S E W R I T I N G

Vu Van Tam'*, Phan Trong Hanh^

'institute of Engineering - Logistics People's Public Security (Ministry of Public Security),

^Le Quy Don University of Science and Technology (Ministry of National Defence) The problems of embedded Viemamese text in audio data must solve two related problems, such as increased efficiency and security of embedded informadon content. We solve this problem by analyzing characteristics of Vietaamese letters; from which to build code key, used for encrypting and compressing Vietnamese text before performing embedded. Test results with the popular Vietnamese text show that the number of information bits needed to embed significantly reduced compared with conventional embedded methods and confidential embedded text content.

Keywords; Embedded text; embedded data; text encoding; text compression; embedded audio

Ngdy nhdn bdr02/6/20I4; Ngdy phdn bien-16/6/2014; Ngdy duy4l ddng: 25/8/2014 Phdn bien khoa hoc: TS. Luu Due Khdm - Truang Dai hgc Ky thudt Hdu cdn Cong an nhdn ddn

Tel 0168975888.. Emad tamt36bca@gmailcom 58

Referensi

Dokumen terkait

Phuong phap nhan dien te bao simg con kha nang tang sinh Cac te bao simg cdn kha nang tang sinh frong tam te bao dugc nhuom hda mo mien dich vdi marker te bao goc bieu bi la p63..

De nhiing dinh che boat dpng td't va la nhiing nhan to' thuc day viec thufc thi, cac to chufc c i n dap iing it nha't bon chiic nang: 1 dieu kien chia se ganh nang; 2 cung cap thdng

Ve dudng Idi, phuong cham, phuong phap dau tranh, De cuong \iet: Phong trao each mang tranh dau theo dudng Idi hoa binh, nghTa la phong trao ay lay lyc lugng chinh trj cua nhan dan lam

Tit phuong dien nay, dao diic each mang vdi t u each gia t r i khdng chi la nhiing chuan miic trung, hieu, c i n kiem, liem chinh, chi cdng vd tU - nhiing pham chat cO ban cda dao diic

D I hien thyc hda xa hdi 5.0, ben cgnh nhiing thuan Igi nhu su ddng tam nhit tii cda chinh phu, su ting hp mgnh me cua eac lien doan doanh nghiep ttong viec xay dyng md hinh xa hpi 5.0,

- Chuydn gid d DN ndi dja: Lgi dyng ehinh sdch uu ddi thue cua nhd nude, khdng ft tap dodn kinh td yong nude da thanh lap mOt sd edng ty eon hoat dOng t ^ nhieu dia bdn khdc nhau dd

K i t lugn Nguyin Binh KhiSm, khi ndi din "dao trung", da sd dyng "trung diim ly tudng", hay cdn dupe gpi Id "trang diim vang" vdi tu cdeh nhu mOt an dy thdng qua hinh tupng "bin gida"

Tii phia ddu tu eia Trung Qudc Ben canh nhiing co hdi cho dau tu cua Viet Nam trong bdi canh cugc chien diuong mgi M\ - Trang Qudc chua cd hdi ket, thi nhirng he luy din tu cudc chien