Tgp Chi Khoa Hoe Gido Due Ky Thugt (32/2015) Trudng Bgi Hpc Su Phgm Ky Thuat TR Hd ChiMinh
DU'BAOTRENCHUOlTHdl C3ANStj'DUNGBAITOAN T I M K I E M T l ^ ^ PREDICTION EST TEVIE SERIES USING SIMILARITY SEARCH PROBLEM
Nguyen Thanh Son Tru&ng dgi hoc Suphgm Ky thudt TP.HCM l^y toa scan nhan duoc ba/17/3/2015, ngay phan bien <^nh gia (X3/4/2015, ngay ch^ nhan dang 15/4/2015
TOM TAT
Bdi todn du bdo tren chuoi thdi gian Id bdi todn quan trgng trong nhiiu linh vuc vd da nhdn dugc nhieu su quan tdm tu cdc nhd nghien ciiu trong nhimg ndm gdn ddy. Trong bdi bdo ndy, chung toi nghien ciru each sd dung bdi todn tim kiem tucmg tu vdo bdi todn du bdo tren chudi thdi gian^ co xu hu&ng hoge theo miia. Phucmg phdp ndy dirge thuc hiin nhu sau: (I) Trich mgt chuoi gia tri trin chuoi thdi gian ngay trudc khodng th&i gian mudn du bdo, (2) Sd dung chuoi ndy de tim k lan can gdn nhdt (hoge cdc lan can trong phgm vi mgt nguong tucmg tu T chotru&c) dta no trong du lieu qua khu, (3) Trich cdc chuSi (co chiiu ddi bdng vdi chieu ddi muon du bdo) ngay liin sau moi chuoi lan can tim dugc. vd (4) Chudi du bdo dugc xdc dinh bdng each tinh trung binh cdc chuoi tim dugc trong budc (3). Kit qud thi^ nghiem cho thdy edch tiep can ndy cho kit qud (ve do chinh xdc vd th&i gian thuc thi) co thi cgnh tranh dugc khi so sdnh vdi kit qud du bdo tren chuoi th&i gian co xu hu&ng hoge theo mua su dung mgng ncr ron nhdn tgo (ANN). Trong thuc nghiem, chiing toi ciing xem xet dnh hu&ng cua kvdT din do chinh xdc cua du bdo.
Tir khda: Chuoi th&i gian, die bdo, tim kiem tuang tu.
ABSTRACT
Time series forecasting problem is very important pmblem in several domains and has received a lot of interest from researchers in recent years. In this paper, we investigate the use of pattern matching technique in seasonal or trend time series prediction. This method is performed as follows: (1) This technique retrieves the sequence prior to the interval to be forecasted, (2) This sequence is used as a sample for searching k-nearest neighbors or neighbors within a threshold Tin historical data, (3) Sequences next to these found patterns are retrieved (the length of them are equal to the prediction interval), and (4) The forecasted sequence is calculated by averaging the sequences found in the 3"^ step. The experimental results showed that this approach produces competitive results on seasonal or trend time series in comparison to artificial neural network (ANN) in terms of prediction accuracy and time efficiency. In our experiment, we also examine the impact of parameter values kand Ton the predictive accuracy.
Keywords: time series, prediction, similarity search.
I. GICa THIEU
Mdt chuoi thdi gian la mdt chudi cac sd trong khai pha dii lieu chudi thdi gian. He thue. Mdi sd bi8u diln mdt gia tri do dugc tai thdng du bao chudi thdi gian du bao cac gia tri nhiing khoang thdi gian bdng nhau. Dii lieu tuong lai ciia chudi thdi gian bang each xem chudi thdi gian tdn tai trong nhi^u ling dyng xet dii lieu thu thap dugc trong qua khii. Dp cua cac ITnh vyc khae nhau nhu khoa hpc, ky chinh xac cua dy bao hen chudi thdi gian se la thuat, kinh te, tai chinh, y hpe, quan ly hanh co sd cho nhieu tien trinh ra quyet dinh va vi chinh, v . v . . . . vay viee nghien cOru cai hen dp hieu qua cua Du bao tren chudi thdi gian la mdt trong ^^^ phuong phap dy bao se khdng bao gid ket nhiing cdng viee thach thirc va phiic tap nhdt ^^^^- ^^^ phuong phap du bao thudng dugc
Tgp Chi Khoa Hpe Gido Due Ky Thugt (32/2015) Trudng Bgi Hpe Su Phgm Ky Thugt TR Hd Chi Mmh
chia thanh ba lo?i: dy bao ngan han, trung han va dai han.
Du bao ngdn han la dy bao nhung gi se xay ra trong khoang thdi gian ngan d tuong lai nhu ngay, tuan, thang.
Dy bao trung han la dy bao nhiing gi se xay ra trong khoang thdi gian dai hon d tuong lai nhu mdt nam, hai nam.
Dy bao dai hgn la dy bao nhiing gi se xay ra trong nhidu nam d tuong lai.
Cdng viec cua chiing tdi la nghien ciiu each SUT dyng bai toan tim kidm tuong ty trong dy bao hen chudi thdi gian dang miia hoae cd xu hudng. Ddu tidn, phuong phap nay trich mdt chudi ngay trudc khoang thdi gian can dy bao.
Sau do, chudi nay dugc diing nhu mdt mau de tim kidm k lan can gan nhat hay cac lan can trong pham vi mdt ngudng T cho hudc. Tiep theo, trich cac chudi (cd dp dai bang vdi dp dai cua khoang thdi gian can du bao) lien ngay sau mdi lan can tim dugc. Cudi ciing, chudi du bao dugc xae dinh bdng each tinh trung binh cac chudi vira tim dugc d budc trudc.
Trong thyc nghiem, chung tdi so sanh phuong phap du bao de xuat vdi phuong phap ANN. Phuong phap ANN dugc chpn de so sanh vi nd la phuong phap thudng dugc dimg de du bao tren chudi thdi gian hong nhiing nam gdn day v^ cd kha nang du bao tdt hon tren dii lieu chudi thdi dang gian phi tuyen, phiic tap khi so sanh vdi cac phuong phap truyen thdng ([22]). Chiing tdi cung xem xet anh hudng ciia tham sd k va T tdi dp chinh xac cua dy bao.
Ket qua thyc nghiem cho thay each hep can nay cho ket qua (ve dp chinh xac va thdi gian thuc thi) cd the cgnh tranh dugc khi so sanh vdi ket qua dy bao tren chudi thdi gian cd xu hudng hoae theo mua sii dyng mang no ron nhan tao (ANN).
Phan cdn lai ciia bai bao dugc td chiic nhu sau. Trong phan 2, chiing tdi trinh bay tdm tat cac kien thiic nen tang va cac ket qua nghien Cliu lien quan cua cac tac gia khac. Phan 3 md ta phuong phap du bao tren chudi thdi gian do chiing tdi dd xudt. Phan 4 la phan danh gia
bang thyc nghi?m phuong phap dd xuat tren cac tap dii lieu thuc. Phdn 5 la kit luan va hudng phat trien.
n. KIEN THlTC LIEN QUAN VA CAC NGHIEN ClTU TRU*6C DAy
1. Ki^n thirc lien quan D6 do Euclid.
Dp do Euclid la phuong phap don gian de do dp tuong ty cua cac chudi thdi gian. Cho hai chudi thdi gian Q = {ql, ..., qn} va C ^ {cl, ..., en}, dp do Euclid giiia Q va C dugc dinh
DiQ,C)^^t(q,-c,y
Bi^u dien M F C (Middle Point and Clipping)
Day la phuong phap thu giam sd chieu chudi thdi gian do chiing tdi de xuat trong nghien cuu trudc ddy [17]. Phuong phap nay cd thd duoc tdm tat nhu sau:
Cho mdt chudi thdi gian C cd chieu dai n.
C dugc chia thanh m doan bang nhau (m do ngudi dung chpn). Cac diem giua ciia mdi doan duoc trich ra va dugc chuyen ddi thanh chudi nhi phan, trong dd diem giiia dugc chuySn thanh 1 ndu nd nam tren dudng trung binh, nguge lai thi nd bang 0. Gia tri trung binh va chudi nhi phan tuong irng dugc luu lai nhu dgc trung cua chudi.
Cau true chi muc da chilu dung cho chuoi thdi gian
cdu tnie chi muc da chieu thdng dung la R-tree va cac bidn thd ciia nd ([6], [1]). Mot R-tree la mpt cay can bdng tuong hr nhu B-tree. Trong mpt cdu hiic chi muc da chifiu nhu R-hee hay R*-tree, mdi mit dugc kdt hgp vdi mdt vung bao hinh chii nhat nhd nhat (MBR-Minimum Boundmg Recrangle). Mdt MBR tai mpt niit la vung bao nhd nhdt bao cac mit con ciia nd. Mdi phdn tu hong mit la chiia mpt MBR cua chudi thdi gian va mdt con trd ddn ddi tugng dfi li?u nguyen thiiy dugc bao bdi MBR. Di6m ydu ciia R-tree la cae MBR hong cac niit hen eiing mdt miic cd the phu
Tgp Chi Khoa Hpe Gido Dgc Ky Thugt (32/2015) I Trudng Dgi Hpe Su Phgm Ky Thugt TRHd Chi Minh \
idp nhau. Sy phii idp (overlap) nay cd the lam giam hieu qua thyc thi ciia viec tim kidm dya vao chi myc.
Chi muc Skyline duge d6 xudt bdi Li et al., 2004 [16]. nhdm khde phue tinh h-ang phii lap (overlap) giiia cac hinh chii nhat chan ben hong cac MBR cua cac chudi bang each dinh nghia mdt vimg bao mdi ggi la vwwg bao du&ng chdn tr&i (Skyline Boxmding Region - SBR) thay cho MBR. Vung bao SBR diing dk xdp xi va bieu dien mdt nhdm cac chudi thdi gian theo hinh dang chung cua chiing. Mdt SBR dugc dinh nghia trong cimg khdng gian th&igian-gid tri nhu chudi thdi gian. SBR cho phep chiing ta dinh nghia mdt ham khoang each la chan dudi ciia khoang each giiia mgt cau truy van va mdt nhdm cac chudi thdi gian.
viing bao SBR chi bao gom mpt virng duy nhat va khdng xay ra tinh trang phii lap. Bang thuc nghiem, cac tac gia cho thSy chi muc dudng chan h"di cd the cai thien hieu qua cua bai toan tim kiem tuong ty len gap 3 lan [16]
2 Cac nghien cuu trirdc dSy Nhieu phuong phap dy bao chudi thdi gian da dugc gidi thieu va dua vao ling dyng trong thuc te. Mpt sd phuong phap thudng dugc sii dung cho bai toan dy bao dii lieu chudi thdi gian nhu phuong phap lam tron theo ham mu (exponential smoothing) ([7]), md hinh ARIMA (autoregressive integrated moving average) ([3],[13],[14]), mang no ron nhan tao (artificial neural network -ANN) ([2], [4], [8], [9], [21], [22]) va may vec to ho h-g ([15], [19]). Trong dd, phuong phap lam hon theo ham mii va md hinh ARIMA la cac md hinh tuyen tinh vi chiing chi cd the nam bat dugc cac dgc trung tuyen tinh ciia chudi thdi gian, cdn ANN la mot md hinh phi my8n da dugc su dyng cho bai toan dy bao dQ lieu chudi thdi gian. Tuy nhien, van de md hinh ANN cd thd xu Iy mdt each hieu qua dir lieu cd tinh xu hudng va tinh mua hay khdng dang la mdt vdn dd gay ban cai vi ed nhiing nhgn dinh trai ngugc nhau trong eOng ddng nghien cuu vd du bao da lieu chudi thdi gian [22].
Nam 2007, Nayak va te Braak da dd xudt phuong phap du bao cho dii li?u thi trudng chiing khoan su dung thugt toan gom cym [18]. Phuong phap nay dua tren y tudng la mpt cum dugc hinh thanh quanh mdt bign cd cd the dugc dimg de udc lugng cho bien cd d tuong lai. Cym do can dugc xac dinh vdi ban kinh nhd nhat cd the.
Cung hong nam 2007, Troncoso va cac cpng su da de xuat mdt phuong phap dy bao dugc gpi la phuong phap du bao dua vao chudi mau (pattern sequence-based forecasting - PSF) [20]. Phuong phap nay su dyng thuat toan k-Means de gom cym dii lieu va phat sinh ra mdt chudi cac nhan phan cym. Cudi cimg phuong phap thuc hien dy bao dua hen cac nhan nay. Cach tiep can nay da gidi thieu mdt phuong phap luan mdi cd the cung cap cac qui lugt du bao dya tren cac nhan du lieu thu dugc mdt each tu ddng tu thugt toan gom cum. Nam 2011, phuong phap nay da dugc ling dyng du bao gia thi trudng dien va nhu cau sir dyng dien [5]. Tuy nhien, qua thuc nghiem chiing tdi thay rang ket qua dy bao phu thugc vao sd cym va viec xac dinh sd cum tdt nhat bang each gom cym nhieu lan de chpn ra sd cym tdt nhat se tdn nhieu thdi gian. Ngoai ra, trong mpt sd trudng hpp bat thudng, neu cac mau hm kidm khdng cd hong tap huan luyen, phuong phap nay khdng the du bao cac bien cd d tuong lai ngay ca khi chieu dai cua mlu lai.
Nam 2009, Jang va cac cdng su de nghi mpt phuong phap du bao chudi thdi gian chiing khoan dua vao thdng tm motif [12]. Sau khi phat hien ra motif quan trpng nhat trong mpt chudi thdi gian, motif dd dupe chia lam hai phdn: ti§n td (prefix) va hau td (postfix). Ndu mdu hien hanh cua dii lieu chudi thdi gian khdp vdi tien td cua motif, thi ta cd the dy doan hi ciia budc thdi gian ke tiep dua vao hgu td ciia motif Do giai thuat phat hien motif dugc dung trong cdng trinh nay khdng dugc huu hieu, nen dp chinh xac dy bao va dp hiiu hieu vd thdi gian tinh toan ciia phuong phap dy bao dua vao motif chua cao.
Tgp Chi Khoa Hpc Gido Due Ky Thugt (32/2015) Trudng Dgi Hpc Su Phgm Ky Thugt TR Hd ChiMinh
Nam 2010 va 2012, Huang va cac cdng su de xuat mpt chien luge ket hgp k-lan can gan nhat vdi md hinh may vec to hd \xa binh phuong tdi thieu (least square support vector machine - LS-SVM) de du bao dai han tren dii lieu chudi thdi gian [ 10] [ 11 ].
III. PHU*ONG PHAP DE XUAT Chiing tdi sir dung thugt toan tim k lan can gan nhat hoae tim lan can hong pham vi mpt ngudng cho trudc dya tren mdt cau tnie chi myc da chieu nhu chi myc dudng chan trdi.
Cach tiep can k-lan can gan nhat la mpt trong nhiing ky thugt dy bao phi tham sd (non-parametric), hieu theo nghia ngudi dimg khdng phai biet trudc mdi quan he ly thuyet
nao giiia cac tri xudt va cac tri nhap trong bai toan dy bao, do dd nd rdt tu nhien va h v c giac.
Y tudng chinh cua each tidp can nay la nhan dgng cac mau hong qua khii khdp vdi mau hien hanh va dimg hi thiic vd each ma chudi thdi gian bien ddi trong qua khii trong nhiing tinh huong tuong ty 6k dy bao vd bidn ddi trong tuong lai. Ngoai ra, vdi each tidp can k-lan can gan nhat nay, cac mdu du bao cd thd dugc hdi tiep trd Igi vao tgp dii liSu de su dyng cho cac Ian du bao sau, nhd vgy tdm (horizon) ciia dy bao cd the dugc keo dai theo yeu cdu (ky thuat nay dugc gpi la du bao lap - iterated prediction). Hinh 1 trinh bay y tudng co ban cua each tiep can nay.
Dii lieu dugc chuan hda
Tim cac lan can gan nhat Mau dugc du bao
X
Ket thuc Hinh 1. Y tudng co ban ciia each ti6p Thuat toan du bao chudi thdi gian dua vao ky thugt k-lan can gan nhat dugc thuc hien nhu sau: Cho mpt hgng thai (mlu) hien hanh cd chieu dai w hong chudi thdi gian c6 chieu dai n (w « n) va chiing ta phai du doan chudi cd chieu dai m (m < w) se xay ra d budc kS tiep theo thdi gian (hic la du bao m budc vd phia tuong lai). Ddu tien, thuat toan se tim ki6m k lan can gdn nhdt hay cac Ian can trong mdt nguong T cho trudc ddi vdi mdu do. Sau do, thuat toan lav cac chudi cd chieu dai m
Cac chuoi tirong tir
can dua hen phuong phap so triing mdu.
nam ke can ben phai ciia cac lan can gan nhat tim dugc d budc tren. Cudi cimg, chudi dy bao dugc udc lugng bang each tinh trung binh cdng cac chudi vua thu dugc. Trong trudng hgp can dy bao cho cac chuoi khac niia, chudi udc lugng cd the dugc chen vao cudi tap dil lieu de du bao cho cac mau tiep theo.
Hinh 2 minh hpa bang thi du thuat toan dugc de xuat va hinh 3 trinh bay cac budc chinh cua thuat toan nay.
Chuoi can dir doan Mau
ChuSi irdr lu'ong Hinh 2. Minh hoa thuat toan dugc d l xuat.
Tgp Chi Khoa Hpe Gido Due Ky Thugt (32/2015) Trudng Dgi Hoc Su Phgm Ky Thugt TR Hd ChiMinh
Chii y la trong hudng hgp m < w, chiing ta cd the diing mgt bidn Ah luu tich liiy cac chudi udc lugng cho tdi khi m bdng vdi w. Khi dd, chiing ta cd thd chen chudi tich luy dugc vao h-ong cdu hiic chi myc ma khdng cdn phai xay dung Igi cdu tnie chi myc khi quay lai thyc hien budc 1.
Chiing tdi ling dyng phuong phap MP_C [17] k8t hgp vdi chi myc dudng chan trdi [16]
vao bai toan dy bao dya tren viec so triing mau de dy bao tren dii lieu chudi thdi gian cd xu hudng hoae bien ddi theo miia. Chi myc Skyline duoc chpn sir dyng vi nd nhieu uu diem hon so vdi R*-tree.
Input: Chudi thdi gian D cd chieu dai nj, tap kiem tra TS cd chieu dai n^, chieu dai ciia sd w, sd Ian can gan nhat k (hoge ngudng T) va chieu dai chudi can du bao m (m < w < n^ and w « n , l
Output: Chudi udc luong S cd chieu dai m.
1. Thu giam sd chieu cac chudi con cd chieu dai w trong D va chen chiing vao trong mpt cau tnie chi myc da chieu (neu can).
2. Lay chudi S (mau) cd chieu dai w nam trudc vi tri chudi ta phai du bao trong TS.
3. Tim k lan can gan nhat (hay cac lan can nam trong pham vi ngudng T) cua S.
4. Vdi mdi lan can gan nhat tim dugc d budc 3, khdi phue chuoi cd chieu dai m nam ke can nd trong D.
5. Tinh trung binh cgng cac chudi tim dugc d budc 4.
6. Tra Igi ket qua udc lugng d budc 5.
7. Chen chudi udc lugng d budc 5 vao D dd du bao cac mau tiep sau va quay lai budc 1 (nSu can).
Hinh 3. Cac budc chinh cua thugt toan dy bao theo phuong phap dS xudt.
Chii y la trong trudng hgp m < w, chiing ta ed the dimg mdt bien de luu tich luy cac chudi udc lugng cho tdi khi m bdng vdi w. Khi do, chiing ta cd the chen chudi tich liiy dugc vao
trong cau tnie chi muc ma khdng cdn phai xay dyng Igi cau tnie chi myc khi quay lai thyc hien budc 1.
IV. DANH GIA BANG THlTC N G H I E M L M6i t r u d n g va du- li^u thuc nghiem Chiing tdi so sanh sy thuc thi cua phuang phap dy bao de xuat vdi sy thuc thi cua phuong phap ANN. Thyc nghiem dugc thyc hien hen bdn tap du lieu thyc: Temperatures at Savannah Intemational Airport, Fraser River (FR), Milk production (MP) and Carbon Dioxide (CD). Phuong phap d6 xudt dugc cai dat bang Microsoft Visual C# hen laptop Core 13, Ram 2GB. ANN (sir dung Spice-Neuro) vdi cau tnie sau: 12 niit input, 3 mit output cho hai tap dii lieu MP va CD, 12 mit output cho cac tap dii lieu khac. Hai phuong phap du bao dugc so sanh su thyc thi hen tdt ca cac doan cua tap dii lieu kiem tra va sau do tinh ldi trung binh trong khoang dy bao.
Cac tap dii lieu dugc chia thanh hai tap con theo ti le xap xi la 9:1. Trong dd ISy khoang 90% lam tap huan luyen va khoang 10% lam tap kiem tra. Cac tgp dii lieu diing hong thyc nghiem nhu md ta sau:
Tap dij lieu Temperatures at Savannah Intemational Airport, hi 1/1910 den 12/2010. Tap hudn luyen dugc chpn tir 1/1910 dan 12/2000 va tap ki6m tra hi 1/2001 dan 12/2010.
Tap dii lieu Fraser River dataset, tir 1/1913 ddn 12/1990. Tap hudn luyen dugc chpn tir 1/1913 den 12/1982 va tap kigm h-a hr 1/1983 dgn 12/1990.
Tap du Heu Milk Production, hr 1/1962 den 12/1975. Tap huan luyen dugc chpn tir 1/1962 d6n 12/1971 va tap kigm h a tit 1/1972 dgn 12/1975.
Tap dii Heu Carbon Dioxide dataset, tir 1/1959 dgn 12/2008. Tap hudn luyen dugc chpn tu 1/1959 dgn 12/1998 va tap kigm tra tir 1/1999 dgn 12/2008.
Tdt ca cac tSp d u ii^u tren d u g c idy t u web site: http://www.datamarket.com. Hinh
Tgp Chi Khoa Hpe Gido Due Ky Thugt (32/2015) Trudng Dgi Hpe Su Phgm Ky Thugt TR Hd ChiMinh
4 minh hpa hinh dang ciia cac tap du lieu thuc nghiem dudi dgng dd hpa.
2. Tieu chuan danh gia
Trong bai bao nay, chung tdi su dung hai tieu chuan danh gia thudng diing la Ldi trung binh tuong ddi so vdi x_^^^ (MER - Mean Error Relative) va Ldi hung binh tuyet ddi (MAE - Mean Absolute Error) dugc dinh nghia nhu sau [5]:
1 V ly _ v |?i X 1 v ^ rmode/,j ^obs,i ""^^
MER=\00x
Trong dd, x^^ la gia hi quan sat dugc, x^^^, la gia tri tinh dugc bdi md hinh tai thdi diem i, A^^^ la gia tri trung binh trong khoang thdi gian xem xet va N la chidu dai cua chudi dy bao.
Ldi trung binh tuyet ddi (MAE).
N,
(a)Temperatures (b) Fraser River
,00im
c) Carbon Dioxide (d) Milk Production Hinh 4. Minh hpa hinh dgng bdn tap du lipu thyc nghiem.
3. Ket qua thuc nghiem
Dg xem xet anh hudng ciia k va ngudng T tdi dp ehinh xac ciia dy bao, chiing tdi tien hanh thuc nghiem vdi cac gia tri k va T khac nhau sau dd tinh trung binh Idi du bao. Bang 1 la cac Idi du bao ciia thyc nghiem tren tap dii Heu Frazer River vdi k thay ddi hi 1 dgn 10.
Bang 1. Ldi dy bao cua thyc nghiem hgn tap Frazer River vdi k khac nhau
4 5
22.46 24.39
0.046 0.050
9 10
23.00 22.66
0.047 0.047
„ MER . , , „ , MER . , . „ K ^y^, MAE k ^y^^ MAE
1 2 3
26.62 29.20 23.74
0.055 0.060 0.049
6 7 8
24.31 23.29 22.70
0.050 0.048 0.047
Ket qua thuc nghiem cho thay loi dir bao se khac nhau khi thuc nghiem voi cac gia tri k khac nhau. Trong thuc nghiem nay, ta co the thiy lai dir bao la nho nhit voi k bang 4.
Bang 2 la kSt qua I6i du bao khi thuc nghiem tren tap dft lieu Frazer River vol cac gia tri T khac nhau. KSt qua thuc nghiem cho thay l6i du bao se khac nhau khi thuc nghiem vcri cac gia tri T khac nhau. Trong thuc nghiem nay, ta co th6 thay I6i du bao la nho nh4t voi Thing 0.21.
Tgp Chi Khoa Hge Gido Bue Ky Thugt (32/2015) Trudng Bgi Hge Su Pham Ky Thugt TR Hd Chi Mmh
Bang 2. L6i dir bao cua thirc nghiSm tr^n tap Frazer River voi Tkhac nhau.
T MER (%)
MAE 0.15
27.94 0.056
0.17 27.05 0.055
0.19 25.64 0.052
0.21 23.11 0.047
0.23 25.29 0.051
0.25 25.91 0.052 Bang 3 la ldi du bao cua thyc nghipm
hgn tap du lieu Frazer River vdi gia tri k tdt nhat khi thuc nghiem du bao sir dyng bai toan k Ian can gan nhat (k-NN) va gia hi T tdt nhat khi dy bao sir dyng bai toan tim kiem lan cgn theo ngudng T (Range search). Loi dy bao dugc tinh cho hing nam. Ddng cudi cua bang la Idi dy bao trung binh trong tam nam. Kgt qua thuc nghiem cho thay ldi du bao trong ca hai trudng hgp la xap xi nhau.
Bang 3. Loi du bao cua thuc nghiem tren tap Frazer River vdi gia tri kykT tot nhdt.
Year
1 2 3 4 5 6 7 8 Mean
MER (%) Jt-NN
24.27 18.94 28.48 15.15 25.77 32.20 18.57 21.12 23.06
Range search 21.87 16.75 22.39 26.86 22.66 28.52 20.86 25.02 24.16
MAE t-NN
0.06 0.04 0.06 0.03 0.05 0.06 0.04 0.04 0.05
Range search 0.06 0.03 0.05 0.05 0.05 0.05 0.04 0.05 0.05 Bang 4 la ldi dy bao cua thuc nghiem tren tap dii lieu Temperatures at Savannah Intemational Airport. Ldi dy bao dugc tinh cho timg nam. Ddng cudi cua bang la loi du bao trung binh hong mudi nam.
Do gidi hgn sd hang cua bai bao, hong bang 5 chiing tdi chi trinh bay kgt qua tong hgp tir thyc nghiem tren cac tap du lieu khac nhau. Cac gia tri trong bang la Idi du bao trung binh hong cac nam thuc hien dy bao.
Ket qua thyc nghiem cho thay mac dii Idi du bao trong mdt vai nam ciia phuong phap do chiing tdi dg xuat Idn hon Idi dy bao eua phuong phap ANN, nhung Idi du bao trung binh trong cac nam du bao cua phuong phap do chiing tdi de xuat ludn nhd hon ldi dy bao cua phuong phap ANN. Chi cd trudng hgp thyc nghiem tren tap Carbon Dioxide, ldi trung binh MAE khi sir dyng k Ian can gan nhat la Idn hon mgt it so vdi Idi dy bao trung binh MAE cua ANN. Tuy nhien Idi trung binh MER khi su dyng k lan can gan nhat thi nhd hon ldi du bao hung binh MER ciia ANN.
Bang 4. Loi dir bao ciia thirc nghiem tren tap Temperatures at Savannah International Airport.
Year I 2 3 4 5 6 7 8 9 10 Mean
MER(%) /t-NN
7.555 6.779 8.316 6.288 7.652 8.329 7.570 7.767 5.004 14.542 7.980
ANN 17.814 11.666 11.523 10.239 8.921 10.053 9.590 11.335 8.298 14.394 11.383
MAE t-NN
0.043 0.039 0.047 0.035 0.042 0.047 0.044 0.045 0.029 0.081 0.045
ANN 0.065 0.059 0.039 0.036 0.039 0.040 0.044 0.053 0.035 0.049 0.046 Bang 5. Loi dy bao trung binh khi thuc nghiem hen cac tap dii lieu khac nhau.
Tgp Chi Khoa Hge Gido Bue Ky Thudt (32/2015) Trucmg Bgi Hge Su Phgm Ky Thugt TR Hd ChiMinh
Dataset Frazer River Milk Production Carbon Dioxide
MER (%) /t-NN 23.06 8.06 3.38
ANN 24.16 14.73 3.61
MAE
*-NN 0.05 0.09 0.037
ANN 0.06 0.10 0.032
Ben canh viec danh gia ve dp chinh xac, chiing toi con so sanh hai phuong phap du bao ve thai gian thuc thi. Bang 6 la thoi gian thuc thi (tinh theo giay) cua hai phuong phap du bao thuc nghiem tren bon tap du lieu. Ket qua thuc nghiem cho thay phuang phap du bao su dung k lan can gan nhat luon thuc thi nhanh hon khi so sanh vai phuang phap ANN.
Bang 4. Th^-c nghiem ve thcri gian thyc thi cua hai phirong phap du- bao tren bdn tap
dir tl^u.
Dataset Temperatures Milk Production Carbon Dioxide Frazer River
ANN 50
4 37 58
t-NN 0.262 0.464 1.261 0 199 V. KET LUAN VA HlTONG PHAT TRIEN
Trong bai bao nay, chiing tdi da dg xuat phuong phap dy bao tren chudi thdi gian dang miia hoae cd xu hudng sir dyng bai toan tim kigm tuong tu. Trong each tiep can nay, chiing tdi sir dyng phuong phap thu giam sd ehidu MP_C kgt hgp vdi chi muc Skyline cho bai toan tim kigm tuong tu nham tang nhanh tdc dp tim kigm. Chiing tdi cung xem xet anh hudng ciia k va T den dp chinh xac ciia dy bao. Thyc nghiem cho thay vdi cac gia tri k va T thich hpp, phuong phap du bao sir dyng bai toan tim kiem tuong tu se cho kgt qua tdt hon so vdi ANN ve dp chinh xac va thdi gian thyc thi khi dy bao tren chudi thdi gian dang miia hoge cd xu hudng.
Trong tuong lai, chiing tdi du dinh se nghign ciiu each xac dinh gia tri tdt nhdt cho k va T mdt each ty ddng cho bai toan dy bao sir dyng bai toan tim kigm k Ian can gdn nhdt hoae sii dyng bai toan tim lan can trong mot ngudng T.
TAI LIEU THAM KHAO
[1] N. Beckman, H.P Kriegel, R. Schneider and B. Seeger, ''The R*-tree: An efficient and robust access method for points and rectangles", Proc. of 1990 ACM-SIGMOD Conf., Atlantic City, NJ, May 1990, pp. 322-331.
[2] S. D. Balkin and J. K. Ord, "'Automatic neural network modeling for univariate time series'", Intemational Journal of Forecasting, vol.16, 2000, pp. 509-515.
[3] C. Chatfield, Time-series forecasting. New York, NY, Chapman and Hall, Inc., 2000 [4] E. Cadenas and W. Rivera, **Short term wind speed forecasting in La Venta. Oaxaca,
Mexico, using artificial neural networks^\ Renewable Energy, vol. 34, no. 1, 2009, pp.
274-278.
[5] F. M. Alvarez, A. Troncoso, J. C. Riquelme and J. S. A. Ruiz, "Energy Time Series Forecasting Based on Pattern Sequence Similarity^', IEEE Trans, on Knowledge and Data Engineering, vol. 23, No. 8, Aug. 2011, pp. 1230 - 1243.
[6] A. Guttman, "R-trees: a Dynamic Index Structure for Spatial Searching"', Proc. of the
Tgp Chi Khoa Hge Gido Bgc Ky Thugt (32/2015) Trudng Bgi Hge Su Phgm Ky Thudt TR Hd ChiMinh
ACM SIGMOD Int. Conf. on Management of Data, June 18-21, 1984, pp. 47-57.
[7] S. Gelper, R. Fried, and C. Croux, ^Robust forecasting with exponential and Holt- Winters smoothing". Journal of Forecasting, vol. 29, 2010, pp. 285-300.
[8] M. Ghiassi, H. Saidane, and D. K. Zimbra, "^ dynamic artificial neural network model for forecasting series events", Intemational Journal of Forecasting, vol.21, 2005, pp.
341-362.
[9] S. Heravi, D. R. Osbom and C. R. Birchenhall, ^'Linear versus neural network forecasting for European Industrial production series", Intemational Joumal of Forecasting, vol.20,
2004, pp. 435^46.
[10] Z. Huang and M. L. Shyu, "k-NN Based LS-SVM Framework for Long-Term Time Series Prediction," in The 11th IEEE Intemational Conference on Information Reuse and Integration (IRl 2010), Tuscany Suites & Casino, Las Vegas, Nevada, USA, 2010, pp.
69-74
[11] Z. Huang and M.-L. Shyu, "Long-Term Time Series Prediction using k-NN Based LS- SVM Framework with Multi-Value Integration," in Recent Trends in Information Reuse and Integration, K. K. a. M. T. Tansel Ozyer, Ed. Springer Vienna, 2012, ch. 9, pp. 191- 209.
[12] Y. Jiang, C. Li, J. Han, "Stock temporal prediction based on time series motifs," inProc.
of 8th Int. Conf on Machine Leaming and Cybernetics, 2009.
[13] l.-B. Kang, "Multi-periodforecasting using different models for different horizons:An application to U.S. economic time series data", Intemational Joumal of Forecasting, vol.19,2003, pp. 387^00.
[14] J. H. Kim, "Forecasting autoregressive time series with biascorrected parameter estimators", Intemational Joumal of Forecasting, vol.19, 2003, pp. 493-502.
|15] K. J. Kim, "Financial time series forecasting using support vector machines", Neuro- computing, vol. 55, 2003, pp. 307-319.
[16] Q. Li, I. F. V. Lopez, and B. Moon, "Skyline Index for Time Series Data", IEEE Trans.
on Knowledge and Data Engineering, vol.16. No. 6, 2004
[17] N. T. Son, D. T. Anh, "Time Series Similarity Search based on Middle Points and Clipping", Proceedings of the 3rd Conference on Data Mining and Optimization (DM0 2011), Putrajaya, Malaysia, June 28-29, 2011, pp.13-19.
[ 18] R. Nayak, and te Braak, "Temporal Pattern Matching for the Prediction of Stock Prices", In (Ong, K.-L. and Li, W. and Gao, J., Eds.) Proceedings 2nd Intemational Workshop on Integrating Artificial Intelhgence and Data Mining (AIDM 2007), pp. 99-107.
[19] Y.Radhika and M.Shashi, "Atmospheric Temperature Prediction using Support Vector Machines," Intemational Joumal of Computer Theory and Engineering, vol. 1, no. 1, 2009, pp. 55-58.
[20] A. Troncoso, J.M. Riquelme, J. C. Riquelme, A. Gomez, and J. L. Time series prediction:
Application to the short term electric energy demand Martinez, «", LNAI 3040, Springer, 2004, pp. 577- 586.
[21] G. Tkacz, "Neural networkforecasting of Canadian GDP growth", Intemational Joumal of Forecasting, vol.17, 2001, pp. 57-69.
[22] G. P. Zhang, M. Qi, "Neural Network Forecasting for Seasonal and Trend Time Series", European Joumal of Operational Research, vol. 160, 2005, pp. 501-514.