• Tidak ada hasil yang ditemukan

A Bayesian Linear Regression

N/A
N/A
Protected

Academic year: 2024

Membagikan "A Bayesian Linear Regression"

Copied!
3
0
0

Teks penuh

(1)

Hyperparameter Transfer

Learning with Adaptive Complexity

Item Type Conference Paper

Authors Horvath, Samuel;Klein, Aaron;Richtarik, Peter;Archambeau, Cedric

Eprint version Publisher's Version/PDF Publisher MLResearchPress

Rights Copyright 2021 by the author(s).

Download date 2023-12-09 23:12:54

Link to Item http://hdl.handle.net/10754/667827

(2)

Samuel Horváth, Aaron Klein, Peter Richtárik, Cédric Archambeau

(a)ABRAC (b)ABLR

(c)ABLR SGD fixed (d) ABLR L-BFGS fixed

Figure 9: Comparison of the surrogate objective along with its basis functions after 5 total evaluations obtained by 4 different methods. In the first and the third panel the target function is shown in blue, the evaluations are the red points (which includes the same5 random points for all methods), and the surrogate predictive mean is shown in orange with two standard deviations in transparent light blue. The second and fourth panel show the basis functions used to fit the surrogate model for given algorithm with colour representing its relative weight.

A Bayesian Linear Regression: extra material for Section 4.2

For the details omitted in Section 4.2, we continue with classical process of fitting Bayesian Linear regression (BLR) using Empirical Bayes, where one fits this probabilistic model by firstly integrating out weightswnew, which leads to the multivariate Gaussian modelNormal(µ(xnew), σ2(xnew))for the next inputxnew of the new task, whose mean and variance can be computed analytically (Bishop, 2006)

µ(xnew) = βnewfnew> Knew−1new,b↓)>ynew, σ2(xnew) = fnew> Knew−1fnew+ 1

βnew

,

where fnewz,b↓(xnew)and Knewnewnew,b↓)>Φnew,b↓+ Diag(αnew). We use this as the input to the acquisition function, which decides the next point to sample. Before this step, we first need to obtain the parameters. This model defines a Gaussian process, for which we can compute the covariance matrix in the closed-form solution.

cov(yi, yj) =E

z,b↓(xi)>w+i)(φz,b↓(xj)>w+j)>

z,b↓(xi)>E ww>

φz,b↓(xj) +E[ij]

z,b↓(xi)>Diag(α)−1φz,b↓(xj) +E[ij],

thus the covariance matrix has the following form Σnew = Φnew,b↓Diag(αnew)−1new,b↓)>new−1INnew. We obtain{αnew,i}ri=1 andβnew by minimizing the negative log likelihood of our model has, up to constant factors in the following form

min

new,i}ri=1new

1

2log|Σnew|+ynew> Σ−1newynew,

for which we use L-BGFS algorithm as this method is parameter-free and does not require to tune step size, which is a desired property as this procedure is run in every step of BO. In order to speed up optimization and obtain linear scaling in terms of evaluation, one needs to include extra modifications, starting with decomposition of the matrix(Φnew,b↓)>new,b↓) +1/βnewDiag(αnew)asLnewL>new using Cholesky. This decomposition exists

(3)

Hyperparameter Transfer Learning with Adaptive Complexity

since (Φnew,b↓)>new,b↓) +1/βnewDiag(αnew) is always positive definite, due to semi-positive definiteness of (Φnew,b↓)>new,b↓)and positive diagonal matrix1/βnewDiag(αnew). Final step is to use Weinstein–Aronszajn identity, and Woodbury matrix inversion identity, which implies that our objective can be rewritten to the the equivalent form

min

new,i}ri=1new

−(Nnew−r)

2 log(βnew)−1 2

r

X

i=1

log(αnew,i)

+

r

X

i=1

log([L]ii) +βnew

2 (kynewk2− kL−1newnew,b↓)>ynewk2),

which leads to overall complexityO(d2max{Nnew, d}), which is linear in number of evaluations of the function fnew.

Referensi

Dokumen terkait

Over years the procedure of multiple linear regression analysis have been used for ana- lyzing the data collected from survey research (Fowler, Jr., 1984) in which a researcher

KESIMPULAN Perbandingan algoritma Generalized Linear Model GLM, Deep Learning, Neural Network, Linear Regression, dan Support Vector Machine SVM untuk estimasi jumlah peserta yang

PROBLEM AND OBJECT OF STUDY The purpose of this study was to contribute to knowledge relating to the use of simple and multiple linear regression in educational research by

86 The 4th International Scholars’ Conference – Universitas Klabat “RESEARCH: BRIDGING THEORIES AND PRACTICES” Paper 167 – Social Sciences A ROBUST MULTIVARIATE LINEAR REGRESSION

Pouya Pourmaleki Techno-Economic Analysis of a 12-kW Photovoltaic System Using an Efficient Multiple Linear Regression Model Prediction some advantages over other methods, including

A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables.. High-dimensional properties of

This study then shows the effect of these three factors on the remaining results of operations using a multiple linear regression model, a linear regression model with one continuous

Dengan demikian, perlu dilakukan penelitian lebih lanjut untuk membandingkan kinerja antara random forest regression dan regresi linear dalam konteks peramalan harga saham BBCA, dengan