Chapter II: Data-driven Spectroscopy of Cool Stars at High Spectral Resolution 16
2.3 The Cannon
Preparing HIRES spectra forThe Cannon
To prepare the spectral library for The Cannon, we must ensure that the spectra satisfy certain conditions; the spectra must share a common wavelength grid, be shifted onto the rest wavelength frame, share a common line-spread function, and be continuum-normalized via a method independent of S/N (Ness et al., 2015). The first two conditions are already satisfied for the library spectra, and we can assume that they effectively share a line-spread function, though there may be negligible variation due to variable observation seeing conditions. To carry out normalization, we applied error-weighted, broad Gaussian smoothing with
Β― π(π0) =
Γ
π(πππβ2
π π€π(π0)) Γ
π(πβ2
π π€π(π0))
, (2.1)
where ππ is the flux at pixel π of the wavelength range,ππ is the uncertainty at pixel
5400 5425 5450 5475 5500 5525 5550 5575 5600 0.0
0.5 1.0
5400 5425 5450 5475 5500 5525 5550 5575 5600
Wavelength (Γ ) 0.0
0.5 1.0
Flux
Figure 2.2: HIRES spectrum of a reference sample star (HD100623) before and after normalization. The top panel shows the pre-normalized spectrum overlaid with the Gaussian-smoothed version of itself in red, while the bottom panel shows the normalized spectrum after the Gaussian-smoothed signal was divided out. The displayed wavelength region (π = 5400-5600 Γ ) is a subset of the full wavelength range and was chosen for better visualization of the spectrum and accompanying Gaussian-smoothed curve.
π, and the weightπ€π(π0)is drawn from a Gaussian:
π€π(π0) =πβ
(π0βππ)2
πΏ2 , (2.2)
where πΏ was chosen to be 3 Γ . If larger πΏ values are chosen for HIRES spectra, continuum-normalization begins to remove high resolution features. For reference, Ho et al. (2017) used a width of πΏ = 50 Γ to normalize low-resolution Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) spectra (π β 1800).
The Gaussian smoothing procedure is illustrated in Figure 2.2.
Training Step
We used The Cannon 2, the second implementation of The Cannondeveloped by Casey et al. (2016). Hereafter, we will refer toThe Cannon 2simply asThe Cannon.
This version builds upon the original with additional features that are designed to aid prediction of a larger label set including elemental abundances that go beyond bulk metallicity ([Fe/H]), such as regularization.
As outlined in Section 2.1, in the training step,The Cannonuses a set of reference objects with well-determined labels to construct a predictive model of the flux at every pixel of the wavelength range that is a function of the stellar labels. Model construction is based on two assumptions: that continuum-normalized spectra with identical labels look identical at every pixel, and that the flux at every pixel in a spectrum changes continuously as a function of the stellar labels. While The Cannoncan be trained on any set of empirical spectra and their labels, the resultant model will only be capable of predicting labels for spectra with properties that are represented in the training set. In other words,The Cannonis not able to accurately extrapolate outside the training set parameter space, so the training set spectra must be representative of the test set spectra in order to predict accurate label values. It is also important to note that theCannon-predicted labels will only be as accurate as those of the training set.
The flux model ππ πfor a spectrumπat pixel π can be written as
ππ π=π£π£π£(ππ) Β·ππππ+ππ π, (2.3) where ππππ is the set of spectral model coefficients at each pixel π and π£π£π£(ππ) is a function of the label listππthat is unique for each spectrumπ. The functionπ£π£π£(ππ)is referred to as the βvectorizerβ which can accommodate functions that are linear in the coefficientsππππ, but not necessarily simple polynomial expansions of the label list ππ. The noise term is described byππ πand can be taken as sampled from a Gaussian with zero mean and varianceπ2
π π+π 2
π whereπ2
π πis the uncertainty reported on the input HIRES spectra (flux variance) and π 2
π is the intrinsic scatter of the model at each pixel π. This intrinsic scatter can be likened to the expected deviation of the model from the spectrum at π.
To determine the optimal model labels (ππππ,π 2
π), we can relate the flux model to a single-pixel log-likelihood function:
lnπ(ππ π|ππππ, π£π£π£(ππ), π 2
π) =β[ππ π β π£π£π£(ππ) Β·πππ]2 π2
π π+π 2
π
β ln(π2
π π+π 2
π) βΞπ(πππ), (2.4) where Ξ is a regularization parameter and π(πππ) is a regularizing function that encourages the model coefficientsππππ to take on zero values, resulting in a simpler
model that is less prone to overfitting. In the case of L1 regularization implemented withinThe Cannon, the regularizing function takes the form
π(πππ) =
π½β1
βοΈ
π=0
|ππ|. (2.5)
L1 regularization was chosen becauseThe Cannonis designed for predicting large sets of elemental abundances, and it is reasonable to assume that only one or a few elemental abundances will affect the flux at a single pixel of the wavelength range.
For more details on regularization or the model itself, see (Casey et al., 2016).
In the training step, the log-likelihood is maximized via the Broyden-Fletcher- Goldfarb-Shanno (BFGS) algorithm to derive the best-fit model coefficientsππππ and intrinsic scatterπ 2
π:
ππππ, π 2
π β π
π ππ ,π π
π ππ ππ₯ h
πβ1
βοΈ
π=0
lnπ(ππ π|ππππ, π£π£π£(ππ), π 2
π)i
. (2.6)
Plugging in the explicit form of the log-likelihood (Equation 2.4) leads to
ππππ, π 2
π β π
π ππ ,π π
π ππ ππ₯ h
πβ1
βοΈ
π=0
β[ππ πβπ£π£π£(ππ) Β·πππ]2 π2
π π+π 2
π
β
πβ1
βοΈ
π=0
ln(π2
π π+π 2
π) β Ξπ(πππ)i
. (2.7)
Test Step
In the test step, we set the model labels (ππππ,π 2
π) to the optimized values determined during the training step at every pixel π, and fit for the label list ππ for each star π that minimizes the log-likelihood:
ππ β π
ππ
π ππππ h
π½β1
βοΈ
π=0
β[ππ πβπ£π£π£(ππ) Β·πππ]2 π2
π π+π 2
π
i
. (2.8)
Optimization of the log-likelihood in the test step is carried out via weighted least squares.
5100 5105 5110 5115 5120 5125
Wavelength (Γ )
0 2 4 6 8
Flux
vsini = 10 km s1 vsini = 8 km s 1 vsini = 6 km s 1 vsini = 4 km s 1 vsini = 2 km s 1 vsini = 0 km s 1
Figure 2.3: Synthetic spectra generated viaSpecMatch-Syn under the sameπeff, logπ, and [Fe/H] conditions (πeff = 4000 K, logπ = 4.5 cm sβ2, [Fe/H] = 0.2 dex) but with varying amounts of additional broadening. From top to bottom, the spectra haveπ£sinπ= 0β10 km sβ1in increments of +2 km sβ1.