-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite of the smooth continuum algorithm for speed and stability #186
Conversation
interpolation spline fitting with a smoothed version that need not go through every point and weight the importance of the points according to the local variability of the residual after removing the main fitted continuum model. The goal is to permit robust fitting while eliminating the need for very expensive median filtering afterwards -- the new version is about 20x faster.
interpolation spline fitting with a smoothed version that need not go through every point and weight the importance of the points according to the local variability of the residual after removing the main fitted continuum model. The goal is to permit robust fitting while eliminating the need for very expensive median filtering afterwards -- the new version is about 20x faster.
…into smooth_cont
* tag a couple of np.arange() calls as producing floats, since they are immediately used for FP arithmetic
potentially quadratic string concatenation
For completeness, here are two other diagnostic plots which are related to the smooth-continuum modeling (for the example object shown at the top of this PR): This first figure shows the initial line-fitting which is done in order to get a good estimate of the line-widths and, in particular, to determine if there are broad Balmer lines present. The line-widths, in turn, determine how aggressively to mask the data before continuum-fitting: And the second figure shows the line-masking, which includes an effort to determine the signal-to-noise ratio of each line (poorly detected lines are not masked because they won't impact the continuum-fitting): |
In the example shown above, the data reduction is good; however, there are cases like the ones shown in desihub/desispec#2193 where there are large inter-camera breaks. For example, fitting all three cameras simultaneously for the following object leads to unsatisfactory results:
Instead, with the updates to the code, each camera can be fitted independently, which yields nice results. (This all supposes, of course, that the redshift is correct, which, in this case it's not; nevertheless, I think this is the behavior we want): And here's the final result / fit: |
…nally relative to the data, not the continuum model
I've also taken the opportunity in this PR to update how we estimate the "continuum fluxes", namely the fluxes at 1450, 1500, 1700, 2800, 3000, and 5100 Angstrom (rest) and at 1215.67, 3728.48, 4862.71, 5008.24, and 6564.6 Angstrom (i.e., the continuum level under Lyman-alpha, [OII], H-beta, [OIII] 5007, and H-alpha). In Here's an example where the algorithm works quite nicely: And here's an example where Lyman-alpha and [OII] aren't perfect because of the age of the stellar population; however, the error is <10% and probably adequate for this PR: |
Partly motivated by #171, this PR contains a rewrite of the smooth continuum algorithm. Instead of relying on a median-smoothing of the residual spectrum in sliding bins, we now use
scipy.interpolate.UnivariateSpline
, which has the advantage of using the inverse variance as a statistical weight.Here's one example that was previously quite problematic:
main-dark-9196-39633010890377403
.New algorithm:
Old algorithm:
Note especially how the wings of the broad Halpha line are compromised in the current vs the new algorithm.