Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PosteriorMeanModel and add qPosteriorStandardDeviation acquisition function #2634

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

slishak-PX
Copy link
Contributor

@slishak-PX slishak-PX commented Nov 20, 2024

Motivation

This is a small collection of changes for improving support for optimisation with deterministic (posterior mean) and pure exploration (posterior std) acquisition functions:

  1. Using PosteriorMeanModel with optimize_acqf is currently not supported as PosteriorMeanModel does not implement num_outputs or batch_shape.
  2. The PosteriorStandardDeviation acquisition function has no MC equivalent.

This PR addresses the points above, and consequentially adds support for the constrained PSTD acquisition function.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

TODO - just submitting draft for now for discussion.

Related PRs

(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/pytorch/botorch, and link to your PR here.)

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Nov 20, 2024
@Balandat
Copy link
Contributor

The PosteriorMean and PosteriorStandardDeviation acquisition functions have no MC equivalent.

So qSimpleRegret for q=1 is the MC-equivalent for PosteriorMean - can you just use that instead?

Using PosteriorMeanModel with optimize_acqf is currently not supported as PosteriorMeanModel does not implement num_outputs or batch_shape.

Adding these properties to DeterministicModel makes sense to me

@slishak-PX
Copy link
Contributor Author

slishak-PX commented Nov 20, 2024

So qSimpleRegret for q=1 is the MC-equivalent for PosteriorMean - can you just use that instead?

Sorry, missed that, you're right. Will remove qPosteriorMean, but qPosteriorStandardDeviation could still be helpful (e.g. for constrained active learning).

Adding these properties to DeterministicModel makes sense to me

DeterministicModel is still an abstract class, I don't know how we'd set num_outputs. GenericDeterministicModel handles it by requiring it to be passed to __init__, and AffineDeterministicModel computes it based on the weights, but PosteriorMeanModel (and FixedSingleSampleModel) can just grab it from the base model.

@slishak-PX slishak-PX changed the title Fix PosteriorMeanModel and add qPosteriorMean/qPosteriorStandardDeviation acquisition functions Fix PosteriorMeanModel and add qPosteriorStandardDeviation acquisition function Nov 20, 2024
@slishak-PX
Copy link
Contributor Author

At this point the PR can support constrained AL (I want to minimise uncertainty in the feasible region).

For a simple example, see the image below: the task is to minimise uncertainty in output 1, but only where output 2 is negative. The blue dashed line (PSTD) is analytical, green solid line is the MC equivalent (with no Bessel correction - will add this, thanks @Balandat), and orange solid line is the constrained PSTD.

image

Not closed the loop and run an actual optimisation yet, interested to see how it will perform.

Code used for testing is below:

Code

(Sorry the plotting code is messy!)

import torch
from botorch import acquisition
from botorch.fit import fit_gpytorch_mll
from botorch.models import SingleTaskGP
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.sampling import SobolQMCNormalSampler
from botorch.acquisition.objective import ScalarizedPosteriorTransform, LinearMCObjective

from plotly.subplots import make_subplots

n_train = 10
device = torch.device("cuda:1")

torch.manual_seed(3)
train_x = torch.rand(n_train, 1, dtype=torch.float64, device=device)
train_y = torch.randn(n_train, 2, dtype=torch.float64, device=device)

model = SingleTaskGP(
    train_x,
    train_y,
)
mll = ExactMarginalLogLikelihood(model.likelihood, model)
_ = fit_gpytorch_mll(mll)

sampler = SobolQMCNormalSampler(torch.Size([64]))

# Objective weights
w = torch.tensor([1, 0], dtype=torch.float64, device=device)

pstd = acquisition.PosteriorStandardDeviation(
    model, 
    posterior_transform=ScalarizedPosteriorTransform(w),
)
qpstd = acquisition.qPosteriorStandardDeviation(
    model, 
    sampler=sampler, 
    objective=LinearMCObjective(w),
)
qpstd_constr = acquisition.qPosteriorStandardDeviation(
    model, 
    sampler=sampler, 
    objective=LinearMCObjective(w), 
    constraints=[lambda samples: samples[..., 1]],
)

x = torch.linspace(0, 1, 100, device=device)
std = pstd(x[:, None, None])
qstd = qpstd(x[:, None, None])
qstd_c = qpstd_constr(x[:, None, None])

post = model.posterior(x[:, None, None])

fig = make_subplots(rows=3, cols=1, shared_xaxes=True)

for i in [0, 1]:
    fig.add_scatter(x=x.cpu().detach().numpy(), y=post.mean[:, 0, i].cpu().detach().numpy(), mode="lines", line_color="black", row=i+1, col=1, showlegend=False)
    fig.add_scatter(x=x.cpu().detach().numpy(), y=(post.mean[:, 0, i] + post.variance[:, 0, i]**0.5).cpu().detach().numpy(), line_color="grey", mode="lines", showlegend=False, row=i+1, col=1)
    fig.add_scatter(x=x.cpu().detach().numpy(), y=(post.mean[:, 0, i] - post.variance[:, 0, i]**0.5).cpu().detach().numpy(), line_color="grey", mode="lines", showlegend=False, fill="tonexty", row=i+1, col=1)
    fig.add_scatter(x=train_x[:, 0].cpu().detach().numpy(), y=train_y[:, i].cpu().detach().numpy(), mode="markers", marker_color="red", row=i+1, col=1, showlegend=False)

fig.add_hline(y=0, row=2, line_dash="dash")

fig.add_scatter(x=x.cpu().detach().numpy(), y=qstd.cpu().detach().numpy(), row=3, col=1, name="qPSTD", line_color="orange")
fig.add_scatter(x=x.cpu().detach().numpy(), y=qstd_c.cpu().detach().numpy(), row=3, col=1, name="qPSTD (constrained y2<0)", line_color="green")
fig.add_scatter(x=x.cpu().detach().numpy(), y=std.cpu().detach().numpy(), row=3, col=1, name="PSTD", line_dash="dash", line_color="blue")

fig.update_yaxes(row=1, title_text="Output 1")
fig.update_yaxes(row=2, title_text="Output 2")
fig.update_yaxes(row=3, title_text="Acquisition value")
fig.update_xaxes(row=3, title_text="x")

@slishak-PX slishak-PX marked this pull request as ready for review December 3, 2024 17:09
@slishak-PX
Copy link
Contributor Author

Added unit tests, copied from the rest of the MC acquisition function tests.

This one seems incomplete as it relies on the posterior samples having some variance in order to return nonzero acquisition values, but MockPosterior seems like it's only set up to repeatedly return the same sample.

@Balandat
Copy link
Contributor

Balandat commented Dec 8, 2024

This one seems incomplete as it relies on the posterior samples having some variance in order to return nonzero acquisition values, but MockPosterior seems like it's only set up to repeatedly return the same sample.

Yeah it's quite possible that MockPosterior does not support everything we'd like it to at this point, I don't think we've touched that in a while. I think it should be pretty straightforward to modify this line here to only apply the expansion if self._samples is not already of the appropriate size - that would allow you to return different sample values.

Copy link

codecov bot commented Dec 8, 2024

Codecov Report

Attention: Patch coverage is 84.61538% with 2 lines in your changes missing coverage. Please review.

Project coverage is 99.97%. Comparing base (4190f74) to head (a690236).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
botorch/models/deterministic.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2634      +/-   ##
==========================================
- Coverage   99.98%   99.97%   -0.02%     
==========================================
  Files         196      196              
  Lines       17410    17423      +13     
==========================================
+ Hits        17408    17419      +11     
- Misses          2        4       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@slishak-PX
Copy link
Contributor Author

slishak-PX commented Jan 7, 2025

Thanks for the pointer! Unfortunately it wasn't quite as simple as modifying that one line, the base_shape also needs to be provided so that MockPosterior.base_sample_shape returns the correct shape.

The other issue is that the reparameterisation trick for qUCB (which I'm using for qPSTD) only really makes sense for Gaussian posteriors; if a set of samples is manually constructed then the standard deviation is inaccurate. This might merit further thought, because although the UCB family of acquisition functions are most naturally suited to Gaussian posteriors (to my knowledge), it might be valuable for PSTD to also work for non-Gaussian posteriors.

I've sampled from torch.randn in test_q_pstd, and am using self.assertAllClose to check that the estimated std is close to (within 2% of) the empirical uncorrected value from torch.std.

I haven't edited test_q_pstd_batch yet because for $q&gt;1$, it's not straightforward to give a reference value for qPSTD, so I've left it with the repeated samples (and zero acquision value).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants