Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opacus with torch_geometric.nn and GCN's #588

Open
sagerkudrick opened this issue May 8, 2023 · 5 comments
Open

Opacus with torch_geometric.nn and GCN's #588

sagerkudrick opened this issue May 8, 2023 · 5 comments

Comments

@sagerkudrick
Copy link
Contributor

sagerkudrick commented May 8, 2023

Does Opacus work with GCNConv?

I'm attempting to use Opacus with a GCN, with the model defined as such:

class GCN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(GCN, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = GCNConv(dataset.num_node_features, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, hidden_channels)
        self.conv3 = GCNConv(hidden_channels, hidden_channels)
        self.lin = Linear(hidden_channels, dataset.num_classes)

    def forward(self, x, edge_index, batch):
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = self.conv2(x, edge_index)
        x = x.relu()
        x = self.conv3(x, edge_index)

        x = global_mean_pool(x, batch)

        x = F.dropout(x, p=0.5, training=self.training)
        x = self.lin(x)
        
        return x

When training, however, I'm running into

Traceback (most recent call last):
  File "c:\Users\me\Desktop\github\opacus_graph\prt_3.py", line 121, in <module>
    train()
  File "c:\Users\me\Desktop\github\opacus_graph\prt_3.py", line 89, in train
    loss.backward()  # Derive gradients.
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd\__init__.py", line 200, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 69, in __call__
    return self.hook(module, *args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\opacus\grad_sample\grad_sample_module.py", line 337, in capture_backprops_hook
    grad_samples = grad_sampler_fn(module, activations, backprops)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\opacus\grad_sample\functorch.py", line 58, in ft_compute_per_sample_gradient
    per_sample_grads = layer.ft_compute_sample_grad(
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 434, in wrapped
    return _flat_vmap(
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 39, in fn
    return f(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 619, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\eager_transforms.py", line 1380, in wrapper
    results = grad_and_value(func, argnums, has_aux=has_aux)(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 39, in fn
    return f(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\eager_transforms.py", line 1245, in wrapper
    output = func(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\opacus\grad_sample\functorch.py", line 34, in compute_loss_stateless_model
    output = flayer(params, batched_activations)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\make_functional.py", line 342, in forward
    return self.stateless_model(*args, **kwargs)
  File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
TypeError: GCNConv.forward() missing 1 required positional argument: 'edge_index'

Occuring within

def train():
    model.train().to(device)

    for data in train_loader:  # Iterate in batches over the training dataset.
         data = data.to(device)
         out = model(data.x, data.edge_index, data.batch)  # Perform a single forward pass.
         loss = criterion(out, data.y)  # Compute the loss.
         loss = loss.to(device)
         loss.backward()  # Derive gradients.
         optimizer.step()  # Update parameters based on gradients.
         optimizer.zero_grad()  # Clear gradients.

On the loss.backward(), it's worth noting that training and evaluating work regularly, but upon doing

model, optimizer, train_loader = privacy_engine.make_private_with_epsilon(
   module = model,
   optimizer=optimizer,
   data_loader=train_loader,
   epochs=401,
   target_epsilon=5,
   target_delta=0.001,
   max_grad_norm=1,
)

and training, it begins to throw the error with the new model

Thank you!

@marlowe518
Copy link

Same issue here. Did you address this error? : ) @sagerkudrick

@sagerkudrick
Copy link
Contributor Author

Same issue here. Did you address this error? : ) @sagerkudrick

Hey @marlowe518 I did, the problem was with this:

model, optimizer, data_loader = privacy_engine.make_private_with_epsilon(
module = model,
optimizer=optimizer,
data_loader=data_loader,
epochs=10,
target_epsilon=5,
target_delta=0.0001,
max_grad_norm=255,
batch_first=True
)

batch_first = True results in "tensor is shape [K, batch_size, ...], if false: [batch_size, ...]", the input tensor to the model is modified by this, throwing off the positional argument, I was able to solve this by making batch_first=False

@sagerkudrick
Copy link
Contributor Author

sagerkudrick commented May 11, 2023

I'm not entirely sure if Opacus supports graphs though- validating using PrivacyEngine says that our GCN model is valid, but we're running into a new error here:

File "c:\Users\me\Desktop\github\opacus_graph\tds.py", line 88, in <module>
  loss.backward()
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_tensor.py", line 487, in backward
  torch.autograd.backward(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd\__init__.py", line 200, in backward
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 69, in __call__
  return self.hook(module, *args, **kwargs)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\opacus\grad_sample\grad_sample_module.py", line 337, in capture_backprops_hook
  grad_samples = grad_sampler_fn(module, activations, backprops)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\opacus\grad_sample\functorch.py", line 58, in ft_compute_per_sample_gradient
  per_sample_grads = layer.ft_compute_sample_grad(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 426, in wrapped
  batch_size, flat_in_dims, flat_args, args_spec = _process_batched_inputs(in_dims, args, func)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 119, in _process_batched_inputs
  return _validate_and_get_batch_size(flat_in_dims, flat_args), flat_in_dims, flat_args, args_spec
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_functorch\vmap.py", line 52, in _validate_and_get_batch_size
  raise ValueError(
ValueError: vmap: Expected all tensors to have the same size in the mapped dimension, got sizes [16, 7] for the mapped dimension

We're using the default DataLoader from from torch_geometric.loader import DataLoader
and our loader looks like this: data_loader = DataLoader(dataset, batch_size=32, shuffle=False)
(Using torch.utils.data and torch_geometric.loader DataLoader are resulting in the same error)
Our datasets are

dataset = Planetoid(root='/tmp/Cora', name='Cora')
data = dataset[0].to(device)

And our trainer:

for epoch in range(10):
    for batch in data_loader:
        print("batch ", batch)
        optimizer.zero_grad()
        out = model(batch)
        out.to(device)
        loss = F.nll_loss(out, batch.y)
        loss.backward()
        optimizer.step()

@sagerkudrick sagerkudrick reopened this May 11, 2023
@nhianK
Copy link

nhianK commented Mar 21, 2024

I get similar behavior when I wrap one of my models with GradSamplerModule(). Were you able to solve this issue? Doesn't work with batch_first=false when I use it with GradSampleModule(model,batch_first=False)

@Zening-Li
Copy link

@sagerkudrick I have the same problem. Have you solved this mistake?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants