Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using custom mapper #5350

Open
zacklew opened this issue Aug 15, 2024 · 0 comments
Open

Error using custom mapper #5350

zacklew opened this issue Aug 15, 2024 · 0 comments

Comments

@zacklew
Copy link

zacklew commented Aug 15, 2024

If you do not know the root cause of the problem, please post according to this template:

Instructions To Reproduce the Issue:

I tried to create a custom mapping function to increase the number of augmentations performed on my input images. Code for custom trainer below

def custom_mapper(dataset_dict):
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
image = utils.read_image(dataset_dict["file_name"], format="BGR")
transform_list = [
T.Resize((800,600)),
T.RandomBrightness(0.8, 1.8),
T.RandomContrast(0.6, 1.3),
T.RandomSaturation(0.8, 1.4),
T.RandomRotation(angle=[90, 90]),
T.RandomLighting(0.7),
T.RandomFlip(prob=0.4, horizontal=False, vertical=True),
]
image, transforms = T.apply_transform_gens(transform_list, image)
dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))

annos = [
    utils.transform_instance_annotations(obj, transforms, image.shape[:2])
    for obj in dataset_dict.pop("annotations")
    if obj.get("iscrowd", 0) == 0
]
instances = utils.annotations_to_instances(annos, image.shape[:2])
dataset_dict["instances"] = utils.filter_empty_instances(instances)
return dataset_dict

class CustomTrainer(DefaultTrainer):
@classmethod
def build_train_loader(cls, cfg):
return build_detection_train_loader(cfg, mapper=custom_mapper)

torch.cuda.empty_cache()
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)

os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

trainer = CustomTrainer(cfg)
trainer.register_hooks([MemoryTrackingHook(period=1)])
trainer.resume_or_load(resume=False)
trainer.train()

  1. What exact command you run:
  2. Full logs or other relevant observations:
    AssertionError Traceback (most recent call last)
    Cell In[15], line 7
    5 trainer.register_hooks([MemoryTrackingHook(period=1)])
    6 trainer.resume_or_load(resume=False)
    ----> 7 trainer.train()

File ~/detectron_uw_repo/detectron2/engine/defaults.py:488, in DefaultTrainer.train(self)
481 def train(self):
482 """
483 Run training.
484
485 Returns:
486 OrderedDict of results, if evaluation is enabled. Otherwise None.
487 """
--> 488 super().train(self.start_iter, self.max_iter)
489 if len(self.cfg.TEST.EXPECTED_RESULTS) and comm.is_main_process():
490 assert hasattr(
491 self, "_last_eval_results"
492 ), "No evaluation results obtained during training!"

File ~/detectron_uw_repo/detectron2/engine/train_loop.py:155, in TrainerBase.train(self, start_iter, max_iter)
153 for self.iter in range(start_iter, max_iter):
154 self.before_step()
--> 155 self.run_step()
156 self.after_step()
157 # self.iter == max_iter can be used by after_train to
158 # tell whether the training successfully finished or failed
159 # due to exceptions.

File ~/detectron_uw_repo/detectron2/engine/defaults.py:498, in DefaultTrainer.run_step(self)
496 def run_step(self):
497 self._trainer.iter = self.iter
--> 498 self._trainer.run_step()

File ~/detectron_uw_repo/detectron2/engine/train_loop.py:494, in AMPTrainer.run_step(self)
492 self.optimizer.zero_grad()
493 with autocast(dtype=self.precision):
--> 494 loss_dict = self.model(data)
495 if isinstance(loss_dict, torch.Tensor):
496 losses = loss_dict

File ~/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs)
1530 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1531 else:
-> 1532 return self._call_impl(*args, **kwargs)

File ~/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1541, in Module._call_impl(self, *args, **kwargs)
1536 # If we don't have any hooks, we want to skip the rest of the logic in
1537 # this function, and just call forward.
1538 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1539 or _global_backward_pre_hooks or _global_backward_hooks
1540 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541 return forward_call(*args, **kwargs)
1543 try:
1544 result = None

File ~/detectron_uw_repo/detectron2/modeling/meta_arch/panoptic_fpn.py:119, in PanopticFPN.forward(self, batched_inputs)
116 images = self.preprocess_image(batched_inputs)
117 features = self.backbone(images.tensor)
--> 119 assert "sem_seg" in batched_inputs[0]
120 gt_sem_seg = [x["sem_seg"].to(self.device) for x in batched_inputs]
121 gt_sem_seg = ImageList.from_tensors(
122 gt_sem_seg,
123 self.backbone.size_divisibility,
124 self.sem_seg_head.ignore_value,
125 self.backbone.padding_constraints,
126 ).tensor

AssertionError:

Expected behavior:

Expected model to train as normal. Not sure what's causing this error. Followed some other examples I've seen on here on how others have implemented custom mappers and haven't found any examples of others getting this error.
Any and all help or advice would be greatly appreciated.

Environment:

sys.platform linux
Python 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:12:24) [GCC 11.2.0]
numpy 1.26.4
detectron2 0.6 @/home/computational/anaconda3/lib/python3.12/site-packages/detectron2
Compiler GCC 11.2
CUDA compiler CUDA 12.5
detectron2 arch flags 8.6
DETECTRON2_ENV_MODULE
PyTorch 2.3.1+cu121 @/home/computational/anaconda3/lib/python3.12/site-packages/torch
PyTorch debug build False
torch._C._GLIBCXX_USE_CXX11_ABI False
GPU available Yes
GPU 0 NVIDIA RTX A2000 12GB (arch=8.6)
Driver version 555.58.02
CUDA_HOME /home/computational/anaconda3
Pillow 10.3.0
torchvision 0.18.1+cu121 @/home/computational/anaconda3/lib/python3.12/site-packages/torchvision
torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 4.10.0


PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 12.1
  • NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.9.2
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant