You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Theory: custom exceptions not being caught properly in the program itself? If that's the case, then is it a dmi-scheduler issue? The real problem is that any sort of logging is not handled and the job is not properly closed out by the dmi-scheduler.
Possible solution: at the very least we could ensure that our base logger also handles the root log so that logs are not just send to stderr/stdout and stored in the log file. This would not resolve the error handling/job closure.
Two examples where exceptions are being raised to _bootstrap_inner in threading.py.
Exception in thread crawler:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/middleman-python/dmi_scheduler/worker.py", line 57, in run
self.work()
File "/opt/middleman-python/workers/crawl.py", line 46, in work
network_pending.iterate()
File "/opt/middleman-python/lib/network.py", line 222, in iterate
new_corpus.add_LRUs(starting_points)
File "/opt/middleman-python/lib/corpus.py", line 283, in add_LRUs
self.add_URLs([hyphe.LRU_to_URL(lru) for lru in lrus])
File "/opt/middleman-python/lib/corpus.py", line 241, in add_URLs
self.add_URL(url)
File "/opt/middleman-python/lib/corpus.py", line 255, in add_URL
response = hyphe.call("store.declare_webentity_by_lrus_as_urls", [url], None, None, [], False, self.corpus_id)
File "/opt/middleman-python/lib/hyphe.py", line 90, in call
raise HypheException("Unexpected response format\n\n%s" % repr(response))
lib.exceptions.HypheException: Unexpected response format
{'jsonrpc': '2.0', 'id': '0dbec3f9-ed8e-4afc-a262-90057e3a6cb3', 'error': {'message': '', 'code': 8002, 'data': ''}}
or
Exception in thread crawler:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/middleman-python/dmi_scheduler/worker.py", line 57, in run
self.work()
File "/opt/middleman-python/workers/crawl.py", line 46, in work
network_pending.iterate()
File "/opt/middleman-python/lib/network.py", line 216, in iterate
new_corpus = lib.corpus.Corpus.create(self.get_corpus_ID(next_iteration), self.local_db, self.remote_db, detail_level, self.depth)
File "/opt/middleman-python/lib/corpus.py", line 656, in create
raise CorpusException("Could not create corpus %s: %s" % (name, repr(created)))
lib.exceptions.CorpusException: Could not create corpus ic-2-356267: {'message': 'There already is a corpus named "ic-2-356267".', 'code': 'fail'}
The text was updated successfully, but these errors were encountered:
2021-11-03 12:00:55,960 [ERROR] Worker crawler raised exception CorpusException and will abort: Could not create corpus ic-2-356269: {'message': 'There already is a corpus named "ic-2-356269".', 'code': 'fail'} at worker.py:57->crawl.py:46->network.py:216->corpus.py:656
But it has always been BaseException so there must be another reason. At some point I switched self.log.error() to self.log.exception() in the Exception block after self.run() from the dmi-scheduler worker and then back again. Perhaps that was the culprit. I switched back to BaseException to test that, but it failed to catch the error again (while using self.log.error() per normal).
This seems fixed, but I do not know why it started to begin with.
Theory: custom exceptions not being caught properly in the program itself? If that's the case, then is it a
dmi-scheduler issue
? The real problem is that any sort of logging is not handled and the job is not properly closed out by thedmi-scheduler
.Possible solution: at the very least we could ensure that our base logger also handles the
root
log so that logs are not just send tostderr
/stdout
and stored in the log file. This would not resolve the error handling/job closure.Two examples where exceptions are being raised to
_bootstrap_inner
inthreading.py
.or
The text was updated successfully, but these errors were encountered: