Skip to content
This repository has been archived by the owner on Jan 4, 2023. It is now read-only.

Bad detected encoding in ops-rose (BLOB attribut) #536

Closed
Peltier10 opened this issue Jul 28, 2021 · 3 comments
Closed

Bad detected encoding in ops-rose (BLOB attribut) #536

Peltier10 opened this issue Jul 28, 2021 · 3 comments
Assignees
Labels
type/bug Something isn't working

Comments

@Peltier10
Copy link
Contributor

Description

Error happened in ETL on decoding of BLOB attribut.

Error in Kibana :
Traceback (most recent call last): File "/srv/django/extractor/service.py", line 41, in broadcast_events producer.produce_event(topic=f"{conf.PRODUCED_TOPIC_PREFIX}{batch_id}", event=event) File "/srv/django/common/kafka/producer.py", line 69, in produce_event value=CustomJSONEncoder().encode(event), File "/usr/local/lib/python3.8/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/local/lib/python3.8/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/srv/django/common/kafka/producer.py", line 27, in default return obj.decode(detected_encoding) File "/usr/local/lib/python3.8/encodings/cp1254.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 2: character maps to <undefined>

Other error reported : 'charmap' codec can't decode byte 0x8d in position 6: character maps to <undefined>

River detected cp1254 encoding whereas the encoding is latin-1

Environment

Rose

Resolution

2 solutions :

  • In common/kafka/producer.py : add a try to decode, if the decode doesn't work, add a line that decode in latin-1
  • Decode thanks to a cleaning script where we said : decode in base64, then encode in utf-8 (maybe a decode in latin-1 is necessary)
@Peltier10 Peltier10 added the type/bug Something isn't working label Jul 28, 2021
@elsiehoffet-94
Copy link
Contributor

@simonvadee didn't we come accross a similar issue where river has to try to decode something ?
@Peltier10 is it clear for you how to do the decoding/encoding steps in the cleaning script ?

@elsiehoffet-94
Copy link
Contributor

Linked to #376 and #422 ?

@simonvadee
Copy link
Contributor

duplicate

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants