Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions entities.c
Original file line number Diff line number Diff line change
Expand Up @@ -390,3 +390,32 @@ size_t decode_html_entities_utf8(char *dest, const char *src)
return (size_t)(to - dest);
}

int encode_html_entities(char *dest, const char *src) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please return sitze_t instead of int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree. I'll commit the correction.

char *to = dest;
for( const char *from = src ; *from ; from++ ) {
int i = 9999;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please declare i as late as possible and do not assign a magic unused value to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree. I'll commit the correction.

if ( *from <= '+' ) {
sprintf(to,"%%%02x",*from);
to += 3;
continue;
}
if ( *from<='z' ) {
*to = *from;
to++;
continue;
}
//if ( *from=='\r' || *from=='\n' ) continue;
for( i=0 ; i<sizeof NAMED_ENTITIES / sizeof *NAMED_ENTITIES ; i++ )
if ( *from == NAMED_ENTITIES[i][1][0] ) break;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get this logic :-) are you comparing just the first character or am I missing something? Or is this sufficient because of the available entities (then please make sure this remains the case by adding an appropriate test case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm just comparing the first character. I found this sufficient for the final application I was developing. Maybe this encoder should be classified as an ASCII encoder instead of an UTF-8 encoder. I'll think about this during next week. Enlarging the length of the test can be done but this solution will still be inefficient since it uses a top-down search on an unsorted table.
If I won't improve the test, I'll try to keep the restriction clear to future users or developers.
Any way, I'll provide a test case.

if ( i<sizeof NAMED_ENTITIES / sizeof *NAMED_ENTITIES ) {
strcpy(to,NAMED_ENTITIES[i][0]);
to += strlen(NAMED_ENTITIES[i][0]);
}
else {
*to = *from;
to++;
}
}
*to = 0;
return strlen(dest);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you know dest and to, no call to strlen is necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree. I'll commit the correction.

}