Tools for Misfigured Urls

There is an unresolved issue when parsing for urls that bleed into regular text (often because of rich text features like tables etc.). 

For example, 

`https://www.example.com/index.html.Beginning_of_following_paragraph` which could be resolved by accepting only one period after the url, except that 

`https://www.example.com/index.htmlBeginning_of_following_paragraph` would still not be resolved.

I think an easier solution might be to offer some optional cleaning functions for the dataframes that archivr produces, but there could be other ideas.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tools for Misfigured Urls #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tools for Misfigured Urls #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions