-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto generate link for URL on news and comments #214
Comments
Hello again, I have improved this issue suggestion by getting the first URL from text and extracting the meta information, like facebook or twitter does: This was the commit that implemented it: The main responsible file is the If you are good with this idea, I can merge it with the previous suggestion and make the pull request. My only doubt was about adding this new requirement in this file, I am not sure if this is the right way: |
This is actually awesome @Vamoss I have been thinking about implementing this for some time now, but I hadn't the time to do it myself. I appreciate you took the time to show it to me. This is great, and I appreciate it very much. Yes, please, I gladly receive your PR for this. |
Great to hear you like it. I started learning Django because bootcamp, and because of that I don't know much. |
Yes @Vamoss it is in the right place. Can you replace the URL you use to describe the package, for the URL to the repository? |
Hi @sebastian-code, |
The only feature not implemented yet would be cache the metatag image. |
Hi @sebastian-code, amazing working improving the code! Just one small fix. The news are giving an error if there is no URLs in it. This should fix: Before a pull-request I would like discuss two topics:
Considering the complexity those two subject I would recommend to roll back to the original suggestion, or find a better way to solve those issues. |
Now I see the value of using a really complex regex like the one you gave @Vamoss I'll recover it then. About the |
I know
During the development I find some cases where a website returned the first byte(avoiding request timeout) and then took 30 seconds to fully return the content. Another approach to use with |
@Vamoss I see your point. I think this can be solved implementing the proper exception workflow from requests when declaring the timeout, I was thinking on something of the likes of: try:
r = requests.get(url,timeout=3)
r.raise_for_status()
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err) But thinking on what the documentation says, perhaps your solution with with eventlet.Timeout(10):
requests.get("someverytrickyURL.com") |
By the way @Vamoss I could not reproduce a timeout situation, do you have an URL I can work with for testing? |
Great! There is hope :) Yes, that is hard to simulate, I don't remember exactly what site I was testing for that case, but I suspect it was https://reddit.com, or my personal website https://vamoss.com.br (my server can be very slow sometimes). This site may help (I tested some pages and not all of them were actually slow): Nonetheless, I think you have a good strategy, I would be happy to see in action. |
@Vamoss I'm trying to revert the regex, but when I tested it, in the specific case you say (I tried using vamoss.com.br) it throws a MissingSchema error; the log actually has a mention about detecting the right pattern Do you know something? My regex is really bad. |
It happens because This code works for me: def fetch_metadata(text):
"""Method to consolidate workflow to recover the metadata of a page of the
first URL found a in a given text block.
:requieres:
:param text: Block of text of any lenght
"""
urls = get_urls(text)
if len(urls) > 0:
url = urls[0]
if not url.lower().startswith(("http://", "https://", "ftp://")):
url = "http://" + url
return get_metadata(url)
return None |
No problem @Vamoss I fixed it using a different approach. Thanks for the help tho. |
I think we can close this issue @sebastian-code |
Hi,
I have recently implemented auto generating hyperlinks on URL for news.
This was done by these two commits:
News list:
EncontrosDigitais@6681653
News Comments:
EncontrosDigitais@dc9e319
If it is an interest of all, I could create a pull-request.
Cheers,
Carlos
The text was updated successfully, but these errors were encountered: