Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a digest authentication helper #2213

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CONTRIBUTORS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ Jeroen van der Heijden
Jesus Cea
Jinkyu Yi
Joel Watts
John Feusi
Jon Nabozny
Joongi Kim
Josep Cugat
Expand Down
196 changes: 193 additions & 3 deletions aiohttp/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import cgi
import datetime
import functools
import hashlib
import inspect
import os
import re
Expand All @@ -17,18 +18,19 @@
from contextlib import suppress
from math import ceil
from pathlib import Path
from urllib.parse import quote
from time import gmtime
from urllib.parse import quote, urlparse
from urllib.request import getproxies

import async_timeout
from yarl import URL

from . import hdrs
from . import hdrs, client_exceptions
from .abc import AbstractAccessLogger
from .log import client_logger


__all__ = ('BasicAuth',)
__all__ = ('BasicAuth', 'DigestAuth')


sentinel = object()
Expand Down Expand Up @@ -182,6 +184,194 @@ def encode(self):
return 'Basic %s' % base64.b64encode(creds).decode(self.encoding)


def parse_pair(pair):
key, value = pair.split('=', 1)

# If it has a trailing comma, remove it.
if value[-1] == ',':
value = value[:-1]

# If it is quoted, then remove them.
if value[0] == value[-1] == '"':
value = value[1:-1]

return key, value


def parse_key_value_list(header):
return {
key: value for key, value in
map(parse_pair, header.split(' '))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace map with a nested comprehension.
It's a part of aiohtp preferred codestyle.

}


class DigestAuth():
"""HTTP digest authentication helper.
The work here is based off of
https://github.com/requests/requests/blob/v2.18.4/requests/auth.py.

:param str username: Username or login
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use autodocs markup in headers -- we don't use autodoc facility anyway.

:param str password: Password
:param ClientSession session: Session to use digest auth
"""

def __init__(self, username, password, session, previous=None):
if previous is None:
previous = {}

self.username = username
self.password = password
self.last_nonce = previous.get('last_nonce', '')
self.nonce_count = previous.get('nonce_count', 0)
self.challenge = previous.get('challenge')
self.args = {}
self.session = session

@asyncio.coroutine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody uses this syntax here anymore.

def request(self, method, url, *, headers=None, **kwargs):
if headers is None:
headers = {}

# Save the args so we can re-run the request
self.args = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you store last request as a class attribute? Won't it cause everything to break when DigestAuth instance is used by multiple coroutines simultaneously?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point. We can just store it as a local variable and then pass it to _handle_401.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this applies to self.challenge as well.

'method': method,
'url': url,
'headers': headers,
'kwargs': kwargs
}

if self.challenge:
headers[hdrs.AUTHORIZATION] = self._build_digest_header(
method.upper(), url
)

response = yield from self.session.request(
method, url, headers=headers, **kwargs
)

# If the response is not in the 400 range, do not try digest
# authentication.
if not 400 <= response.status < 500:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why do we try to auth in case of...503?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't try to do auth in the in the case of a 503. It will only try to perform digest authentication if the response status is from 400 up to but not including 500. Perhaps this would be clearer?

        # Only try performing digest authentication if the response status is
        # from 400 to 500.
        if 400 <= response.status < 500:
            return await self._handle_401(response)
        
        return response

As for why we don't just check for 401, you can see the discussion here: psf/requests#3772.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll notice the behavior should be equivalent but perhaps this is more readable.

return response

return (yield from self._handle_401(response))

def _build_digest_header(self, method, url):
"""
:rtype: str
"""

realm = self.challenge['realm']
nonce = self.challenge['nonce']
qop = self.challenge.get('qop')
algorithm = self.challenge.get('algorithm', 'MD5').upper()
opaque = self.challenge.get('opaque')

# lambdas assume digest modules are imported at the top level
if algorithm == 'MD5' or algorithm == 'MD5-SESS':
hash_fn = hashlib.md5
elif algorithm == 'SHA':
hash_fn = hashlib.sha1
else:
return ''

def H(x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about meaningless names.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the nomenclature is pulled directly from the RFC (e.g. H, KD, HA1, HA2, cnonce, ncvalue, etc.)

return hash_fn(x.encode()).hexdigest()

def KD(s, d):
return H('%s:%s' % (s, d))

parsed = urlparse(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have yarl. Why url parse?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unfamiliar with yarl; I will switch it over.

#: path is request-uri defined in RFC 2616 which should not be empty
path = parsed.path or "/"
if parsed.query:
path += '?' + parsed.query

A1 = '%s:%s:%s' % (self.username, realm, self.password)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about meaningless names.

A2 = '%s:%s' % (method, path)

HA1 = H(A1)
HA2 = H(A2)

if nonce == self.last_nonce:
self.nonce_count += 1
else:
self.nonce_count = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is redundant. 0 +1 equals 1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you re-use the instance, the nonce_count count can grow without bound. The nonce count is important in preventing replay attacks. There probably aren't too many cases where the nonce will need to reset but perhaps in the event of some failed messages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should clarify, if you re-use the instance of DigestAuth for performing multiple requests.


self.last_nonce = nonce

ncvalue = '%08x' % self.nonce_count

k = str(self.nonce_count).encode()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid one-two chars names except common dummy loop ones (i,j,k,etc.). What k here stands for?

k += nonce.encode()
k += time.ctime().encode()
k += os.urandom(8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about join?

cnonce = (hashlib.sha1(k).hexdigest()[:16])

if algorithm == 'MD5-SESS':
HA1 = H('%s:%s:%s' % (HA1, nonce, cnonce))

if not qop:
respdig = KD(HA1, '%s:%s' % (nonce, HA2))
elif qop == 'auth' or 'auth' in qop.split(','):
noncebit = '%s:%s:%s:%s:%s' % (
nonce, ncvalue, cnonce, 'auth', HA2
)
respdig = KD(HA1, noncebit)
else:
raise client_exceptions.ClientError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qop is known quite early above. How about to fail fast?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would have to be some duplication of logic around qop and I didn't think that the performance was that big of an issue. I can move it though.

'Unsupported qop value: %s' % qop
)

base = ', '.join([
'username="%s"' % self.username,
'realm="%s"' % realm,
'nonce="%s"' % nonce,
'uri="%s"' % path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if path would contain " char? What about the rest params?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe the header is supposed to follow the rules outlined in RFC 2616 section 2. I couldn't figure out how that specification intended for quotes to be escaped. I did a little more searching and this SO post was the only other thing I found and it wasn't any help.

I have been using httpbin to do some testing. I opened up the developer tools in chrome and tried going to the httpbin with a username and password that contained a space and a double quote. Looking at the request headers, Chrome seems to perform a URL encoding (i.e. using %20 and %22). httpbin didn't seem to handle it correctly. I also tried doing the same thing with the python requests library and it didn't work either:

import requests
from requests.auth import HTTPDigestAuth
auth = HTTPDigestAuth('us er', 'pass"word')
url = 'http://httpbin.org/digest-auth/auth/us er/pass"word/MD5/never'
requests.get(url, auth=auth)

My guess is that the URL encoding is probably the best way to go. Also, I added parse_key_value_list in helpers.py to parse the key-value pairs that are coming in the response header but it probably isn't very complete either. Do you guys already have a tool to parse these or should I try to improve my implementation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_key_value_list should at the very least, split on the comma and not on the space... and then trim the results.

'response="%s"' % respdig,
'algorithm="%s"' % algorithm,
])
if opaque:
base += ', opaque="%s"' % opaque
if qop:
base += ', qop="auth", nc=%s, cnonce="%s"' % (ncvalue, cnonce)

return 'Digest %s' % base

@asyncio.coroutine
def _handle_401(self, response):
"""
Takes the given response and tries digest-auth, if needed.
:rtype: ClientResponse
"""
auth_header = response.headers.get('www-authenticate', '')

if 'digest' in auth_header.lower():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to enforce first word check on this header value.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps 'digest ' would be a better check (with a space at the end)

pattern = re.compile(r'digest ', flags=re.IGNORECASE)
self.challenge = parse_key_value_list(
pattern.sub('', auth_header, count=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like you want to strip first word from string if it's match digest in lower case. No reason for regexp here.

)

return (yield from self.request(
self.args['method'],
self.args['url'],
headers=self.args['headers'],
**self.args['kwargs']
))

return response


if PY_352:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer actual. We require 3.5.3+ on master.

def create_future(loop):
return loop.create_future()
else:
def create_future(loop): # pragma: no cover
"""Compatibility wrapper for the loop.create_future() call introduced in
3.5.2."""
return asyncio.Future(loop=loop)


def strip_auth_from_url(url):
auth = BasicAuth.from_url(url)
if auth is None:
Expand Down
1 change: 1 addition & 0 deletions changes/2213.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added a digest authentication helper class.
112 changes: 112 additions & 0 deletions docs/client_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1395,6 +1395,118 @@ BasicAuth
:return: encoded authentication data, :class:`str`.


DigestAuth
^^^^^^^^^^

.. class:: DigestAuth(login, password', session)

HTTP digest authentication helper. Unlike :class:`DigestAuth`, this helper
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike BasicAuth ?

CANNOT be passed to the *auth* parameter of a :meth:`ClientSession.request`.

:param str login: login
:param str password: password
:param `ClientSession` session: underlying session that will use digest auth
:param dict previous: dict containing previous auth data. ``None`` by
default (optional).

.. comethod:: request(method, url, *, params=None, data=None, \
json=None,\
headers=None, cookies=None, auth=None, \
allow_redirects=True, max_redirects=10, \
encoding='utf-8', \
version=HttpVersion(major=1, minor=1), \
compress=None, chunked=None, expect100=False, \
connector=None, loop=None,\
read_until_eof=True)
:coroutine:

Perform an asynchronous HTTP request. Return a response object
(:class:`ClientResponse` or derived from).

:param str method: HTTP method

:param url: Requested URL, :class:`str` or :class:`~yarl.URL`

:param dict params: Parameters to be sent in the query
string of the new request (optional)

:param data: Dictionary, bytes, or file-like object to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

param dict|bytes|file data btw.

send in the body of the request (optional)

:param json: Any json compatible python object (optional). *json* and *data*
parameters could not be used at the same time.

:param dict headers: HTTP Headers to send with the request (optional)

:param dict cookies: Cookies to send with the request (optional)

:param aiohttp.BasicAuth auth: an object that represents HTTP Basic
Authorization (optional)

:param bool allow_redirects: If set to ``False``, do not follow redirects.
``True`` by default (optional).

:param aiohttp.protocol.HttpVersion version: Request HTTP version (optional)

:param bool compress: Set to ``True`` if request has to be compressed
with deflate encoding.
``False`` instructs aiohttp to not compress data.
``None`` by default (optional).

:param int chunked: Enables chunked transfer encoding.
``None`` by default (optional).

:param bool expect100: Expect 100-continue response from server.
``False`` by default (optional).

:param aiohttp.connector.BaseConnector connector: BaseConnector sub-class
instance to support connection pooling.

:param bool read_until_eof: Read response until EOF if response
does not have Content-Length header.
``True`` by default (optional).

:param loop: :ref:`event loop<asyncio-event-loop>`
used for processing HTTP requests.
If param is ``None``, :func:`asyncio.get_event_loop`
is used for getting default event loop.

.. deprecated:: 2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indention


:return ClientResponse: a :class:`client response <ClientResponse>` object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rtype?


Usage::

import aiohttp
import asyncio

async def fetch(client):
auth = aiohttp.DigestAuth('usr', 'psswd', client)
resp = await auth.request('GET', 'http://httpbin.org/digest-auth/auth/usr/psswd/MD5/never')
assert resp.status == 200
# If you don't re-use the DigestAuth object you can store this data
# and pass it as the last argument the next time you instantiate a
# DigestAuth object. For example,
# aiohttp.DigestAuth('usr', 'psswd', client, previous). This will
# save a second request being launched to re-authenticate.
previous = {
'nonce_count': auth.nonce_count,
'last_nonce': auth.last_nonce,
'challenge': auth.challenge,
}

return await resp.text()

async def main():
async with aiohttp.ClientSession() as client:
text = await fetch(client)
print(text)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())



CookieJar
^^^^^^^^^

Expand Down
Loading