-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New PSR for standardizing CAPTCHA (CaptchaInterface) #1330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I do not really see why we need a psr for this. An interface can always be used by any developer in any application. The problem of recaptcha is a very specific implementation to validate humans interacting with your website. While there are other solutions. Can you explain why you think this should be a standard? |
So, for example, I have a Google ReCaptcha on my website, lets say I installed it via composer from corresponding repository. Let's omit frontend integration and look at the backend, we have smth like this: class FormSubmitService
{
private $recaptcha;
public function __construct(\ReCaptcha\ReCaptcha $recaptcha)
{
$this->recaptcha = $recaptcha;
}
public function submitForm(CustomRequest $request): CustomResponse
{
$token = $request->get('g-recaptcha-response');
$response = $this->recaptcha->verify($token);
if (!$response->isSuccess()) {
return new CustomResponse('Recaptcha failed!');
}
// ... continue to submit fom
}
}
//...
$recaptcha = new \ReCaptcha\ReCaptcha('my_secret_token');
$formSubmitService = new FormSubmitService($recaptcha);
$formSubmitService->submitForm($incomeCustomRequest); So, what happens if we at some moment switch to another Captcha vendor (another SDK)? We immediately lose the public function __construct(CaptchaInterface $captcha)
{
$this->captcha= $captcha;
} because in this scenario we can pass: $recaptcha = new \ReCaptcha\ReCaptcha('my_secret_token');
$hcaptcha = new \HCaptcha\HCaptcha('my_secret_token');
$mcaptcha= new \MCaptcha\MCaptcha('my_secret_token');
$cloudflareTurnstile= new \Cloudflare\Turnstile('my_secret_token');
$smartCaptcha= new \Yandex\SmartCaptcha('my_secret_token');
$formSubmitService = new FormSubmitService($recaptcha);
$formSubmitService2 = new FormSubmitService($hcaptcha);
$formSubmitService3 = new FormSubmitService($mcaptcha);
$formSubmitService4 = new FormSubmitService($cloudflareTurnstile);
$formSubmitService5 = new FormSubmitService($smartCaptcha); Yes, we will need an additional configuration when constructing the Captcha object itself, but inside the form handler there will be no changes, 'cause all of the above will be able to say true/false |
I'm still on the fence about whether this deserves a PSR or not, I think that's going to depend on what kind of buy-in we can get from the projects that we'd expect to benefit from this. There are some additional things that come with the request that captcha services typically either require or optionally allow like the visitors IP address. It'd be better in my opinion to depend on PSR-7 and pass the entire RequestInterface implementation to the verify method so that the implementation can decide what to gather. |
Whoops, fat fingered the close button. |
@KorvinSzanto since all of the most popular Captchas (and, I guess, all of the external Captchas) requires secret token to connect to their validation routes, I guess it would benefit to add |
Have you already gathered some consensus between the developers of the major captcha libraries in PHP? |
The exceptions would be used for exceptional cases that represent neither pass nor fail, for example:
Without defined exceptions, all of these cases would throw implementation-specific exceptions like a guzzle exception or a recaptcha sdk exception and would require the consumer to know the implementation to properly catch exceptions. So an implementation would do something like: try {
$response = $guzzle->send($validationRequest);
$isValid = $this->validateResponse($response);
} catch (GuzzleException $e) {
throw new ImplementationSpecificValidationRequestErrorException('Failed to send validation request', $e);
} catch (ValidationResponseException $e) {
throw new ImplementationSpecificValidationRequestErrorException('Invalid validation response', $e);
} and a consumer can avoid knowing that guzzle is in use at all and do: try {
$isValid = $captcha->validate($whatever);
} catch (\Psr\Captcha\ValidationRequestErrorException) {
// Show user "We're having trouble validating your request, please try again in a few minutes"
} catch (\Psr\Captcha\InvalidArgumentException) {
// Show user "Invalid request, please try again"
}
if (!$isValid) {
// Show user "Invalid captcha, please try again"
} |
@mbeccati @KorvinSzanto |
Sadly, the Google recaptcha PHP repo is dead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I like the idea. The problem domain, indeed, is pretty clear so could be standardized.
captcha.md
Outdated
* SHOULD contain actual user's SCORING | ||
* MAY contain additional information (e.g., gathered from it's captcha-vendor service's verification endpoint) (i.e. message, errors, etc.) | ||
*/ | ||
interface CaptchaResponseInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it an interface and not a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, why CaptchaInterface::verify()
returns that instead of a bool? If yes, then answer is - 'cause you might want to implement additional methods to specific CaptchaResponse
classes, like getScore()
, getHost()
(for instance, those fields implemented by Google ReCaptcha, hCaptcha, SmartCaptcha) or similar - to extend response data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be mentioned in meta-document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mentioned it in captcha-meta.md
(added meta-document)
captcha.md
Outdated
* SHOULD contain method to configure SCORING threshold (if applicable by PROVIDER) | ||
* SHOULD throw a CaptchaException as soon as possible if appears any non-user related error that prevents correct Captcha solving (e.g. network problems, incorrect secret token, e.g.) | ||
*/ | ||
interface CaptchaInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be a need for challenge(string $token): string
method which, given a token, generates the challenge for the user to solve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk what did you mean by this, probably CaptchaInterface
needs to be renamed to CaptchaVerifierInterface
, 'cause that is its purpose - to verify that passed token is valid. But idk what else can be retrieved from $token, especially in string
. Do you mean challenge
is frontend rendered I am not a robot
checkbox with traffic lights? If so, i'm not sure that service should interact with it - it's just a verifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, for example, given a token that is 24
, challenge might be 20+4
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that interface is not quite for that purpose, but rather to verify already solved captcha task by its token using some external (or even internal, it doesnt actually matter) captcha service. Roughly, it's an interface to build SDK on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Makes sense.
Here's Yii2 implementation: https://github.com/yiisoft/yii2/tree/master/framework/captcha |
Here is Shopware 6 implementation https://github.com/shopware/shopware/tree/e4c3f565857f8e6c53d3c8e35f6b00dfd0eb3f73/src/Storefront/Framework/Captcha and Shopware 5 implementation https://github.com/shopware5/shopware/tree/8779bb0fc2cff04bb92dbba8ea5522263a71ab48/engine/Shopware/Components/Captcha |
Do you believe that Shopware could use and achieve profit from implementation of this interfaces by recieving more information from responses? |
Added demo-repository |
I am the maintainer of a PHP 8.4 compatible version of Google's Recaptcha with about 115K downloads so far and growing. I think this is an excellent idea for standardization. I would be happy implement any PSR that is agreed to in my fork. But be aware that Google has announced that they will be turning off this version of the service at the end of 2025. That said, my goal is to allow an easy upgrade path to the newer Google versions of their API with the current package, hopefully with no or minor changes needed by the developer. |
captcha.md
Outdated
|
||
## 2. Interfaces | ||
|
||
### 2.1 CaptchaInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should now read:
### 2.1 CaptchaVerifierInterface
Due to the code change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
I feel like Captcha feature is something too specific to be worth a whole personal PSR. When looking at:
I feel like this is not the right signature cause you might need more informations to verify something Google accepts the IP Shopware lib use the request
I believe we'll end with something like
when the responseInterface could have
But it seems really close to the PSR 7 |
@VincentLanglet // Imagine that we're on REST API route controller
$captcha = (new CaptchaVerifier('secret_token', $maybe_http_client))
->setIp($maybe_ip)
->setTime($maybe_time)
->setHost($maybe_host);
$formHandler = new AbstractFormSubmitHandler($captcha);
$formHandler->submit(['token' => $userToken, ...]); And inside of it we will need to verify this token via specified Captcha, but we do not know what this Captcha is and we do not need to know that, 'cause class AbstractFormSubmitHandler
{
function __construct(CaptchaVerifierInterface $captcha_verifier){...}
} And all we do - is just using already pumped object as a Validator, you can say. Token was passed to On the other hand, I do understand you concerns about mutable
|
@VincentLanglet on the second part of your message While I believe that having just
For now, the base interface solves the simple validation problem well, but for advanced response handling, it's worth discussing which extensibility pattern (if any) would be preferable — whether as a universal Thanks for surfacing this; now we're talking and I believe this is what WG should discuss and eventually solve |
I am going to go with what @LeTraceurSnork said. The whole point is to provide a minimal interface that all Capthca systems have to provide. It is minimal, but with some proper code around it, you can easily and quickly swap in another provider. With PHP, that could be configured with a ini file or database entry. The code will work the same. You can then fine tune the new provider if needed. And as for minimal interfaces, I find the current Container interface also to minimal to be usefull, but that was agreed upon and in use. So I don't think this interface should be rejected simply because it does not account for all use cases. It accounts for enough to be useful and will help standardize Captcha libraries to make them more generic and easy to swap out if needed. |
The de facto pattern of several past PSRs has been that the interfaces provide the "read" but not "configure" API. Configure is left up to the particular implementation, and you configure it in advance. The API is just for plugging it in and using it, assuming most of the configuration is already handled before the service gets injected. (PSR-3 Logging, PSR-11 Containers, PSR-14 Events, PSR-20 Clocks, etc. all take that approach.) |
The last question I have in mind is "Which feature should have a PSR" and which shouldn't. If we have a CaptchaInterface should we have some
which are also often used feature ? Where will be the limit ? (Maybe the answer is none) |
It's important to keep in mind that not all captchas are built the same. They won't always have a single value to validate, though I concede most do today.
To me, captcha input is closer to a cache item than a simple log message. Like the event PSR, we should at the very least accept an In my opinion it'd be better to accept a full PSR-7 request in validate, or in a factory that builds the value to validate. That way I can pass a captcha interface to my controllers and validate the request without needing to cater each controller to properly extract the required information from the request. |
The future of software is interoperable standards that allow things to be mixed and matched and not have vendor lock in. Vendors want lock in, standards prevent that. So yes, we need more standard interfaces, mail and translation would be two good ones to tackle. We have standards in most industries. Metric bolts for example. JIS (Japanese Industrial Standard) comes to mind. The more we insist on standard interfaces the better off as a PHP community we are in the long run. |
@VincentLanglet as I confessed to @samdark recently, I'm an inclusionist, as wikipedians would say 🙂 In my believe, all of your mentioned interfaces may actually be accepted if written properly, reviewed, agreed upon, etc. So, to
@KorvinSzanto yes, absolute most of the Captchas operates token as a string. I did some researches, I've got two different sources that representeed different information about CAPTCHA providers usage percentage, so I've checked following: Google ReCaptcha, Cloudflare Turnstile, hCaptcha, Friendly Captcha, AWS WAF, Yandex SmartCaptcha, GeeTest, Tencent Captcha, Yidun Captcha, MTCaptcha, Altcha, VAPTCHA and KeyCAPTCHA. From all of those (that's more than 99,9% of market) only Altcha verifies something that is not a string - their SDK method accepts array of some data, but even that case is totally fits in presented interface - we with @VincentLanglet discussed it earlier - it can be achieved via mutable objects with
Oh, I see your point. Yes, while we could not have an opportunity to pump the verifier with additional data it needs beforehand (e.g. Verifier needs some data from HttpRequest that being validated right now and the one that actually contains the token), I guess, it is wise to give it an opportunity to do so. But what do you think of an idea to keep the
With this being said, I guess, the question "what do we do with Response then" is much more acute for now @phpfui totally agree with you! More standarts -> less exotic packages that doesn't fits them -> less odd unsupportable legacy |
How to join? |
Pardon, link changed to https://discord.gg/uAfJDfY2 |
In my opinion, making the captcha validator strict to use PSR-7 request will make usage with simple captcha (which won't need the request) providers too complex. What about having two arguments:
And to provide additional interface - This will allow:
|
I'm quite sceptical about PSR-7 interface usage at all - because even if its optional - how do we parse it without knowing the implementation? E.g.: class CaptchaVerifier implements CaptchaVerifierInterface
function withData(RequestInterface $psr7_request)
{
$this->psr7_request = $psr7_request;
return $this;
}
function validate(string $token)
{
$http_client = new HttpClient(); // in this example PSR-18 compatible
$request_contents = $this->psr7_request->getBody()->getContents();
$request_data = json_decode($request_contents);
// maybe some other manipulations
// @var array<string, mixed> $request_data
// <=== START
$http_request = new HttpRequest([ // PSR-7 compatible
'token' => $token,
'ip' => $request_data['ip'],
'host' => $request_data['host'],
'some_value' => $request_data['some_value'],
]);
// <=== END
$http_client->sendRequest();
}
}
(new CaptchaVerifier())->withData($psr7_request)->verify($token); It's not perfect example, but it shows the problem - between START and END how do we know for sure that fields 'ip', 'host', 'some_value' exists AND have those names? Why not 'client_ip', 'site_host', 'some_other_value'? We cannot rely on them because we don't know the implementation, only the interface. That's why more proper (but, I guess, not ideal as well) appoach would be: class CaptchaVerifier implements CaptchaVerifierInterface
function withData(array $data)
{
$this->ip = $data['ip'] ?? null;
$this->host = $data['host'] ?? null;
$this->some_value = $data['some_value'] ?? null;
return $this;
}
function validate(string $token)
{
$http_client = new HttpClient(); // in this example PSR-18 compatible
$request_data = [
'token' => $token,
'ip' => $ip,
'host' => $host,
'some_value' => $some_value,
];
// maybe some other manipulations
// @var array<string, mixed> $request_data
$http_request = new HttpRequest($request_data); // PSR-7 compatible
$http_client->sendRequest($http_request);
}
}
(new CaptchaVerifier())->withData($psr7_request)->verify($token); That alternative also would omit |
All data should be directly passed to the method
<?php
// verifier that requires token, IP, host and some other data (optional)
class AcmeCaptchaVerifier implements CaptchaVerifierInterface
{
public function __construct(private readonly string $requestIpAttributeName)
{
}
public function validate(string $token, ContextInterface $context)
{
$someValue = null;
if ($context instanceof AcmeContextInterface) {
$ip = $context->getIpForAcmeProvider();
$host = $context->getHostForAcmeProvider();
$someValue = $context->getSomeValueForAcmeProvider();
} elseif ($context instanceof RequestAwareContextInterface && $context->getRequest() instanceof \Psr\Http\Message\ServerRequestInterface) {
/** @var \Psr\Http\Message\ServerRequestInterface $request */
$request = $context->getRequest();
$ip = $request->getAttribute($this->requestIpAttributeName, $request->getServerParams()['REMOTE_ADDR'] ?? null);
$host = $request->getUri()->getHost();
$someValue = null; // unable to determine
} else {
// OR fallback to some default values for $ip, $host, $someValue
throw new \InvalidArgumentException('Unsupported context - should be AcmeContextInterface or RequestAwareContextInterface to determine IP and host');
}
$http_client = new HttpClient(); // in this example PSR-18 compatible
// <=== START
$http_request = new HttpRequest(
[ // PSR-7 compatible
'token' => $token,
'ip' => $ip,
'host' => $host,
'some_value' => $someValue, // some other data
],
);
// <=== END
$http_client->sendRequest();
}
}
// verifier that uses ONLY token
class FooCaptchaVerifier implements CaptchaVerifierInterface
{
public function __construct(private readonly string $requestIpAttributeName)
{
}
public function validate(string $token, ContextInterface $context)
{
$http_client = new HttpClient(); // in this example PSR-18 compatible
// <=== START
$http_request = new HttpRequest(
[ // PSR-7 compatible
'token' => $token,
],
);
// <=== END
$http_client->sendRequest();
}
}
(new CaptchaVerifier())->verify($token, new AcmeContext('127.0.0.1', 'example.com', 'some_value'));
(new CaptchaVerifier())->verify($token, new Psr7Context(Psr7Request::createFromGlobals())); |
... and why don't "withData" or "array $context" - it won't be typed. Very hard to decide unified keys what can be used by different implementations. My proposal based on "empty interface" at a core and (at least) one additional extending interface that knows about PSR-7 request |
I think, I got your point public function verify(string $token, CaptchaContextInterface = new NoContext()): CaptchaResponseInterface; And then add smth like this: interface CaptchaContextInterface
{
// need to think about its content and methods
}
class NoContext implements CaptchaContextInterface
{
// all of its methods will be stubs that do nothing/returns empty data
} Then we can pass some context to verify() if we need it (creating it beforehand), but if we don't - then just call |
This is okay if it's immutable, but I think it's unnecessary when the data is required to be passed every time for proper interop.
The IP and the host would come from ->getServerParams but I hear your underlying question. The name of the query parameter that gets used should be known at the time the verifier is constructed so they can be configured against the verifier instance. $configuredVerifier = new FooVerifier(captchaKey: 'my_custom_captcha_key'); One issue you didn't bring up that is demonstrated in your example is the request's body isn't parsed in a class CaptchaRequest
{
public function __construct(
/** Either a parsed query or a parsed post body */
public readonly array $parsedRequest,
public readonly UriInterface $uri,
public readonly array $serverParams,
) {}
}
This requires the caller to be aware of the implementation for interop. To me this is sounding more and more like a part of a validator spec rather than it's own captcha focused thing. |
@andrew-demb My point was that if you rely on anything else rather than a token, you need to either pass it directly to
|
p.s. may I ask you to officialy join the Working Group? @KorvinSzanto @andrew-demb @phpfui @VincentLanglet |
Update to clarify my perspective further: |
Personally I'm not convinced about such incomplete interface since
Asking to call some extra method What the purpose of writing
and we could even argue it's odd to rely on the key Anyway... |
Well, that may be true, though I've researched that topic - top Captchas by usage percentage - ALL is string-token based. As do, I assume, most of self-hosted Captchas. So, based on that I'm proposing to standartize specifically them, not some other Captchas that may exist and may be based on, idk, file-uploading
Totally agree! That's why I suppose that usage in your example should be more like: class FormService
{
public function __construct(private readonly CaptchaVerifier $captcha) {}
public function submitForm(Psr7Request $psr7_request): Psr7Response
{
$token = $psr7_request->get('captcha-token'); // something that not based on exact implementation. Though, it may vary - it's on developer's choice
$response = $this->recaptcha->verify($token);
if (!$response->isSuccess()) {
return new Psr7Response('Recaptcha failed!');
}
}
$captcha = new SomeImplementationCaptcha(); // implements CaptchaVerifierInterface
$captcha->withIp($ip)->withSomethingElse($somethingElse);
$formService = new FormService($captcha);
$formService->submitForm($psr7_request);
Well, yeah... But, as I mentioned in example above, that's on developer's conscience 🧐 |
Gentlemen, all who is interested in this PSR, or at least in discussing it - I believe, we need to at least duplicate this converstaion (maybe partially) to this official discussion in Google Groups: https://groups.google.com/g/php-fig/c/v8VymG3L9no to progress it further |
This is a pretty cool initiative and seems like it could be super handy. If you think about the main point of CAPTCHA, it’s to tell humans apart from bots (like a Voight-Kampff test but on the web). The ways people find to bypass CAPTCHAs are always evolving, and because of that, CAPTCHA implementations have to keep getting smarter too. If we limit CAPTCHA’s flexibility through a PSR, we might end up forcing developers to ditch unique, creative protection methods that don’t fit the PSR mold. Now, tying this to PSR-7 (handling all the request data), as suggested earlier, means that tons of sites that don’t use PSR-7 are going to have a tough time just implementing a regular CAPTCHA. Currently, without a set standard, it’s easier, and users are accustomed to how simple it is. Take WordPress for example; it doesn’t have PSR-7, and neither do lots of other sites; changing a CAPTCHA there is pretty straightforward. From a corporate perspective, where tasks are split into backend and frontend, a standard might indeed speed things up on the backend, but it won’t really help the frontend guys, which means it won’t speed up CAPTCHA implementation or replacement overall. It might make life easier for backend developers, which is what we're aiming for, right? Despite this criticism, I think the idea has potential if it’s developed further. Though, I have to admit, I’m not an expert in CAPTCHA technology. |
Excatly my worries. I've been worried that thousands of developers (maybe even students on their first work) will have struggling with unneeded and overloading functionality. I see that the much interface simpler the more it is easier to understand, hence use, hence it much more popular and widely implemented.
Few words on that - as I mentioned earlier, I don't really think that by narrowing token format with |
UPD4: finally got my hands on Captchas SDK library and managed to implement suggested interface in it |
You can also see my implementation for Google's ReCaptcha here: https://github.com/phpfui/recaptcha/tree/PSR |
@phpfui yeah, I had a brief look at it - this is it :-) Not quite as I thought Exception would be handled, but I guess it's just Google's legacy, that you decided to keep backwards compatibility with. They insist that network errors is just another type of |
Here's a proposal of new PSR for standartizing CAPTCHAs to one interface
Continuation of this thread: https://discord.com/channels/788815898948665366/788816084383694848/1379332512886689812
TL;DR:
Recently we've faced with a problem that government forbids to gather data abroad and we urgently need to switch Google ReCaptcha to other solutions, but the thing is that since those solutions doesn't have common interface yet, the codebase needs to be refactored, which isn't good - the task in a nutshell is just "switch vendors"
UPD1: demo-repository was created https://github.com/LeTraceurSnork/psr-captcha-verifier
UPD2: Working Group can communicate now via Discord-channel
(https://discord.gg/uAfJDfY2)
UPD3: Clarification of an interface purpose and my POV #1330 (comment)
UPD4: finally got my hands on Captchas SDK library and managed to implement suggested interface in it
https://github.com/LeTraceurSnorkLibrary/SDKaptcha
Feel free to see what I mean by this proposal on practice