-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Looks good, but no conditions without functions? #3
Comments
Correct, but this is not hard. You specify the following functions:
And then you can write:
This is how we use it ourselves: https://github.com/Usecue/InvoiceLion/blob/master/lib/InvoiceTemplate.php#L40 I hope this helps. |
Yeah..., No... But it is such a basic functionality, that you want to use it without explicitly passing the needed functions every time. Right now I just integrated them into the template class with a third (and experimental fourth) parameter $if = function($a, $b, $op='==', $mod='') {
$ret = false;
switch($op) {
case '==':
$ret = ($a == $b);
break;
case '===':
$ret ($a === $b);
break;
case '!=':
$ret = ($a != $b);
break;
case '!==':
$ret = ($a !== $b);
break;
case '>':
$ret = ($a > $b);
break;
case '<':
$ret = ($a < $b);
break;
case '>=':
$ret = ($a >= $b);
break;
case '<=':
$ret = ($a <= $b);
break;
}
if($mod == '!') {
$ret = !$ret;
}
return $ret;
}; Right now I am experimenting using this for .htaccess file generation based on environmental variables - so I just searched for a small, self-sustained class capable of inserting variables based on some conditions or even doing loops. Regarding your example Are you trying to filter out <script>-Tags and Content to prevent XSS or something? <scr<script>ipt>Test</script>ipt>
alert("Test");
</scr<script>ipt>Test</script>ipt> |
Sounds like a good idea to me. Can you do a PR?
Great that you found this lib.
Yes, but mainly to avoid execution of accidentally copy-pasted script tags.
Yes it does. Good point. Do you think this will help? https://github.com/Usecue/InvoiceLion/blob/master/lib/InvoiceTemplate.php Thank you for your help with making this tiny template engine even better. Kind regards, Maurits |
Will do if I produce something useful, I just played around with it for a bit, changing dozens of lines with no real goal
I only have some experience with finding flaws like this in other peoples code, because my job mainly resolves around keeping legacy code alive and fixing bugs that pop up. When I look at sources like this https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#XSS_using_HTML_quote_encapsulation the only save way that comes to my mind, is to have a whitelist with allowed tags and encode everything else with htmlentities (so non-whitelisted |
Thank you for your kind reply! Please keep finding those bugs :-) |
Thank you for encouraging me :) Being me, I just had to do some research and found this: https://blog.uploadcare.com/vulnerability-in-html-design-the-script-tag-33d24642359e The interesting part:
I guess So I realize now, your code seems to be good for your purpose, but I should stop trying to filter out such things from user generated content and just use the new possibilities |
Yeah.. but this is a script in a PDF template engine, not in HTML, so the header won't help here, unfortunately. You can disable scripts altogether, but the page numbers rely on it. |
oh sorry, I did not think about any other output than html when I saw the <script> tag - I guess that makes my last statement even more true, that your approach is good for your purpose 👍 |
Hi, sorry to bother you again in here, but github seems to not have direct messages anymore. I read your article about writing a parser for template languages and found a small error <?php
function token_with_quote($str, $quote = "'", $escape = '^', $separator = '|')
{
$tokens = [];
$token = '';
$escaped = false;
$quoted = false;
for ($i = 0; $i < strlen($str); $i++) {
$c = $str[$i];
if (!$quoted) {
if ($c == $quote) {
$quoted = true;
- } elseif (substr($str, $i, strlen($separator)) == $separator) {
+ } elseif ($seplen = substr($str, $i, strlen($separator)) == $separator) {
$tokens[] = $token;
$token = '';
$i += $seplen - 1;
continue;
}
} else {
if (!$escaped) {
if ($c == $quote) {
$quoted = false;
} elseif ($c == $escape) {
$escaped = true;
}
} else {
$escaped = false;
}
}
$token .= $c;
}
$tokens[] = $token;
return $tokens;
}
-$input = "'one|uno'||'three^'^''|'four^^^'^cuatro'|"
+$input = "'one|uno'||'three^'^''|'four^^^'^cuatro'|";
$output = token_with_quote($input);
echo json_encode($output) . "\n"; function token_unquote($arr, $quote = "'", $escape = '^')
{
for ($i = 0; $i < count($arr); $i++) {
$str = trim($arr[$i]);
if (strlen($str) > 1 && $str[0] == $quote && $str[strlen($str) - 1] == $quote) {
$escaped = false;
$token = '';
$str = substr($str, 1, strlen($str) - 2);
for ($j = 0; $j < strlen($str); $j++) {
$c = $str[$j];
if (!$escaped) {
if ($c == $escape) {
$escaped = true;
continue;
}
} else {
$escaped = false;
}
$token .= $c;
}
$arr[$i] = $token;
}
}
return $arr;
}
-$input = "'one|uno'||'three^'^''|'four^^^'^cuatro'|"
+$input = "'one|uno'||'three^'^''|'four^^^'^cuatro'|";
$output = token_unquote(token_with_quote($input));
echo json_encode($output) . "\n"; The missing semicolon was reported by php, but the missing variable took me a few minutes, because it always just resulted in Hope this helps |
Yes, thank you very much! Sorry for any inconvenience this may have caused. :-) |
woops, there was an error in mine, too <?php
<?php
function token_with_quote($str, $quote = "'", $escape = '###', $separator = '|')
{
$tokens = [];
$token = '';
$escaped = false;
$quoted = false;
$strlen = strlen($str);
$seplen = strlen($separator);
$esclen = strlen($escape);
for ($i = 0; $i < $strlen; $i++) {
$c = $str[$i];
if (!$quoted) {
if ($c == $quote) {
if ($token == '') {
$quoted = true;
}
else {
// unescaped quote in unquoted token, insert escape sequence, ignore quote
$token .= $escape;
}
} elseif (($i + $seplen <= $strlen) && substr($str, $i, $seplen) == $separator) {
$tokens[] = $token;
$token = '';
$i += $seplen - 1;
continue;
} elseif ($str[$i-1] == $quote && substr($str, ($i-1-$esclen), $esclen) != $escape && (strlen($token) > $esclen && substr($token, (-1-$esclen), $esclen) != $escape)) {
// [last character was the quote sign], [it was not escaped and escape was not inserted above], [current char is not the separator (elseif above)]
for ($j = $i; $j < $strlen; $j++) {
if (ctype_space($str[$j])) {
if (($j + $seplen <= $strlen) && substr($str, $j, $seplen) == $separator) {
// there was only whitespace until next separator, so closing the quote was correct and we do nothing
break;
}
}
else {
// there was an unescaped quote in the data, insert escape sequence in front of it and set quoted status back to true
$token = substr($token, 0, -1) . $escape . substr($token, -1);
$quoted = true;
break;
}
}
}
} else {
if (substr($str, $i, $esclen) == $escape) {
$escaped = true;
$token .= $escape;
$i += $esclen -1;
continue;
}
if (!$escaped) {
if ($c == $quote) {
$quoted = false;
}
} else {
$escaped = false;
}
}
$token .= $c;
}
$tokens[] = $token;
return $tokens;
}
$input = "'one|un'o'|bl'abla|'three###'###''|'four#########'###cuatro'|token|'token2'|'to'|ken4'|tok'en5";
$output = token_with_quote($input);
$out = json_encode($output);
echo $out . PHP_EOL;
echo "Expected:" . PHP_EOL;
$expected = "[\"'one|un###'o'\",\"bl###'abla\",\"'three###'###''\",\"'four#########'###cuatro'\",\"token\",\"'token2'\",\"'to'\",\"ken4###'\",\"tok###'en5\"]";
echo $expected . PHP_EOL;
if ($expected == $out)
echo "!!! YAY !!!" . PHP_EOL;
else
echo "!!! NOPE !!!" . PHP_EOL;
echo "Input: " . PHP_EOL; This also contains some changes, to handle unescaped quotes that are in $str by mistake but could be escaped on-the-fly because they should not start or end a quote on this position anyway I really love fiddling with your work, maybe you find some of it useful |
Can you be more precise? I see that you are trying to support escape sequences that are longer than one character, which is a nice addon. Anything I missed?
Thank you very much, I really appreciate the bugs you report! |
Yes, sorry - I did multiple changes and this was just the final result First I thought there might be unescaped quotes inside unquoted as well as quoted data - we need to somehow handle this correctly or ignore it, so this was my testcase: <?php
function token_with_quote($str, $quote = "'", $escape = '^', $separator = '|')
{
$tokens = [];
$token = '';
$escaped = false;
$quoted = false;
$seplen = strlen($separator);
for ($i = 0; $i < strlen($str); $i++) {
$c = $str[$i];
if (!$quoted) {
if ($c == $quote) {
if ($token == '') {
$quoted = true;
}
else {
// unescaped quote in unquoted token, insert escape char, ignore quote
$token .= $escape;
}
} elseif (substr($str, $i, $seplen) == $separator) {
$tokens[] = $token;
$token = '';
$i += $seplen - 1;
continue;
} elseif ($str[$i-1] == $quote && $str[$i-2] != $escape && (strlen($token) > 1 && substr($token, -2, 1) != $escape)) {
// [last character was the quote sign], [it was not escaped and escape was not inserted above], [current char is not the separator (elseif above)]
for ($j = $i; $j < strlen($str); $j++) {
if (ctype_space($str[$j])) {
if (substr($str, $j, strlen($separator)) == $separator) {
// there was only whitespace until next separator, so closing the quote was correct and we do nothing
break;
}
}
else {
// there was an unescaped quote in the data, insert escape sequence in front of it and set quoted status back to true
$token = substr($token, 0, -1) . $escape . substr($token, -1);
$quoted = true;
break;
}
}
}
} else {
if (!$escaped) {
if ($c == $quote) {
$quoted = false;
} elseif ($c == $escape) {
$escaped = true;
}
} else {
$escaped = false;
}
}
$token .= $c;
}
$tokens[] = $token;
return $tokens;
}
$input = "'one|un'o'|bl'abla|'three^'^''|'four^^^'^cuatro'|token|'token2'|'to'|ken4'|tok'en5";
$output = token_with_quote($input);
echo json_encode($output) . "\n";
echo "Erwartung:" . PHP_EOL;
echo "[\"'one|un^'o'\",\"bl^'abla\",\"'three^'^''\",\"'four^^^'^cuatro'\",\"token\",\"'token2'\",\"'to'\",\"ken4^'\",\"tok^'en5\"]" .PHP_EOL;
echo "Input: " . PHP_EOL;
echo $input . PHP_EOL; The important changed I did are marked here: <?php
function token_with_quote($str, $quote = "'", $escape = '^', $separator = '|')
{
$tokens = [];
$token = '';
$escaped = false;
$quoted = false;
$seplen = strlen($separator);
for ($i = 0; $i < strlen($str); $i++) {
$c = $str[$i];
if (!$quoted) {
if ($c == $quote) {
- $quoted = true;
+ if ($token == '') {
+ $quoted = true;
+ }
+ else {
+ // unescaped quote in unquoted token, insert escape char, ignore quote
+ $token .= $escape;
+ }
} elseif (substr($str, $i, $seplen) == $separator) {
$tokens[] = $token;
$token = '';
$i += $seplen - 1;
continue;
+ } elseif ($str[$i-1] == $quote && $str[$i-2] != $escape && (strlen($token) > 1 && substr($token, -2, 1) != $escape)) {
+ // [last character was the quote sign], [it was not escaped and escape was not inserted above], [current char is not the separator (elseif above)]
+ for ($j = $i; $j < strlen($str); $j++) {
+ if (ctype_space($str[$j])) {
+ if (substr($str, $j, strlen($separator)) == $separator) {
+ // there was only whitespace until next separator, so closing the quote was correct and we do nothing
+ break;
+ }
+ }
+ else {
+ // there was an unescaped quote in the data, insert escape sequence in front of it and set quoted status back to true
+ $token = substr($token, 0, -1) . $escape . substr($token, -1);
+ $quoted = true;
+ break;
+ }
+ }
}
} else {
if (!$escaped) {
if ($c == $quote) {
$quoted = false;
} elseif ($c == $escape) {
$escaped = true;
}
} else {
$escaped = false;
}
}
$token .= $c;
}
$tokens[] = $token;
return $tokens;
}
$input = "'one|un'o'|bl'abla|'three^'^''|'four^^^'^cuatro'|token|'token2'|'to'|ken4'|tok'en5";
$output = token_with_quote($input);
echo json_encode($output) . "\n";
echo "Erwartung:" . PHP_EOL;
echo "[\"'one|un^'o'\",\"bl^'abla\",\"'three^'^''\",\"'four^^^'^cuatro'\",\"token\",\"'token2'\",\"'to'\",\"ken4^'\",\"tok^'en5\"]" .PHP_EOL;
echo "Input: " . PHP_EOL;
echo $input . PHP_EOL; Basically we don't set $quoted to true, if $token is not empty, because in this case we should be handling unquoted data and a sudden quote should be a quote that was forgotten to be escaped. For the second part, we check after $quote was set to false in quoted data, because from my understanding the only valid follow up should be the separator or some whitespace if you allow it. The second part is not perfect, because you could forget to escape a quote AND have whitespace and/or $separator follow it, so the token will be split in two and the closing quote will be escaped by the first change of escaping quotes in unquoted data. So I'm not sure if this is better or worse from error handling point of view, but the test cases in the echo's below looked good to me The next step led to the result you see above your reply, because I wanted escape strings instead of escape chars, so I had to work with new $esclen variable to mimic your style from $separator. But I had to change the part where you only set $escape to true if it was not escaped previously, because that led tom mixup if separators followed one after another and being made up of multiple times the same char, so [###][###][###]' would not become ###' but be interpreted as [###]#[###]##(unescaped)' I hope I could clear up most of my intentions to change the code to those states. Please ask back if there is something unclear |
Follow up for the dependent <?php
function token_unquote($arr, $quote = "'", $escape = '###')
{
$esclen = strlen($escape);
for ($i = 0; $i < count($arr); $i++) {
$str = trim($arr[$i]);
if (strlen($str) > 1 && $str[0] == $quote && $str[strlen($str) - 1] == $quote) {
// if the token has more than one char and the first char is the $quote and the last char is the $quote, then we just remove the quotes before moving on
$str = substr($str, 1, strlen($str) - 2); // den Teil des Tokens zwischen den quotes holen
}
$escaped = false;
$token = '';
for ($j = 0; $j < strlen($str); $j++) {
$c = $str[$j];
if (substr($str, $j, $esclen) == $escape) {
// jump over escape sequence - remove it from token
$escaped = true;
$j += $esclen -1;
continue;
} else {
$escaped = false;
}
$token .= $c;
}
$arr[$i] = $token;
}
return $arr;
}
$input = ["'one|un###'o'","bl###'abla","'three###'###''","'four#########'###cuatro'","token","'token2'","'to'","ken4###'","tok###'en5"];
$output = token_unquote($input);
echo json_encode($output) . "\n";
echo "Expected:" . PHP_EOL;
echo "[\"one|un'o\",\"bl'abla\",\"three''\",\"four'cuatro\",\"token\",\"token2\",\"to\",\"ken4'\",\"tok'en5\"]" .PHP_EOL;
echo "Input: " . PHP_EOL;
echo json_encode($input) . PHP_EOL; With this it is now possible to unquote and remove escape strings from formerly created tokens with multi-char $escape strings I closed the if-condition for removing start and end quotes earlier, to everything after is just a string which might contain escaped quotes. |
I had some time to take a deeper look into the Template-Class, now that I got the general idea, how you parse templates. Now I have some questions of why and how. i found that something like Wouldn't it be better to prefix variables with a So we could have Also in What I could not make sense of yet, was why you set |
Great!
No there is not.
It is used to reference the "value" parameter for comparison in the next block (from 'if' to 'elseif' or from 'elseif' to 'else').
There are very good templating languages out there (such as Blade and Twig), but really.. would you have been inspired by this code if it was a lot longer? Adding features/code is not a virtue.. leaving it out is. It allows people to quickly grasp the essence of the code. Things like multi character escapes or literals in the left hand side of a comparison are not essential to a templating language. Why don't you fork the project and make it more full-featured? You seem to like adding features and people may like it: I think there is room for a zero-dependency Twig or Blade. |
Ooooh, I did not see
My target is to have a zero-dependency template engine I fully understand and that I can support for many years to come. For that to happen I would have to try and write one from scratch (not exactly my strong suit) or play around with yours or others long enough to grasp it 😳 I tried to make the changed as minimal as possible, because I hoped you might be interested in some to implement them in your own style. Trying to make parsing a bit more robust for malformed templates added 19 lines, but that was just trying to understand the parser better - I dont know if it makes things better or worse Prefixing variables with So I just share some thoughts and am thankful for your input/replies - if any of that is useful for your library is only for you to decide - in the end I need to make a fork anyway, because I only need the template class, not the whole core (for now anyhow) |
Just found this little gem from this article:
https://tqdev.com/2019-php-templating-engine-in-165-lines
So far I really like it but as far as I can see it is not possible to have conditions in the template without calling a dedicated functions which must be provided to the sandbox.
This seems to be impossible:
{{if:SOME_VARIABLE!="sometext")}} Some output {{endif}}
That makes it unpleasant to work with, because it is basic functionality you are probably used to from other engines like FatFreeFramework
The text was updated successfully, but these errors were encountered: