Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fuzzer): Add input generator for json_parse in expression fuzzer #12019

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kagamiori
Copy link
Contributor

Summary:
Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 4, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

Copy link

netlify bot commented Jan 4, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 40f50b3
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67a6ae8c07947a000860bb18

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 17, 2025
…facebookincubator#12019)

Summary:
Pull Request resolved: facebookincubator#12019

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 17, 2025
…facebookincubator#12019)

Summary:
Pull Request resolved: facebookincubator#12019

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 17, 2025
…facebookincubator#12019)

Summary:
Pull Request resolved: facebookincubator#12019

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@assignUser
Copy link
Collaborator

I assume this is also based/waiting on #12080?

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 23, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 23, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 23, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Jan 27, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

Comment on lines 441 to 442
funcArgOverrides_["switch"] = std::bind(
&ExpressionFuzzer::generateSwitchArgs, this, std::placeholders::_1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to fit switch into the ArgsOverrideFunc interface? It'd be nice to get rid of this one off exceptional case (doesn't have to be done as part of this change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't think so. Because the function signature of "switch" only contains (conditionType, thenType, elseType). It is the generateSwitchArgs() method that randomly generate 1--5 else and then branches by calling ExpressionFuzzer::generateArg(). So generateSwitchArgs() has to be a member method of ExpressionFuzzer to be able to call generateArg().

Since the additional then and else branches are not specified in the function signature, we also cannot let generateSwitchArgs() return nullptr for them and wait for the caller of generateSwitchArgs() call generateArg().

Comment on lines 532 to 538
funcArgOverrides_[name] = std::bind(
func,
std::placeholders::_1,
std::cref(this->vectorFuzzer_->getOptions()),
std::ref(this->rng_),
std::ref(this->state_));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using function pointers, could we make a virtual base class with a function that can be overridden?

With function pointers you just have to know the signature when you write a new one, and it will make evolving it in the future (if we need to) difficult. If we use an interface, the compiler can enforce that the signature is correct.

Comment on lines 870 to 871
VELOX_CHECK_GT(callable.args.size(), i);
args[i] = generateArg(callable.args[i], callable.constantArgs[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just use callable.args.at(i) and callable.constantArgs.at(i), no need to do the bounds check yourself

int32_t remainingLevelOfNesting_;
};

State& stateMultable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stateMultable -> stateMutable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also, why do we need to expose it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was for the last version of the code. I forgot to remove it. Will remove.

Comment on lines 275 to 278
const std::unordered_map<std::string, std::shared_ptr<ArgGenerator>>&
argGenerators)
argGenerators,
const std::unordered_map<std::string, ArgsOverrideFuncPtr>&
argsOverrideFuncs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the names it's unclear what the difference is between ArgGenerator and ArgsOverrideFuncPtr, I thought ArgGenerator was doing the same thing until I read the code.

Could we rename these to something like ArgTypesGenerator and ArgValuesGenerator (or something better).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I'm renaming them in #12283.

kagamiori added a commit to kagamiori/velox that referenced this pull request Feb 7, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Feb 7, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

kagamiori added a commit to kagamiori/velox that referenced this pull request Feb 7, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Differential Revision: D67820571
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

Copy link
Contributor

@kevinwilfong kevinwilfong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!


namespace facebook::velox::fuzzer {

class generateJsonParseArgs : public ArgValuesGenerator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class names should be upper case: GenerateJsonParseArgs

nit: JsonParseArgsGenerator?

Comment on lines 54 to 80
/*std::vector<core::TypedExprPtr> generateJsonParseArg(
const CallableSignature& signature,
const VectorFuzzer::Options& options,
FuzzerGenerator& rng,
ExpressionFuzzer::State& state) {
VELOX_CHECK_EQ(signature.args.size(), 1);
std::vector<core::TypedExprPtr> inputExpressions;

state.inputRowTypes_.emplace_back(signature.args[0]);
state.inputRowNames_.emplace_back(
fmt::format("c{}", state.inputRowTypes_.size() - 1));

const auto representedType = facebook::velox::randType(rng, 3);
const auto seed = rand<uint32_t>(rng);
const auto nullRatio = options.nullRatio;
state.customInputGenerators_.emplace_back(
std::make_shared<fuzzer::JsonInputGenerator>(
seed,
signature.args[0],
nullRatio,
fuzzer::getRandomInputGenerator(seed, representedType, nullRatio),
true));

inputExpressions.push_back(std::make_shared<core::FieldAccessTypedExpr>(
signature.args[0], state.inputRowNames_.back()));
return inputExpressions;
}*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be deleted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot it. Thank you for catching this.

Comment on lines +845 to +850
// Special case for switch because it has a variable number of arguments not
// specified in the signature. Other functions' argument override should be
// specified through funcArgOverrides_.
if (callable.name == "switch") {
return generateSwitchArgs(callable);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like this better!

/// nullptr at the corresponding index. ExpressionFuzzer then generates random
/// arguments for these unspecified ones with the types specified in the
/// function signature. (Functions of variable arity must determine the number
/// of arguments in the overridden method. Arguments at indices beyong the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: beyong -> beyond

…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Reviewed By: kevinwilfong

Differential Revision: D67820571
kagamiori added a commit to kagamiori/velox that referenced this pull request Feb 8, 2025
…facebookincubator#12019)

Summary:

Make expression fuzzer generate input vectors of valid JSON strings for the
json_parse function. To test corner cases, the JSON strings may be
randomly truncated or inserted with a space character.

Reviewed By: kevinwilfong

Differential Revision: D67820571
@kagamiori kagamiori force-pushed the export-D67820571 branch 2 times, most recently from 4bd683e to 40f50b3 Compare February 8, 2025 01:08
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67820571

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants