Skip to content

[FLINK-37724] Adds AsyncTableFunction as a fully supported UDF type #26567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<td><h5>table.exec.async-scalar.buffer-capacity</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">10</td>
<td>Integer</td>
<td>The max number of async i/o operation that the async lookup join can trigger.</td>
<td>The max number of async i/o operation that the async scalar function can trigger.</td>
</tr>
<tr>
<td><h5>table.exec.async-scalar.max-attempts</h5><br> <span class="label label-primary">Streaming</span></td>
Expand Down Expand Up @@ -62,6 +62,36 @@
<td>Boolean</td>
<td>Set whether to use the SQL/Table operators based on the asynchronous state api. Default value is false.</td>
</tr>
<tr>
<td><h5>table.exec.async-table.buffer-capacity</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">10</td>
<td>Integer</td>
<td>The max number of async i/o operations that the async table function can trigger.</td>
</tr>
<tr>
<td><h5>table.exec.async-table.max-attempts</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">3</td>
<td>Integer</td>
<td>The max number of async retry attempts to make before task execution is failed.</td>
</tr>
<tr>
<td><h5>table.exec.async-table.retry-delay</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">100 ms</td>
<td>Duration</td>
<td>The delay to wait before trying again.</td>
</tr>
<tr>
<td><h5>table.exec.async-table.retry-strategy</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">FIXED_DELAY</td>
<td><p>Enum</p></td>
<td>Restart strategy which will be used, FIXED_DELAY by default.<br /><br />Possible values:<ul><li>"NO_RETRY"</li><li>"FIXED_DELAY"</li></ul></td>
</tr>
<tr>
<td><h5>table.exec.async-table.timeout</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">3 min</td>
<td>Duration</td>
<td>The async timeout for the asynchronous operation to complete.</td>
</tr>
<tr>
<td><h5>table.exec.deduplicate.insert-update-after-sensitive-enabled</h5><br> <span class="label label-primary">Streaming</span></td>
<td style="word-wrap: break-word;">true</td>
Expand Down
18 changes: 18 additions & 0 deletions docs/layouts/shortcodes/generated/sink_configuration.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<table class="configuration table table-bordered">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated the async table function?

<thead>
<tr>
<th class="text-left" style="width: 20%">Key</th>
<th class="text-left" style="width: 15%">Default</th>
<th class="text-left" style="width: 10%">Type</th>
<th class="text-left" style="width: 55%">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><h5>sink.committer.retries</h5></td>
<td style="word-wrap: break-word;">10</td>
<td>Integer</td>
<td>The number of retries a Flink application attempts for committable operations (such as transactions) on retriable errors, as specified by the sink connector, before Flink fails and potentially restarts.</td>
</tr>
</tbody>
</table>
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Optional;

import static org.apache.flink.shaded.asm9.org.objectweb.asm.Type.getConstructorDescriptor;
import static org.apache.flink.shaded.asm9.org.objectweb.asm.Type.getMethodDescriptor;
Expand Down Expand Up @@ -373,4 +374,28 @@ public static void validateLambdaType(Class<?> baseClass, Type t) {
+ "Otherwise the type has to be specified explicitly using type information.");
}
}

/**
* Will return true if the type of the given generic class type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Will return true if the type of the given generic class type.
* Will return true if the type of the given generic class type matches clazz.

*
* @param clazz The generic class to check against
* @param type The type to be checked
*/
public static boolean isGenericOfClass(Class<?> clazz, Type type) {
Optional<ParameterizedType> parameterized = getParameterizedType(type);
return clazz.equals(type)
|| parameterized.isPresent() && clazz.equals(parameterized.get().getRawType());
}

/**
* Returns an optional of a ParameterizedType, if that's what the type is.
*
* @param type The type to check
* @return optional which is present if the type is a ParameterizedType
*/
public static Optional<ParameterizedType> getParameterizedType(Type type) {
Comment on lines +384 to +396
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit test for these two function?

return Optional.of(type)
.filter(p -> p instanceof ParameterizedType)
.map(ParameterizedType.class::cast);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -409,7 +409,7 @@ public class ExecutionConfigOptions {
.intType()
.defaultValue(10)
.withDescription(
"The max number of async i/o operation that the async lookup join can trigger.");
"The max number of async i/o operation that the async scalar function can trigger.");

@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<Duration> TABLE_EXEC_ASYNC_SCALAR_TIMEOUT =
Expand Down Expand Up @@ -443,6 +443,49 @@ public class ExecutionConfigOptions {
"The max number of async retry attempts to make before task "
+ "execution is failed.");

// ------------------------------------------------------------------------
// Async Table Function
// ------------------------------------------------------------------------
@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<Integer> TABLE_EXEC_ASYNC_TABLE_BUFFER_CAPACITY =
key("table.exec.async-table.buffer-capacity")
.intType()
.defaultValue(10)
.withDescription(
"The max number of async i/o operations that the async table function can trigger.");

@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<Duration> TABLE_EXEC_ASYNC_TABLE_TIMEOUT =
key("table.exec.async-table.timeout")
.durationType()
.defaultValue(Duration.ofMinutes(3))
.withDescription(
"The async timeout for the asynchronous operation to complete.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the timeout for each retry or total timeout for all retries? I suppose it's for each retry


@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<RetryStrategy> TABLE_EXEC_ASYNC_TABLE_RETRY_STRATEGY =
key("table.exec.async-table.retry-strategy")
.enumType(RetryStrategy.class)
.defaultValue(RetryStrategy.FIXED_DELAY)
.withDescription(
"Restart strategy which will be used, FIXED_DELAY by default.");

@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<Duration> TABLE_EXEC_ASYNC_TABLE_RETRY_DELAY =
key("table.exec.async-table.retry-delay")
.durationType()
.defaultValue(Duration.ofMillis(100))
.withDescription("The delay to wait before trying again.");

@Documentation.TableOption(execMode = Documentation.ExecMode.STREAMING)
public static final ConfigOption<Integer> TABLE_EXEC_ASYNC_TABLE_MAX_ATTEMPTS =
key("table.exec.async-table.max-attempts")
.intType()
.defaultValue(3)
.withDescription(
"The max number of async retry attempts to make before task "
+ "execution is failed.");

// ------------------------------------------------------------------------
// MiniBatch Options
// ------------------------------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,17 @@
import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashSet;
import java.util.List;
import java.util.Optional;
import java.util.Set;
import java.util.concurrent.CompletableFuture;
import java.util.stream.Collectors;

import static org.apache.flink.api.java.typeutils.TypeExtractionUtils.getAllDeclaredMethods;
import static org.apache.flink.api.java.typeutils.TypeExtractionUtils.getParameterizedType;
import static org.apache.flink.api.java.typeutils.TypeExtractionUtils.isGenericOfClass;
import static org.apache.flink.util.Preconditions.checkState;

/**
Expand Down Expand Up @@ -477,11 +481,12 @@ private static void validateImplementationMethods(
validateImplementationMethod(functionClass, false, false, SCALAR_EVAL);
} else if (AsyncScalarFunction.class.isAssignableFrom(functionClass)) {
validateImplementationMethod(functionClass, false, false, ASYNC_SCALAR_EVAL);
validateAsyncImplementationMethod(functionClass, ASYNC_SCALAR_EVAL);
validateAsyncImplementationMethod(functionClass, false, ASYNC_SCALAR_EVAL);
} else if (TableFunction.class.isAssignableFrom(functionClass)) {
validateImplementationMethod(functionClass, true, false, TABLE_EVAL);
} else if (AsyncTableFunction.class.isAssignableFrom(functionClass)) {
validateImplementationMethod(functionClass, true, false, ASYNC_TABLE_EVAL);
validateAsyncImplementationMethod(functionClass, true, ASYNC_TABLE_EVAL);
} else if (AggregateFunction.class.isAssignableFrom(functionClass)) {
validateImplementationMethod(functionClass, true, false, AGGREGATE_ACCUMULATE);
validateImplementationMethod(functionClass, true, true, AGGREGATE_RETRACT);
Expand Down Expand Up @@ -541,7 +546,9 @@ private static void validateImplementationMethod(
}

private static void validateAsyncImplementationMethod(
Class<? extends UserDefinedFunction> clazz, String... methodNameOptions) {
Class<? extends UserDefinedFunction> clazz,
boolean verifyFutureContainsCollection,
String... methodNameOptions) {
final Set<String> nameSet = new HashSet<>(Arrays.asList(methodNameOptions));
final List<Method> methods = getAllDeclaredMethods(clazz);
for (Method method : methods) {
Expand All @@ -558,18 +565,31 @@ private static void validateAsyncImplementationMethod(
if (method.getParameterCount() >= 1) {
Type firstParam = method.getGenericParameterTypes()[0];
firstParam = ExtractionUtils.resolveVariableWithClassContext(clazz, firstParam);
if (CompletableFuture.class.equals(firstParam)
|| firstParam instanceof ParameterizedType
&& CompletableFuture.class.equals(
((ParameterizedType) firstParam).getRawType())) {
foundParam = true;
if (isGenericOfClass(CompletableFuture.class, firstParam)) {
Optional<ParameterizedType> parameterized = getParameterizedType(firstParam);
if (!verifyFutureContainsCollection) {
foundParam = true;
} else if (parameterized.isPresent()
&& parameterized.get().getActualTypeArguments().length > 0) {
firstParam = parameterized.get().getActualTypeArguments()[0];
if (isGenericOfClass(Collection.class, firstParam)) {
foundParam = true;
}
}
}
}
if (!foundParam) {
throw new ValidationException(
String.format(
"Method '%s' of function class '%s' must have a first argument of type java.util.concurrent.CompletableFuture.",
method.getName(), clazz.getName()));
if (!verifyFutureContainsCollection) {
throw new ValidationException(
String.format(
"Method '%s' of function class '%s' must have a first argument of type java.util.concurrent.CompletableFuture.",
method.getName(), clazz.getName()));
} else {
throw new ValidationException(
String.format(
"Method '%s' of function class '%s' must have a first argument of type java.util.concurrent.CompletableFuture<java.util.Collection>.",
method.getName(), clazz.getName()));
}
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
import java.util.concurrent.CompletableFuture;
import java.util.stream.Stream;

import static org.apache.flink.api.java.typeutils.TypeExtractionUtils.getParameterizedType;
import static org.apache.flink.table.types.extraction.ExtractionUtils.collectAnnotationsOfClass;
import static org.apache.flink.table.types.extraction.ExtractionUtils.collectAnnotationsOfMethod;
import static org.apache.flink.table.types.extraction.ExtractionUtils.extractionError;
Expand Down Expand Up @@ -208,7 +209,8 @@ static MethodVerification createParameterVerification(boolean requireAccumulator
* Verification that checks a method by parameters (arguments only) with mandatory {@link
* CompletableFuture}.
*/
static MethodVerification createParameterAndCompletableFutureVerification(Class<?> baseClass) {
static MethodVerification createParameterAndCompletableFutureVerification(
Class<?> baseClass, @Nullable Class<?> nestedArgumentClass) {
return (method, state, arguments, result) -> {
checkNoState(state);
checkScalarArgumentsOnly(arguments);
Expand All @@ -220,11 +222,24 @@ static MethodVerification createParameterAndCompletableFutureVerification(Class<
final Class<?> resultClass = result.toClass();
Type genericType = method.getGenericParameterTypes()[0];
genericType = resolveVariableWithClassContext(baseClass, genericType);
if (!(genericType instanceof ParameterizedType)) {
Optional<ParameterizedType> parameterized = getParameterizedType(genericType);
if (!parameterized.isPresent()) {
throw extractionError(
"The method '%s' needs generic parameters for the CompletableFuture at position %d.",
method.getName(), 0);
}
// If nestedArgumentClass is given, it is assumed to be a generic parameters of
// argumentClass, also at the position genericPos
if (nestedArgumentClass != null) {
genericType = parameterized.get().getActualTypeArguments()[0];
parameterized = getParameterizedType(genericType);
if (!parameterized.isPresent()
|| !parameterized.get().getRawType().equals(nestedArgumentClass)) {
throw extractionError(
"The method '%s' expects nested generic type CompletableFuture<%s> for the %d arg.",
method.getName(), nestedArgumentClass.getName(), 0);
}
}
final Type returnType = ((ParameterizedType) genericType).getActualTypeArguments()[0];
Class<?> returnTypeClass = getClassFromType(returnType);
// Parameters should be validated using strict autoboxing.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@

import javax.annotation.Nullable;

import java.util.Collection;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
Expand Down Expand Up @@ -106,7 +107,7 @@ public static TypeInference forAsyncScalarFunction(
null,
null,
createOutputFromGenericInMethod(0, 0, true),
createParameterAndCompletableFutureVerification(function));
createParameterAndCompletableFutureVerification(function, null));
return extractTypeInference(mappingExtractor, false);
}

Expand Down Expand Up @@ -172,7 +173,8 @@ public static TypeInference forAsyncTableFunction(
null,
null,
createOutputFromGenericInClass(AsyncTableFunction.class, 0, true),
createParameterAndCompletableFutureVerification(function));
createParameterAndCompletableFutureVerification(
function, Collection.class));
return extractTypeInference(mappingExtractor, false);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,9 @@ private static TypeStrategy deriveSystemOutputStrategy(
FunctionKind functionKind,
@Nullable List<StaticArgument> staticArgs,
TypeStrategy outputStrategy) {
if (functionKind != FunctionKind.TABLE && functionKind != FunctionKind.PROCESS_TABLE) {
if (functionKind != FunctionKind.TABLE
&& functionKind != FunctionKind.PROCESS_TABLE
&& functionKind != FunctionKind.ASYNC_TABLE) {
return outputStrategy;
}
return new SystemOutputStrategy(functionKind, staticArgs, outputStrategy);
Expand Down
Loading