Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage Write API functionality doesn't work - UNKNOWN: failed to find stream #342

Open
tarasrng opened this issue Jul 23, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@tarasrng
Copy link

What happened?

I'm trying to implement a very simple PoC using Storage Write API and test it using the emulator (v0.6.3), but I keep getting this error:

com.google.api.gax.rpc.UnknownException: io.grpc.StatusRuntimeException: UNKNOWN: failed to find stream from projects/test-project/datasets/local-bigquery/tables/local-table/_default

Enabling the debug level doesn't help to find the root cause, container logs have nothing suspicious.

Actual code that uses Storage Write API:

        JSONArray rowContentArray = new JSONArray();
        JSONObject rowContent;
        rowContent = new JSONObject();

        rowContent.put("timestamp", DateTimeFormatter.ISO_INSTANT.format(Instant.ofEpochMilli(timestamp).atOffset(ZoneOffset.UTC)));
        rowContent.put("id", deviceIdentifier);
        rowContent.put("json", objectMapper.writeValueAsString(json));
        rowContent.put("partition", partition);

        rowContentArray.put(rowContent);

        TableSchema tableSchema = TableSchema.newBuilder()
                .addFields(TableFieldSchema.newBuilder()
                        .setName("id")
                        .setType(TableFieldSchema.Type.STRING)
                        .setMode(TableFieldSchema.Mode.NULLABLE)
                        .build())
                .addFields(TableFieldSchema.newBuilder()
                        .setName("timestamp")
                        .setType(TableFieldSchema.Type.TIMESTAMP)
                        .setMode(TableFieldSchema.Mode.NULLABLE)
                        .build())
                .addFields(TableFieldSchema.newBuilder()
                        .setName("json")
                        .setType(TableFieldSchema.Type.JSON)
                        .setMode(TableFieldSchema.Mode.NULLABLE)
                        .build())
                .addFields(TableFieldSchema.newBuilder()
                        .setName("partition")
                        .setType(TableFieldSchema.Type.INT64)
                        .setMode(TableFieldSchema.Mode.NULLABLE)
                        .build())
                .build();
                
                        BigQueryWriteSettings settings = BigQueryWriteSettings.newBuilder()
                .setEndpoint(properties.getGrpcHost())
                .setCredentialsProvider(NoCredentialsProvider.create())
                .setTransportChannelProvider(
                        EnhancedBigQueryReadStubSettings.defaultGrpcTransportProviderBuilder()
                                .setChannelConfigurator(ManagedChannelBuilder::usePlaintext)
                                .build()
                )
                .build();

        bigQueryWriteClient = BigQueryWriteClient.create(settings);
        writer = JsonStreamWriter.newBuilder(parentTable.toString(), tableSchema, bigQueryWriteClient)
                .setExecutorProvider(FixedExecutorProvider.create(Executors.newScheduledThreadPool(100)))
                .setChannelProvider(BigQueryWriteSettings.defaultGrpcTransportProviderBuilder()
                        .setKeepAliveTime(org.threeten.bp.Duration.ofMinutes(1))
                        .setKeepAliveTimeout(org.threeten.bp.Duration.ofMinutes(1))
                        .setKeepAliveWithoutCalls(true)
                        .setChannelsPerCpu(2)
                        .build())
                .setEnableConnectionPool(true)
                .setRetrySettings(retrySettings)
                .build()

        ApiFuture<AppendRowsResponse> future = writer.append(rowContentArray);

BTW the emulator works fine when using older insertAll API.

I would appreciate any help in understanding the root cause and how I can troubleshoot the problem.

What did you expect to happen?

The integration test should run successfully

How can we reproduce it (as minimally and precisely as possible)?

This is a sample project demonstrating the issue:
https://github.com/tarasrng/big-query-emulator-write
Test run command (requires Java 21 installed):
./mvnw clean install

Anything else we need to know?

Actual code:
https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/repository/BigQueryRepository.java

Table schema creation:
https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/repository/migration/SchemaCreator.java

Emulator config:
https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/config/BigQueryEmulatorConfig.java

Test:
https://github.com/tarasrng/big-query-emulator-write/blob/main/src/test/java/org/example/bigqueryemulatorwrite/BigQueryEmulatorWriteApplicationTests.java

@tarasrng tarasrng added the bug Something isn't working label Jul 23, 2024
@andersdavoust
Copy link

We are having the same issue. We would very much appreciate this functionlity.

@ismailsimsek
Copy link

ismailsimsek commented Oct 30, 2024

having same error. this might mean we need to explicitly create the stream first. FYI

after explicitly creating the stream. im getting following error.

[UNAUTHENTICATED: Credentials require channel with PRIVACY_AND_INTEGRITY security level. Observed security level: NONE](https://stackoverflow.com/questions/65359948/unauthenticated-credentials-require-channel-with-privacy-and-integrity-security)

https://stackoverflow.com/questions/65359948/unauthenticated-credentials-require-channel-with-privacy-and-integrity-security

this example might help. in which example stream is created explicitly. instead using default stream. https://cloud.google.com/bigquery/docs/write-api-batch#java

@mluiten
Copy link

mluiten commented Jan 9, 2025

Apparently, PR #226 assumed that the default stream is formatted as projects/<project>/datasets/<datasets>/tables/<table>/streams/_default, but at least for the Java client, this is not correct, as it's using projects/<project>/datasets/<datasets>/tables/<table>/_default -- so without the streams at the end... and there is no way to create the default stream because you can only create streams using the former format.

So not sure what that PR was all about and who tested it, but at least for Java it's not working at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants