apache · Fokko · Mar 23, 2023 · Mar 14, 2023 · Mar 15, 2023 · Mar 15, 2023
diff --git a/docs/flink-actions.md b/docs/flink-actions.md
@@ -0,0 +1,42 @@
+---
+title: "Flink Actions"
+url: flink-actions
+aliases:
+    - "flink/flink-actions"
+menu:
+    main:
+        parent: Flink
+        weight: 500
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+## Rewrite files action.
+
+Iceberg provides API to rewrite small files into large files by submitting flink batch job. The behavior of this flink action is the same as the spark's [rewriteDataFiles](../maintenance/#compact-data-files).
+
+```java
+import org.apache.iceberg.flink.actions.Actions;
+
+TableLoader tableLoader = TableLoader.fromHadoopTable("hdfs://nn:8020/warehouse/path");
+Table table = tableLoader.loadTable();
+RewriteDataFilesActionResult result = Actions.forTable(table)
+        .rewriteDataFiles()
+        .execute();
+```
+
+For more doc about options of the rewrite files action, please see [RewriteDataFilesAction](../../../javadoc/{{% icebergVersion %}}/org/apache/iceberg/flink/actions/RewriteDataFilesAction.html)
diff --git a/docs/flink-ddl.md b/docs/flink-ddl.md
@@ -0,0 +1,234 @@
+---
+title: "Flink DDL"
+url: flink-ddl
+aliases:
+    - "flink/flink-ddl"
+menu:
+    main:
+        parent: Flink
+        weight: 200
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+
+## DDL commands
+
+###  `Create Catalog`
+
+#### Catalog Configuration
+
+A catalog is created and named by executing the following query (replace `<catalog_name>` with your catalog name and
+`<config_key>`=`<config_value>` with catalog implementation config):
+
+```sql
+CREATE CATALOG <catalog_name> WITH (
+  'type'='iceberg',
+  `<config_key>`=`<config_value>`
+); 
+```
+
+The following properties can be set globally and are not limited to a specific catalog implementation:
+
+* `type`: Must be `iceberg`. (required)
+* `catalog-type`: `hive`, `hadoop` or `rest` for built-in catalogs, or left unset for custom catalog implementations using catalog-impl. (Optional)
+* `catalog-impl`: The fully-qualified class name of a custom catalog implementation. Must be set if `catalog-type` is unset. (Optional)
+* `property-version`: Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The current property version is `1`. (Optional)
+* `cache-enabled`: Whether to enable catalog cache, default value is `true`. (Optional)
+* `cache.expiration-interval-ms`: How long catalog entries are locally cached, in milliseconds; negative values like `-1` will disable expiration, value 0 is not allowed to set. default value is `-1`. (Optional)
+
+#### Hive catalog
+
+This creates an Iceberg catalog named `hive_catalog` that can be configured using `'catalog-type'='hive'`, which loads tables from Hive metastore:
+
+```sql
+CREATE CATALOG hive_catalog WITH (
+  'type'='iceberg',
+  'catalog-type'='hive',
+  'uri'='thrift://localhost:9083',
+  'clients'='5',
+  'property-version'='1',
+  'warehouse'='hdfs://nn:8020/warehouse/path'
+);
+```
+
+The following properties can be set if using the Hive catalog:
+
+* `uri`: The Hive metastore's thrift URI. (Required)
+* `clients`: The Hive metastore client pool size, default value is 2. (Optional)
+* `warehouse`: The Hive warehouse location, users should specify this path if neither set the `hive-conf-dir` to specify a location containing a `hive-site.xml` configuration file nor add a correct `hive-site.xml` to classpath.
+* `hive-conf-dir`: Path to a directory containing a `hive-site.xml` configuration file which will be used to provide custom Hive configuration values. The value of `hive.metastore.warehouse.dir` from `<hive-conf-dir>/hive-site.xml` (or hive configure file from classpath) will be overwritten with the `warehouse` value if setting both `hive-conf-dir` and `warehouse` when creating iceberg catalog.
+* `hadoop-conf-dir`: Path to a directory containing `core-site.xml` and `hdfs-site.xml` configuration files which will be used to provide custom Hadoop configuration values.
+
+#### Hadoop catalog
+
+Iceberg also supports a directory-based catalog in HDFS that can be configured using `'catalog-type'='hadoop'`:
+
+```sql
+CREATE CATALOG hadoop_catalog WITH (
+  'type'='iceberg',
+  'catalog-type'='hadoop',
+  'warehouse'='hdfs://nn:8020/warehouse/path',
+  'property-version'='1'
+);
+```
+
+The following properties can be set if using the Hadoop catalog:
+
+* `warehouse`: The HDFS directory to store metadata files and data files. (Required)
+
+Execute the sql command `USE CATALOG hadoop_catalog` to set the current catalog.
+
+#### REST catalog
+
+This creates an iceberg catalog named `rest_catalog` that can be configured using `'catalog-type'='rest'`, which loads tables from a REST catalog:
+
+```sql
+CREATE CATALOG rest_catalog WITH (
+  'type'='iceberg',
+  'catalog-type'='rest',
+  'uri'='https://localhost/'
+);
+```
+
+The following properties can be set if using the REST catalog:
+
+* `uri`: The URL to the REST Catalog (Required)
+* `credential`: A credential to exchange for a token in the OAuth2 client credentials flow (Optional)
+* `token`: A token which will be used to interact with the server (Optional)
+
+#### Custom catalog
+
+Flink also supports loading a custom Iceberg `Catalog` implementation by specifying the `catalog-impl` property:
+
+```sql
+CREATE CATALOG my_catalog WITH (
+  'type'='iceberg',
+  'catalog-impl'='com.my.custom.CatalogImpl',
+  'my-additional-catalog-config'='my-value'
+);
+```
+
+#### Create through YAML config
+
+Catalogs can be registered in `sql-client-defaults.yaml` before starting the SQL client.
+
+```yaml
+catalogs: 
+  - name: my_catalog
+    type: iceberg
+    catalog-type: hadoop
+    warehouse: hdfs://nn:8020/warehouse/path
+```
+
+#### Create through SQL Files
+
+The Flink SQL Client supports the `-i` startup option to execute an initialization SQL file to set up environment when starting up the SQL Client.
+
+```sql
+-- define available catalogs
+CREATE CATALOG hive_catalog WITH (
+  'type'='iceberg',
+  'catalog-type'='hive',
+  'uri'='thrift://localhost:9083',
+  'warehouse'='hdfs://nn:8020/warehouse/path'
+);
+
+USE CATALOG hive_catalog;
+```
+
+Using `-i <init.sql>` option to initialize SQL Client session:
+
+```bash
+/path/to/bin/sql-client.sh -i /path/to/init.sql
+```
+
+### `CREATE DATABASE`
+
+By default, Iceberg will use the `default` database in Flink. Using the following example to create a separate database in order to avoid creating tables under the `default` database:
+
+```sql
+CREATE DATABASE iceberg_db;
+USE iceberg_db;
+```
+
+### `CREATE TABLE`
+
+```sql
+CREATE TABLE `hive_catalog`.`default`.`sample` (
+    id BIGINT COMMENT 'unique id',
+    data STRING
+);
+```
+
+Table create commands support the commonly used [Flink create clauses](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/) including:
+
+* `PARTITION BY (column1, column2, ...)` to configure partitioning, Flink does not yet support hidden partitioning.
+* `COMMENT 'table document'` to set a table description.
+* `WITH ('key'='value', ...)` to set [table configuration](../configuration) which will be stored in Iceberg table properties.
+
+Currently, it does not support computed column, primary key and watermark definition etc.
+
+### `PARTITIONED BY`
+
+To create a partition table, use `PARTITIONED BY`:
+
+```sql
+CREATE TABLE `hive_catalog`.`default`.`sample` (
+                                                   id BIGINT COMMENT 'unique id',
+                                                   data STRING
+) PARTITIONED BY (data);
+```
+
+Iceberg support hidden partition but Flink don't support partitioning by a function on columns, so there is no way to support hidden partition in Flink DDL.
+
+### `CREATE TABLE LIKE`
+
+To create a table with the same schema, partitioning, and table properties as another table, use `CREATE TABLE LIKE`.
+
+```sql
+CREATE TABLE `hive_catalog`.`default`.`sample` (
+                                                   id BIGINT COMMENT 'unique id',
+                                                   data STRING
+);
+
+CREATE TABLE  `hive_catalog`.`default`.`sample_like` LIKE `hive_catalog`.`default`.`sample`;
+```
+
+For more details, refer to the [Flink `CREATE TABLE` documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/create/).
+
+
+### `ALTER TABLE`
+
+Iceberg only support altering table properties:
+
+```sql
+ALTER TABLE `hive_catalog`.`default`.`sample` SET ('write.format.default'='avro')
+```
+
+### `ALTER TABLE .. RENAME TO`
+
+```sql
+ALTER TABLE `hive_catalog`.`default`.`sample` RENAME TO `hive_catalog`.`default`.`new_sample`;
+```
+
+### `DROP TABLE`
+
+To delete a table, run:
+
+```sql
+DROP TABLE `hive_catalog`.`default`.`sample`;
+```