Skip to content

Commit f7181aa

Browse files
treff7estreff7es
treff7es
authored andcommitted
[GOBBLIN-1385] Renaming references from incubator gobblin to gobblin
Renaming references to incubator gobblin to gobblin Removing disclamer as it not needed anymore Closes apache#3223 from treff7es/rename_incubator_gobblin_to_gobblin
1 parent 2c5122d commit f7181aa

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+181
-182
lines changed

CHANGELOG.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1762,9 +1762,9 @@ GOBBLIN 0.6.0
17621762
NEW FEATURES
17631763

17641764
* [Compaction] Added M/R compaction/de-duping for hourly data
1765-
* [Compaction] Added late data handling for hourly and daily M/R compaction: https://github.com/apache/incubator-gobblin/wiki/Compaction#handling-late-records; added support for triggering M/R compaction if late data exceeds a threshold
1765+
* [Compaction] Added late data handling for hourly and daily M/R compaction: https://github.com/apache/gobblin/wiki/Compaction#handling-late-records; added support for triggering M/R compaction if late data exceeds a threshold
17661766
* [I/O] Added support for using Hive SerDe's through HiveWritableHdfsDataWriter
1767-
* [I/O] Added the concept of data partitioning to writers: https://github.com/apache/incubator-gobblin/wiki/Partitioned-Writers
1767+
* [I/O] Added the concept of data partitioning to writers: https://github.com/apache/gobblin/wiki/Partitioned-Writers
17681768
* [Runtime] Added CliLocalJobLauncher for launching single jobs from the command line.
17691769
* [Converters] Added AvroSchemaFieldRemover that can remove specific fields from a (possibly recursive) Avro schema.
17701770
* [DQ] Added new row-level policies RecordTimestampLowerBoundPolicy and AvroRecordTimestampLowerBoundPolicy for checking if a record timestamp is too far in the past.

DISCLAIMER

-1
This file was deleted.

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Apache Gobblin
2-
[![Build Status](https://api.travis-ci.org/apache/incubator-gobblin.svg?branch=master)](https://travis-ci.org/apache/incubator-gobblin)
2+
[![Build Status](https://api.travis-ci.org/apache/gobblin.svg?branch=master)](https://travis-ci.org/apache/gobblin)
33
[![Documentation Status](https://readthedocs.org/projects/gobblin/badge/?version=latest)](https://gobblin.readthedocs.org/en/latest/?badge=latest)
44
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.apache.gobblin/gobblin-api/badge.svg)](https://search.maven.org/search?q=g:org.apache.gobblin)
55
[![Stack Overflow](http://img.shields.io/:stack%20overflow-gobblin-brightgreen.svg)](http://stackoverflow.com/questions/tagged/gobblin)
66
[![Join us on Slack](https://img.shields.io/badge/slack-apache--gobblin-brightgreen.svg)](https://communityinviter.com/apps/apache-gobblin/apache-gobblin)
7-
[![codecov.io](https://codecov.io/github/apache/incubator-gobblin/branch/master/graph/badge.svg)](https://codecov.io/github/apache/incubator-gobblin)
7+
[![codecov.io](https://codecov.io/github/apache/gobblin/branch/master/graph/badge.svg)](https://codecov.io/github/apache/gobblin)
88

99
Apache Gobblin is a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems.
1010

@@ -59,11 +59,11 @@ The distribution will be created in build/gobblin-distribution/distributions dir
5959
# Quick Links
6060

6161
* [Gobblin documentation](https://gobblin.apache.org/docs/)
62-
* [Running Gobblin on Docker from your laptop](https://github.com/apache/incubator-gobblin/blob/master/gobblin-docs/user-guide/Docker-Integration.md)
62+
* [Running Gobblin on Docker from your laptop](https://github.com/apache/gobblin/blob/master/gobblin-docs/user-guide/Docker-Integration.md)
6363
* [Getting started guide](https://gobblin.apache.org/docs/Getting-Started/)
6464
* [Gobblin architecture](https://gobblin.apache.org/docs/Gobblin-Architecture/)
6565
* Community Slack: [Get your invite](https://communityinviter.com/apps/apache-gobblin/apache-gobblin)
6666
* [List of companies known to use Gobblin](https://gobblin.apache.org/docs/Powered-By/)
67-
* [Sample project](https://github.com/apache/incubator-gobblin/tree/master/gobblin-example)
67+
* [Sample project](https://github.com/apache/gobblin/tree/master/gobblin-example)
6868
* [How to build Gobblin from source code](https://gobblin.apache.org/docs/user-guide/Building-Gobblin/)
6969
* [Issue tracker - Apache Jira](https://issues.apache.org/jira/projects/GOBBLIN/issues/)

dev/README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -63,10 +63,10 @@ Users can configure this automatically by running `gobblin-pr setup_git_remotes`
6363

6464
```bash
6565
$ git remote -v
66-
apache https://git-wip-us.apache.org/repos/asf/incubator-gobblin.git (fetch)
67-
apache https://git-wip-us.apache.org/repos/asf/incubator-gobblin.git (push)
68-
github https://github.com/apache/incubator-gobblin.git (fetch)
69-
github https://github.com/apache/incubator-gobblin.git (push)
66+
apache https://git-wip-us.apache.org/repos/asf/gobblin.git (fetch)
67+
apache https://git-wip-us.apache.org/repos/asf/gobblin.git (push)
68+
github https://github.com/apache/gobblin.git (fetch)
69+
github https://github.com/apache/gobblin.git (push)
7070
origin https://github.com/<USER>/gobblin (fetch)
7171
origin https://github.com/<USER>/gobblin (push)
7272
```

dev/gobblin-pr

+8-8
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,8 @@ APACHE_REMOTE_NAME = os.environ.get("APACHE_REMOTE_NAME", "apache")
8686
# https://github.com/settings/tokens. This tool only requires the "public_repo" scope.
8787
GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY")
8888

89-
GITHUB_BASE = "https://github.com/apache/incubator-gobblin/pull"
90-
GITHUB_API_BASE = "https://api.github.com/repos/apache/incubator-gobblin"
89+
GITHUB_BASE = "https://github.com/apache/gobblin/pull"
90+
GITHUB_API_BASE = "https://api.github.com/repos/apache/gobblin"
9191
GITHUB_USER = 'asfgit'
9292

9393
JIRA_BASE = "https://issues.apache.org/jira/browse"
@@ -1050,25 +1050,25 @@ def setup_git_remotes():
10501050
GITHUB_REMOTE_NAME and APACHE_REMOTE_NAME environment variables:
10511051
10521052
git remote -v
1053-
apache https://git-wip-us.apache.org/repos/asf/incubator-gobblin.git (fetch)
1054-
apache https://git-wip-us.apache.org/repos/asf/incubator-gobblin.git (push)
1055-
github https://github.com/apache/incubator-gobblin.git (fetch)
1056-
github https://github.com/apache/incubator-gobblin.git (push)
1053+
apache https://git-wip-us.apache.org/repos/asf/gobblin.git (fetch)
1054+
apache https://git-wip-us.apache.org/repos/asf/gobblin.git (push)
1055+
github https://github.com/apache/gobblin.git (fetch)
1056+
github https://github.com/apache/gobblin.git (push)
10571057
10581058
If these remotes already exist, the tool will display an error.
10591059
"""))
10601060
continue_maybe('Do you want to continue?')
10611061

10621062
error = False
10631063
try:
1064-
run_cmd('git remote add apache https://git-wip-us.apache.org/repos/asf/incubator-gobblin.git')
1064+
run_cmd('git remote add apache https://git-wip-us.apache.org/repos/asf/gobblin.git')
10651065
except:
10661066
click.echo(click.style(reflow(
10671067
'>>ERROR: Could not create apache remote. If it already exists, '
10681068
'run `git remote remove apache` to delete it.', fg='red')))
10691069
error = True
10701070
try:
1071-
run_cmd('git remote add github https://github.com/apache/incubator-gobblin.git')
1071+
run_cmd('git remote add github https://github.com/apache/gobblin.git')
10721072
except:
10731073
click.echo(click.style(reflow(
10741074
'>>ERROR: Could not create github remote. If it already exists, '

gobblin-docs/Getting-Started.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -81,17 +81,17 @@ For this example, we will once again run the Wikipedia example. The records will
8181

8282
## Preliminary
8383

84-
Each Gobblin job minimally involves several constructs, e.g. [Source](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/Source.java), [Extractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/extractor/Extractor.java), [DataWriter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/writer/DataWriter.java) and [DataPublisher](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/publisher/DataPublisher.java). As the names suggest, Source defines the source to pull data from, Extractor implements the logic to extract data records, DataWriter defines the way the extracted records are output, and DataPublisher publishes the data to the final output location. A job may optionally have one or more Converters, which transform the extracted records, as well as one or more PolicyCheckers that check the quality of the extracted records and determine whether they conform to certain policies.
84+
Each Gobblin job minimally involves several constructs, e.g. [Source](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/Source.java), [Extractor](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/extractor/Extractor.java), [DataWriter](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/writer/DataWriter.java) and [DataPublisher](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/publisher/DataPublisher.java). As the names suggest, Source defines the source to pull data from, Extractor implements the logic to extract data records, DataWriter defines the way the extracted records are output, and DataPublisher publishes the data to the final output location. A job may optionally have one or more Converters, which transform the extracted records, as well as one or more PolicyCheckers that check the quality of the extracted records and determine whether they conform to certain policies.
8585

86-
Some of the classes relevant to this example include [WikipediaSource](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaSource.java), [WikipediaExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaExtractor.java), [WikipediaConverter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaConverter.java), [AvroHdfsDataWriter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/writer/AvroHdfsDataWriter.java) and [BaseDataPublisher](https://github.com/apache/incubator-gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/publisher/BaseDataPublisher.java).
86+
Some of the classes relevant to this example include [WikipediaSource](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaSource.java), [WikipediaExtractor](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaExtractor.java), [WikipediaConverter](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaConverter.java), [AvroHdfsDataWriter](https://github.com/apache/gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/writer/AvroHdfsDataWriter.java) and [BaseDataPublisher](https://github.com/apache/gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/publisher/BaseDataPublisher.java).
8787

88-
To run Gobblin in standalone daemon mode we need a Gobblin configuration file (such as uses [application.conf](https://github.com/apache/incubator-gobblin/blob/master/conf/standalone/application.conf)). And for each job we wish to run, we also need a job configuration file (such as [wikipedia.pull](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)). The Gobblin configuration file, which is passed to Gobblin as a command line argument, should contain a property `jobconf.dir` which specifies where the job configuration files are located. By default, `jobconf.dir` points to environment variable `GOBBLIN_JOB_CONFIG_DIR`. Each file in `jobconf.dir` with extension `.job` or `.pull` is considered a job configuration file, and Gobblin will launch a job for each such file. For more information on Gobblin deployment in standalone mode, refer to the [Standalone Deployment](user-guide/Gobblin-Deployment#Standalone-Deployment) page.
88+
To run Gobblin in standalone daemon mode we need a Gobblin configuration file (such as uses [application.conf](https://github.com/apache/gobblin/blob/master/conf/standalone/application.conf)). And for each job we wish to run, we also need a job configuration file (such as [wikipedia.pull](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)). The Gobblin configuration file, which is passed to Gobblin as a command line argument, should contain a property `jobconf.dir` which specifies where the job configuration files are located. By default, `jobconf.dir` points to environment variable `GOBBLIN_JOB_CONFIG_DIR`. Each file in `jobconf.dir` with extension `.job` or `.pull` is considered a job configuration file, and Gobblin will launch a job for each such file. For more information on Gobblin deployment in standalone mode, refer to the [Standalone Deployment](user-guide/Gobblin-Deployment#Standalone-Deployment) page.
8989

9090
A list of commonly used configuration properties can be found here: [Configuration Properties Glossary](user-guide/Configuration-Properties-Glossary).
9191

9292
## Steps
9393

94-
* Create a folder to store the job configuration file. Put [wikipedia.pull](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull) in this folder, and set environment variable `GOBBLIN_JOB_CONFIG_DIR` to point to this folder. Also, make sure that the environment variable `JAVA_HOME` is set correctly.
94+
* Create a folder to store the job configuration file. Put [wikipedia.pull](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull) in this folder, and set environment variable `GOBBLIN_JOB_CONFIG_DIR` to point to this folder. Also, make sure that the environment variable `JAVA_HOME` is set correctly.
9595

9696
* Create a folder as Gobblin's working directory. Gobblin will write job output as well as other information there, such as locks and state-store (for more information, see the [Standalone Deployment](user-guide/Gobblin-Deployment#Standalone-Deployment) page). Set environment variable `GOBBLIN_WORK_DIR` to point to that folder.
9797

@@ -152,12 +152,12 @@ java -jar avro-tools-1.8.1.jar tojson --pretty [job_output].avro > output.json
152152

153153
`output.json` will contain all retrieved records in JSON format.
154154

155-
Note that since this job configuration file we used ([wikipedia.pull](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)) doesn't specify a job schedule, the job will run immediately and will run only once. To schedule a job to run at a certain time and/or repeatedly, set the `job.schedule` property with a cron-based syntax. For example, `job.schedule=0 0/2 * * * ?` will run the job every two minutes. See [this link](http://www.quartz-scheduler.org/documentation/quartz-2.1.x/tutorials/crontrigger.html) (Quartz CronTrigger) for more details.
155+
Note that since this job configuration file we used ([wikipedia.pull](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)) doesn't specify a job schedule, the job will run immediately and will run only once. To schedule a job to run at a certain time and/or repeatedly, set the `job.schedule` property with a cron-based syntax. For example, `job.schedule=0 0/2 * * * ?` will run the job every two minutes. See [this link](http://www.quartz-scheduler.org/documentation/quartz-2.1.x/tutorials/crontrigger.html) (Quartz CronTrigger) for more details.
156156

157157
# Other Example Jobs
158158

159-
Besides the Wikipedia example, we have another example job [SimpleJson](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/simplejson.pull), which extracts records from JSON files and store them in Avro files.
159+
Besides the Wikipedia example, we have another example job [SimpleJson](https://github.com/apache/gobblin/blob/master/gobblin-example/src/main/resources/simplejson.pull), which extracts records from JSON files and store them in Avro files.
160160

161-
To create your own jobs, simply implement the relevant interfaces such as [Source](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/Source.java), [Extractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/extractor/Extractor.java), [Converter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/converter/Converter.java) and [DataWriter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/writer/DataWriter.java). In the job configuration file, set properties such as `source.class` and `converter.class` to point to these classes.
161+
To create your own jobs, simply implement the relevant interfaces such as [Source](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/Source.java), [Extractor](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/source/extractor/Extractor.java), [Converter](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/converter/Converter.java) and [DataWriter](https://github.com/apache/gobblin/blob/master/gobblin-api/src/main/java/org/apache/gobblin/writer/DataWriter.java). In the job configuration file, set properties such as `source.class` and `converter.class` to point to these classes.
162162

163-
On a side note: while users are free to directly implement the Extractor interface (e.g., WikipediaExtractor), Gobblin also provides several extractor implementations based on commonly used protocols, e.g., [KafkaExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaExtractor.java), [RestApiExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/source/extractor/extract/restapi/RestApiExtractor.java), [JdbcExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-modules/gobblin-sql/src/main/java/org/apache/gobblin/source/jdbc/JdbcExtractor.java), [SftpExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/source/extractor/extract/sftp/SftpExtractor.java), etc. Users are encouraged to extend these classes to take advantage of existing implementations.
163+
On a side note: while users are free to directly implement the Extractor interface (e.g., WikipediaExtractor), Gobblin also provides several extractor implementations based on commonly used protocols, e.g., [KafkaExtractor](https://github.com/apache/gobblin/blob/master/gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaExtractor.java), [RestApiExtractor](https://github.com/apache/gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/source/extractor/extract/restapi/RestApiExtractor.java), [JdbcExtractor](https://github.com/apache/gobblin/blob/master/gobblin-modules/gobblin-sql/src/main/java/org/apache/gobblin/source/jdbc/JdbcExtractor.java), [SftpExtractor](https://github.com/apache/gobblin/blob/master/gobblin-core/src/main/java/org/apache/gobblin/source/extractor/extract/sftp/SftpExtractor.java), etc. Users are encouraged to extend these classes to take advantage of existing implementations.

0 commit comments

Comments
 (0)