Skip to content

Commit 0cd9153

Browse files
heemin32VijayanBmend-for-github-com[bot]
authored
IP2Geo processor implementation (opensearch-project#362)
* Implement creation of ip2geo feature (#257) * Update gradle version to 7.6 (#265) Signed-off-by: Vijayan Balasubramanian <[email protected]> * Implement creation of ip2geo feature * Implementation of ip2geo datasource creation * Implementation of ip2geo processor creation Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Vijayan Balasubramanian <[email protected]> Signed-off-by: Heemin Kim <[email protected]> Co-authored-by: Vijayan Balasubramanian <[email protected]> * Added unit tests with some refactoring of codes (#271) * Add Unit tests * Set cache true for search query * Remove in memory cache implementation (Two way door decision) * Relying on search cache without custom cache * Renamed datasource state from FAILED to CREATE_FAILED * Renamed class name from *Helper to *Facade * Changed updateIntervalInDays to updateInterval * Changed value type of default update_interval from TimeValue to Long * Read setting value from cluster settings directly Signed-off-by: Heemin Kim <[email protected]> * Sync from main (#280) * Update gradle version to 7.6 (#265) Signed-off-by: Vijayan Balasubramanian <[email protected]> * Exclude lombok generated code from jacoco coverage report (#268) Signed-off-by: Heemin Kim <[email protected]> * Make jacoco report to be generated faster in local (#267) Signed-off-by: Heemin Kim <[email protected]> * Update dependency org.json:json to v20230227 (#273) Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> * Baseline owners and maintainers (#275) Signed-off-by: Vijayan Balasubramanian <[email protected]> --------- Signed-off-by: Vijayan Balasubramanian <[email protected]> Signed-off-by: Heemin Kim <[email protected]> Co-authored-by: Vijayan Balasubramanian <[email protected]> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> * Add datasource name validation (#281) Signed-off-by: Heemin Kim <[email protected]> * Refactoring of code (#282) 1. Change variable name from datasourceName to name 2. Change variable name from id to name 3. Added helper methods in test code Signed-off-by: Heemin Kim <[email protected]> * Change field name from md5 to sha256 (#285) Signed-off-by: Heemin Kim <[email protected]> * Implement get datasource api (#279) Signed-off-by: Heemin Kim <[email protected]> * Update index option (#284) 1. Make geodata index as hidden 2. Make geodata index as read only allow delete after creation is done 3. Refresh datasource index immediately after update Signed-off-by: Heemin Kim <[email protected]> * Make some fields in manifest file as mandatory (#289) Signed-off-by: Heemin Kim <[email protected]> * Create datasource index explicitly (#283) Signed-off-by: Heemin Kim <[email protected]> * Add wrapper class of job scheduler lock service (#290) Signed-off-by: Heemin Kim <[email protected]> * Remove all unused client attributes (#293) Signed-off-by: Heemin Kim <[email protected]> * Update copyright header (#298) Signed-off-by: Heemin Kim <[email protected]> * Run system index handling code with stashed thread context (#297) Signed-off-by: Heemin Kim <[email protected]> * Reduce lock duration and renew the lock during update (#299) Signed-off-by: Heemin Kim <[email protected]> * Implements delete datasource API (#291) Signed-off-by: Heemin Kim <[email protected]> * Set User-Agent in http request (#300) Signed-off-by: Heemin Kim <[email protected]> * Implement datasource update API (#292) Signed-off-by: Heemin Kim <[email protected]> * Refactoring test code (#302) Make buildGeoJSONFeatureProcessorConfig method to be more general Signed-off-by: Heemin Kim <[email protected]> * Add ip2geo processor integ test for failure case (#303) Signed-off-by: Heemin Kim <[email protected]> * Bug fix and refactoring of code (#305) 1. Bugfix: Ingest metadata can be null if there is no processor created 2. Refactoring: Moved private method to another class for better testing support 3. Refactoring: Set some private static final variable as public so that unit test can use it 4. Refactoring: Changed string value to static variable Signed-off-by: Heemin Kim <[email protected]> * Add integration test for Ip2GeoProcessor (#306) Signed-off-by: Heemin Kim <[email protected]> * Add ConcurrentModificationException (#308) Signed-off-by: Heemin Kim <[email protected]> * Add integration test for UpdateDatasource API (#307) Signed-off-by: Heemin Kim <[email protected]> * Bug fix on lock management and few performance improvements (#310) * Release lock before response back to caller for update/delete API * Release lock in background task for creation API * Change index settings to improve indexing performance Signed-off-by: Heemin Kim <[email protected]> * Change index setting from read_only_allow_delete to write (#311) read_only_allow_delete does not block write to an index. The disk-based shard allocator may add and remove this block automatically. Therefore, use index.blocks.write instead. Signed-off-by: Heemin Kim <[email protected]> * Fix bug in get datasource API and improve memory usage (#313) Signed-off-by: Heemin Kim <[email protected]> * Change package for Strings.hasText (#314) (#317) Signed-off-by: Heemin Kim <[email protected]> * Remove jitter and move index setting from DatasourceFacade to DatasourceExtension (#319) Signed-off-by: Heemin Kim <[email protected]> * Do not index blank value and do not enrich null property (#320) Signed-off-by: Heemin Kim <[email protected]> * Move index setting keys to constants (#321) Signed-off-by: Heemin Kim <[email protected]> * Return null index name for expired data (#322) Return null index name for expired data so that it can be deleted by clean up process. Clean up process exclude current index from deleting. Signed-off-by: Heemin Kim <[email protected]> * Add new fields in datasource (#325) Signed-off-by: Heemin Kim <[email protected]> * Delete index once it is expired (#326) Signed-off-by: Heemin Kim <[email protected]> * Add restoring event listener (#328) In the listener, we trigger a geoip data update Signed-off-by: Heemin Kim <[email protected]> * Reverse forcemerge and refresh order (#331) Otherwise, opensearch does not clear old segment files Signed-off-by: Heemin Kim <[email protected]> * Removed parameter and settings (#332) * Removed first_only parameter * Removed max_concurrency and batch_size setting first_only parameter was added as current geoip processor has it. However, the parameter have no benefit for ip2geo processor as we don't do a sequantial search for array data but use multi search. max_concurrency and batch_size setting is removed as these are only reveal internal implementation and could be a future blocker to improve performance later. Signed-off-by: Heemin Kim <[email protected]> * Add a field in datasource for current index name (#333) Signed-off-by: Heemin Kim <[email protected]> * Delete GeoIP data indices after restoring complete (#334) We don't want to use restored GeoIP data indices. Therefore we delete the indices once restoring process complete. When GeoIP metadata index is restored, we create a new GeoIP data index instead. Signed-off-by: Heemin Kim <[email protected]> * Use bool query for array form of IPs (#335) Signed-off-by: Heemin Kim <[email protected]> * Run update/delete request in a new thread (#337) This is not to block transport thread Signed-off-by: Heemin Kim <[email protected]> * Remove IP2Geo processor validation (#336) Cannot query index to get data to validate IP2Geo processor. Will add validation when we decide to store some of data in cluster state metadata. Signed-off-by: Heemin Kim <[email protected]> * Acquire lock sychronously (#339) By acquiring lock asychronously, the remaining part of the code is being run by transport thread which does not allow blocking code. We want only single update happen in a node using single thread. However, it cannot be acheived if I acquire lock asynchronously and pass the listener. Signed-off-by: Heemin Kim <[email protected]> * Added a cache to store datasource metadata (#338) Signed-off-by: Heemin Kim <[email protected]> * Changed class name and package (#341) Signed-off-by: Heemin Kim <[email protected]> * Refactoring of code (#342) 1. Changed class name from Ip2GeoCache to Ip2GeoCachedDao 2. Moved the Ip2GeoCachedDao from cache to dao package Signed-off-by: Heemin Kim <[email protected]> * Add geo data cache (#340) Signed-off-by: Heemin Kim <[email protected]> * Add cache layer to reduce GeoIp data retrieval latency (#343) Signed-off-by: Heemin Kim <[email protected]> * Use _primary in query preference and few changes (#347) 1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346 2. Update datasource metadata index mapping 3. Move batch size from static value to setting Signed-off-by: Heemin Kim <[email protected]> * Wait until GeoIP data to be replicated to all data nodes (#348) Signed-off-by: Heemin Kim <[email protected]> * Update packages according to a change in OpenSearch core (opensearch-project#354) * Update packages according to a change in OpenSearch core Signed-off-by: Heemin Kim <[email protected]> * Update packages according to a change in OpenSearch core (opensearch-project#353) Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Vijayan Balasubramanian <[email protected]> Signed-off-by: Heemin Kim <[email protected]> Co-authored-by: Vijayan Balasubramanian <[email protected]> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>
1 parent 806755f commit 0cd9153

File tree

96 files changed

+10419
-25
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

96 files changed

+10419
-25
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
1717
### Enhancements
1818
### Bug Fixes
1919
### Infrastructure
20+
* Make jacoco report to be generated faster in local ([#267](https://github.com/opensearch-project/geospatial/pull/267))
21+
* Exclude lombok generated code from jacoco coverage report ([#268](https://github.com/opensearch-project/geospatial/pull/268))
2022
### Documentation
2123
### Maintenance
24+
* Change package for Strings.hasText ([#314](https://github.com/opensearch-project/geospatial/pull/314))
2225
### Refactoring

build.gradle

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55

66
import org.opensearch.gradle.test.RestIntegTestTask
77

8+
import java.util.concurrent.Callable
9+
810
apply plugin: 'java'
911
apply plugin: 'idea'
1012
apply plugin: 'opensearch.opensearchplugin'
@@ -35,6 +37,7 @@ opensearchplugin {
3537
classname "${projectPath}.${pathToPlugin}.${pluginClassName}"
3638
licenseFile rootProject.file('LICENSE')
3739
noticeFile rootProject.file('NOTICE')
40+
extendedPlugins = ['opensearch-job-scheduler']
3841
}
3942

4043
// This requires an additional Jar not published as part of build-tools
@@ -142,6 +145,10 @@ publishing {
142145
}
143146

144147

148+
configurations {
149+
zipArchive
150+
}
151+
145152
//****************************************************************************/
146153
// Dependencies
147154
//****************************************************************************/
@@ -154,6 +161,9 @@ dependencies {
154161
implementation "org.apache.commons:commons-lang3:3.12.0"
155162
implementation "org.locationtech.spatial4j:spatial4j:${versions.spatial4j}"
156163
implementation "org.locationtech.jts:jts-core:${versions.jts}"
164+
implementation "org.apache.commons:commons-csv:1.10.0"
165+
zipArchive group: 'org.opensearch.plugin', name:'opensearch-job-scheduler', version: "${opensearch_build}"
166+
compileOnly "org.opensearch:opensearch-job-scheduler-spi:${opensearch_build}"
157167
}
158168

159169
licenseHeaders.enabled = true
@@ -206,8 +216,6 @@ integTest {
206216
testClusters.integTest {
207217
testDistribution = "ARCHIVE"
208218

209-
// This installs our plugin into the testClusters
210-
plugin(project.tasks.bundlePlugin.archiveFile)
211219
// Cluster shrink exception thrown if we try to set numberOfNodes to 1, so only apply if > 1
212220
if (_numNodes > 1) numberOfNodes = _numNodes
213221
// When running integration tests it doesn't forward the --debug-jvm to the cluster anymore
@@ -220,6 +228,49 @@ testClusters.integTest {
220228
debugPort += 1
221229
}
222230
}
231+
232+
// This installs our plugin into the testClusters
233+
plugin(project.tasks.bundlePlugin.archiveFile)
234+
plugin(provider(new Callable<RegularFile>(){
235+
@Override
236+
RegularFile call() throws Exception {
237+
return new RegularFile() {
238+
@Override
239+
File getAsFile() {
240+
return configurations.zipArchive.asFileTree.getSingleFile()
241+
}
242+
}
243+
}
244+
}))
245+
246+
// opensearch-geospatial plugin is being added to the list of plugins for the testCluster during build before
247+
// the opensearch-job-scheduler plugin, which is causing build failures. From the stack trace, this looks like a bug.
248+
//
249+
// Exception in thread "main" java.lang.IllegalArgumentException: Missing plugin [opensearch-job-scheduler], dependency of [opensearch-geospatial]
250+
// at org.opensearch.plugins.PluginsService.addSortedBundle(PluginsService.java:515)
251+
//
252+
// A temporary hack is to reorder the plugins list after evaluation but prior to task execution when the plugins are installed.
253+
// See https://github.com/opensearch-project/anomaly-detection/blob/fd547014fdde5114bbc9c8e49fe7aaa37eb6e793/build.gradle#L400-L422
254+
nodes.each { node ->
255+
def plugins = node.plugins
256+
def firstPlugin = plugins.get(0)
257+
plugins.remove(0)
258+
plugins.add(firstPlugin)
259+
}
260+
}
261+
262+
testClusters.yamlRestTest {
263+
plugin(provider(new Callable<RegularFile>(){
264+
@Override
265+
RegularFile call() throws Exception {
266+
return new RegularFile() {
267+
@Override
268+
File getAsFile() {
269+
return configurations.zipArchive.asFileTree.getSingleFile()
270+
}
271+
}
272+
}
273+
}))
223274
}
224275

225276
run {
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
/*
2+
* SPDX-License-Identifier: Apache-2.0
3+
*
4+
* The OpenSearch Contributors require contributions made to
5+
* this file be licensed under the Apache-2.0 license or a
6+
* compatible open source license.
7+
*/
8+
9+
package org.opensearch.geospatial.annotation;
10+
11+
public @interface VisibleForTesting {
12+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.constants;
7+
8+
/**
9+
* Collection of keys for index setting
10+
*/
11+
public class IndexSetting {
12+
public static final String NUMBER_OF_SHARDS = "index.number_of_shards";
13+
public static final String NUMBER_OF_REPLICAS = "index.number_of_replicas";
14+
public static final String REFRESH_INTERVAL = "index.refresh_interval";
15+
public static final String AUTO_EXPAND_REPLICAS = "index.auto_expand_replicas";
16+
public static final String HIDDEN = "index.hidden";
17+
public static final String BLOCKS_WRITE = "index.blocks.write";
18+
}
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.exceptions;
7+
8+
import java.io.IOException;
9+
10+
import org.opensearch.OpenSearchException;
11+
import org.opensearch.core.common.io.stream.StreamInput;
12+
import org.opensearch.core.rest.RestStatus;
13+
14+
/**
15+
* General ConcurrentModificationException corresponding to the {@link RestStatus#BAD_REQUEST} status code
16+
*
17+
* The exception is thrown when multiple mutation API is called for a same resource at the same time
18+
*/
19+
public class ConcurrentModificationException extends OpenSearchException {
20+
21+
public ConcurrentModificationException(String msg, Object... args) {
22+
super(msg, args);
23+
}
24+
25+
public ConcurrentModificationException(String msg, Throwable cause, Object... args) {
26+
super(msg, cause, args);
27+
}
28+
29+
public ConcurrentModificationException(StreamInput in) throws IOException {
30+
super(in);
31+
}
32+
33+
@Override
34+
public final RestStatus status() {
35+
return RestStatus.BAD_REQUEST;
36+
}
37+
}
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.exceptions;
7+
8+
import java.io.IOException;
9+
10+
import org.opensearch.OpenSearchException;
11+
import org.opensearch.core.common.io.stream.StreamInput;
12+
import org.opensearch.core.rest.RestStatus;
13+
14+
/**
15+
* IncompatibleDatasourceException corresponding to the {@link RestStatus#BAD_REQUEST} status code
16+
*
17+
* The exception is thrown when a user tries to update datasource with new endpoint which is not compatible
18+
* with current datasource
19+
*/
20+
public class IncompatibleDatasourceException extends OpenSearchException {
21+
22+
public IncompatibleDatasourceException(String msg, Object... args) {
23+
super(msg, args);
24+
}
25+
26+
public IncompatibleDatasourceException(String msg, Throwable cause, Object... args) {
27+
super(msg, cause, args);
28+
}
29+
30+
public IncompatibleDatasourceException(StreamInput in) throws IOException {
31+
super(in);
32+
}
33+
34+
@Override
35+
public final RestStatus status() {
36+
return RestStatus.BAD_REQUEST;
37+
}
38+
}
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.exceptions;
7+
8+
import java.io.IOException;
9+
10+
import org.opensearch.OpenSearchException;
11+
import org.opensearch.core.common.io.stream.StreamInput;
12+
import org.opensearch.core.rest.RestStatus;
13+
14+
/**
15+
* Generic ResourceInUseException corresponding to the {@link RestStatus#BAD_REQUEST} status code
16+
*/
17+
public class ResourceInUseException extends OpenSearchException {
18+
19+
public ResourceInUseException(String msg, Object... args) {
20+
super(msg, args);
21+
}
22+
23+
public ResourceInUseException(String msg, Throwable cause, Object... args) {
24+
super(msg, cause, args);
25+
}
26+
27+
public ResourceInUseException(StreamInput in) throws IOException {
28+
super(in);
29+
}
30+
31+
@Override
32+
public final RestStatus status() {
33+
return RestStatus.BAD_REQUEST;
34+
}
35+
}
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.ip2geo.action;
7+
8+
import org.opensearch.action.ActionType;
9+
import org.opensearch.action.support.master.AcknowledgedResponse;
10+
11+
/**
12+
* Ip2Geo datasource delete action
13+
*/
14+
public class DeleteDatasourceAction extends ActionType<AcknowledgedResponse> {
15+
/**
16+
* Delete datasource action instance
17+
*/
18+
public static final DeleteDatasourceAction INSTANCE = new DeleteDatasourceAction();
19+
/**
20+
* Delete datasource action name
21+
*/
22+
public static final String NAME = "cluster:admin/geospatial/datasource/delete";
23+
24+
private DeleteDatasourceAction() {
25+
super(NAME, AcknowledgedResponse::new);
26+
}
27+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.geospatial.ip2geo.action;
7+
8+
import java.io.IOException;
9+
10+
import lombok.AllArgsConstructor;
11+
import lombok.Getter;
12+
import lombok.Setter;
13+
14+
import org.opensearch.action.ActionRequest;
15+
import org.opensearch.action.ActionRequestValidationException;
16+
import org.opensearch.core.common.io.stream.StreamInput;
17+
import org.opensearch.core.common.io.stream.StreamOutput;
18+
19+
/**
20+
* GeoIP datasource delete request
21+
*/
22+
@Getter
23+
@Setter
24+
@AllArgsConstructor
25+
public class DeleteDatasourceRequest extends ActionRequest {
26+
/**
27+
* @param name the datasource name
28+
* @return the datasource name
29+
*/
30+
private String name;
31+
32+
/**
33+
* Constructor
34+
*
35+
* @param in the stream input
36+
* @throws IOException IOException
37+
*/
38+
public DeleteDatasourceRequest(final StreamInput in) throws IOException {
39+
super(in);
40+
this.name = in.readString();
41+
}
42+
43+
@Override
44+
public ActionRequestValidationException validate() {
45+
ActionRequestValidationException errors = null;
46+
if (name == null || name.isBlank()) {
47+
errors = new ActionRequestValidationException();
48+
errors.addValidationError("Datasource name should not be empty");
49+
}
50+
return errors;
51+
}
52+
53+
@Override
54+
public void writeTo(final StreamOutput out) throws IOException {
55+
super.writeTo(out);
56+
out.writeString(name);
57+
}
58+
}

0 commit comments

Comments
 (0)