Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk API #4

Open
alolis opened this issue Nov 4, 2014 · 13 comments
Open

Bulk API #4

alolis opened this issue Nov 4, 2014 · 13 comments

Comments

@alolis
Copy link
Contributor

alolis commented Nov 4, 2014

Any plans for supporting ES Bulk API?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html

@will-iam
Copy link
Contributor

will-iam commented Nov 5, 2014

Inded, this is mandatory to implement each feature. I'll see what I can do except if you want to do it ?

@alolis
Copy link
Contributor Author

alolis commented Nov 5, 2014

At the moment I am in the middle of another project that I need to deliver as soon as possible so I do not have the time right now. I might be able to implement it after I am done with my current task if you haven't done it already until then.

@will-iam
Copy link
Contributor

will-iam commented Nov 5, 2014

Ok, good luck for your project. I'll try to do it soon.

@alolis
Copy link
Contributor Author

alolis commented Nov 27, 2014

Well, I did have a go on the matter, not sure if you going to like it, my C++ is a bit rusty :)

It does work for my needs but it would be nice if we manage to make something more official.

#5

@alolis
Copy link
Contributor Author

alolis commented Nov 27, 2014

Another useful thing would be a Json::Object.validate(). Maybe I will try to add it when I get some free time again.

@will-iam
Copy link
Contributor

What would be the purpose ? To check whether the string respect the Json
format ?

2014-11-27 9:44 GMT+01:00 Alexander Lolis notifications@github.com:

Another useful thing would be a Json::Object.validate(). Maybe I will try
to add it when I get some free time again.


Reply to this email directly or view it on GitHub
#4 (comment)
.

@alolis
Copy link
Contributor Author

alolis commented Nov 27, 2014

Yes exactly. It probably makes more sense to use it for the bulk operation to validate the json string before sending it.

@will-iam
Copy link
Contributor

Ok, indeed it would be useful.

2014-11-27 10:41 GMT+01:00 Alexander Lolis notifications@github.com:

Yes exactly. It probably makes more sense to use it for the bulk operation
to validate the json string before sending it.


Reply to this email directly or view it on GitHub
#4 (comment)
.

@will-iam will-iam closed this as completed Dec 2, 2014
@will-iam
Copy link
Contributor

will-iam commented Dec 9, 2014

Hi Alolis,
On this fork https://github.com/tawnkramer/cpp-elasticsearch Tawn is working on the migration to windows and also he would like to discuss the return value from ElasticSearch::bulk.
Here is his question:
It returns a bool now that tests the return of the http request against 200. The http request may succeed, but there may be errors. I found that if

jResult.member("errors") && jResult.getValue("errors").getBoolean() == true

then there were problems. Perhaps we could test that as well before returning true?

@will-iam will-iam reopened this Dec 9, 2014
@alolis
Copy link
Contributor Author

alolis commented Dec 10, 2014

Hi William,

I did that on purpose actually because it does not necessarily mean that if the jResult contains errors that the function failed. My thought was that it should be up to the caller to further check or not the jResult and decide if this was a failure or success depending on his case usage.

For example in my case, I do not consider the bulk() call a failure if there is a duplicate key error upon bulk insert; it's something I expect under specific conditions. The important part is the http status code and it's nice to have consistency since all other functions within the library check the http status code as well.

My 2 cents :)

A.

@tawnkramer
Copy link

Hi Alolis,

I see your point about success in this case meaning different things. I'm still learning about elasticsearch, so what constitutes a failure is not yet clear to me. Bulk does return the json object, so it's not hiding any results.

Some other api calls, ElasticSearch::index, check the json return values for the "created". But they also don't return the json result object.

On a separate note, I think there's some room to improve the speed of bulk operations. In my case, creation and stringifying 32K json objects takes about 30 seconds on a fast machine with plenty of ram. The objects themselves are simple 4 member records. When I wrote a routine to just directly create a string from the objects using some ad-hoc code, it took less than 1 sec. This does not include the http request portion.

I spent some time experimenting, but I'm not sure exactly how best to improve the speed. If I arrive on some solution, I will share it.

thanks,
Tawn

@Mrliu88888888
Copy link

I want to know how to use the Bulk API, thank you, thank you, thank you @alolis @tawnkramer @will-iam , Please give me a demo,,,

@Mrliu88888888
Copy link

`int main(int, char**) {
// Instanciate elasticsearch client.
ElasticSearch es("192.168.0.120:9200");

Json::Object j;
string s = "{ \"index\": { \"_index\": \"website\" }}\n{ \"event\": \"User logged in\" }";
s = "{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"_doc\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }";
//string s = "{ \"event\": \"User logged in\" }";
//string s = "{ \"index\": {}}\n{ \"event\": \"User logged in\" }";
//cout << s.c_str() << endl;

char a[256] = {0};
//strcpy(a, "{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"_doc\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }");
//strcpy(a, "{ \"create\": { \"_index\": \"website\", \"_type\": \"blog\", \"_id\": \"123\" }}");
strcpy(a, "{\"index\":{}}\n{\"event\":\"User logged in\"}");
cout << a << endl;
es.bulk(a, j);

}
`
My version is elasticsearch-7.6.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants