Skip to content

Document N-dimensional arrays #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
8fdb8c3
Improve docs for other types
mtopolnik Mar 28, 2025
1f68487
Add docs for ARRAY
mtopolnik Mar 28, 2025
915c077
Merge branch 'main' into mt_array
mtopolnik Mar 28, 2025
7f03a57
Don't use footnotes
mtopolnik Mar 28, 2025
abf268a
Add IPv4 limitation to ILP Limitations
mtopolnik Mar 31, 2025
0741ba6
Generally improve docs
mtopolnik Mar 31, 2025
3dcf965
Move array docs to Concepts
mtopolnik Mar 31, 2025
91f1a53
Remove outdated note
mtopolnik Apr 9, 2025
1aad93c
Merge branch 'main' into mt_array
mtopolnik May 13, 2025
3b68314
Document dim_length() and out-of-bounds access
mtopolnik May 13, 2025
a9f2ed5
Merge branch 'main' into mt_array
mtopolnik May 20, 2025
3f7f7f7
New page for array functions
mtopolnik May 20, 2025
77ad0e9
Update sidebars
mtopolnik May 20, 2025
1e4ab1e
broken link fixed
jerrinot May 21, 2025
cf1d39d
Proper SQL examples with results in Array Concept
mtopolnik May 21, 2025
3d5600c
execute many
jerrinot May 21, 2025
589c2ac
inserting arrays with asyncpg
jerrinot May 21, 2025
edeef25
Fix parsing errors, improve
mtopolnik May 21, 2025
bea799d
Improve examples in arry functions
mtopolnik May 21, 2025
13acbbe
Document protocol version config in ILP
mtopolnik May 21, 2025
bf01db4
Improve rendering of examples
mtopolnik May 21, 2025
20ae7ba
Demote subheadings in Aggregate Functions
mtopolnik May 21, 2025
98950c2
Touch up Finance page
mtopolnik May 21, 2025
5eb4f66
better wording
jerrinot May 23, 2025
263542b
better wording
jerrinot May 23, 2025
6f4abb2
links to anchors
jerrinot May 23, 2025
9445872
Merge branch 'main' into mt_array
mtopolnik May 23, 2025
00d8d6f
Add ARRAY literal section in Concepts
mtopolnik May 23, 2025
2041417
inserting arrays with psycopg3
jerrinot May 23, 2025
16431e2
explain binary array transfers with psycopg3
jerrinot May 23, 2025
ee55757
binary for all psycopg3 transfers
jerrinot May 26, 2025
24183bb
asyncpg and timezones explained
jerrinot May 26, 2025
774ff8e
inserting arrays via pgwire
jerrinot May 26, 2025
a1d7584
Add note on limited Beta support
mtopolnik May 26, 2025
5cc187d
Auto-style on java_ilp.md
mtopolnik May 26, 2025
97a3fa4
Document array ingestion for Java ILP client
mtopolnik May 26, 2025
5b6bcdc
Touch up explanation of "transpose"
mtopolnik May 27, 2025
0f0bd91
Touch up performance explanation
mtopolnik May 27, 2025
f6f8cdc
Rephrase explanation of storing the array
mtopolnik May 27, 2025
454b3a4
Move sample array to higher level
mtopolnik May 27, 2025
df4a9a4
document `ndarray` and `protocol_version` in java and rust client.
kafka1991 May 28, 2025
4a02f24
add c/c++/rust array example.
kafka1991 May 28, 2025
f3d0402
Improve ILP config page
mtopolnik May 28, 2025
c2e6b79
Improve Rust ILP page
mtopolnik May 28, 2025
7a4659f
fix c/java protocol_version part.
kafka1991 May 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 170 additions & 0 deletions documentation/clients/ingest-c-and-cpp.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,62 @@ using the original event timestamps when ingesting data into QuestDB. Using the
current timestamp will hinder the ability to deduplicate rows which is
[important for exactly-once processing](/docs/reference/api/ilp/overview/#exactly-once-delivery-vs-at-least-once-delivery).

### Array Insertion

Currently, the C++ interface supports `std::array` for data, while the C interface offers a lower-level and more flexible option:

```cpp
#include <questdb/ingress/line_sender.hpp>
#include <iostream>
#include <vector>

using namespace std::literals::string_view_literals;
using namespace questdb::ingress::literals;

int main()
{
try
{
auto sender = questdb::ingress::line_sender::from_conf(
"tcp::addr=127.0.0.1:9000;protocol_version=2;");
const auto table_name = "cpp_market_orders_byte_strides"_tn;
const auto symbol_col = "symbol"_cn;
const auto book_col = "order_book"_cn;
size_t rank = 3;
std::vector<uintptr_t> shape{2, 3, 2};
std::vector<intptr_t> strides{48, 16, 8};
std::array<double, 12> arr_data = {
48123.5,
2.4,
48124.0,
1.8,
48124.5,
0.9,
48122.5,
3.1,
48122.0,
2.7,
48121.5,
4.3};

questdb::ingress::line_sender_buffer buffer = sender.new_buffer();
buffer.table(table_name)
.symbol(symbol_col, "BTC-USD"_utf8)
.column<true>(book_col, 3, shape, strides, arr_data)
.at(questdb::ingress::timestamp_nanos::now());
sender.flush(buffer);
return true;
}
catch (const questdb::ingress::line_sender_error& err)
{
std::cerr << "[ERROR] " << err.what() << std::endl;
return false;
}
}
```

If your strides match the element size, call `column<false>(book_col, 3, shape, strides, arr_data)` (note the false generic parameter).

## C

:::note
Expand Down Expand Up @@ -424,6 +480,99 @@ original event timestamps when ingesting data into QuestDB. Using the current
timestamp hinder the ability to deduplicate rows which is
[important for exactly-once processing](/docs/reference/api/ilp/overview/#exactly-once-delivery-vs-at-least-once-delivery).

### Array Insertion

```c
int main()
{
line_sender_error* err = NULL;
line_sender* sender = NULL;
line_sender_buffer* buffer = NULL;
char* conf_str = concat("tcp::addr=", host, ":", port, ";protocol_version=2;");
if (!conf_str)
{
fprintf(stderr, "Could not concatenate configuration string.\n");
return false;
}

line_sender_utf8 conf_str_utf8 = {0, NULL};
if (!line_sender_utf8_init(
&conf_str_utf8, strlen(conf_str), conf_str, &err))
goto on_error;

sender = line_sender_from_conf(conf_str_utf8, &err);
if (!sender)
goto on_error;

free(conf_str);
conf_str = NULL;

buffer = line_sender_buffer_new_for_sender(sender);
line_sender_buffer_reserve(buffer, 64 * 1024);

line_sender_table_name table_name = QDB_TABLE_NAME_LITERAL("market_orders_byte_strides");
line_sender_column_name symbol_col = QDB_COLUMN_NAME_LITERAL("symbol");
line_sender_column_name book_col = QDB_COLUMN_NAME_LITERAL("order_book");

if (!line_sender_buffer_table(buffer, table_name, &err))
goto on_error;

line_sender_utf8 symbol_val = QDB_UTF8_LITERAL("BTC-USD");
if (!line_sender_buffer_symbol(buffer, symbol_col, symbol_val, &err))
goto on_error;

size_t array_rank = 3;
uintptr_t array_shape[] = {2, 3, 2};
intptr_t array_strides[] = {48, 16, 8};

double array_data[] = {
48123.5,
2.4,
48124.0,
1.8,
48124.5,
0.9,
48122.5,
3.1,
48122.0,
2.7,
48121.5,
4.3};

if (!line_sender_buffer_column_f64_arr_byte_strides(
buffer,
book_col,
array_rank,
array_shape,
array_strides,
(const uint8_t*)array_data,
sizeof(array_data),
&err))
goto on_error;

if (!line_sender_buffer_at_nanos(buffer, line_sender_now_nanos(), &err))
goto on_error;

if (!line_sender_flush(sender, buffer, &err))
goto on_error;

line_sender_close(sender);
return true;

on_error:;
size_t err_len = 0;
const char* err_msg = line_sender_error_msg(err, &err_len);
fprintf(stderr, "Error: %.*s\n", (int)err_len, err_msg);
free(conf_str);
line_sender_error_free(err);
line_sender_buffer_free(buffer);
line_sender_close(sender);
return false;
}
```

If your strides match the element size, call `line_sender_buffer_column_f64_arr_elem_strides`.

## Other Considerations for both C and C++

### Configuration options
Expand Down Expand Up @@ -473,6 +622,27 @@ demonstrated on the examples in this document.

This call will return false if the flush wouldn't be data-transactional.

### Protocol Version

To enhance data ingestion performance, QuestDB introduced an upgrade to the
text-based InfluxDB Line Protocol which encodes arrays and f64 values in binary
form. Arrays are supported only in this upgraded protocol version.

You can select the protocol version with the `protocol_version` setting in the
configuration string.

HTTP transport automatically negotiates the protocol version by default. In order
to avoid the slight latency cost at connection time, you can explicitly configure
the protocol version by setting `protocol_version=2|1;`.

TCP transport does not negotiate the protocol version and uses version 1 by
default. You must explicitly set `protocol_version=2;` in order to ingest
arrays, as in this example:

```text
tcp::addr=localhost:9000;protocol_version=2;
```

## Next Steps

Please refer to the [ILP overview](/docs/reference/api/ilp/overview) for details
Expand Down
37 changes: 37 additions & 0 deletions documentation/clients/ingest-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,43 @@ with Sender.from_conf(conf) as sender:
sender.dataframe(df, table_name='trades', at='timestamp')
```

## Insert numpy.ndarray

- Direct Array Insertion:

```python
from questdb.ingress import Sender, TimestampNanos
import numpy as np

arr1 = np.array([1.2345678901234567, 2.3456789012345678], dtype=np.float64)
arr2 = np.arange(6, dtype=np.float64).reshape(2, 3)
arr3 = base[:, ::2]

conf = f'http::addr=localhost:9000;'
with Sender.from_conf(conf) as sender:
sender.row(
'tango',
columns={'arr1': arr1, 'arr2': arr2, 'arr3': arr3},
at=TimestampNanos.now())
sender.flush()
```

- DataFrame Insertion

```python
import pandas as pd
from questdb.ingress import Sender
import numpy as np

df = pd.DataFrame({
'array': [np.array([1.0], np.float64), np.array([2.0], np.float64)]
'timestamp': pd.to_datetime(['2022-03-08T18:03:57.609765Z', '2022-03-08T18:03:57.710419Z'])})

conf = f'http::addr=localhost:9000;'
with Sender.from_conf(conf) as sender:
sender.dataframe(df, table_name='tango', at='timestamp')
```

## Configuration options

The minimal configuration string needs to have the protocol, host, and port, as
Expand Down
Loading