Skip to content

Commit

Permalink
📝 Updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nikhilbadyal committed Dec 7, 2024
1 parent 11789e3 commit 50c2767
Showing 1 changed file with 147 additions and 33 deletions.
180 changes: 147 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# EsXport
[![codecov](https://codecov.io/gh/nikhilbadyal/esxport/graph/badge.svg?token=zaoNlW2YXq)](https://codecov.io/gh/nikhilbadyal/esxport)

An adept Python CLI utility & module designed for querying Elasticsearch and exporting result as a CSV file.

A Python-based CLI utility and module designed for querying Elasticsearch and exporting results as a CSV file.

Requirements
------------
Expand All @@ -24,40 +23,155 @@ pip install "esxport[dev]"
Usage
-----

```bash
esxport --help
```
### CLI Usage

Run `esxport --help` for detailed information on available options:


Arguments
OPTIONS
---------
```text
Usage: esxport [OPTIONS]
Options:
-q, --query JSON Query string in Query DSL syntax.
[required]
-o, --output-file PATH CSV file location. [required]
-i, --index-prefixes TEXT Index name prefix(es). [required]
-u, --url URL Elasticsearch host URL. [default:
https://localhost:9200]
-U, --user TEXT Elasticsearch basic authentication user.
[default: elastic]
-p, --password TEXT Elasticsearch basic authentication password.
[required]
-f, --fields TEXT List of _source fields to present be in
output. [default: _all]
-S, --sort ELASTIC SORT List of fields to sort on in form
<field>:<direction>
-d, --delimiter TEXT Delimiter to use in CSV file. [default: ,]
-m, --max-results INTEGER Maximum number of results to return.
[default: 10]
-s, --scroll-size INTEGER Scroll size for each batch of results.
[default: 100]
-q, --query JSON Query string in Query DSL syntax. [required]
-o, --output-file PATH CSV file location. [required]
-i, --index-prefixes TEXT Index name prefix(es). [required]
-u, --url URL Elasticsearch host URL. [default: https://localhost:9200]
-U, --user TEXT Elasticsearch basic authentication user. [default: elastic]
-p, --password TEXT Elasticsearch basic authentication password. [required]
-f, --fields TEXT List of _source fields to present in the output. [default: _all]
-S, --sort ELASTIC SORT List of fields to sort in the format `<field>:<direction>`.
-d, --delimiter TEXT Delimiter to use in the CSV file. [default: ,]
-m, --max-results INTEGER Maximum number of results to return. [default: 10]
-s, --scroll-size INTEGER Scroll size for each batch of results. [default: 100]
-e, --meta-fields [_id|_index|_score]
Add meta-fields in output.
--verify-certs Verify SSL certificates.
--ca-certs PATH Location of CA bundle.
--client-cert PATH Location of Client Auth cert.
--client-key PATH Location of Client Cert Key.
-v, --version Show version and exit.
--debug Debug mode on.
--help Show this message and exit.
Add meta-fields to the output.
--verify-certs Verify SSL certificates.
--ca-certs PATH Location of CA bundle.
--client-cert PATH Location of Client Auth cert.
--client-key PATH Location of Client Cert Key.
-v, --version Show version and exit.
--debug Enable debug mode.
--help Show this message and exit.
```


Module Usage
---------
In addition to the CLI, EsXport can now be used as a Python module. Below is an example of how to integrate it into
your Python application:

```python
from esxport import CliOptions, EsXport

kwargs = {
"query": {
"query": {"match_all": {}},
"size": 1000
},
"output_file": "output.csv",
"index_prefixes": ["my-index-prefix"],
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"verify_certs": False,
"debug": True,
"max_results": 1000,
"scroll_size": 100,
"sort": ["field_name:asc"],
"ca_certs": "path/to/ca.crt"
}

# Create CLI options and initialize EsXport
cli_options = CliOptions(kwargs)
es = EsXport(cli_options)

# Export data
es.export()
```

Class Descriptions
------------------

### `CliOptions`

A configuration class to manage CLI arguments programmatically when using the module.

#### Attributes

| **Attribute** | **Type** | **Description** | **Default** |
|------------------|-------------|---------------------------------------------------------|-------------------------------|
| `query` | `dict` | Elasticsearch Query DSL syntax for filtering data. | N/A |
| `output_file` | `str` | Path to save the exported CSV file. | N/A |
| `url` | `str` | Elasticsearch host URL. | `"https://localhost:9200"` |
| `user` | `str` | Basic authentication username for Elasticsearch. | `"elastic"` |
| `password` | `str` | Basic authentication password for Elasticsearch. | N/A |
| `index_prefixes` | `list[str]` | List of index prefixes to query. | N/A |
| `fields` | `list[str]` | List of `_source` fields to include in the output. | `["_all"]` |
| `sort` | `list[str]` | Fields to sort the output in the format `field_name:asc | desc`. | N/A |
| `delimiter` | `str` | Delimiter for the CSV output. | `","` |
| `max_results` | `int` | Maximum number of results to fetch. | `10` |
| `scroll_size` | `int` | Batch size for scroll queries. | `100` |
| `meta_fields` | `list[str]` | Metadata fields to include in the output. | `["_id", "_index", "_score"]` |
| `verify_certs` | `bool` | Whether to verify SSL certificates. | `False` |
| `ca_certs` | `str` | Path to the CA certificate bundle. | N/A |
| `client_cert` | `str` | Path to the client certificate for authentication. | N/A |
| `client_key` | `str` | Path to the client key for authentication. | N/A |
| `debug` | `bool` | Enable debugging. | `False` |

---

#### Example Initialization

```python
from esxport import CliOptions

cli_options = CliOptions({
"query": {"query": {"match_all": {}}},
"output_file": "data.csv",
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"index_prefixes": ["my-index-prefix"],
"fields": ["field1", "field2"],
"sort": ["field1:asc"],
"max_results": 1000,
"scroll_size": 100
})
```


### `EsXport`

The main class for executing the export operation.

#### Methods

| **Method** | **Description** |
|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
| `__init__(opts: CliOptions, es_client: ElasticsearchClient \| None = None)` | Initializes the `EsXport` object with options (`CliOptions`) and an optional Elasticsearch client. |
| `export()` | Executes the query and exports the results to the specified CSV file. |

---

#### Example Initialization and Usage

```python
from esxport import CliOptions, EsXport

# Define CLI options
cli_options = CliOptions({
"query": {"query": {"match_all": {}}},
"output_file": "output.csv",
"url": "https://localhost:9200",
"user": "elastic",
"password": "password",
"index_prefixes": ["my-index-prefix"]
})

# Initialize EsXport
esxport = EsXport(cli_options)

# Export data
esxport.export()

0 comments on commit 50c2767

Please sign in to comment.