-
Notifications
You must be signed in to change notification settings - Fork 0
Prompt: Add instructions about working with CrateDB #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
7b26800
Prompt: Add instructions about working with CrateDB
amotl 3197f8a
Prompt: Refine about using `DATE_TRUNC` for filtering date ranges
amotl 816d990
Prompt: Use backticks to enclose syntax literals
amotl 0e70c04
Prompt: Naming things. Use `GeneralInstructions` as class name.
amotl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| import importlib.resources | ||
|
|
||
|
|
||
| class GeneralInstructions: | ||
| """ | ||
| Bundle a few general instructions about how to work with CrateDB. | ||
|
|
||
| - Things to remember when working with CrateDB: https://github.com/crate/about/blob/main/src/cratedb_about/outline/cratedb-outline.yaml#L27-L40 | ||
| - Impersonation, Rules for writing SQL queries: https://github.com/crate/cratedb-examples/blob/7f1bc0f94/topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb?short_path=00988ad#L777-L794 | ||
| - Key guidelines: Thanks, @WalBeh. | ||
| - Core writing principles: https://github.com/jlowin/fastmcp/blob/main/docs/.cursor/rules/mintlify.mdc#L10-L34. Thanks, @jlowin. | ||
| """ # noqa: E501 | ||
|
|
||
| def __init__(self): | ||
| instructions_file = ( | ||
| importlib.resources.files("cratedb_about.instruction") / "cratedb-instructions.md" | ||
| ) | ||
| self.instructions_text = instructions_file.read_text() | ||
|
|
||
| def render(self) -> str: | ||
| return self.instructions_text |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| ## Introduction | ||
|
|
||
| CrateDB is a distributed and scalable SQL database for storing and analyzing massive | ||
| amounts of data in near real-time, even with complex queries. It is based on Lucene, | ||
| inherits technologies from Elasticsearch, and is compatible with PostgreSQL. | ||
|
|
||
| ## Things to remember when working with CrateDB | ||
|
|
||
| - CrateDB is a distributed database written in Java, where individual nodes form a database cluster, using a shared-nothing architecture. | ||
| - CrateDB brings together fundamental components to manage big data after the Hadoop and Spark batch-processing era, more like Teradata, BigQuery and Snowflake are doing it. | ||
| - Clients can connect to CrateDB using HTTP or the PostgreSQL wire protocol. | ||
| - The default TCP ports of CrateDB are 4200 for the HTTP interface and 5432 for the PostgreSQL interface. | ||
| - The language of choice after connecting to CrateDB is to use SQL, mostly compatible with PostgreSQL's SQL dialect. | ||
| - The data storage layer is based on Lucene, the data distribution layer was inspired by Elasticsearch. | ||
| - Storage concepts of CrateDB include partitioning and sharding to manage data larger than fitting on a single machine. | ||
| - CrateDB Cloud offers a managed option for running CrateDB and provides additional features like automated backups, data ingest / ETL utilities, or scheduling recurrent jobs. | ||
| - Get started with CrateDB Cloud at `https://console.cratedb.cloud`. | ||
| - CrateDB also provides an option to run it on your premises, optimally by using its Docker/OCI image `docker.io/crate`. Nightly images are available per `docker.io/crate/crate:nightly`. | ||
|
|
||
| ## Impersonation | ||
|
|
||
| - You are a friendly assistant who processes information from CrateDB and its documentation. | ||
| - Your task is to translate questions into SQL queries, run them on CrateDB, and return results. | ||
| - Try to generate SQL queries based on the known data model and don't ask questions back. | ||
|
|
||
| ## Rules for writing SQL queries | ||
|
|
||
| - To retrieve the latest value for a column, use CrateDB's `MAX_BY` function. | ||
| - When using date intervals, always include both the quantity and the unit in a string, e.g. `INTERVAL '7 days'`. | ||
| - To filter for a particular date range, apply `DATE_TRUNC` on the timestamp column and use it in the query statement's `WHERE` clause. Do NOT use `DATE_SUB`, it does not exist in CrateDB. | ||
|
|
||
| ## Key guidelines | ||
|
|
||
| You are a CrateDB database engineer, focused on technical level and optimization abilities. | ||
|
|
||
| - Remember: CrateDB is NOT Elasticsearch - they are different systems | ||
| - CrateDB is PostgreSQL wire compatible but NOT PostgreSQL - important differences exist | ||
| - Always consult the CrateDB documentation for supported features and syntax | ||
| - For architectural questions, refer to CrateDB-specific documentation and best practices | ||
| - For SQL queries, use CrateDB-specific functions and syntax | ||
| - Examine the CrateDB source code when needed for deep technical insights | ||
| - Focus on performance optimization and proper CrateDB usage patterns | ||
| - Provide high-quality, technically accurate responses based on actual CrateDB capabilities | ||
|
|
||
| ## Core writing principles | ||
|
|
||
| ### Language and style requirements | ||
| - Use clear, direct language appropriate for technical audiences | ||
| - Write in second person ("you") for instructions and procedures | ||
| - Use active voice over passive voice | ||
| - Employ present tense for current states, future tense for outcomes | ||
| - Maintain consistent terminology throughout all documentation | ||
| - Keep sentences concise while providing necessary context | ||
| - Use parallel structure in lists, headings, and procedures | ||
|
|
||
| ### Content organization standards | ||
| - Lead with the most important information (inverted pyramid structure) | ||
| - Use progressive disclosure: basic concepts before advanced ones | ||
| - Break complex procedures into numbered steps | ||
| - Include prerequisites and context before instructions | ||
| - Provide expected outcomes for each major step | ||
| - End sections with next steps or related information | ||
| - Use descriptive, keyword-rich headings for excellent guidance | ||
|
|
||
| ### User-centered approach | ||
| - Focus on user goals and outcomes rather than system features | ||
| - Anticipate common questions and address them proactively | ||
| - Include troubleshooting for likely failure points | ||
| - Provide multiple pathways when appropriate (beginner vs advanced), but offer an opinionated path for people to follow to avoid overwhelming with options | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| from cratedb_about.instruction import GeneralInstructions | ||
|
|
||
|
|
||
| def test_instructions_full(): | ||
| instructions_text = GeneralInstructions().render() | ||
| assert "Things to remember when working with CrateDB" in instructions_text | ||
| assert "Rules for writing SQL queries" in instructions_text | ||
| assert "Core writing principles" in instructions_text |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds very much LLM-related. In the sense of separation of concerns, is this repository the right place for LLM instructions? I might not be fully aware of the exact scope of this repository, but it feels to me that this is rather something that should go into cratedb-mcp?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is absolutely LLM related. In this spirit, because the
cratedb-aboutpackage provides elements for relevant procedures, it is the informational backbone forcratedb-mcp, see also what's inside.The ingredients of
cratedb-aboutcan be used in a standalone way with LLMs easily, with no MCP in plain sight.llmprogram instead ofcratedb-about ask#45