This file provides quick, practical guidance for coding agents working in this repo. It is intentionally short and focused on day-to-day work.
- Apache Pinot is a real-time distributed OLAP datastore for low-latency analytics over streaming and batch data.
- Core runtime roles: broker (query routing), server (segment storage/execution), controller (cluster metadata/management), minion (async tasks).
- pinot-broker: broker query planning and scatter-gather.
- pinot-controller: controller APIs, table/segment metadata, Helix management.
- pinot-server: server query execution, segment loading, indexing.
- pinot-minion: background tasks (segment conversion, purge, etc).
- pinot-common / pinot-spi: shared utils, config, and SPI interfaces.
- pinot-segment-local / pinot-segment-spi: segment generation, indexes, storage.
- pinot-query-planner / pinot-query-runtime: multi-stage query (MSQ) engine.
- pinot-connectors: external tooling to connect to Pinot
- pinot-plugins: all pinot plugins.
- pinot-tools: CLI and quickstart scripts.
- pinot-integration-tests: end-to-end validation suites.
- pinot-distribution: packaging artifacts.
- pinot-input-format: input format plugin family.
- pinot-arrow: Apache Arrow input format support.
- pinot-avro: Avro input format support.
- pinot-avro-base: shared Avro utilities and base classes.
- pinot-clp-log: CLP log input format support.
- pinot-confluent-avro: Confluent Schema Registry Avro input support.
- pinot-confluent-json: Confluent Schema Registry JSON input support.
- pinot-confluent-protobuf: Confluent Schema Registry Protobuf input support.
- pinot-orc: ORC input format support.
- pinot-json: JSON input format support.
- pinot-parquet: Parquet input format support.
- pinot-csv: CSV input format support.
- pinot-thrift: Thrift input format support.
- pinot-protobuf: Protobuf input format support.
- pinot-file-system: filesystem plugin family.
- pinot-adls: Azure Data Lake Storage (ADLS) filesystem support.
- pinot-hdfs: Hadoop HDFS filesystem support.
- pinot-gcs: Google Cloud Storage filesystem support.
- pinot-s3: Amazon S3 filesystem support.
- pinot-batch-ingestion: batch ingestion plugin family.
- pinot-batch-ingestion-common: shared batch ingestion APIs and utilities.
- pinot-batch-ingestion-spark-base: shared Spark ingestion base classes.
- pinot-batch-ingestion-spark-3: Spark 3 ingestion implementation.
- pinot-batch-ingestion-hadoop: Hadoop MapReduce ingestion implementation.
- pinot-batch-ingestion-standalone: standalone batch ingestion implementation.
- pinot-stream-ingestion: stream ingestion plugin family.
- pinot-kafka-base: shared Kafka ingestion base classes.
- pinot-kafka-3.0: Kafka 3.x ingestion implementation.
- pinot-kafka-4.0: Kafka 4.x ingestion implementation.
- pinot-kinesis: AWS Kinesis ingestion implementation.
- pinot-pulsar: Apache Pulsar ingestion implementation.
- pinot-minion-tasks: minion task plugin family.
- pinot-minion-builtin-tasks: built-in minion task implementations.
- pinot-metrics: metrics reporter plugin family.
- pinot-dropwizard: Dropwizard Metrics reporter implementation.
- pinot-yammer: Yammer Metrics reporter implementation.
- pinot-compound-metrics: compound metrics implementation.
- pinot-segment-writer: segment writer plugin family.
- pinot-segment-writer-file-based: file-based segment writer implementation.
- pinot-segment-uploader: segment uploader plugin family.
- pinot-segment-uploader-default: default segment uploader implementation.
- pinot-environment: environment provider plugin family.
- pinot-azure: Azure environment provider implementation.
- pinot-timeseries-lang: time series language plugin family.
- pinot-timeseries-m3ql: M3QL language plugin implementation.
- assembly-descriptor: Maven assembly descriptor for plugin packaging.
- Build JDK: Use JDK 11+ (CI runs 11/17/21); code targets Java 11.
- Runtime JRE: Broker/server/controller/minion run on Java 11+.
- Default build:
./mvnw clean install - Faster dev build:
./mvnw verify -Ppinot-fastdev - Full binary/shaded build:
./mvnw clean install -DskipTests -Pbin-dist -Pbuild-shaded-jar - Build a module with deps:
./mvnw -pl pinot-server -am test - Single test example:
./mvnw -pl pinot-segment-local -Dtest=RangeIndexTest test - Quickstart (after build):
build/bin/quick-start-batch.sh
- Single integration test example:
./mvnw -pl pinot-integration-tests -am -Dtest=OfflineClusterIntegrationTest -Dsurefire.failIfNoSpecifiedTests=false test
- Add class-level Javadoc for new classes; describe behavior and thread-safety.
- Use Javadoc comments with either
/** ... */or///syntax (per JEP-467); code targets Java 11. - Keep license headers on all new source files.
- Use
./mvnw license:formatto add headers to new files. - Preserve backward compatibility across mixed-version broker/server/controller.
- Prefer imports over fully qualified class names (e.g., use
import com.foo.Barand refer toBar, notcom.foo.Barinline). - Prefer targeted unit tests; use integration tests when behavior crosses roles.
- Checkstyle rules and related config files live under
config/. - Use the Maven wrapper (
./mvnwon Unix-like systems ormvnw.cmdon Windows) to runspotless:applyto format code andcheckstyle:checkto validate style. - Run
./mvnw license:checkto validate license headers.
Before pushing a commit, always run the following checks on the affected modules and fix any failures:
./mvnw spotless:apply -pl <module>— auto-format code../mvnw checkstyle:check -pl <module>— validate style conformance../mvnw license:format -pl <module>— add missing license headers to new files../mvnw license:check -pl <module>— verify all files have correct license headers.
Do not push until all four checks pass cleanly.
- Query changes often touch broker planning and server execution; verify both.
- Segment/index changes usually live under
pinot-segment-localandpinot-segment-spi. - Config or API changes should update relevant configs and docs where applicable.
README.mdfor build and quickstart details.CONTRIBUTING.mdfor style, licensing, and contribution guidance.