Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Example Streaming Application (Python Variant)

Overview

The example streaming application shows an example of an application that can be deployed using the PNDA Deployment Manager. (See the platform-deployment-manager project for details.)

The application is a tar file containing binaries and configuration files required to perform some stream processing.

This example application reads events from Kafka and performs basic counting analytics.

The results are reported via the PNDA metrics logger, for display in the console and grafana dashboards.

The application expects avro encoded events with 3 generic integer fields and a ms since 1970 timestamp, a b c and gen_ts: a=1;b=2;c=3;gen_ts=1466166149000. These are generated by the sample data source.

Requirements

Build

Please note this currently only builds on OSX or Linux until package-py-deps.sh can be modified to run in a cross-platform way.

To build the example applications use:

mvn clean package

This command should be run at the root of the repository and will build the application package. It will create a package file spark-streaming-example-app-python-{version}.tar.gz in the app-package/target directory.

Files in the package

  • application.properties: config file used by the Spark Streaming scala application.
  • log4j.properties: defines the log level and behaviour for the spark streaming framework (not the python code).
  • properties.json: contains default properties that may be overriden at application creation time.
  • job.py: implements the stream processing in python.

Deploying the package and creating an application

The PNDA console can be used to deploy the application package to a cluster and then to create an application instance. The console is available on port 80 on the edge node.

When creating an application in the console, ensure that the input_topic property is set to a real Kafka topic.

"input_topic": "avro.events.samples",

To make the package available for deployment it must be uploaded to a package repository. The default implementation is an OpenStack Swift container. The package may be uploaded via the PNDA repository manager which abstracts the container used, or by manually uploading the package to the container.

Run sample data source

Same as spark-streaming project