Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

Latest commit

 

History

History

cortx-s3-slack-bot

forthebadge forthebadge forthebadge forthebadge forthebadge forthebadge


Logo

CORTX S3 Slack Bot

A slack bot that uses Elastic Search and AWS to interact with Cortx S3 server

Repository link https://github.com/sarthakarora1208/cortx-s3-slack-bot
Devpost link https://devpost.com/software/cortx-s3-slack-bot
Video link https://youtu.be/G_Pu86H5nSg

What does the CORTX S3 SLACK BOT do?

File Syncing and Data Backup inside Slack

Cortx S3 Slack Bot enables users to access files in your S3 bucket directly from Slack using Slash commands. By using simple commands like /cortx-s3-get filename and /cortx-s3-delete filename we can find or delete files. Whenever a new file is shared on any public channel it is automatically added to the Cortx S3 test bucket, ensuring that all your slack files are safe in case a teammate accidently deletes a file that you need.

File Searching

Most of the time we don't know the exact name of the file we are looking for. We also need to check if the file is actually present in the S3 bucket. Pooling the bucket over and over again to find a file or check for its existence is a computationally expensive and slow operation. To enable faster indexing of all the files on the S3 bucket, there is a layer of Elasticsearch between the Slack Bot and the S3 bucket. A user can find any file using the /cortx-s3-search command which opens a file search dialog. Elasticsearch's autocomplete functionality helps in navigating or guiding the user by prompting them with likely completions and alternatives to the filenames as they are typing it.

Employee/Intern Onboarding

Whenever a new employee/intern joins the #cortx-s3-test channel he/she is greeted by our Cortx bot and is asked to upload his/her resume. After uploading their resume, they notify the slack bot with the /cortx-s3-upload-resume resume.pdf command. The bot processes the file extracts Personally Identifiable Information (PII) like name, email and phone number from the document updates of the csv file. The administrators can get all the details of the employees within slack using /cortx-s3-resume-data slash command.

In App Screenshots

How we built it

This integration has 5 components

  1. Slack Bot
  2. Cortx S3 Server
  3. Elasticsearch
  4. AWS Comprehend
  5. AWS Textract

The Project is set up to work in a python3 virtual environment. The Slack app is built using Bolt for Python framework. For connecting to the CortxS3 Server, AWS Comprehend and AWS Textract we use their respective boto3 clients. We connect to Elasticsearch using the Python Elasticsearch Client.

The Slack app listens to all sorts of events happening around your workspace — messages being posted, files being shared, users joining the team, and more. To listen for events, the slack app uses the Events API. To enable custom interactivity like the search modal we use the Blocks Kit.

Slash commands perform a very simple task: they take whatever text you enter after the command itself (along with some other predefined values), send it to a URL, then accept whatever the script returns and posts it as a Slackbot message to the person who issued the command or in a public channel. Here are the 5 slash commands we use to interact with the Cortx S3 bucket.

File Sync

Whenever a new file is shared in any public slack channel the file_share event is sent to the Slack app. The file is first indexed into Elasticsearch and then added to the Cortx S3 bucket with a key as file name.

Slash Commands

  • /cortx-s3-get
  • /cortx-s3-search
  • /cortx-s3-delete
  • /cortx-s3-upload-resume
  • /cortx-s3-resume-data

/corx-s3-get filename

After fetching the filename from the command['text'] parameter we check if a the file exists using the es.exists(es = Elasticsearch client) function. If the file is found, we return the file back to the user as a direct message.

/corx-s3-search

This command opens up a modal inside of slack with a search bar, the user is suggested the file names depending on whatever text is written in.

/corx-s3-delete filename

After fetching the filename from the command['text'] parameter we check if a the file exists using the es.exists(es = Elasticsearch client) function. If the file is found, we confirm if the user wants to permanently delete the file from the S3 bucket. If the user clicks yes, the file is permanently deleted.


/corx-s3-upload-resume resume.pdf

When the command is invoked, we get the name of the file from the command[text] parameter. The slack app searches for the file on the S3 bucket and downloads it for local processing. The text is extracted from the .jpeg or .pdf resume file using AWS textract using OCR (Optical Character Recognition). The text is passed onto AWS Comprehend which identifies Personally Identifiable Information (PII) of the employee like name, email and phone number from the document. This data is appended in the resume-data.csv file.

/corx-s3-resume-data

Upon invocation we get the names and email addresses of the employees inside a table in slack populated with the data from resume-data.csv file.

resume-analsyis

CORTX-S3 Slack Bot Installation Instructions

Requirements

Getting Started



Python 3.6+

To test the integration you need to have python installed on your computer. You can get a suitable release from here. You can check your python version by the following command.

We recommend using a virtual environment for development. Read about it here.

Follow the following steps to create a virtual environment, clone the repository and install all the packages.

Cloning the repo

# Python 3.6+ required
git clone https://github.com/sarthakarora1208/cortx-s3-slack-bot
cd cortx-s3-slack-bot
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt



Cortx S3 Server

To successfully connect to a Cortx Server you need to set the endpoint_url , aws_access_key_id and aws_secret_access_key in the .env file. If you are using a Cloudshare environment and followed the instructions from https://raw.githubusercontent.com/Seagate/cortx/wiki/CORTX-Cloudshare-Setup-for-April-Hackathon-2021 you can simply copy the Server URL (CORTX endpoint) from the Connection Details section -> External address on your cortx-va-1.03 VM and paste in the .env file

ENDPOINT_URL=""
AWS_ACCESS_KEY_ID="AKIAtEpiGWUcQIelPRlD1Pi6xQ"
AWS_SECRET_ACCESS_KEY="YNV6xS8lXnCTGSy1x2vGkmGnmdJbZSapNXaSaRhK"

You need to have a bucket with the name 'testbucket' inorder for the code to work




ngrok

Using ngrok as a local proxy

To develop locally we'll be using ngrok, which allows you to expose a public endpoint that Slack can use to send your app events. If you haven't already, install ngrok from their website .

Read more about ngrok



AWS Account

You need a verified aws account to test the process_resume.py

You can get your credentials file at ~/.aws/credentials (C:\Users\USER_NAME.aws\credentials for Windows users) and copy the following lines in the .env file.

AMAZON_AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID"
AMAZON_AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY



Elasticsearch 7.12.0

The slack bot uses elastic search to index files on the S3 bucket.

To download Elasticsearch from their website.

You can run Elasticsearch on your own hardware, or use our hosted Elasticsearch Service on Elastic Cloud.

You can change the config variables in the .env file if you choose a hosted option

ELASTIC_DOMAIN='http://localhost'
ELASTIC_PORT=9200

You can test the elasticsearch client by running elasticsearch_connector.py.

python3 elastic_connector.py

A successful connection will yield:




Slack

You need to have slack installed on your computer. If you don't have Slack you get it from here for Windows or Mac. Login to your account, if you don't have an account you can make one here.

If you are an existing user you need to make a new channel #cortx-s3-test and you must be able to add new apps.

You need to create a new workspace (https://slack.com/create) and add a new channel #cortx-s3-test

To get started, you'll need to create a new Slack app, go to: https://api.slack.com/apps

Bolt is a foundational framework that makes it easier to build Slack apps with the platform's latest features. We will be using this make our slack bot

  1. Click on Create an App button

  2. Give the app name as cortx-bot and choose the development workspace

  3. Requesting scopes - Scopes give your app permission to do things (for example, post messages) in your development workspace. You can select the scopes to add to your app by navigating over to the OAuth & Permissions sidebar.

  4. Add the following scopes the Bot Token Scopes by clicking on the Add an OAuth Scope button


OAuth Scope Description
channels:history View messages and other content in public channels that cortx-bot has been added to
channels:join Join public channels in a workspace
channels:read View basic information about public channels in a workspace
chat:write Send messages as @cortxbot
chat:write.customize Send messages as @cortxbot with a customized username and avatar
chat:write.public Send messages to channels @cortxbot isn't a member of
commands Add shortcuts and/or slash commands that people can use
files:read View files shared in channels and conversations that cortx-bot has been added to
files:write Upload, edit, and delete files as cortx-bot


  1. Add the following scopes the the User Token Scopes by clicking on the Add an OAuth Scope button
OAuth Scope Description
channels:history View messages and other content in public channels that cortx-bot has been added to
files:read View files shared in channels and conversations that cortx-bot has been added to


  1. Install your own app by selecting the Install App button at the top of the OAuth & Permissions page, or from the sidebar.

  2. After clicking through one more green Install App To Workspace button, you'll be sent through the Slack OAuth UI.

  3. After installation, you'll land back in the OAuth & Permissions page and find a Bot User OAuth Access Token. and a User OAuth Token. Click on the copy button for each of them. These tokens need to be added to the .env file. (The bot token starts with xoxb whereas the user token is longer and starts with xoxp)

SLACK_USER_TOKEN=xoxp-your-user-token
SLACK_BOT_TOKEN=xoxb-your-bot-token

  1. In addition to the access token, you'll need a signing secret. Your app's signing secret verifies that incoming requests are coming from Slack. Navigate to the Basic Information page from your app management page. Under App Credentials, copy the value for Signing Secret and add it to the .env file.
SLACK_SIGNING_SECRET=your-signing-secret

  1. Make sure you have followed the steps in Cloning the repo. To start the bolt app. The HTTP server is using a built-in development adapter, which is responsible for handling and parsing incoming events from Slack on port 3000
python3 app.py

Open a new terminal and ensure that you've installed ngrok, go ahead and tell ngrok to use port 3000 (which Bolt for Python uses by default):

ngrok http 3000

For local slack development, we'll use your ngrok URL from above, so copy it your clipboard

For example: https://your-own-url.ngrok.io (copy to clipboard)
  1. Subscribing to events - Your app can listen to all sorts of events happening around your workspace — messages being posted, files being shared, and more. On your app configuration page, select the Event Subscriptions sidebar. You'll be presented with an input box to enter a Request URL, which is where Slack sends the events your app is subscribed to. Hit the save button

By default Bolt for Python listens for all incoming requests at the /slack/events route, so for the Request URL you can enter your ngrok URL appended with /slack/events.

Request URL: https://your-own-url.ngrok.io/slack/events

If the challenge was successful you will get a verified right next to the Request URL.

On the same page click on the Subscribe to bot events menu on the bottom of the page. Click on the Add Bot User Event .

Similary click on the Subscribe to events on behalf of user. Click on the Add Workspace Event.

Add the following scopes

EventName Description Required Scope
file_share A file was shared files:read
message.channels A message was posted to a channel channesls:history



  1. Next up select the Interactivity & Shortcuts sidebar and toggle the switch as on. Again for the Request URL enter your ngrok URL appended with /slack/events
Request URL: https://your-own-url.ngrok.io/slack/events

  1. Scroll down to the Select Menus section, in the Options Load URL, enter your ngork URL appended with /slack/events
Options Load URL: https://your-own-url.ngrok.io/slack/events

  1. Finally we come to the slash commands. Slack's custom slash commands perform a very simple task: they take whatever text you enter after the command itself (along with some other predefined values), send it to a URL, then accept whatever the script returns and posts it as a Slackbot message to the person who issued the command. We have 5 slash commands to be added in the workspace.

Head over to the Slash Commands sidebar and click on the Create New Command button to head over the Create New Command page. Add the Command, Request URL,Short Description and Usage hint, according to the table provided below.

Click on Save to return to the Slash Commands

Command Request URL Short Description Usage Hint
/cortx-s3-get https://your-own-url.ngrok.io/slack/events Get a file from s3 bucket filename
/cortx-s3-search https://your-own-url.ngrok.io/slack/events Search for a file in S3
/cortx-s3-delete https://your-own-url.ngrok.io/slack/events Deletes the given file from the s3 bucket filename
/cortx-s3-upload-resume https://your-own-url.ngrok.io/slack/events Upload resume to database resume.pdf
/cortx-s3-resume-data https://your-own-url.ngrok.io/slack/events Get resume data from s3

  1. Watch the video to know more about using these slack commands

  2. Open the slack channel and upload a file in any channel, note the file name

  3. Then type the /cortx-s3-search and search for your file

ezgif-7-2d48a9abea31

ISSUES

Elasticsearch Error

Solution - Starting Elasticsearch first it is possible that elasticsearch is not working

Endpoint URL error

Solution - add your endpoint url