Repository link https://github.com/sarthakarora1208/cortx-s3-slack-bot
Devpost link https://devpost.com/software/cortx-s3-slack-bot
Video link https://youtu.be/G_Pu86H5nSg
Cortx S3 Slack Bot enables users to access files in your S3 bucket directly from Slack using Slash commands. By using simple commands like /cortx-s3-get filename
and /cortx-s3-delete filename
we can find or delete files. Whenever a new file is shared on any public channel it is automatically added to the Cortx S3 test bucket, ensuring that all your slack files are safe in case a teammate accidently deletes a file that you need.
Most of the time we don't know the exact name of the file we are looking for. We also need to check if the file is actually present in the S3 bucket. Pooling the bucket over and over again to find a file or check for its existence is a computationally expensive and slow operation. To enable faster indexing of all the files on the S3 bucket, there is a layer of Elasticsearch between the Slack Bot and the S3 bucket. A user can find any file using the /cortx-s3-search
command which opens a file search dialog. Elasticsearch's autocomplete functionality helps in navigating or guiding the user by prompting them with likely completions and alternatives to the filenames as they are typing it.
Whenever a new employee/intern joins the #cortx-s3-test
channel he/she is greeted by our Cortx bot and is asked to upload his/her resume. After uploading their resume, they notify the slack bot with the /cortx-s3-upload-resume resume.pdf
command. The bot processes the file extracts Personally Identifiable Information (PII) like name, email and phone number from the document updates of the csv file.
The administrators can get all the details of the employees within slack using /cortx-s3-resume-data
slash command.
This integration has 5 components
- Slack Bot
- Cortx S3 Server
- Elasticsearch
- AWS Comprehend
- AWS Textract
The Project is set up to work in a python3 virtual environment. The Slack app is built using Bolt for Python framework. For connecting to the CortxS3 Server, AWS Comprehend and AWS Textract we use their respective boto3 clients. We connect to Elasticsearch using the Python Elasticsearch Client.
The Slack app listens to all sorts of events happening around your workspace — messages being posted, files being shared, users joining the team, and more. To listen for events, the slack app uses the Events API. To enable custom interactivity like the search modal we use the Blocks Kit.
Slash commands perform a very simple task: they take whatever text you enter after the command itself (along with some other predefined values), send it to a URL, then accept whatever the script returns and posts it as a Slackbot message to the person who issued the command or in a public channel. Here are the 5 slash commands we use to interact with the Cortx S3 bucket.
Whenever a new file is shared in any public slack channel the file_share event is sent to the Slack app. The file is first indexed into Elasticsearch and then added to the Cortx S3 bucket with a key as file name.
- /cortx-s3-get
- /cortx-s3-search
- /cortx-s3-delete
- /cortx-s3-upload-resume
- /cortx-s3-resume-data
After fetching the filename from the command['text']
parameter we check if a the file exists using the es.exists
(es = Elasticsearch client) function. If the file is found, we return the file back to the user as a direct message.
This command opens up a modal inside of slack with a search bar, the user is suggested the file names depending on whatever text is written in.
After fetching the filename from the command['text']
parameter we check if a the file exists using the es.exists
(es = Elasticsearch client) function. If the file is found, we confirm if the user wants to permanently delete the file from the S3 bucket. If the user clicks yes, the file is permanently deleted.
When the command is invoked, we get the name of the file from the command[text]
parameter. The slack app searches for the file on the S3 bucket and downloads it for local processing. The text is extracted from the .jpeg or .pdf resume file using AWS textract using OCR (Optical Character Recognition). The text is passed onto AWS Comprehend which identifies Personally Identifiable Information (PII) of the employee like name, email and phone number from the document. This data is appended in the resume-data.csv file.
Upon invocation we get the names and email addresses of the employees inside a table in slack populated with the data from resume-data.csv file.
To test the integration you need to have python installed on your computer. You can get a suitable release from here. You can check your python version by the following command.
We recommend using a virtual environment for development. Read about it here.
Follow the following steps to create a virtual environment, clone the repository and install all the packages.
# Python 3.6+ required
git clone https://github.com/sarthakarora1208/cortx-s3-slack-bot
cd cortx-s3-slack-bot
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
To successfully connect to a Cortx Server you need to set the endpoint_url , aws_access_key_id and aws_secret_access_key in the .env file. If you are using a Cloudshare environment and followed the instructions from https://raw.githubusercontent.com/Seagate/cortx/wiki/CORTX-Cloudshare-Setup-for-April-Hackathon-2021 you can simply copy the Server URL (CORTX endpoint) from the Connection Details section -> External address on your cortx-va-1.03 VM and paste in the .env file
ENDPOINT_URL=""
AWS_ACCESS_KEY_ID="AKIAtEpiGWUcQIelPRlD1Pi6xQ"
AWS_SECRET_ACCESS_KEY="YNV6xS8lXnCTGSy1x2vGkmGnmdJbZSapNXaSaRhK"
You need to have a bucket with the name 'testbucket' inorder for the code to work
To develop locally we'll be using ngrok, which allows you to expose a public endpoint that Slack can use to send your app events. If you haven't already, install ngrok from their website .
You need a verified aws account to test the process_resume.py
You can get your credentials file at ~/.aws/credentials (C:\Users\USER_NAME.aws\credentials for Windows users) and copy the following lines in the .env file.
AMAZON_AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID"
AMAZON_AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY
The slack bot uses elastic search to index files on the S3 bucket.
To download Elasticsearch from their website.
You can run Elasticsearch on your own hardware, or use our hosted Elasticsearch Service on Elastic Cloud.
You can change the config variables in the .env file if you choose a hosted option
ELASTIC_DOMAIN='http://localhost'
ELASTIC_PORT=9200
You can test the elasticsearch client by running elasticsearch_connector.py.
python3 elastic_connector.py
A successful connection will yield:
You need to have slack installed on your computer. If you don't have Slack you get it from here for Windows or Mac. Login to your account, if you don't have an account you can make one here.
If you are an existing user you need to make a new channel #cortx-s3-test and you must be able to add new apps.
You need to create a new workspace (https://slack.com/create) and add a new channel #cortx-s3-test
To get started, you'll need to create a new Slack app, go to: https://api.slack.com/apps
Bolt is a foundational framework that makes it easier to build Slack apps with the platform's latest features. We will be using this make our slack bot
-
Click on
Create an App
button -
Give the app name as cortx-bot and choose the development workspace
-
Requesting scopes - Scopes give your app permission to do things (for example, post messages) in your development workspace. You can select the scopes to add to your app by navigating over to the OAuth & Permissions sidebar.
-
Add the following scopes the Bot Token Scopes by clicking on the
Add an OAuth Scope
button
OAuth Scope | Description |
---|---|
channels:history | View messages and other content in public channels that cortx-bot has been added to |
channels:join | Join public channels in a workspace |
channels:read | View basic information about public channels in a workspace |
chat:write | Send messages as @cortxbot |
chat:write.customize | Send messages as @cortxbot with a customized username and avatar |
chat:write.public | Send messages to channels @cortxbot isn't a member of |
commands | Add shortcuts and/or slash commands that people can use |
files:read | View files shared in channels and conversations that cortx-bot has been added to |
files:write | Upload, edit, and delete files as cortx-bot |
- Add the following scopes the the User Token Scopes by clicking on the
Add an OAuth Scope
button
OAuth Scope | Description |
---|---|
channels:history | View messages and other content in public channels that cortx-bot has been added to |
files:read | View files shared in channels and conversations that cortx-bot has been added to |
-
Install your own app by selecting the
Install App
button at the top of the OAuth & Permissions page, or from the sidebar. -
After clicking through one more green
Install App To Workspace
button, you'll be sent through the Slack OAuth UI. -
After installation, you'll land back in the OAuth & Permissions page and find a Bot User OAuth Access Token. and a User OAuth Token. Click on the copy button for each of them. These tokens need to be added to the .env file. (The bot token starts with xoxb whereas the user token is longer and starts with xoxp)
SLACK_USER_TOKEN=xoxp-your-user-token
SLACK_BOT_TOKEN=xoxb-your-bot-token
- In addition to the access token, you'll need a signing secret. Your app's signing secret verifies that incoming requests are coming from Slack. Navigate to the Basic Information page from your app management page. Under App Credentials, copy the value for Signing Secret and add it to the .env file.
SLACK_SIGNING_SECRET=your-signing-secret
- Make sure you have followed the steps in Cloning the repo. To start the bolt app. The HTTP server is using a built-in development adapter, which is responsible for handling and parsing incoming events from Slack on port 3000
python3 app.py
Open a new terminal and ensure that you've installed ngrok, go ahead and tell ngrok to use port 3000 (which Bolt for Python uses by default):
ngrok http 3000
For local slack development, we'll use your ngrok URL from above, so copy it your clipboard
For example: https://your-own-url.ngrok.io (copy to clipboard)
- Subscribing to events - Your app can listen to all sorts of events happening around your workspace — messages being posted, files being shared, and more. On your app configuration page, select the Event Subscriptions sidebar. You'll be presented with an input box to enter a
Request URL
, which is where Slack sends the events your app is subscribed to. Hit the save button
By default Bolt for Python listens for all incoming requests at the /slack/events route, so for the Request URL you can enter your ngrok URL appended with /slack/events.
Request URL: https://your-own-url.ngrok.io/slack/events
If the challenge was successful you will get a verified right next to the Request URL.
On the same page click on the Subscribe to bot events
menu on the bottom of the page. Click on the Add Bot User Event
.
Similary click on the Subscribe to events on behalf of user
. Click on the Add Workspace Event
.
Add the following scopes
EventName | Description | Required Scope |
---|---|---|
file_share | A file was shared | files:read |
message.channels | A message was posted to a channel | channesls:history |
- Next up select the Interactivity & Shortcuts sidebar and toggle the switch as on. Again for the Request URL enter your ngrok URL appended with /slack/events
Request URL: https://your-own-url.ngrok.io/slack/events
- Scroll down to the Select Menus section, in the Options Load URL, enter your ngork URL appended with /slack/events
Options Load URL: https://your-own-url.ngrok.io/slack/events
- Finally we come to the slash commands. Slack's custom slash commands perform a very simple task: they take whatever text you enter after the command itself (along with some other predefined values), send it to a URL, then accept whatever the script returns and posts it as a Slackbot message to the person who issued the command. We have 5 slash commands to be added in the workspace.
Head over to the Slash Commands sidebar and click on the Create New Command
button to head over the Create New Command page.
Add the Command, Request URL,Short Description and Usage hint, according to the table provided below.
Click on Save to return to the Slash Commands
Command | Request URL | Short Description | Usage Hint |
---|---|---|---|
/cortx-s3-get | https://your-own-url.ngrok.io/slack/events | Get a file from s3 bucket | filename |
/cortx-s3-search | https://your-own-url.ngrok.io/slack/events | Search for a file in S3 | |
/cortx-s3-delete | https://your-own-url.ngrok.io/slack/events | Deletes the given file from the s3 bucket | filename |
/cortx-s3-upload-resume | https://your-own-url.ngrok.io/slack/events | Upload resume to database | resume.pdf |
/cortx-s3-resume-data | https://your-own-url.ngrok.io/slack/events | Get resume data from s3 |
-
Watch the video to know more about using these slack commands
-
Open the slack channel and upload a file in any channel, note the file name
-
Then type the
/cortx-s3-search
and search for your file