Skip to content

Commit 51758d8

Browse files
Add Automation script/workflow for updating mentor data for ad-hoc monthly prep (#575)
* script to update mentors.yml with monthly ad-hoc availability * updates, readmes, minor fixes * add workflow for adhoc-availability[wip to restrict to only parent repo] * update meetup automation to use new generic token, prev to be deleted * replace gdown with rclone for retrieving google drive files securely * use copyid syntax from documentation * updates * updates * documentation updates * updates + doc comments for clarity
1 parent 0734177 commit 51758d8

7 files changed

Lines changed: 254 additions & 5 deletions

File tree

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
name: Update Mentor Availabilities for Monthly Ad-Hoc Prep
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
month:
7+
description: "Month number (e.g. 10 for October)"
8+
required: true
9+
file_id:
10+
description: "Google Drive file ID of the Excel sheet with the availabilities"
11+
required: true
12+
13+
jobs:
14+
update-mentors:
15+
runs-on: ubuntu-latest
16+
17+
steps:
18+
- name: Checkout repository
19+
uses: actions/checkout@v5
20+
21+
- name: Set up Python
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: '3.12'
25+
26+
- name: Cache pip
27+
uses: actions/cache@v4
28+
with:
29+
path: ~/.cache/pip
30+
key: ${{ runner.os }}-pip-${{ hashFiles('tools/requirements.txt') }}
31+
restore-keys: |
32+
${{ runner.os }}-pip-
33+
34+
- name: Install dependencies
35+
run: |
36+
python -m pip install --upgrade pip
37+
pip install -r tools/requirements.txt
38+
39+
- name: Install and Configure rclone with Google Cloud service account
40+
run: |
41+
curl https://rclone.org/install.sh | sudo bash
42+
echo '${{ secrets.GOOGLECLOUD_SERVICE_KEY_RETRIEVE_ADHOC_FILE_JSON }}' > service_account.json
43+
rclone config create gdrive drive scope=drive service_account_file=service_account.json
44+
45+
- name: Download spreadsheet from Google Drive
46+
run: |
47+
rclone backend copyid gdrive: ${{ github.event.inputs.file_id }} tools/adhoc.xlsx
48+
49+
- name: Run script
50+
run: |
51+
cd tools
52+
python automation_prepare_adhoc_availability.py adhoc.xlsx ${{ github.event.inputs.month }}
53+
54+
- name: Cleanup files
55+
if: always()
56+
run: rm -f service_account.json tools/adhoc.xlsx
57+
58+
- name: Create or Update Pull Request
59+
uses: peter-evans/create-pull-request@v7
60+
with:
61+
token: ${{ secrets.GHA_ACTIONS_ALLOW_TOKEN }}
62+
commit-message: "updated mentor hours, availabilities, and sort for monthly adhoc prep"
63+
branch: "automation/adhoc-monthly-prep"
64+
team-reviewers: |
65+
Women-Coding-Community/leaders
66+
title: "[WCC Bot] Monthly Ad-hoc Prep - Month ${{ github.event.inputs.month }}"
67+
body: |
68+
This PR was created automatically by a GitHub Action that handles mentor data updates for the monthly ad-hoc preparation.
69+
Only `_data/mentors.yml` should be updated.
70+
71+
Please review the changes `_data/mentors.yml` and ensure that the changes are as expected before merging (Review the availability sheet used).
72+
labels: |
73+
automation
74+
adhoc-prep

.github/workflows/import_meetup_events.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ jobs:
4545
- name: Create or Update Pull Request
4646
uses: peter-evans/create-pull-request@v7
4747
with:
48-
token: ${{ secrets.MEETUP_IMPORT_ACTIONS_TOKEN }}
48+
token: ${{ secrets.GHA_ACTIONS_ALLOW_TOKEN }}
4949
commit-message: "Automated import of Meetup events"
5050
branch: "automation/import-meetup-events"
5151
team-reviewers: "Women-Coding-Community/leaders"

_data/mentors.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2824,7 +2824,7 @@
28242824
- Grow beyond senior level
28252825
extra: Focus areas for aspiring data scientists; transitioning into data science, building a roadmap, preparing for technical interviews, optimizing resume/LinkedIn, creating and showcasing portfolio projects, and communicating insights effectively.
28262826
network:
2827-
- linkedin: https://www.linkedin.com/in/ilayda-yilmaz/
2827+
- linkedin: https://www.linkedin.com/in/ilayda-yilmaz/
28282828

28292829
- name: Damola Taiwo
28302830
disabled: false
@@ -2891,7 +2891,7 @@
28912891
- Grow from mid-level to senior-level
28922892
- Grow from beginner to mid-level
28932893
- Switch career to IT
2894-
extra:
2894+
extra: |
28952895
Data engineering best practices; Data architecture design and implementation; Transitioning from on-premises to cloud solutions (with focus on Databricks);
28962896
Growing from junior to mid-level; Growing from mid-level to senior-level; Career development and strategic planning in data roles;
28972897
Technology architecture & strategy; Career growth & soft skills; Navigating the job market and building a professional brand; Effective communication in technical teams

tools/README.md

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,11 @@ There are two automation scripts:
55

66
2) `download_image.py`: downloads image from a specified URL and saves in `assets/images/mentors`
77

8-
3) `automation_create_mentor_spreadsheets.py`: creates spreadhseets for each longterm mentor with filenames like `WCC - Long Term - MentorName.xlsx`. All the files are saved in a folder named `Long Term Mentors`. It uses the data from `Mentorship Programme long-term Registration Form for Mentees (Responses).xlsx` sheetname `Revised Mentees`as input.
8+
3) `meetup_import.py`: imports new upcoming events from the WCC MeetUp page using the iCal feed: https://www.meetup.com/women-coding-community/events/ical/
9+
10+
4) `automation_create_mentor_spreadsheets.py`: creates spreadhseets for each longterm mentor with filenames like `WCC - Long Term - MentorName.xlsx`. All the files are saved in a folder named `Long Term Mentors`. It uses the data from `Mentorship Programme long-term Registration Form for Mentees (Responses).xlsx` sheetname `Revised Mentees`as input.
11+
12+
5) `automation_prepare_adhoc_availability.py`: updates mentors data with specified availability hours in `samples/adhoc-prep.xlsx` in preparation for monthly ad-hoc mentorship.
913

1014
### Dependencies
1115

@@ -75,4 +79,18 @@ sh run_meetup_import.sh
7579
📁 Long Term Mentors
7680
│── WCC - Long Term - Nonna Shakhova.xlsx
7781
│── WCC - Long Term - Rajani Rao.xlsx
78-
│── WCC - Long Term - Gabriel Oliveira.xlsx └── (more mentor files...)
82+
│── WCC - Long Term - Gabriel Oliveira.xlsx └── (more mentor files...)
83+
84+
#### E) `automation_prepare_adhoc_availability.py`
85+
86+
```shell
87+
sh run_adhoc_prep_automation.sh
88+
```
89+
**Note:**
90+
- If running locally, ensure to update `adhoc-prep.xslx` with the new data to be updated for the mentors.
91+
- If using GitHub Actions, the GHA workflow for this script uses a Google Cloud service account setup to retrieve the file from Google Drive. The service key has been configured for womencodingcommunity Google Drive account and the file to be used/updated has been shared with the service account email.
92+
Hence, to run the GHA workflow, you only need to provide:
93+
- the month value (e.g 9 for September) and,
94+
- the file ID for the excel sheet to use
95+
96+
For more information on the GC service account configurations, you can read the [README](blog_automation/README.md) in the blog automation folder.
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
"""
2+
Create a script that will update availability and hours for mentors in preparation for adhoc registration for a specified month
3+
"""
4+
# !/usr/bin/env python
5+
6+
import logging
7+
import sys
8+
import pandas as pd
9+
from ruamel.yaml import YAML
10+
11+
yaml = YAML()
12+
yaml.width = 4096
13+
14+
TYPE_LONG_TERM = "long-term"
15+
TYPE_AD_HOC = "ad-hoc"
16+
TYPE_BOTH = "both"
17+
18+
MONTHS_MAP = {
19+
4: 'April',
20+
5: 'May',
21+
6: 'June',
22+
7: 'July',
23+
8: 'August',
24+
9: 'September',
25+
10: 'October',
26+
11: 'November'
27+
}
28+
29+
30+
def get_available_mentor_sort(mentor, current_availability):
31+
"""
32+
Returns sort value for available mentors ONLY if:
33+
- mentor is new (current availability still contains full list of months), sort to highest: 500
34+
- mentor has >3 available hours, sort to highest: 500
35+
- 3 or less hours, sort: 200
36+
37+
Note: sort logic for unavailable mentors is split on purpose (see update_mentor_availability function)
38+
39+
Guide: https://docs.google.com/document/d/1GwlleBNScHCQ3K8rgvYIB3upIr1BylgWjGR2jxwYWtI/edit?tab=t.0
40+
"""
41+
42+
if len(current_availability) > 1 or mentor.get('hours') > 3:
43+
return 500
44+
45+
return 200
46+
47+
48+
def get_unavailable_mentor_sort(mentor):
49+
"""
50+
Returns sort value for unavailable mentor if:
51+
- mentor is ad-hoc only or both but no available hours for the month, sort: 100
52+
- mentor is long-term only, sort: 10
53+
- mentor is deactivated, sort: 1
54+
"""
55+
if mentor.get("disabled", True):
56+
return 1
57+
58+
if mentor.get("type") == TYPE_LONG_TERM:
59+
return 10
60+
61+
return 100
62+
63+
64+
def get_availability_update_dict(available_mentors):
65+
"""
66+
Returns a dictionary mapping mentor to their available hours (from spreadsheet file)
67+
- If hours column in spreadsheet is empty/non-numeric, the value will be None (indicating existing hours should be kept)
68+
"""
69+
availability_update_dict = {}
70+
71+
for _, row in available_mentors.iterrows():
72+
mentor_name = row['Mentor Name'].strip()
73+
updated_hours = row['Availability (Hours)']
74+
75+
if pd.isna(updated_hours) or str(updated_hours).strip() == "":
76+
availability_update_dict[mentor_name] = None
77+
else:
78+
availability_update_dict[mentor_name] = updated_hours
79+
80+
return availability_update_dict
81+
82+
83+
def update_mentor_availability(month, xlsx_file_path, yml_file_path):
84+
"""
85+
Updates mentor availability and hours in the mentors.yml file based on the provided xlsx file for a given month.
86+
- Mentors not listed in the xlsx file are set to unavailable for the month.
87+
- Mentors listed in the xlsx file have their availability set to the specified month and hours updated if provided.
88+
- All mentors are re-sorted according to the guide: https://docs.google.com/document/d/1GwlleBNScHCQ3K8rgvYIB3upIr1BylgWjGR2jxwYWtI/edit?tab=t.0
89+
"""
90+
91+
df_available_mentors = pd.read_excel(xlsx_file_path)
92+
availability_updates = get_availability_update_dict(df_available_mentors)
93+
94+
with open(yml_file_path, 'r') as input_yml:
95+
mentors = yaml.load(input_yml) or []
96+
97+
for mentor in mentors:
98+
yml_name = mentor['name'].strip()
99+
100+
# if mentor is not included in availability file: update sort, set availability to none, and move to next mentor
101+
if yml_name not in availability_updates:
102+
mentor['sort'] = get_unavailable_mentor_sort(mentor)
103+
mentor['availability'] = []
104+
continue
105+
106+
# otherwise: mentor is available, update sort and reset availability to the current month only
107+
current_availability = mentor.get('availability', [])
108+
mentor['sort'] = get_available_mentor_sort(mentor, current_availability)
109+
mentor['availability'] = [month]
110+
111+
# Only update hours if updated hours is not None
112+
updated_hours = availability_updates.get(yml_name)
113+
if updated_hours is not None:
114+
logging.info(f"Updating hours for {yml_name} to: {updated_hours}")
115+
mentor['hours'] = updated_hours
116+
117+
with open(yml_file_path, 'w') as f:
118+
yaml.default_flow_style = True
119+
yaml.dump(mentors, f)
120+
121+
print(f"Mentor availabilities updated for month {MONTHS_MAP[month]}.")
122+
123+
124+
def run_automation():
125+
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
126+
127+
mentors_yml_file_path = "../_data/mentors.yml"
128+
129+
if len(sys.argv) == 3:
130+
xlsx_file_path = sys.argv[1]
131+
month = int(sys.argv[2])
132+
133+
logging.info("Using values: xlsx: %s, month: %s", xlsx_file_path, month)
134+
else:
135+
xlsx_file_path = "samples/adhoc-prep.xlsx"
136+
month = 11
137+
138+
logging.info("Default values: xlsx: %s, month: %s", xlsx_file_path, month)
139+
140+
update_mentor_availability(month, xlsx_file_path, mentors_yml_file_path)
141+
142+
143+
if __name__ == "__main__":
144+
run_automation()

tools/run_adhoc_prep_automation.sh

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#Create the virtual environment On macOS/Linux
2+
python3 -m venv myenv
3+
4+
#Activate the virtual environment:
5+
source myenv/bin/activate
6+
7+
# Install packages
8+
pip install -r requirements.txt
9+
10+
# Enter the parameters: FILE_PATH_XLSX MONTH
11+
# Example: samples/adhoc-prep.xlsx 9
12+
# month: the adhoc month in number e.g 4 -> April, 11 -> November
13+
python3 automation_prepare_adhoc_availability.py samples/adhoc-prep.xlsx 10

tools/samples/adhoc-prep.xlsx

9.22 KB
Binary file not shown.

0 commit comments

Comments
 (0)