Skip to content
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .env-example
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ END_DATE = ""
ORGANIZATION = "organization"
REPOSITORY = "organization/repository"
START_DATE = ""
SPONSOR_INFO = "False"
LINK_TO_PROFILE = "True"
ACKNOWLEDGE_COAUTHORS = "True"

# GITHUB APP
GH_APP_ID = ""
Expand Down
21 changes: 12 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,20 +84,23 @@ This action can be configured to authenticate with GitHub App Installation or Pe

#### Other Configuration Options

| field | required | default | description |
| ------------------- | ----------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. |
| `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` |
| `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. |
| `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` |
| `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" |
| `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" |
| field | required | default | description |
| ----------------------- | ----------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. |
| `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` |
| `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. |
| `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` |
| `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" |
| `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" |
| `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages using the GitHub noreply email format (e.g., `[email protected]`). This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" |

**Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date.

**Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds.

**Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. Only co-authors with GitHub noreply email addresses (e.g., `[email protected]`) will be recognized, as this is the standard format used by GitHub for [creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors).

### Example workflows

**Be sure to change at least these values: `<YOUR_ORGANIZATION_GOES_HERE>`, `<YOUR_GITHUB_HANDLE_HERE>`**
Expand Down
138 changes: 135 additions & 3 deletions contributors.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# pylint: disable=broad-exception-caught
"""This file contains the main() and other functions needed to get contributor information from the organization or repository"""

import re
from typing import List

import auth
Expand All @@ -27,6 +28,7 @@ def main():
end_date,
sponsor_info,
link_to_profile,
acknowledge_coauthors,
) = env.get_env_vars()

# Auth to GitHub.com
Expand All @@ -46,7 +48,13 @@ def main():

# Get the contributors
contributors = get_all_contributors(
organization, repository_list, start_date, end_date, github_connection, ghe
organization,
repository_list,
start_date,
end_date,
github_connection,
ghe,
acknowledge_coauthors,
)

# Check for new contributor if user provided start_date and end_date
Expand All @@ -60,6 +68,7 @@ def main():
end_date=start_date,
github_connection=github_connection,
ghe=ghe,
acknowledge_coauthors=acknowledge_coauthors,
)
for contributor in contributors:
contributor.new_contributor = contributor_stats.is_new_contributor(
Expand Down Expand Up @@ -103,6 +112,7 @@ def get_all_contributors(
end_date: str,
github_connection: object,
ghe: str,
acknowledge_coauthors: bool = False,
):
"""
Get all contributors from the organization or repository
Expand All @@ -113,6 +123,8 @@ def get_all_contributors(
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
github_connection (object): The authenticated GitHub connection object from PyGithub
ghe (str): The GitHub Enterprise URL to use for authentication
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages

Returns:
all_contributors (list): A list of ContributorStats objects
Expand All @@ -130,7 +142,9 @@ def get_all_contributors(
all_contributors = []
if repos:
for repo in repos:
repo_contributors = get_contributors(repo, start_date, end_date, ghe)
repo_contributors = get_contributors(
repo, start_date, end_date, ghe, acknowledge_coauthors
)
if repo_contributors:
all_contributors.append(repo_contributors)

Expand All @@ -140,20 +154,61 @@ def get_all_contributors(
return all_contributors


def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
def get_coauthors_from_message(commit_message: str) -> List[str]:
"""
Extract co-author usernames from a commit message.

Co-authored-by trailers follow the format:
Co-authored-by: Name <email>
Or with a GitHub username:
Co-authored-by: Name <[email protected]>

Args:
commit_message (str): The commit message to parse

Returns:
List[str]: List of GitHub usernames extracted from co-author trailers
"""
# Match Co-authored-by trailers - case insensitive
# Format: Co-authored-by: Name <email>
pattern = r"Co-authored-by:\s*[^<]*<([^>]+)>"
matches = re.findall(pattern, commit_message, re.IGNORECASE)

usernames = []
for email in matches:
# Check if it's a GitHub noreply email format: [email protected]
noreply_pattern = r"^(\d+\+)?([^@]+)@users\.noreply\.github\.com$"
noreply_match = re.match(noreply_pattern, email)
if noreply_match:
usernames.append(noreply_match.group(2))
return usernames


def get_contributors(
repo: object,
start_date: str,
end_date: str,
ghe: str,
acknowledge_coauthors: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot why set this to false if the configuration file has true by default. Maybe it's better not to have a default here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed default values from both get_contributors() and get_all_contributors() function parameters in commit 4454fe7. The acknowledge_coauthors parameter is now always explicitly passed from the configuration, eliminating the inconsistency.

):
"""
Get contributors from a single repository and filter by start end dates if present.

Args:
repo (object): The repository object from PyGithub
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
ghe (str): The GitHub Enterprise URL to use for authentication
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages

Returns:
contributors (list): A list of ContributorStats objects
"""
all_repo_contributors = repo.contributors()
contributors = []
# Track usernames already added as contributors
contributor_usernames = set()

try:
for user in all_repo_contributors:
# Ignore contributors with [bot] in their name
Expand Down Expand Up @@ -187,6 +242,15 @@ def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
"",
)
contributors.append(contributor)
contributor_usernames.add(user.login)

# Get co-authors from commit messages if enabled
if acknowledge_coauthors:
coauthor_contributors = get_coauthor_contributors(
repo, start_date, end_date, ghe, contributor_usernames
)
contributors.extend(coauthor_contributors)

except Exception as e:
print(f"Error getting contributors for repository: {repo.full_name}")
print(e)
Expand All @@ -195,5 +259,73 @@ def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
return contributors


def get_coauthor_contributors(
repo: object,
start_date: str,
end_date: str,
ghe: str,
existing_usernames: set,
) -> List[contributor_stats.ContributorStats]:
"""
Get contributors who were co-authors on commits in the repository.

Args:
repo (object): The repository object
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
ghe (str): The GitHub Enterprise URL
existing_usernames (set): Set of usernames already added as contributors

Returns:
List[ContributorStats]: A list of ContributorStats objects for co-authors
"""
coauthor_counts: dict = {} # username -> count
endpoint = ghe if ghe else "https://github.com"

try:
# Get all commits in the date range
if start_date and end_date:
commits = repo.commits(since=start_date, until=end_date)
else:
commits = repo.commits()

for commit in commits:
# Get commit message from the commit object
commit_message = commit.commit.message if commit.commit else ""
if not commit_message:
continue

# Extract co-authors from commit message
coauthors = get_coauthors_from_message(commit_message)
for username in coauthors:
if username not in existing_usernames:
coauthor_counts[username] = coauthor_counts.get(username, 0) + 1

except Exception as e:
print(f"Error getting co-authors for repository: {repo.full_name}")
print(e)
return []

# Create ContributorStats objects for co-authors
coauthor_contributors = []
for username, count in coauthor_counts.items():
if start_date and end_date:
commit_url = f"{endpoint}/{repo.full_name}/commits?author={username}&since={start_date}&until={end_date}"
else:
commit_url = f"{endpoint}/{repo.full_name}/commits?author={username}"

contributor = contributor_stats.ContributorStats(
username,
False,
"", # No avatar URL available for co-authors
count,
commit_url,
"",
)
coauthor_contributors.append(contributor)

return coauthor_contributors


if __name__ == "__main__":
main()
4 changes: 4 additions & 0 deletions env.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ def get_env_vars(
str,
bool,
bool,
bool,
]:
"""
Get the environment variables for use in the action.
Expand All @@ -105,6 +106,7 @@ def get_env_vars(
end_date (str): The end date to get contributor information to.
sponsor_info (str): Whether to get sponsor information on the contributor
link_to_profile (str): Whether to link username to Github profile in markdown output
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages
"""

if not test:
Expand Down Expand Up @@ -145,6 +147,7 @@ def get_env_vars(

sponsor_info = get_bool_env_var("SPONSOR_INFO", False)
link_to_profile = get_bool_env_var("LINK_TO_PROFILE", False)
acknowledge_coauthors = get_bool_env_var("ACKNOWLEDGE_COAUTHORS", True)

# Separate repositories_str into a list based on the comma separator
repositories_list = []
Expand All @@ -166,4 +169,5 @@ def get_env_vars(
end_date,
sponsor_info,
link_to_profile,
acknowledge_coauthors,
)
Loading
Loading