Saying Goodbye to “yolo”, “oops”, and “fix” with AI-powered Commit Messages

Michaël Scherding
6 min readJul 23, 2023

--

Ever used “updated” or “yolo” as a commit message? Discover how we can harness OpenAI to craft insightful and engaging commit messages automatically. Dive in!

We’ve all been there. 7 PM, eyes bleary, fingers tired, and just one more commit before closing the computer. But what to name it? “yolo”? “update”? “oops, forgot this”?

Commit messages, those little notes we write to summarize code changes, often end up as afterthoughts in our rush to move on. In today’s fun ride, we’re going to explore how we can hand off this task to our trusty AI assistant, leaving behind the days of random commit messages and embracing the era of meaningful, automated descriptions!

Disclaimer

While leveraging the power of the ChatGPT API and external services like OpenAI can be incredibly beneficial, it’s crucial to exercise caution when dealing with sensitive code. The ChatGPT model processes data externally, which means the code you submit for review is shared with the OpenAI infrastructure. It’s essential to ensure that the code being reviewed does not contain any sensitive information that you are not comfortable sharing with an external company.

The ‘Oops’ Syndrome

Every code repository has its fair share of quirky commit messages. Some classics include:

  • “yolo”
  • “fix”
  • “I can’t even…”
  • “hope this works”
  • “just testing, will delete later” (it never gets deleted)

While they’re fun and very relatable, they don’t exactly paint a clear picture of the changes made. Enter OpenAI and GitHub Actions!

Diving Deeper: Understanding the subprocess Library

Before we dive into the intricacies of our code, let’s take a moment to understand one of the key players: the subprocess library.

In the vast world of Python, subprocess stands out as a Swiss Army knife for spawning new processes, connecting to their input/output/error pipes, and obtaining their return codes. Simply put, it's our bridge between Python scripts and the system shell.

Why do we need subprocess?

Our goal here is to automate the generation of meaningful commit messages. To accomplish this, our script needs to interact directly with Git, fetching information about recent commits and updating commit messages. While Git commands are typically run in a shell, our Python script doesn’t have that luxury. Enter subprocess.

By harnessing the power of subprocess, our script can:

  1. Run shell commands as if we’re typing them out in a terminal.
  2. Capture the output of these commands, allowing us to process and utilize this data within our script.

For instance, in our code:

commit_count = int(subprocess.getoutput("git rev-list --count HEAD"))

Here, we’re using subprocess to run the Git command that fetches the number of commits made. We then capture this output and convert it to an integer, allowing us to make decisions based on the number of commits.

Making Magic Happen

Our plan is simple. Every time we make changes to the code, we’ll let an AI scan those changes and craft a meaningful message. No more midnight brain fogs!

def get_code_diff():
"""
Grabs the files that have been changed in the latest act of coding.
"""
commit_count = int(subprocess.getoutput("git rev-list --count HEAD"))

if commit_count == 1:
return "init commit"

return subprocess.getoutput("git diff HEAD^ --name-only")

This function basically checks our latest code changes. If it’s our very first commit, it labels it as “init commit”. Otherwise, it lists the files we modified.

Engaging in Chit-Chat with Our AI Buddy

def get_commit_message(diff):
"""
Get the commit message using OpenAI for the provided code difference.

Args:
diff (str): Description of code difference.

Returns:
str: Generated commit message with an emoticon prefix and a body.
"""
if diff == "init commit":
return diff

# Constructing a prompt to guide the model
context = ("You are an AI code reviewer. Generate a commit message with a title and a body. "
"The title should start with an appropriate emoticon: \n\n"
"✨ for new features\n"
"🐛 for bug fixes\n"
"📚 for documentation updates\n"
"🚀 for performance improvements\n"
"🧹 for cleaning up code\n"
"⚙️ for configuration changes\n\n"
"Start with 'Title:' for the title and 'Body:' for the detailed description. Here are the code changes:")

# Using the ChatCompletion interface to interact with gpt-3.5-turbo
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": context
},
{
"role": "user",
"content": f"{diff}"
}
]
)

# Extracting the assistant's message from the response and parsing it
message_from_assistant = response.choices[0].message['content']

# Extract the title and body from the model's response
title = message_from_assistant.split('Title:')[1].split('Body:')[0].strip()
body = message_from_assistant.split('Body:')[1].strip()

return title, body

Why Emoticons?

Emoticons serve as visual cues, helping developers instantly recognize the nature of the commit.

Here’s the breakdown:

  1. ✨ New Features — It’s like a little sparkle moment. You’ve added something new!
  2. 🐛 Bug Fixes — A bug! Well, now it’s squashed. This commit did some debugging.
  3. 📚 Documentation Updates — Books represent knowledge, right? This commit added or changed documentation.
  4. 🚀 Performance Improvements — Rocket-fast! This commit made things run more efficiently.
  5. 🧹 Cleaning Up Code — Just like sweeping the floor, this commit tidied up the codebase.
  6. ⚙️ Configuration Changes — Gears for settings or config. This commit fiddled with some setup.

Guided Structure with ‘Title:’ and ‘Body:’

By asking the model to prepend its suggestions with ‘Title:’ and ‘Body:’, we’re setting clear boundaries. It’s like handing over a form to be filled out, ensuring the AI’s output is structured and easily parsed.

Amending Our Commit Message

Finally, we apply our freshly minted commit message:

def main():
diff = get_code_diff()
title, body = get_commit_message(diff)
subprocess.run(["git", "commit", "--amend", "-m", title, "-m", body], check=True)

Breakdown of the .yml

The YAML configuration file is responsible for defining GitHub Actions workflows. Let’s break down our workflow, generate_commit_message.yml, which automates the commit message generation.

name: Generate Commit Message

on:
push:
branches:
- main

jobs:
generate_message:
runs-on: ubuntu-latest

steps:
- name: Checkout Repository
uses: actions/checkout@v2
with:
fetch-depth: 2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9

- name: Install OpenAI
run: pip install openai

- name: Set up Git user
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"

- name: Generate Commit Message
run: python .github/actions/generate_commit_message.py
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

- name: Set up Git for push
run: |
git remote set-url origin https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/mchl-schrdng/code_review.git

- name: Push changes
run: git push origin main --force-with-lease

The workflow, aptly named “Generate Commit Message,” springs into action with every push to the main branch. Within it, we’ve charted a job termed generate_message that runs on the freshest Ubuntu version. Using actions/checkout@v2, we pull the repository and retrieve the two most recent commits (this is where fetch-depth: 2 comes into play) to discern the differences in the latest commit. We then orchestrate a Python environment, setting our sights on version 3.9.

For our operation's pivotal interaction with OpenAI's API, the openai library is a must-have, so we install it. The identity behind the commits? None other than the github-actions[bot], as we've configured. The crux of the process involves our Python script. It scouts for the latest code alterations, liaises with OpenAI, and masterfully assembles the commit message, all while safeguarding the OpenAI API key fetched from GitHub's secrets.

When it's time to push our amended commit back, we tweak the Git remote URL for token-based authentication. Finally, with a dash of precaution, we leverage the --force-with-lease flag for a secure pushback to the main branch.

Results

And tada you have now really great commit titles:

And also messages:

Conclusion

And there you have it! With a sprinkle of OpenAI magic and the power of GitHub Actions, commit messages transform from mundane to magnificent. No more “updated file.” Instead, revel in detailed, emoticon-rich messages like “🐛 Fixed the pesky login bug” followed by a detailed description.

Happy committing! 🎉

--

--