Skip to main content
AI & Automation
13 min read George Spanos

From Pull Request Diff to AI-Assisted Code Review

A closer look at Code Review Agent, an open foundation agent project for reviewing code and CI/CD changes across GitHub, GitLab, Jenkins, and local development workflows.

#agentic-ai #langgraph #code-review #ci-cd #developer-productivity #software-quality

Code review is one of the most important parts of software delivery, but it is also one of the easiest places for teams to lose time.

A pull request may look small on the surface, but the reviewer still has to think through several questions at once:

  • Does this change introduce a bug?
  • Are there security issues hiding in the diff?
  • Will this hurt performance or reliability?
  • Does the change follow the team’s standards?
  • Did the CI/CD configuration change in a risky way?
  • Is this worth blocking, or is it just a suggestion?

For small teams, this is even harder. Senior engineers are usually reviewing code while also building features, fixing incidents, supporting customers, and keeping delivery moving. The result is predictable: reviews become rushed, inconsistent, or overly dependent on the one person who understands the system best.

Code Review Agent is our foundation agent project for making that review loop more useful. It is an LLM-powered, multi-language code and CI/CD review agent built with LangGraph. It can review a local git diff, a pull request, a merge request, or a CI job diff and produce practical findings around bugs, security issues, performance concerns, and maintainability.

The goal is not to replace human reviewers.

The goal is to give teams a flexible, lower-cost review assistant that can run where they need it, follow their review rules, and surface issues before a human spends time hunting through the diff.


The Problem: Code Review Does Not Scale Cleanly

Most teams do code review manually, and that works until the review load grows.

A team starts with a few pull requests per week. Then it adds more repositories, more developers, more deployment pipelines, and more application surface area. Before long, review quality depends on who happens to be available that day.

That creates a workflow problem:

  • Reviewers focus on style while missing real logic risks.
  • Important CI/CD changes get skimmed because they are not application code.
  • Junior developers wait too long for feedback.
  • Senior engineers become bottlenecks.
  • Small teams cannot justify expensive enterprise tooling for every repository.
  • AI review tools may exist, but they are often tied to a specific platform or subscription model.

The missing piece is not another chat window. The missing piece is a repeatable review workflow that can run inside the team’s existing development process.

Code Review Agent treats code review as a controlled automation workflow. It reads the diff, detects the type of files that changed, applies the right review skill, aggregates findings, and reports the result through the configured output channel.

That makes it useful as a foundation project. It is not limited to one narrow demo. It can become the base for different review workflows across software, DevOps, CI/CD, and internal platform teams.


How the Workflow Runs

The agent can run locally or inside CI.

For local use, a developer can review uncommitted changes or a branch diff before opening a pull request:

make review

./.venv/bin/code-review --repo . --reporter terminal

git diff origin/main | ./.venv/bin/code-review --reporter terminal

For CI usage, the same worker container can run against a pull request or merge request diff. The reporter determines where the result goes.

diff source
  -> ingest changed files
  -> detect language or CI target
  -> fan out to the right review skills
  -> aggregate findings
  -> render report
  -> publish through terminal, file, GitHub, or GitLab reporter

That shape is important. The agent is not a free-roaming bot with unclear behavior. It is a predictable graph with a specific job: review the diff and report evidence-backed findings.

The project currently supports GitHub, GitLab, Jenkins, file reports, and terminal output. That gives teams more deployment options than a tool that only works in one source control platform.


Review Skills Instead of Hard-Coded Expertise

The most useful design choice in this project is the skill system.

Review expertise is not buried deep in application code. It lives in portable SKILL.md files. Each skill describes how the agent should review a specific language or target.

The bundled skills currently cover:

  • Python
  • JavaScript and TypeScript
  • Java
  • Dockerfiles
  • GitHub Actions workflows
  • GitLab CI configuration
  • Jenkinsfiles

That means the agent can review both application code and delivery infrastructure. This matters because bugs are not the only risk in a pull request. A weak Dockerfile, a dangerous GitHub Actions permission change, or a broken Jenkins pipeline can create just as much operational pain as a code defect.

The skill approach also makes the project extensible. A new language skill can be added by creating a new skill folder with a SKILL.md file that defines the review guidance and file extensions. For many language additions, the graph itself does not need to change.

That is the difference between a hard-coded AI demo and a reusable foundation agent.


The Agent Reviews. It Does Not Take Over.

The project has a clear operating boundary:

The agent reads diffs and reports findings. It does not write to the reviewed repository. It does not automatically fix code. It does not approve pull requests. It does not merge anything.

That boundary matters.

For production software teams, the safest use of AI is usually not full autonomy. It is assisted review. The model can help identify suspicious code, explain why something may be risky, and suggest what a reviewer should look at. The human still owns the decision.

This is especially important for consulting and small-business environments. Many companies want AI-assisted development, but they do not want a system that silently changes production code or bypasses normal review gates.

Code Review Agent keeps the role simple:

  1. Read the proposed change.
  2. Apply the right review guidance.
  3. Return concrete findings.
  4. Let the team decide what to do next.

That is practical AI automation.


Flexible Reporting for Different Teams

Not every team reviews code the same way.

Some teams want comments directly on GitHub pull requests. Some use GitLab merge requests. Some still rely on Jenkins and archived build artifacts. Some developers want a local review before pushing their branch.

Code Review Agent supports multiple reporters:

ReporterUse Case
terminalLocal runs and CI logs
fileDurable Markdown and JSON reports
githubIdempotent GitHub pull request comment
gitlabIdempotent GitLab merge request note

The GitHub and GitLab reporters update a marked comment instead of creating duplicate comments on every run. That keeps the review signal visible without spamming the pull request conversation.

This also makes the project easier to adapt for consulting work. A client does not need to change their entire platform to try AI-assisted review. The same foundation can be wired into the tools they already use.


Configurable Gates Without Forcing a Big Rollout

AI review should not start by blocking every merge.

Teams need time to learn what the agent catches, where it is useful, and where the review instructions need tuning. That is why the project supports configurable failure thresholds.

code-review --fail-on high

The threshold can be adjusted so the agent only fails the run when findings meet or exceed a selected severity. It can also run in advisory mode when the team wants review comments without blocking delivery.

A practical rollout would look like this:

  1. Start with terminal or file reports.
  2. Move to pull request or merge request comments.
  3. Run advisory-only for a few repositories.
  4. Tune ignore rules and review skills.
  5. Begin failing CI only on high-confidence, high-impact findings.
  6. Expand to more languages, CI targets, and repositories.

That is how this kind of automation should be introduced. Visibility first. Enforcement later.


Why This Can Be a Cheaper Alternative

There are strong commercial tools in this space. GitHub Copilot can provide code review feedback in pull requests. GitLab Duo can review merge requests and provide feedback on potential errors and standards alignment.

Those tools can be useful, especially for teams already paying for the broader platform.

But not every team wants to buy a larger AI developer platform just to experiment with AI-assisted review. Some teams also need more control over where the review runs, which model is used, what review instructions are applied, and how results are reported.

That is where Code Review Agent fits.

It can be a cheaper alternative because the core workflow is open and model-configurable. Instead of paying for a platform-wide feature set, a team can run a focused review workflow with the provider they choose. The default model is OpenAI gpt-5-mini, and Anthropic and Google are selectable through configuration.

It can also be a more flexible alternative because the review skills are portable and editable. A team can define exactly what good review means for their codebase, their CI/CD standards, and their risk tolerance.

This does not mean it replaces Copilot, GitLab Duo, or other commercial tools for every team. It means there is room for a focused, customizable agent when the team wants control, portability, and a lower-cost starting point.


Use Cases Beyond Basic Code Review

Because this is a foundation agent project, the use cases are broader than “AI comments on a pull request.”

It can support several practical workflows:

  • Pre-PR self-review: Developers run the agent locally before opening a pull request.
  • Pull request review assistant: The agent posts an initial review comment so human reviewers start with a better map of the risk.
  • CI/CD configuration review: Dockerfiles, GitHub Actions, GitLab CI, and Jenkinsfiles get reviewed as first-class delivery assets.
  • Junior developer feedback loop: Less experienced developers get faster feedback before a senior engineer reviews the change.
  • Platform engineering guardrails: Platform teams can encode common review expectations into reusable skills.
  • Client codebase assessments: Consultants can run the agent against targeted diffs or modernization work and produce repeatable review reports.
  • Multi-repo review standardization: Organizations can apply the same review pattern across several repositories without copying manual checklists everywhere.
  • Experimentation with different LLM providers: Teams can test OpenAI, Anthropic, or Google models without redesigning the whole workflow.

This is where the project becomes more than a code review bot. It becomes a reusable pattern for AI-assisted engineering workflows.


Why LangGraph Fits This Shape

This project does not need an agent that randomly decides what to do next. It needs a controlled workflow with a clear beginning, middle, and end.

LangGraph fits because code review naturally breaks into steps:

  • ingest the diff
  • classify changed files
  • select the right skill
  • review each unit
  • aggregate results
  • render a report
  • publish through the selected reporter

That is a graph, not a loose conversation.

The graph structure also makes the system easier to test and reason about. Each node has a clear responsibility. The workflow can fan out across different review units, then merge results into a single report.

For business automation, that structure matters. The impressive part is not that an LLM is involved. The useful part is that the workflow is inspectable, repeatable, and safe enough to run inside software delivery.


Built With a Practical Trust Model

A code review agent has to assume that pull request content is untrusted.

That includes the diff itself, repository configuration, and any repo-local review skills. A malicious or careless pull request should not be able to rewrite the rules used to review itself.

Code Review Agent addresses this with several safeguards:

  • CI reads review.toml from the trusted base ref, not the pull request head.
  • Repo-local extra skills are ignored unless the CI operator explicitly enables them.
  • .env files are not loaded in CI.
  • Diffs are treated as untrusted data, not instructions.
  • Bundled skills are preferred over repo-local skills.
  • The worker can run with a read-only checkout and write reports elsewhere.

That is exactly the kind of thinking AI workflow projects need. Prompt injection is not just a chatbot problem. Any system that feeds user-controlled content into a model needs a trust boundary.

This project makes that boundary visible.


Where This Helps Most

Code Review Agent is a good fit for teams that want AI-assisted review without committing immediately to a full enterprise AI development platform.

It is especially useful when:

  • the team has small or overloaded engineering capacity
  • pull requests wait too long for initial feedback
  • CI/CD files are changed often but reviewed inconsistently
  • the team wants review standards that are more specific than generic AI feedback
  • multiple repositories need the same review pattern
  • the company wants to test AI review with controlled cost
  • consultants need a reusable demo or delivery accelerator for engineering clients

It is not a replacement for senior engineering judgment. It will not understand every product decision. It will not know every exception in a legacy system. It should not be treated as an approval authority.

But it can make the first review pass faster, more consistent, and easier to act on.

That is enough to matter.


What This Project Shows

Code Review Agent is valuable because it is narrow enough to be useful and flexible enough to build on.

It does not try to “do software engineering.” It focuses on one painful workflow: reviewing changes before they merge.

That same pattern applies to many other technical workflows:

  • release readiness checks
  • dependency risk summaries
  • CI failure analysis
  • infrastructure change reviews
  • security exception reviews
  • migration readiness assessments
  • test coverage review
  • documentation drift checks

The larger lesson is simple: good AI automation starts with a real workflow, not a vague promise.

The model is only one part of the system. The real value comes from the workflow around it: the trigger, the context, the guardrails, the reporting, and the human decision point.

Code Review Agent is a practical example of that pattern.


Check Out the Project

Code Review Agent is open here: https://github.com/infiniumtek/code-review-agent

If your team is overloaded with pull requests, inconsistent code reviews, or CI/CD configuration changes that are easy to miss, this project shows how a focused AI workflow can add value without taking control away from your developers.

Interested in building a focused AI workflow for your engineering or operations team? Schedule a Digital Health Check and we can help identify the review loops where automation would create real leverage without forcing your team into a tool they do not need.


This post was last reviewed and updated in May 2026. AI developer tools, LLM platforms, and CI/CD systems continue to evolve, but the operating principle is stable: use AI to assist review, keep the workflow controlled, and leave production decisions with the humans responsible for the system.

Turn Hours of Work into Minutes

Automate routine tasks so your team can focus on higher-value decisions.

About the Author

George Spanos
George Spanos

Co-founder at InfiniumTek

George believes every small business deserves high-level tech leadership at a price that makes sense. After leading large-scale technology projects for national brands, he co-founded InfiniumTek to help small business owners navigate software, security, and AI.

View full profile