semhound

Org-scale Semgrep sweeps with optional AI triage

semhound

Motivation

Security teams often need the same answer across many repositories: “Where else does this pattern show up?” That might be a bug-bounty SQLi variant, a zero-day in a dependency, or a custom policy you encode as Semgrep rules. Running Semgrep repo-by-repo means scripting discovery, cloning, execution, and reporting yourself.

semhound automates that loop at GitHub org (or user) scale: you supply the rules, it handles discovery (gh repo list), parallel shallow clones over SSH, scanning, and a single report per target with GitHub permalinks. If you want help separating noise from signal, optional AI triage adds a confidence score and a true-positive verdict per finding.

The mental model is simple: tools like TruffleHog or Gitleaks are built for secrets; semhound is for any Semgrep pattern you define, swept across every repo you can access—like a hound for Semgrep findings.

What it does

  1. Discover — Lists repositories for each org or username you pass (inline or via --orgs-file).
  2. Clone — Shallow clone (--depth 1) with a blob size cap aligned to Semgrep’s default so large binaries are skipped.
  3. Scan — Runs your rules from a local --rules-dir, remote --rules-url, or both.
  4. Report — Writes <target>_scan.csv and optional SARIF (--sarif).

semhound is aimed at targeted, on-demand investigations (tight rule sets, specific events), not continuous full-org scanning with huge rule packs.

Install and docs

If you use private repositories, you need gh auth login plus an SSH key registered with GitHub for cloning.

Licence

Open source under the MIT licence; see the repository for details.

Avatar
Rohit Salecha
Security Engineering

Rohit Salecha is a technology geek who loves to explore anything that runs and understands binary. As a security engineer he is passionate about learning the length,breadth and depth of technology. Being more on the defensive side he has evangelised secure software development at various organizations for more than a decade. He is ridiculously driven by “everything as code” mantra and strongly believes that security team must strive towards making themselves irrelevant.

Previous

Related