🔍 ChatGPT API Scanner

Security Research Tool for OpenAI API Key Detection

⚠️ DISCLAIMER - SECURITY RESEARCH ONLY

THIS PROJECT IS ONLY FOR SECURITY RESEARCH and reminds others to protect their property. DO NOT USE IT ILLEGALLY! The project authors are not responsible for any consequences resulting from misuse.

📖 About This Tool

ChatGPT-API-Scanner is a security research tool that scans GitHub repositories for accidentally exposed OpenAI API keys. This tool helps security researchers identify and report leaked credentials to prevent unauthorized usage.

Note: As of August 21, 2024, GitHub has enabled push protection to prevent API key leakage, which significantly reduces the effectiveness of this tool. GitHub now automatically scans for and blocks commits containing sensitive credentials.

✨ Features

🔎 GitHub Scanning

Searches GitHub repositories using regex patterns to find exposed API keys

✅ Key Validation

Automatically validates discovered keys against OpenAI API

💾 SQLite Database

Stores results in local database for analysis

🌐 Multi-Language

Searches across multiple programming languages

🚀 Installation & Usage

Prerequisites

  • Python 3.x
  • Google Chrome browser
  • GitHub account

Installation Steps

# Clone the repository
git clone https://github.com/Junyi-99/ChatGPT-API-Scanner
cd ChatGPT-API-Scanner

# Install dependencies
pip install selenium tqdm openai rich

Running the Scanner

# Basic usage
python3 src/main.py

# Start from specific iteration
python3 src/main.py --from-iter 100

# Check existing keys only
python3 src/main.py --check-existed-keys-only

# Custom keywords and languages
python3 src/main.py -k "openai" "chatgpt" -l python javascript
Important: You will be prompted to log in to your GitHub account in the browser when running the tool.

⚙️ Command Line Arguments

Parameter Description Default
--from-iter Start scanning from specific iteration None
--debug Enable debug mode for detailed logging False
-ceko Only check existing keys in database False
-k, --keywords Specify search keywords Default list
-l, --languages Specify programming languages Default list

💡 How It Works

  1. GitHub Search: Uses Selenium to perform regex-based searches on GitHub (API doesn't support regex)
  2. Key Extraction: Extracts potential OpenAI API keys from search results
  3. Validation: Tests each key against OpenAI API to verify if it's valid
  4. Storage: Saves results to SQLite database (github.db)
  5. Rate Limiting: Respects GitHub and OpenAI rate limits

🔒 Protecting Your API Keys

If you're a developer, here's how to keep your API keys safe:

  • Never commit API keys to version control
  • Use environment variables or secret management tools
  • Enable GitHub's push protection feature
  • Use .gitignore to exclude config files
  • Rotate keys regularly
  • Monitor API usage for suspicious activity

Resources:

OpenAI Best Practices What to Do If Key Leaked

📊 Results

Scan results are stored in github.db SQLite database. You can view the results using any SQLite browser or query them directly:

# View database with sqlite3
sqlite3 github.db

# Query all keys
SELECT * FROM keys;

# Query valid keys only
SELECT * FROM keys WHERE status='valid';

📚 Documentation

GitHub Repository Report Issues

❓ FAQ

Why use Selenium instead of GitHub API?

The official GitHub Search API doesn't support regex search, only web-based search does.

Why limit programming languages?

GitHub web search only shows first 5 pages. By limiting languages, we can break down results and find more keys.

Why not use multithreading?

Both GitHub and OpenAI have rate limits. Multithreading doesn't significantly improve efficiency.

What is push protection?

GitHub's feature that scans commits for sensitive information and blocks pushes containing API keys or other secrets.