# ModelGate
**Repository Path**: sjsjssj/ModelGate
## Basic Information
- **Project Name**: ModelGate
- **Description**: ModelGate
基于 FastAPI 的 LLM API 代理服务器,支持多提供商管理、API Key 控制、用量监控和 Web 管理面板。
功能特性
多提供商支持:智谱、DeepSeek、Ollama、Minimax 及任意 OpenAI 兼容 API
并发限流:基于信号量的提供商级并发控制
API Key 管理:支持按 Key 限制可访问的模型
流式响应:实时流式输出,支持 token
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: https://leturx.cc
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 20
- **Created**: 2026-04-13
- **Last Updated**: 2026-04-13
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# ModelGate
ModelGate is a FastAPI-based LLM gateway for multi-provider routing, API key management, request logging, and dashboard monitoring.
## Highlights
- Multi-provider routing: Zhipu, DeepSeek, Ollama, Minimax, and any OpenAI-compatible API
- OpenAI-compatible proxy endpoints: `/v1/chat/completions`, `/v1/embeddings`, `/v1/models`
- Per-provider concurrency limits with asyncio semaphore-based rate control
- API key management with per-key model access control
- Streaming request lifecycle tracking: `pending` -> `success` / `error` / `timeout`
- Upstream status code logging (200, 429, 500, etc.)
- AI-powered daily error analysis with persisted reports
- AI-powered model recommendations and timing advice for users
- Admin dashboards: overview, monitoring, configuration, error analysis, usage guide
- AI-powered usage report generation (DOCX export with stats, trends, and fun awards)
- API key time-based access rules (time windows, date ranges, weekday restrictions)
- User portal: personal stats, health score, recommendations, OpenCode config export
- OpenCode integration: auto-generated config with per-model context/output limits
- WeChat iLink Bot integration via MCP (QR login, auto-reply, message persistence)
- English / Chinese i18n with Babel
- Desktop and mobile admin UI
- Reverse proxy support via configurable base path
- Docker Compose with Nginx reverse proxy and static file serving
- Daily stats aggregation and 30-day log archiving
## Screenshots
### Admin Dashboard

### Admin Monitor

### User Dashboard

### User Report

### Mobile Dashboard

## Quick Start
```bash
pip install -r requirements.txt
python main.py
```
Default local addresses:
- Server: `http://localhost:8765`
- Admin: `http://localhost:8765/admin/home`
- User portal: `http://localhost:8765/user/login`
Windows helper: `start.bat` prompts for log level and restarts the service on port 8765.
## Docker
### Docker Run
```bash
docker build -t localhost:5002/modelgate:latest .
docker push localhost:5002/modelgate:latest
docker run -d --name modelgate \
-p 8765:8765 \
-e DATABASE_URL="postgresql+asyncpg://modelgate:password@host:5432/modelgate" \
-e PORT=8765 \
-e ADMIN_USERS="admin:YourPassword" \
-v /opt/modelgate/logs:/app/logs \
-v /opt/modelgate/reports:/app/reports \
--restart unless-stopped \
localhost:5002/modelgate:latest
```
### Docker Compose
The repository includes a `docker-compose.yml` with ModelGate + Nginx services. Nginx handles static file serving and reverse proxying with WebSocket support.
```bash
docker compose up -d
```
See [DEPLOY.md](DEPLOY.md) for full deployment instructions.
## Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `DATABASE_URL` | Yes | PostgreSQL connection string |
| `PORT` | No | Service port, default `8765` |
| `ADMIN_USERS` | Recommended | Admin accounts, format: `user:pass,user:pass` |
| `ADMIN_USERNAME` | No | Fallback admin username |
| `ADMIN_PASSWORD` | No | Fallback admin password |
| `LOG_LEVEL` | No | `DEBUG`, `INFO`, `WARNING`, `ERROR` |
| `ICP_NUMBER` | No | ICP filing number shown on landing page |
## Database
```sql
CREATE USER "modelgate" WITH PASSWORD 'your_password';
CREATE DATABASE "modelgate" OWNER "modelgate";
```
Schema: [`schema.sql`](schema.sql)
The app performs runtime compatibility migrations on startup (e.g., adding new columns to `request_logs`).
## API
### OpenAI-compatible Endpoints
- `POST /v1/chat/completions` - Chat completions (streaming and non-streaming)
- `POST /v1/embeddings` - Text embeddings
- `GET /v1/models` - List available models
### Model Naming
```text
provider/model
```
Examples: `zhipu/glm-4`, `deepseek/chat`, `minimax/MiniMax-M2.5`
## Dashboards
### Admin
- `/admin/home` - Overview, realtime stats, slow requests, trends
- `/admin/config` - Provider, model, and binding configuration
- `/admin/api-keys` - API key management and per-key model access
- `/admin/monitor` - Composition, hotspots, response-time analysis
- `/admin/errors` - Daily error log viewer with AI-powered analysis reports
- `/admin/reports` - AI-powered usage report generation and DOCX download
- `/admin/system-config` - Outbound User-Agent management and UA stats
- `/admin/usage` - Client configuration examples and setup guides
- `/admin/m` - Mobile admin dashboard
### User Portal
API key holders log in at `/user/login` to access:
- Personal request and token statistics (day/week/month)
- 20-minute system health score (error rate, latency, load, active users)
- AI-powered model recommendations with scored reasons
- AI-generated timing advice based on hourly usage patterns
- Active session tracking
- Model catalog with context/output limits and multimodal info
- OpenCode configuration export (`/opencode/setup.md?api_key=...`)
## API Key Time-Based Access Rules
API keys can be restricted by time of day, date ranges, and weekdays. Rules are validated on every request:
- **Time windows** — `start_time` / `end_time` (e.g., only allow 09:00–18:00)
- **Date ranges** — `start_date` / `end_date`
- **Weekday filters** — restrict to specific days of the week
- **Allow/deny semantics** — explicit `allowed` flag per rule
## WeChat iLink Bot (MCP)
ModelGate includes an MCP (Model Context Protocol) server for WeChat iLink Bot integration at `/weixin`:
- QR code login flow
- Message polling, sending, and auto-reply via internal LLM proxy
- Message persistence to database
- Per-user context threading for conversations
- See [docs/guides/weixin-mcp.md](docs/guides/weixin-mcp.md) for setup instructions
## Request Logging
`request_logs` stores: API key, provider, model, tokens, latency, status, upstream HTTP status code, client IP, user agent, and error details.
Streaming requests are inserted as `pending` first, then updated to `success`, `error`, `timeout`, or `cancelled`.
Logs older than 30 days are automatically archived to `request_logs_history`. A `request_logs_all` view unions both tables for transparent querying.
## Scheduled Tasks
| Task | Schedule | Description |
|------|----------|-------------|
| Timeout cleanup | Every 10 minutes | Mark stale pending requests (>10 min) as `timeout` |
| Daily aggregation | 00:05 | Aggregate request counts into daily/hourly stats tables |
| Log archival | 00:20 | Archive request logs older than 30 days |
## Project Structure
```text
modelgate/
├── main.py # App init, middleware, routers, exception handlers
├── core/
│ ├── config.py # Logging, caches, stats, session management
│ ├── database.py # SQLAlchemy async engine, all ORM models
│ ├── deps.py # Auth dependencies
│ ├── i18n.py # Internationalization
│ ├── app_paths.py # Base path for reverse proxy
│ ├── client_ip.py # Multi-header client IP extraction
│ └── log_sanitizer.py # Sensitive data redaction for logs
├── routes/
│ ├── proxy.py # /v1/chat/completions, /v1/embeddings, /v1/models
│ ├── auth.py # Admin login/logout
│ ├── providers.py # Provider CRUD
│ ├── models.py # Model CRUD
│ ├── provider_models.py # Provider-model bindings + auto-sync
│ ├── keys.py # API key CRUD + per-key stats/logs + time rules
│ ├── stats.py # Statistics and aggregation endpoints
│ ├── logs.py # Log viewer + AI error analysis
│ ├── pages.py # Admin HTML pages
│ ├── user.py # User portal API + pages
│ ├── opencode.py # OpenCode config generation
│ ├── reports.py # Usage report generation + DOCX export
│ ├── system_config.py # System config (outbound UA management)
│ └── weixin.py # WeChat MCP server endpoints
├── services/
│ ├── proxy.py # Main proxy logic, streaming, provider dispatch
│ ├── auth.py # API key validation + time-based access rules
│ ├── provider.py # Provider/model resolution
│ ├── scheduler.py # APScheduler tasks
│ ├── stats_aggregator.py # Daily stats aggregation, archiving
│ ├── logging.py # Request log CRUD
│ ├── tokens.py # Token estimation and response parsing
│ ├── message.py # Message preprocessing (merge, truncate)
│ ├── minimax.py # MiniMax-specific response/tool_call parsing
│ ├── sse.py # SSE stream normalization
│ ├── analysis_store.py # AI analysis task persistence
│ ├── usage_report.py # DOCX usage report generation
│ ├── system_config.py # Outbound UA auto-detection
│ └── weixin.py # WeChat iLink Bot client
├── templates/ # Jinja2 HTML (admin/, user/, public/, components/)
├── locales/ # i18n: en, zh
├── schema.sql
├── Dockerfile
└── DEPLOY.md
```
## Development
- Python 3.10+ | FastAPI | SQLAlchemy async | PostgreSQL
- Lint & format: `ruff check . && ruff format .`
- Type check: `mypy main.py core/*.py --ignore-missing-imports`
- i18n compile: `pybabel compile -d locales`
- Logs: `logs/proxy.log`, `logs/admin.log`, `logs/error.log`
## License
Apache 2.0