Your content. Your servers. Your control. HumanMark is the open-source AI detection engine that never sends your data to the cloud.
π€ The Problem#
Every AI detection service requires you to upload your content to their servers. That’s a dealbreaker for:
Compliance Nightmare: Healthcare (HIPAA), Legal (privilege), Finance (SOC2/PCI), and Government (classified) organizations cannot send sensitive data to third-party APIs.
HumanMark runs entirely on YOUR infrastructure. Your data never leaves your network. Period.
β‘ Quick Start#
Get running in under 30 seconds:
docker run -p 8080:8080 humanmark/humanmarkgo install github.com/vinpatel/humanmark/cmd/api@latest
humanmarkgit clone https://github.com/vinpatel/humanmark.git
cd humanmark && make runThen detect AI content:
curl -X POST http://localhost:8080/verify \
-H "Content-Type: application/json" \
-d '{"text": "Your content here"}'Response:
{
"id": "hm_abc123",
"verdict": "human",
"confidence": 0.87,
"signals": {
"sentence_variance": 0.42,
"vocabulary_richness": 0.78,
"contraction_ratio": 0.15
}
}π Why HumanMark?#
Feature Comparison#
| Feature | HumanMark | GPTZero | Originality.ai | Turnitin |
|---|---|---|---|---|
| Self-Hosted | β | β | β | β |
| Works Offline | β | β | β | β |
| Open Source | β MIT | β | β | β |
| Zero Cost | β | β | β | β |
| Multi-Modal | β | β οΈ | β οΈ | β οΈ |
| API Limits | β | Tiered | Per-check | Per-seat |
π¬ How It Works#
HumanMark uses statistical forensicsβno ML models, no GPU required, instant results.
| Signal | Human | AI |
|---|---|---|
| Sentence length variance | High | Low |
| Vocabulary richness | Diverse | “Safe” words |
| Contractions | “don’t”, “I’m” | “do not”, “I am” |
| Punctuation variety | !?;:β | Mostly periods |
| AI phrases | Rare | “As an AI…” |
| Signal | Real Photo | AI Image |
|---|---|---|
| EXIF metadata | Present | Missing |
| Camera make | Apple, Canon | None |
| Sensor noise | Natural | Too clean |
| Compression | Consistent | Irregular |
- File header analysis
- Encoder fingerprinting
- AI tool markers (ElevenLabs, etc.)
- Temporal consistency checks
- Container metadata analysis
- Frame-by-frame consistency
- Generation tool signatures
- Encoding profile anomalies
πΌ Use Cases#
Enterprise Compliance
HIPAA/GDPR
Healthcare, Finance, Legal
Organizations that cannot send data to third-party APIs. HumanMark runs on-premiseβpatient records, legal documents, and financial data never leave your network.Education at Scale
100K+ Students
Universities & Districts
Process millions of assignments without per-document fees. Self-hosted = unlimited usage at fixed infrastructure costs. Often 90%+ savings vs. commercial alternatives.Developer Integration
REST API
Content Platforms & Apps
Integrate AI detection into your product: content moderation, hiring tools, CMS plugins, browser extensions. Full REST API with SDKs coming soon.Government & Defense
Air-Gapped
Classified Environments
Deploy in air-gapped networks with zero internet connectivity. Essential for classified environments, secure facilities, and critical infrastructure.
π Performance#
Benchmarked on AWS c5.xlarge (4 vCPU, 8GB RAM):
πΊοΈ Roadmap#
- Text detection (statistical analysis)
- Image detection (EXIF + forensics)
- Audio detection (metadata)
- Video detection (container analysis)
- Docker support
- Prometheus metrics
- π Browser extension
- π VS Code extension
- π WordPress plugin
- π Python & JavaScript SDKs
- π Admin dashboard
π Links#
β Star on GitHub π¬ Discussionsπ License#
MIT License β Use it however you want. Free forever. No premium tiers, no enterprise editions, no bait-and-switch.
Built with β€οΈ by Vin Patel
