Skip to content

Video Detection

Manuscript analyzes video files to detect deepfakes, face swaps, and fully AI-generated video content.

Video containers contain rich metadata:

MetadataReal VideoAI-Generated
EncoderStandard codecs (H.264, H.265)May use non-standard
Creation softwarePremiere, Final Cut, etc.AI tool markers
Frame rateStandard (24/30/60fps)May be unusual
Bitrate patternVariable (VBR)Often constant

Weight: 0.25

AI-generated videos often have temporal inconsistencies:

  • Frame timing: Irregular intervals
  • Motion blur: Incorrect blur patterns
  • Flickering: Subtle frame-to-frame inconsistencies

Weight: 0.15

Identifies markers from AI video tools:

  • Sora
  • Runway Gen-2
  • Pika Labs
  • Stable Video Diffusion
  • DeepFake tools

Weight: 0.15

Real videos have natural bitrate variation. AI videos may have:

  • Too-consistent bitrate
  • Unusual compression patterns
  • Incorrect keyframe intervals

Weight: 0.10

If audio is present:

  • Lip sync consistency
  • Audio-video timing
  • Audio authenticity (uses audio detection)

Weight: 0.20

When frame analysis is enabled:

  • Face consistency across frames
  • Background stability
  • Edge artifacts
  • Reflection/shadow consistency

Weight: 0.15

Terminal window
curl -X POST http://localhost:8080/verify \
-F "video=@clip.mp4"
{
"id": "hm_video789",
"verdict": "ai",
"confidence": 0.78,
"content_type": "video",
"signals": {
"metadata_score": 0.40,
"container_analysis": 0.65,
"temporal_pattern": 0.72,
"encoding_signature": "runway_detected",
"bitrate_consistency": 0.55,
"audio_analysis": {
"present": true,
"verdict": "ai",
"confidence": 0.80
}
},
"processing_time_ms": 350
}

Video benchmark is pending - requires video file downloads via API keys.

Based on related benchmarks:

  • Target Accuracy: >75%
  • Off-the-shelf detectors: 21.3% lower accuracy on Sora-like videos
  • Primary Challenges: New diffusion video models, compression
  • MP4 (.mp4)
  • WebM (.webm)
  • MOV (.mov)
  • AVI (.avi)
  • MKV (.mkv)
  1. Full frame analysis requires ffmpeg for extraction
  2. Re-encoded videos lose original signatures
  3. Short clips (<3 seconds) have insufficient data
  4. New AI models may not be in signature database
  1. Original files: Use source video when possible
  2. Minimum length: At least 3 seconds for reliable detection
  3. Include audio: Audio track improves detection
  4. Avoid re-encoding: Each re-encode removes markers
  • Full frame extraction with ffmpeg integration
  • Face consistency analysis across frames
  • Lip sync verification
  • Temporal coherence scoring
  • Real-time streaming detection

For face-swap deepfakes, Manuscript looks for:

  1. Blending boundaries around face edges
  2. Skin tone inconsistencies
  3. Eye/teeth artifacts
  4. Temporal face jittering
  5. Lighting inconsistencies