ArcheoVLM

Phase 4: Hybrid Intelligence Detection with OpenAI Vision Models

Apply hybrid CV and OpenAI GPT-4.1/GPT-o4 Vision-Language Models to detect archaeological features in Sky-View Factor imagery

Goal

Apply a hybrid computer vision and OpenAI Vision-Language Model engine to detect potential archaeological features in Sky-View Factor imagery, using GPT-4.1 for primary cluster detection and GPT-o4 for validation, with an active learning loop to maximize efficiency and accuracy.

Hybrid Intelligence Architecture
Combined computer vision and OpenAI vision model approach

YOLO Object Detection

YOLOv9

Trained on single-channel grayscale Sky-View Factor (SVF) images to detect specific archaeological features

• Geoglyphs
• Mounds
• Linear earthworks
• Geometric anomalies

OpenAI Vision-Language Models

GPT-4.1 Primary Analysis

Advanced Chain-of-Thought prompting for cluster detection and feature identification in SVF imagery

GPT-o4 Validation

Secondary analysis and verification using OpenAI's latest vision model for enhanced accuracy

• Geometric anomaly identification
• Structured JSON output
• Confidence scoring
• Feature descriptions
Sky-View Factor Analysis Workflow
OpenAI vision model processing of archaeological imagery

1. SVF Generation

LiDAR Processing

Generate Sky-View Factor imagery from LiDAR data

2. GPT-4.1 Analysis

Primary Detection

Cluster detection and feature identification

3. GPT-o4 Validation

Secondary Analysis

Verification and enhanced feature description

4. YOLO Integration

CV Validation

Computer vision cross-validation

5. Consensus

Final Results

Consolidated multi-model analysis

OpenAI Vision Model Benefits

GPT-4.1 and GPT-o4 provide sophisticated understanding of archaeological patterns in Sky-View Factor imagery, enabling detection of subtle features that traditional computer vision might miss

Detection Outputs
Structured archaeological feature identification from OpenAI models

YOLO Detections

• Bounding box coordinates
• Feature class labels
• Classification confidence scores
• Localization uncertainty metrics

GPT-4.1 Analysis

• Cluster detection in SVF imagery
• Archaeological feature classification
• Geometric pattern recognition
• Structured JSON descriptions

GPT-o4 Validation

• Secondary feature verification
• Enhanced confidence scoring
• Detailed archaeological descriptions
• Cross-model validation results
Execution Checklist
Phase 4 tasks and deliverables with OpenAI integration
Set up the annotation environment (e.g., Labelbox, CVAT)
Have expert archaeologist annotate the initial seed set of ~200 SVF images
Configure and train the initial YOLOv9 model on the seed set
Develop the GPT-4.1 vision analysis script with Chain-of-Thought prompting
Implement GPT-o4 secondary analysis for validation and verification
Develop the Active Learning module script for continuous improvement
Begin Active Learning Loop iterations with both OpenAI models
Run YOLO and GPT-4.1/GPT-o4 inference on the unlabeled SVF image pool
Consolidate final detections from YOLO, GPT-4.1, and GPT-o4 models
Expected Outputs

Trained YOLO Model

Highly accurate YOLOv9 object detection model for SVF imagery

GPT-4.1/GPT-o4 Analysis

Comprehensive OpenAI vision model results with cluster detection

Consolidated Detections

Master list of features validated by multiple AI models