ArcheoVLM

Phase 4: Hybrid Intelligence Detection with OpenAI Vision Models

Apply hybrid CV and OpenAI GPT-4.1/GPT-o4 Vision-Language Models to detect archaeological features in Sky-View Factor imagery

Goal

Apply a hybrid computer vision and OpenAI Vision-Language Model engine to detect potential archaeological features in Sky-View Factor imagery, using GPT-4.1 for primary cluster detection and GPT-o4 for validation, with an active learning loop to maximize efficiency and accuracy.

Hybrid Intelligence Architecture

Combined computer vision and OpenAI vision model approach

YOLO Object Detection

YOLOv9

Trained on single-channel grayscale Sky-View Factor (SVF) images to detect specific archaeological features

• Geoglyphs

• Mounds

• Linear earthworks

• Geometric anomalies

OpenAI Vision-Language Models

GPT-4.1 Primary Analysis

Advanced Chain-of-Thought prompting for cluster detection and feature identification in SVF imagery

GPT-o4 Validation

Secondary analysis and verification using OpenAI's latest vision model for enhanced accuracy

• Geometric anomaly identification

• Structured JSON output

• Confidence scoring

• Feature descriptions

Sky-View Factor Analysis Workflow

OpenAI vision model processing of archaeological imagery

1. SVF Generation

LiDAR Processing

Generate Sky-View Factor imagery from LiDAR data

2. GPT-4.1 Analysis

Primary Detection

Cluster detection and feature identification

3. GPT-o4 Validation

Secondary Analysis

Verification and enhanced feature description

4. YOLO Integration

CV Validation

Computer vision cross-validation

5. Consensus

Final Results

Consolidated multi-model analysis

OpenAI Vision Model Benefits

GPT-4.1 and GPT-o4 provide sophisticated understanding of archaeological patterns in Sky-View Factor imagery, enabling detection of subtle features that traditional computer vision might miss

Detection Outputs

Structured archaeological feature identification from OpenAI models

YOLO Detections

• Bounding box coordinates

• Feature class labels

• Classification confidence scores

• Localization uncertainty metrics

GPT-4.1 Analysis

• Cluster detection in SVF imagery

• Archaeological feature classification

• Geometric pattern recognition

• Structured JSON descriptions

GPT-o4 Validation

• Secondary feature verification

• Enhanced confidence scoring

• Detailed archaeological descriptions

• Cross-model validation results

Execution Checklist

Phase 4 tasks and deliverables with OpenAI integration

Set up the annotation environment (e.g., Labelbox, CVAT)

Have expert archaeologist annotate the initial seed set of ~200 SVF images

Configure and train the initial YOLOv9 model on the seed set

Develop the GPT-4.1 vision analysis script with Chain-of-Thought prompting

Implement GPT-o4 secondary analysis for validation and verification

Develop the Active Learning module script for continuous improvement

Begin Active Learning Loop iterations with both OpenAI models

Run YOLO and GPT-4.1/GPT-o4 inference on the unlabeled SVF image pool

Consolidate final detections from YOLO, GPT-4.1, and GPT-o4 models

Expected Outputs

Trained YOLO Model

Highly accurate YOLOv9 object detection model for SVF imagery

GPT-4.1/GPT-o4 Analysis

Comprehensive OpenAI vision model results with cluster detection

Consolidated Detections

Master list of features validated by multiple AI models