ESRI Well + Surface + Survey Automation Plan
Objective
Create a structured, project-bound workflow that:
- Discovers relevant ESRI/ArcGIS datasets for a target AOI.
- Classifies data into drilling-relevant domains (
well,surface,survey) while excluding non-target buckets (for example water-only rows when not needed). - Stores artifacts under deterministic project/customer paths.
- Feeds curated data into existing ingestion, CRS/ECEF validation, audit, and output generation pipelines.
Project Binding Rule (Required)
Every scrape run must bind to:
project_id(existing project or generated new project id)client_id(customer)location_or_project_id(site/field/run scope)upload_id(source run identity)
All search and analytics artifacts are written under:
open_data:<project>/open_data/client/<client>/<location>/...derived:<project>/derived/client/<client>/<location>/...
This supports:
- New project onboarding: generate and persist workspace profile first.
- Existing project expansion: append data and re-run with same context.
Automated Collection Sequence
- AOI setup:
- Use direct bbox OR derive from processed upload centroid.
- Normalize AOI to EPSG:4326 envelope.
- ESRI discovery:
- Query ArcGIS
FeatureLayer/MapServerlayers with AOI intersection. - Pull:
- intersecting
record_count - sample attributes
- field names for semantic classification
- Relevance classification:
- Infer
relevance_kindusing layer name + fields: wellsurfacesurveywaterreference- Convert to category:
subsurface/wellsurface/elevationsurvey/controlhydrology/water
- Ranking + queue:
- Merge ESRI sources into normalized source list.
- Rank using overlap/resolution/recency/license/format weights.
- Generate prioritized download queue under project-bound storage keys.
- Large tabular dataset gate (WCR, 100k+ rows):
- Sample/scan rows.
- Classify rows into:
well_oil_gaswater_relatedsurvey_relatedsurface_relatedother- Recommend filter policy before stage execution.
- Existing pipeline handoff:
- Register selected files as reference/source datasets.
- Run:
- ingest/normalize
- CRS + vertical + epoch checks
- ECEF consistency checks
- processing audit validation
- Emit outputs:
- audit JSON/MD
- well bundle JSON
- Autodesk exports (PTS/ASC)
- metadata index
Minimum Data Needed Per Domain
Well domain:
API/APINumber, well type/status, operator, lat/lon, spud/completion dates.
Surface domain:
- elevation/terrain layers (
DEM/DSM/LiDAR) with source CRS and resolution metadata.
Survey domain:
- control points/stations, northing/easting/elevation, datum/epoch references.
Current API Support
POST /data-scraper/esri/relevance-plan- Discovers ESRI layers in AOI.
- Produces relevance-ranked shortlist + queue.
- Optionally classifies a large WCR CSV from object key or URL.
- Returns integrated execution plan and storage layout.
Next Build Extensions
- Persist ESRI layer catalog in DB (tenant-aware) with active/inactive flags.
- Add scheduler for incremental AOI refresh per project.
- Add strict production domain policy templates:
- oil-and-gas focused
- groundwater focused
- mixed energy/water mode
- Attach confidence scoring against audited benchmark manifests.