Operations And Output Guide

1) Site Access

Use this as the main access map:

Machine-readable: docs/SITE_ACCESS.json
Human-readable quick list:
API Gateway: http://localhost:8080
Control Plane API: http://localhost:8081
Data Plane API: http://localhost:8082
Admin Web: http://localhost:3100
Client Web: http://localhost:3101
MinIO API: http://localhost:19000
MinIO Console: http://localhost:19001
PostgreSQL: postgresql://earthbond:earthbond@localhost:5433/earthbond
HTML manuals index: http://localhost:3101/manuals/index.html

Default POC login:

Username: admin
Password: admin123

Authentication UI:

account controls are now in the header (top-right **Account** menu):
login / verify / logout
token status shown directly in the menu

Manuals in UI:

The client web now exposes a **Docs + Manuals (HTML)** panel.
It reads /manuals/manifest.json and renders docs inside an embedded viewer frame.
To regenerate docs HTML from markdown:
make build-ui-manuals

Workspace management:

Client web now includes **Project Workspace** profile management:
save/load/delete workspace profiles (project_id, customer, description, tenant)
auto-generate project IDs for new runs
optional full module sync on workspace load
scraper runs are expected to execute under this workspace context so outputs are structured under the same project/customer path

Dashboard workflow control:

**Beta Production Control** now includes dashboard orchestration:
module layout mode: single_page or structured
module jump navigation (workspace, ingest, processing, visualization, scraper, outputs, docs)
module-level progress bars with dependency-aware execution
Run Dashboard Workflow runs module waves in parallel where dependencies allow
Continue on error mode keeps non-dependent modules running; show-stopper errors trigger popup + auto-jump to the failing module
**Operations Dashboard** provides segment-first controls:
per-segment run/open buttons
per-segment progress bars
overall progress bar across the full pipeline
designed to start with Workspace + Upload, then continue through ingest/processing/output

2) Output Formats

Primary outputs currently supported:

JSON metadata in API responses and DB (ops.job_results.result)
Autodesk-friendly PTS point cloud output
Autodesk 2D terrain ASC (ESRI ASCII DEM; Civil 3D import-ready)
CSV and JSONL export for ML workflows

Export endpoint:

POST /jobs/export/autodesk
Request body fields:
project_id
upload_id or object_key
format = pts|asc|csv|jsonl
classification_codes (optional class filter)
stride (optional decimation)
max_points (optional cap)
grid_cell_size (optional, meters; used by asc)
elevation_stat (optional max|mean|min; used by asc)

Export artifact location pattern in object storage:

<project_id>/derived/<upload_id-or-job_id>/autodesk/<source_basename>.<format>
Classification summary:
<project_id>/derived/<upload_id-or-job_id>/autodesk/<source_basename>.classification.json
Terrain map metadata for ASC exports:
<project_id>/derived/<upload_id-or-job_id>/autodesk/<source_basename>.terrain-map.json
UI shortcut:
after a successful Job Status call, client UI auto-sets Bucket Prefix to this export folder and refreshes Folder Tree.

2D terrain map preview endpoint:

GET /maps/terrain/preview
query params:
job_id or object_key (ASC object)
source_crs (optional but required when CRS cannot be inferred)
contour_interval (optional)
max_dimension (optional grid downsample cap)
response includes:
bounds_wgs84 (for map fit)
hillshade (rows, cols, grayscale values)
contours_geojson + contour_levels

ESRI relevance planning endpoint:

POST /data-scraper/esri/relevance-plan
purpose:
discover ESRI/ArcGIS layers in AOI
classify relevance into well/surface/survey/water/reference domains
merge with search ranking queue
optionally classify large WCR CSV rows (wcr_object_key or wcr_source_url)
return integrated execution plan and project-structured storage layout
supports auth fields for protected ArcGIS layers:
esri_auth_mode (public|token)
esri_token (optional direct token)
or esri_token_url + esri_username + esri_password (+ optional esri_referer)

Output distribution endpoints:

GET /storage/object-link (single object download URL; public or presigned)
POST /storage/bundle-prefix (zip bundle for a folder prefix and return download URL)

Validated Output Hub (client UI):

use **Validated Output Hub** panel for clean output delivery view
features:
structured prefix selection (derived, reports, open_data, raw, or custom)
validated/deliverable tagging for output rows
select all (page), clear, single download, and full-folder bundle download
summary counts and total size for selected output scope

Client Web UI 2D map panel:

open http://localhost:3101
export to ASC DEM first (Queue Autodesk Export)
set task_id and click Use Task ID, then Load 2D Map
base layer switch:
OpenStreetMap
Google Street (tile)
Satellite (Esri)
overlay toggles:
Hillshade on/off + opacity
Contours on/off
if map request fails with CRS detail, fill Source CRS (EPSG:xxxx) and reload.

3) Metadata Browsing

Endpoints:

GET /projects/{project_id}/metadata
GET /projects/{project_id}/metadata/{job_id}

UI:

Client web includes a metadata browser with:
processed-file list
readable detail panel
class profile + CRS + ECEF summary
location validation controls:
Location Validation toggle (show/hide location validation block)
Force Map Link (build map link from centroid even when strict URL is unavailable)
Open Map (open Google Maps for selected metadata item)
tile issue controls:
Detect Tile Issues (checks wrong-location/outlier tiles)
Fix Tile Set (removes detected outlier tiles from active selection)
Include Outlier Tiles checkbox for explicit yes/no override during render
yes/no confirmation prompt appears when outlier tiles are detected and currently included
uploaded-file list (recent uploads)
uploaded folder-style tree view under <project_id>/raw/
storage Finder (folder level) with clickable rows:
Review Uploaded: refresh uploaded list + reset Finder at <project_id>/raw/
Finder Root: jump to <project_id>/raw/
Finder Up: move one level up (bounded to project raw root)
Finder Open: open selected folder or set selected file as object_key
Finder Refresh: reload current Finder prefix
breadcrumb + summary state for quick storage navigation context
folder batch upload for multi-file projects (keeps relative paths in object keys)
ZIP bundle upload:
Upload ZIP Bundle accepts one .zip, extracts folder structure server-side, and stores files under:
<project_id>/raw/<zip_upload_id>/<relative_path_from_zip>
each extracted file is registered in ops.upload_sessions with:
source=direct_zip_upload
zip_upload_id
zip_entry_path
inferred file_role (primary for LAS/LAZ, metadata for XML/TXT/HTML/PDF/CSV, reference otherwise)
processing shortcuts:
Locate Latest LAS/LAZ auto-fills upload_id + object_key and jumps Finder to the file location
Start Processing auto-detects the selected file signature and routes to:
POST /jobs/ingest for binary point-cloud LAS/LAZ (LASF signature)
POST /well-logs/jobs/normalize for text LAS well logs (~Version marker)
detection endpoint used by UI:
GET /uploads/detect-kind?project_id=...&upload_id=...|object_key=...
workflow visibility:
Workflow Tracker in Upload panel shows Upload -> Selected -> Queued -> Processing
includes a Next: hint line so operator knows the immediate next action
processing stage auto-updates from /jobs/status polling once a task is queued
upload automation controls:
Auto-run pipeline after upload toggle
mode: Run Next or Run Full Pipeline
Run Next Now, Run Full Now, and Sync Context available directly in Upload panel
cross-module context sync auto-updates metadata, processing visibility, Finder context, and beta board after selection/process events
well-log stage controls in Upload panel:
Run Interpretation queues POST /well-logs/jobs/interpret for current well_id
Run Bypassed Pay queues POST /well-logs/jobs/bypassed-pay for current well_id
Download Output Bundle exports one JSON containing:
status-pack (GET /projects/{project_id}/wells/{well_id}/status-pack)
linked normalize/interpret/bypassed-pay job status payloads
Well Data Processing Visibility now includes well-workflow job feed:
Refresh Well Jobs loads GET /projects/{project_id}/well-jobs
optional well filter (processing-well-id) narrows to one well
live sync updates both dataset cards and well-job cards
Seamless Workflow Runner:
one-click Run Seamless orchestrates end-to-end flow from selected object
mode:
auto (detect point-cloud vs well-log LAS)
pointcloud
welllog
point-cloud path can auto-run:
ingest
metadata refresh + optional 3D render
export + optional 2D terrain map load
optional ECEF consistency audit
well-log path can auto-run:
normalize -> interpret -> bypassed-pay
status-pack refresh + optional output bundle download
Refresh Snapshot captures current IDs, tracker state, and module status in one JSON panel
Beta Production Control:
top-level guided board for operator use (beta and production-sim channels)
Refresh Board computes current pipeline context and missing items
Run Next executes one deterministic next action based on detected file kind:
point-cloud: process/wait -> metadata -> 3D -> 2D terrain export
well-log: normalize -> interpret -> bypassed-pay -> status pack
Run Full Pipeline executes seamless workflow + audit validation
Open Output Hub jumps UI to 3D/2D or well processing section
Open Docs Hub opens manuals index in the embedded docs panel
Resource Link Library:
persistent link catalog for scraper sources (open data/member/vendor/internal)
add/update/delete links, open source page, and push selected URL directly into scraper direct-download input
drag-and-drop folder/file upload zone with per-file progress bars
WebGL controls for:
max rendered points (up to 1,000,000)
quality presets (performance, balanced, high, max)
sampling mode (stride, random)
detect points (preview/raw estimate)
set max from detected points
point size
focus trim percent (outlier-resistant camera framing)
color mode (classification, xyz, elevation, intensity)
class RGB overrides for classification mode
elevation min/max
intensity min/max
class quick presets (All, Ground, Building, Vegetation, Water)
mouse navigation in WebGL:
left-drag = 360 orbit
right-drag or shift-drag = pan
wheel = zoom
click point = inspect attributes/coordinates
double-click point = re-focus camera on that area
Focus Dense button = focus densest local cluster for close inspection
multi-file render selection:
check multiple processed files, then click Render Checked Files

4) Multi-LAZ Same Grid Workflow (Vector DB)

Goal:

Add many LAZ/LAS files into one logical grid for the same project.

How:

Use the same grid_id in upload/ingest for every related file.
Each completed ingest auto-indexes a tile into PostGIS:

table: ops.pointcloud_tiles

Grid definitions/metadata live in:

table: ops.pointcloud_grids

Grid endpoints:

POST /projects/{project_id}/grids (create/update grid)
GET /projects/{project_id}/grids
GET /projects/{project_id}/grids/{grid_id}/tiles
GET /projects/{project_id}/grids/{grid_id}/summary

This is the current vector-DB layer for pointcloud tiles. It supports:

grouping tiles by grid
class-code filtering
bbox filtering
total point and extent summaries

5) Mesh Note

Current implementation is vector-index-first (tile index + filtered exports).

Mesh generation can be added as a next phase worker:

input: one grid (project_id + grid_id)
process: triangulation/surface reconstruction
output: Autodesk OBJ/FBX or Cesium 3D Tiles
storage: derived/<grid_id>/mesh/*

6) WebGL Dependency

The client UI serves Three.js locally from:
apps/client-web/vendor/three.min.js
This removes runtime dependency on external CDNs and is required for reliable local/proxy environments.
The 2D map uses local Leaflet assets:
apps/client-web/vendor/leaflet.js
apps/client-web/vendor/leaflet.css

7) Documentation Update Rule

When features change, update in the same commit:

README.md
docs/INDEX.md
Stage index under docs/stages/*
docs/operations/REPORT_INDEX.md (if run artifacts changed)
docs/SITE_ACCESS.json
docs/OPERATIONS_AND_OUTPUT_GUIDE.md
docs/CHANGELOG.md

8) Admin Storage Management

Admin Web now includes MinIO storage controls via Data API:

list objects (paginated) by project_id or explicit prefix
select all objects on current page
select all objects across all pages for current query scope (up to cap)
clear selection
delete one object
delete selected objects
dry-run prefix delete (preview count/sample keys)
execute prefix delete
optional DB cleanup marks related ops.upload_sessions and ops.pointcloud_tiles rows as deleted

Storage endpoints:

GET /storage/objects
DELETE /storage/object
DELETE /storage/prefix

11) Well Data Processing Dataset Visibility

New data-plane endpoints for tracking non-pointcloud dataset packs:

GET /projects/{project_id}/processing-datasets
filters: source_tag, processing_stage, dataset_kind, limit
POST /processing-datasets/jobs/register
creates a completed register job + profile (row_count, column_count, schema sample) for one uploaded dataset

Upload metadata tags supported for tracking:

source_tag
processing_stage (for example stage1, stage2)
dataset_kind

Client web UI:

**Well Data Processing Visibility** panel in http://localhost:3101
shows datasets used in processing with stage filters and register-job profile detail

12) Welllog Traceability + Known-Result Audit

Well-log processing now includes deterministic trace signatures so output drift can be traced to the exact stage/config change.

Trace payload location:

normalize output: traceability
interpret output: traceability (includes upstream_normalize_signature)
classify output: traceability (includes upstream_interpret_signature)
status pack API rows include persisted traceability from DB tables.

Run deterministic audit (golden fixture):


make run-welllog-known-result-audit

Audit artifacts:

docs/operations/WELLLOG_KNOWN_RESULT_AUDIT_REPORT.json
docs/operations/WELLLOG_KNOWN_RESULT_AUDIT_REPORT.md

If an intentional formula/policy change is made, refresh baseline once, then commit with rationale:


python3 scripts/dev/run_welllog_known_result_audit.py --refresh-expected

9) Geodetic + ECEF Validation Standard

Pointcloud metadata now follows a strict centroid validation flow:

location_validation.centroid_native stores centroid in source CRS (x, y, z_m, srs)
location_validation.centroid_geodetic stores centroid in geographic coordinates (lat_deg, lon_deg, height_m, srs=EPSG:4326)
location_validation.centroid_status is VALID only when CRS + transform + bounds checks pass
longitude normalized into [-180, 180]
hard-fail conditions include:
unknown CRS
transformation failure
abs(lat_deg) > 90
abs(lon_deg) > 180
heuristic flag for mislabeled projected coordinates:
abs(native centroid x/y) > 1000 while declared CRS is geographic

ECEF (ecef_conversion) is validated with:

CRS->ECEF via pyproj
roundtrip check (source->geodetic vs ecef->geodetic)
independent WGS84 formula cross-check (EPSG 9602 equations)
radius plausibility range check

Cross-dataset distortion guard:

endpoint: POST /projects/{project_id}/ecef/consistency
compares each dataset pair using:
geodetic surface distance (pyproj.Geod great-circle distance)
ECEF chord distance (sqrt((x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2))
expected chord from arc distance (2 * R * sin(arc_m / (2R)))
flags pair when relative error exceeds tolerance (tolerance_relative, default 0.02)
output includes:
flagged_jobs
worst_pairs
pass_ratio
client UI flow:
open **Metadata + 3D**
select one or more metadata rows
click **ECEF Consistency Audit**
optional: click **Download ECEF JSON**

Reference basis is tracked in:

docs/ECEF_GEODETIC_REFERENCES.md

10) Open Data Auto-Search Standard

The platform now implements a canonical open-data discovery sequence for LiDAR uploads.

Core flow:

Derive AOI from upload metadata centroid (EPSG:4326) or explicit bbox input.
Search providers.
Normalize results.
Rank using weighted scoring:

overlap, resolution, recency, coverage, license openness, format readiness.

Generate standard artifacts:

queries.json
results_normalized.json
shortlist.json
download_queue.json

Optional download stage writes datasets under canonical download folders.
Generate analytics artifacts:

summary.json
risk_report.json

Canonical object storage patterns:

Search artifacts:
<project_id>/open_data/client/<client_id>/<location_or_project_id>/search/<upload_id>/queries.json
<project_id>/open_data/client/<client_id>/<location_or_project_id>/search/<upload_id>/results_normalized.json
<project_id>/open_data/client/<client_id>/<location_or_project_id>/search/<upload_id>/shortlist.json
<project_id>/open_data/client/<client_id>/<location_or_project_id>/search/<upload_id>/download_queue.json
Downloaded datasets:
<project_id>/open_data/client/<client_id>/<location_or_project_id>/downloads/lidar/<provider>/<dataset_id>/...
<project_id>/open_data/client/<client_id>/<location_or_project_id>/downloads/hazards/flooding/<provider>/<dataset_id>/...
<project_id>/open_data/client/<client_id>/<location_or_project_id>/downloads/hazards/seismic/<provider>/<dataset_id>/...
<project_id>/open_data/client/<client_id>/<location_or_project_id>/downloads/hazards/fire/<provider>/<dataset_id>/...
Analytics artifacts:
<project_id>/derived/client/<client_id>/<location_or_project_id>/analytics/<upload_id>/summary.json
<project_id>/derived/client/<client_id>/<location_or_project_id>/analytics/<upload_id>/risk_report.json

Endpoints:

POST /data-scraper/search (returns ranked/normalized artifacts in response; can persist artifacts when context is provided)
POST /data-scraper/esri/relevance-plan (supports routing_mode=existing_project|new_project|auto and returns project_routing + structured storage layout)
POST /data-scraper/auto-search-from-upload (derive AOI directly from processed upload metadata)
POST /data-scraper/download (background download job + analytics artifact generation)
GET /data-scraper/audited-sources (loads audit manifest source URLs + hashes/row counts for trusted cross-reference)
POST /data-scraper/direct-download (downloads a specific source URL into object storage for the selected project)

Client UI source-link workflow:

Open **Open Data Scraper** panel.
Set Scrape Routing:

Existing Project: append to currently selected project workspace.
New Project: generate/use a new project context and auto-save workspace profile.
Auto: uses existing project if provided, otherwise creates structured project context.

Click Audited Sources to load validated provider links from data/external/ca_well_test_pack/audit/manifest.json.

Docker fallback: if that host path is unavailable, API uses bundled manifest apps/data-plane-api/src/data_plane_api/audited_sources_manifest.json.
Optional override: GET /data-scraper/audited-sources?manifest_object_key=<object-key> to load manifest from MinIO/S3.

Click Open Source to view provider page in browser.
Click Direct Download (row action) or use Direct Download URL input to ingest a specific source file.
Copy job_id into Scraper Job ID and click Scraper Job Status until status is completed.
Downloaded objects are stored under:

<project_id>/open_data/manual_downloads/<job_id>/<filename>