Skip to content

5 min read

solar-map-ph: rooftop solar detection on free satellite imagery

  • civic-tech
  • satellite
  • case-study

There is no public registry of rooftop solar in the Philippines. Meralco’s net-metering program records about 170 MW inside its franchise. Independent estimates put the actual installed commercial rooftop fleet at closer to 370 MW. The residential “guerrilla solar” gap, where households install panels without going through the formal program, is plausibly a third of the total again. solar-map-ph is a reproducible detector for the gap.

The model is small on purpose

The detection pipeline is a frozen openai/clip-vit-large-patch14 encoder feeding a scikit-learn logistic regression head, calibrated with Platt sigmoid on a 20% source-disjoint holdout. That is the whole model. No fine-tuning. No specialized solar architecture. About 294 NCR positives bootstrapped from OpenStreetMap generator:source=solar tags plus 200 negatives, embedded once, classified per tile.

The interesting result is that this beats purpose-built encoders on this task. The model card has the ablation: CLIP-ViT-L wins by 4 F1 points over facebook/dinov2-large and by 14 F1 points over allenai/satlas-pretrain. A satellite-pretrained encoder loses to a general visual encoder on rooftop solar in the Philippines, by a wide margin. That is not the answer most projects start with. It is the answer a held-out split delivered.

Calibration is Platt in production, isotonic regression run alongside and reported in clf_v4_calibration.json for comparison. Isotonic has slightly lower Brier score. Platt wins on monotonicity and parameter count, and the practical gap is small.

Headline numbers

NCR (v1.0, calibrated): 515 rooftops detected across 41 cities in Greater Metro Manila, of which 280 are high-confidence (model score at or above 0.85) and 235 are below-threshold candidates included for review. 87% of the high-confidence detections (242 of 277) were not on any prior public map of solar within 200 m of the detected tile centroid (the same proximity threshold used in DeepSolar and SPECTRUM). 69.9 MWp aggregate installed capacity from the 384 buildings the SAM auto-mask successfully localized to an OSM footprint. Capacity per building is segmented panel area divided by 6 m² per kWp, capped at the building footprint.

Calibrated precision 96% and recall 80% at threshold 0.85 on the honest 20% source-disjoint holdout. Expect roughly 1 in 25 NCR high-confidence detections to be a false positive.

v1.1 extended the same clf_v4.joblib cross-domain (no retraining) to seven additional Philippine distribution-utility franchise areas: VECO (Cebu), DLPC (Davao), MORE (Iloilo), CEPALCO (Cagayan de Oro), ALECO (Legazpi), the Calabarzon belt south of Meralco, and CENECO (Bacolod). Combined: 177 high-confidence plus 167 candidate detections. The spot-check sample (32 top-scoring tiles inspected across all regions) confirmed 28 real rooftop solar, 3 ground-mount utility farms (the dominant cross-domain failure mode), and 1 blue stadium-roof false positive in Legazpi.

v1.2 closed the cross-domain gap. A domain-shift measurement gated a region-stratified retrain (clf_v5, sha256 prefix 5cc0a093c5279fd9). Cebu, Iloilo, and Calabarzon now carry a conservative per-domain Platt calibration on a scan-realistic holdout. Davao, CDO, Legazpi, and Bacolod remain honest candidate inventory, no precision claim, no fabricated confidence interval.

The bit-exact reproducible recipe: make train && make hash-verify && make calibrate && make demo. Deterministic, no network, no GPU, about 30 seconds on a laptop.

The publication boundary

City-level aggregates are released: detection counts, density per km², kWp totals, composite scores. Per-building polygons are released only for commercial, industrial, and public-purpose roofs (institutional subjects). Residential rooftops are released as an aggregate count only, no geometry, no addresses, no OSM way id. The v1.1 cross-domain regions ship at 240 m tile granularity only, no per-building geometry at all.

Residential leaks are blocked at build time. scripts/check_no_residential_leaks.py fails the build if any feature with is_residential=true or a residential building=* tag reaches the NCR per-building dataset. scripts/check_region_no_pii.py fails the build if any v1.1 region GeoJSON ships per-building geometry, addresses, or PII-adjacent property keys. The homeowner roof-lookup tool runs entirely in the browser. The address the user types reaches third-party geocoders directly (Photon, Nominatim, Overpass, Esri tiles, PVGIS) and never reaches a server we operate.

Under RA 10173 §3(g), personal information is data that “directly and certainly” identifies a person, alone or “put together with other information.” A satellite-derived rooftop polygon by itself does not name anyone. The “put together with other information” exposure is exactly why residential geometry is withheld entirely.

What it is not

Not engineering advice. The homeowner tool is informational; consult a certified installer. Not a permit registry, tax record, or code-compliance audit. Not affiliated with any Philippine distribution utility; franchise names are referenced as the regulatory geography covered by each region scan.

Not a uniformly calibrated registry. NCR is calibrated against a held-out validation split. The v1.2 per-domain calibrations apply only to Cebu, Iloilo, and Calabarzon. The other four regions are explicitly uncalibrated and reported as candidate inventory.

All data sourced from public records (Esri World Imagery, OpenStreetMap, ESA, Microsoft, NOAA, NASA). solar-map-ph computes statistical indicators only. Specific allegations, if any, require independent investigation and corroboration.

Repo: github.com/xmpuspus/solar-map-ph. DOI: 10.5281/zenodo.20178050.