3 min read
ph-civic-data-mcp: 25 tools, 10 sources, zero API keys
- mcp
- civic-tech
- case-study
ph-civic-data-mcp is a Model Context Protocol server that exposes Philippine civic and scientific data sources to any MCP-compatible agent. Twenty-five tools across ten sources: PHIVOLCS earthquakes, PAGASA weather and typhoons, PhilGEPS procurement, PSA population and PXWeb statistics, NASA POWER for solar radiation, Open-Meteo for air quality, NASA MODIS for satellite vegetation indices, USGS FDSN for global earthquakes, NOAA IBTrACS for historical typhoon tracks, and World Bank Open Data for macro indicators.
Boots with zero API keys. uvx ph-civic-data-mcp is the install. Drop one block into your claude_desktop_config.json or .claude/settings.json and the tools show up.
Why an MCP server
The data is already public. Every source above publishes openly. The friction is that each source publishes in its own schema. PHIVOLCS is HTML scraping. PSA is PXWeb JSON. PAGASA exposes undocumented endpoints. PhilGEPS is weekly Excel exports. Wiring all of that into an agent yourself is two weeks of plumbing per source, and it has to be redone every time an agency changes its layout.
MCP is the right shape for this kind of integration. The agent does not need to know that PHIVOLCS is HTML and PSA is PXWeb. It calls get_recent_earthquakes(min_magnitude=4.0) or get_population_stats(region="NCR") and the server handles the source-specific decoding.
The interesting tools are the cross-source ones. flag_infra_anomalies correlates PhilGEPS procurement notices with PHIVOLCS earthquake records and PAGASA typhoon footprints to surface contracts that were awarded on the same day a major hazard hit the same area. It’s a heuristic. The output is a list to look at, not a finding. The disclaimer in the README is the right framing: specific allegations, if any, require independent investigation and corroboration.
v0.3.1 was a correctness pass
The most recent release is not a new feature pass. It’s a correctness pass on v0.3.0, fixing the kind of bugs an automated test suite tends to miss because the symptom only shows up when a real user reads the output. Three of them:
get_weather_alerts was synthesizing advisories out of PAGASA’s navigation chrome. The site’s menu items, sidebar links, and breadcrumb labels were being parsed as “alerts” by the original extractor. Replaced with a strict text classifier that only returns advisories from the actual advisory feed.
flag_infra_anomalies was firing on stoplist tokens like “city” and “barangay”. The location join was matching every infra project with “city” in its address against any city-level hazard, which produced thousands of false positives. Fixed with an explicit stoplist and a tokenizer that requires proper-noun match.
The PSGC location resolver was failing on reasonable variants like “City of Manila” (vs. “Manila City”) and on city-of-X-in-Y addresses like “Sta. Mesa, Manila”. Added alias handling for the canonical PSGC variants across all 81 provinces.
These are the kind of bugs an LLM-driven test suite tends to skip. Every one was caught by reading the actual server output for a specific user query and saying “that doesn’t look right, where did that string come from.”
What it doesn’t do
It doesn’t store anything. Every call hits the upstream source live. No database, no cache yet. That’s a feature for an LLM context (the data is fresh) and a cost for high-volume callers (it’s slower than necessary, and rude to the upstream sources). The right caching layer is per-tool and not yet built.
Repo: github.com/xmpuspus/ph-civic-data-mcp. Install: uvx ph-civic-data-mcp.