Soil Landscapes of Canada — ETL Handoff
Status: v1 shipped (soil_order only). This doc is the handoff for v2 attribute enrichment, intended for the ETL project (operations repo).
The web side already queries public.soil_order_canada from three surfaces — parcel report, Territory & Prospecting CSV, batch enrichment. Adding attributes to the table immediately surfaces them everywhere with no further web work (the columns flow through the existing LEFT JOIN ... ON ST_Contains(slc.geom, ...)).
What ships today
public.soil_order_canada (
id,
soil_order,
soil_order_id,
geom geometry(MultiPolygon, 4326)
)
soil_order is one of: Brunisolic, Chernozemic, Cryosolic, Gleysolic, Luvisolic, Organic, Podzolic, Regosolic, Solonetzic, Vertisolic, Unclassified.
PMTiles archive: maps.townshipcanada.com/soil_order_canada.pmtiles (single property: soil_order).
v2 — richer attributes from the same AAFC SLC source
The AAFC Soil Landscapes of Canada dataset ships as a related set of tables. We currently use ~5% of what's published. Adding the columns below requires re-running the ingest pipeline once, then surfaces immediately across the web product.
Target schema
ALTER TABLE public.soil_order_canada
ADD COLUMN soil_group TEXT, -- e.g. 'Black Chernozem', 'Gray Luvisol'
ADD COLUMN soil_subgroup TEXT, -- e.g. 'Orthic Black Chernozem'
ADD COLUMN drainage_class TEXT, -- 'rapid' | 'well' | 'moderately well' | 'imperfectly' | 'poorly' | 'very poorly'
ADD COLUMN parent_material TEXT, -- 'glacial till' | 'lacustrine' | 'fluvial' | 'eolian' | 'organic' | 'residual' | 'morainal' | 'colluvial'
ADD COLUMN slope_class TEXT, -- 'A' (0-0.5%) | 'B' (0.5-2%) | 'C' (2-5%) | 'D' (5-10%) | 'E' (10-15%) | 'F' (15-30%) | 'G' (30%+)
ADD COLUMN is_solonetzic BOOLEAN DEFAULT FALSE, -- true when soil_order = 'Solonetzic' OR salinity flag set
ADD COLUMN slc_version TEXT; -- AAFC publication date (e.g. 'v3.2-2020')
-- Add a function index on lower(soil_order) for the friendly-name lookup
CREATE INDEX IF NOT EXISTS soil_order_canada_order_lower_idx
ON public.soil_order_canada (lower(soil_order));
AAFC source files
The SLC v3.x ships as four related tables. All available at the Open Government Canada portal (search "Soil Landscapes of Canada"):
| Table | What it carries | Key columns |
|---|---|---|
| PLY | Polygon attributes (landform, slope, surface form) | POLY_ID, SLOPE, LOCAL_SF, LANDSCAPE |
| CMP | Component table — soil components within each polygon, with percent | POLY_ID, CMP_NO, PERCENT, SOIL_ID, KIND, WATER |
| SNF | Soil names file — detailed soil profile attributes | SOIL_ID, SUBGRP, GREATGRP, ORDER, KIND, DRAIN, PARENT_MAT |
| SLF | Soil layer file — horizon-by-horizon attributes (texture, OM, pH) | SOIL_ID, LAYER_NO, TEXTURE, OM, PH |
For v2 we need PLY + CMP + SNF. SLF is deferred (richer than this round needs).
Aggregation strategy — component-percent-weighted dominant
Each SLC polygon has multiple soil components (CMP rows) with a PERCENT field summing to 100. For the polygon-level attributes we want, the rule is dominant component wins (highest PERCENT), with ties broken by lowest CMP_NO. This matches AAFC's own surfacing convention.
Pseudo-SQL for the load:
-- Build a "dominant component per polygon" view
WITH dominant_cmp AS (
SELECT DISTINCT ON (poly_id)
poly_id,
soil_id,
kind,
water AS drainage_raw
FROM etl.slc_cmp
ORDER BY poly_id, percent DESC, cmp_no ASC
),
-- Join SNF for the soil-name attributes
joined AS (
SELECT
d.poly_id,
s.order AS soil_order,
s.greatgrp AS soil_group,
s.subgrp AS soil_subgroup,
-- AAFC drainage codes: R=rapid, W=well, MW=mod well, I=imperfect, P=poor, VP=very poor
CASE s.drain
WHEN 'R' THEN 'rapid'
WHEN 'W' THEN 'well'
WHEN 'MW' THEN 'moderately well'
WHEN 'I' THEN 'imperfectly'
WHEN 'P' THEN 'poorly'
WHEN 'VP' THEN 'very poorly'
ELSE NULL
END AS drainage_class,
-- AAFC parent material codes are 2-3 letter; map to human-readable
CASE s.parent_mat
WHEN 'T' THEN 'glacial till'
WHEN 'L' THEN 'lacustrine'
WHEN 'F' THEN 'fluvial'
WHEN 'E' THEN 'eolian'
WHEN 'O' THEN 'organic'
WHEN 'R' THEN 'residual'
WHEN 'M' THEN 'morainal'
WHEN 'C' THEN 'colluvial'
ELSE s.parent_mat
END AS parent_material,
p.slope AS slope_class,
(s.order = 'Solonetzic' OR s.subgrp ILIKE '%solonetz%') AS is_solonetzic
FROM dominant_cmp d
JOIN etl.slc_snf s ON s.soil_id = d.soil_id
JOIN etl.slc_ply p ON p.poly_id = d.poly_id
)
UPDATE public.soil_order_canada o
SET
soil_group = j.soil_group,
soil_subgroup = j.soil_subgroup,
drainage_class = j.drainage_class,
parent_material = j.parent_material,
slope_class = j.slope_class,
is_solonetzic = j.is_solonetzic,
slc_version = 'v3.2-2020'
FROM joined j
WHERE o.soil_order_id = j.poly_id;
The exact AAFC code-to-label mappings (drainage, parent material, slope) are in the SLC documentation PDF accompanying the dataset publication. Cross-check the codes there before committing the load — AAFC has updated the code list across publications.
PMTiles regeneration
The PMTiles archive at maps.townshipcanada.com/soil_order_canada.pmtiles needs to be regenerated to include the new properties. Add the new columns to the tippecanoe step:
ogr2ogr -f GeoJSONSeq /tmp/slc.geojsonl PG:'host=... dbname=...' \
-sql "SELECT soil_order, soil_group, drainage_class, parent_material,
is_solonetzic, geom
FROM public.soil_order_canada"
tippecanoe \
-o soil_order_canada.pmtiles \
--layer=soil_order_canada \
--minimum-zoom=0 \
--maximum-zoom=12 \
--include=soil_order \
--include=soil_group \
--include=drainage_class \
--include=parent_material \
--include=is_solonetzic \
/tmp/slc.geojsonl
The --coalesce-densest-as-needed flag is not needed here (SLC polygons are large; density isn't the issue).
Validation checks before swapping into production
Once the new PMTiles is built:
- Coverage: Verify no NULL
drainage_classrows where there's a non-NULLsoil_order(other thanUnclassifiedand ocean/water polygons) - Distribution: Sanity-check the drainage_class distribution against AAFC's published summary statistics
- Spot-check known parcels:
NW 22-44-9-W5(Sylvan Lake area, AB) — should beBlack Chernozem,well drained,glacial tillSE 14-29-21-W2M(Estevan area, SK) — should beBrown ChernozemorDark Brown Chernozem, mixed drainageS 7-33-21-W3(correction-section quarter SK) — should beBlack Chernozem, watch forSolonetzicflag
- Solonetzic flag accuracy: spot-check known salt-affected areas in southern AB and central SK
- PMTiles size: the v1 archive is ~280MB. v2 should be ~350-400MB. If it goes over 500MB, drop
soil_subgroupfrom tiles (still queryable from DB).
Web side — no code changes required
Once the columns are populated in public.soil_order_canada, the following surfaces automatically include them:
- Parcel report API (
server/api/v1/parcels/[lld]/report.get.js) — change the SELECT ingetSoilOrder()to pull all new columns - Territory CSV (
server/api/territory/[id]/export.csv.get.js) — add columns to the SELECT + CSV headers - Territory parcel list (
server/api/territory/[id]/parcels.get.js) — add columns to the SELECT - Batch enrichment (
server/utils/queryBuilder.js) — extend thesoil_order_canadaenrichment join to SELECT additional columns
A web-side PR after the ETL completes will be ~30 lines across these four files.
v3 — AGRASID 4.1 for higher-resolution Alberta coverage
National SLC is 1:1,000,000. AGRASID 4.1 (already in the LSRS pipeline) is 1:100,000 for Alberta — 10× finer. For a 160-acre quarter section, AGRASID often distinguishes within-parcel variation that SLC collapses.
Target
Add a province-specific table that supersedes SLC for AB queries:
CREATE TABLE public.agrasid_soil_ab (
id SERIAL PRIMARY KEY,
poly_id INTEGER,
soil_order TEXT,
soil_group TEXT,
soil_subgroup TEXT,
drainage_class TEXT,
parent_material TEXT,
slope_class TEXT,
is_solonetzic BOOLEAN DEFAULT FALSE,
agrasid_version TEXT,
geom geometry(MultiPolygon, 4326)
);
CREATE INDEX agrasid_soil_ab_geom_idx ON public.agrasid_soil_ab USING GIST (geom);
Same column set as SLC + agrasid_version.
Source
AGRASID 4.1 is published by Alberta Soil Information Viewer. Same component / soil-names structure as SLC but Alberta-specific and higher resolution. The LSRS pipeline already ingests AGRASID for the LSRS class compute — extend that pipeline to materialize the per-polygon soil attributes here.
Web side query pattern
The web side will query the higher-resolution table first, falling back to SLC:
SELECT
COALESCE(ab.soil_order, slc.soil_order) AS soil_order,
COALESCE(ab.drainage_class, slc.drainage_class) AS drainage_class,
...
COALESCE(ab.soil_subgroup, slc.soil_subgroup) AS soil_subgroup,
COALESCE('AGRASID 4.1', 'SLC v3.2-2020') AS source
FROM (SELECT 1) dummy
LEFT JOIN public.agrasid_soil_ab ab ON ST_Contains(ab.geom, $1::geometry)
LEFT JOIN public.soil_order_canada slc ON ST_Contains(slc.geom, $1::geometry)
LIMIT 1;
This pattern degrades gracefully when AGRASID coverage is missing (north of the AB agricultural belt) and the response surfaces the source attribution so the parcel report can show "AGRASID 4.1" vs "SLC v3.2" per parcel.
v4 — provincial datasets for SK, MB, BC
Following the AB AGRASID precedent:
- SK: SKSIS (Saskatchewan Soil Information System) — published by SaskAgriculture, similar attribute structure to AGRASID
- MB: Manitoba Soil Database — provincial publication, slightly less detail than SKSIS
- BC: BC Soil Information Finder Service — covers the BC agricultural corridor
Each province ships as its own table public.<source>_soil_<province> and the query pattern extends:
COALESCE(ab.soil_order, sk.soil_order, mb.soil_order, bc.soil_order, slc.soil_order) AS soil_order
Coverage check (province column) determines which provincial table is consulted; SLC is always the fallback.
Sequencing recommendation
- v2 — SLC enrichment (this doc, primary deliverable) — biggest content unlock; same ingest source, new columns
- v3 — AGRASID for AB — finer resolution where LSRS coverage is shipped (matching pattern)
- v4 — SKSIS for SK — synchronizes with the SK LSRS compute pipeline (which lands roughly the same time)
- v4.2 — MB Soil DB — follows SK
- v4.3 — BC Soil Information Finder — last, smaller agricultural land base
Each step is independently shippable. v2 should land within a 2-week window once the SLC source files are pulled and the join SQL is validated against the spot-check parcels above.
Open questions for ops to confirm
- SLC version cadence: AAFC publishes major SLC updates roughly every 5-10 years. The current public release is v3.2 (2020). Confirm the publication we want to ingest is v3.2 or whether v3.3 has dropped.
- Tippecanoe attribute trimming: if PMTiles size blows past 500MB, decide whether to drop
soil_subgroupfrom tiles (kept in DB) or to ship at higher minimum zoom (currently--minimum-zoom=0). - Validation parcels: the three spot-check parcels above are starter examples. Confirm with the agronomy advisor (if available) that these match expected attribute values.
Web-side companion work
The web side ships these in parallel (no data dependency):
- Friendlier soil-order display names (Chernozemic → "Chernozem — Prairie grassland soil zone") — already shipped
- Map colour ramp using semantic earth tones — already shipped
- Click-to-identify popover on the SLC layer — already shipped
- Per-soil-order explainer pSEO pages (Chernozemic, Luvisolic, Solonetzic) — already shipped
Once v2 ingest completes, ~30 lines of web changes surface the new attributes everywhere automatically. No follow-up PR needed for surface area, only for surface formatting.