Skip to main content

Overview

BPD parcel data is well-suited for AI and ML applications. The structured, consistent schema across 140M parcels makes it practical for embedding generation, natural language search, and agentic property research workflows.

Embedding generation

Flat file exports are the recommended delivery method for embedding workflows. Load the full parcel corpus into your pipeline and generate embeddings from parcel text fields:
Python
import httpx

# Example: build a text representation of a parcel for embedding
def parcel_to_text(parcel: dict) -> str:
    parts = []
    if parcel.get("owner"):
        parts.append(f"Owner: {parcel['owner']}")
    if parcel.get("situs"):
        parts.append(f"Address: {parcel['situs']}, {parcel.get('city', '')}, {parcel.get('state', '')}")
    if parcel.get("acres"):
        parts.append(f"Acres: {parcel['acres']}")
    if parcel.get("land_use"):
        parts.append(f"Land use: {parcel['land_use']}")
    if parcel.get("cdl_majority_category"):
        parts.append(f"Crop cover: {parcel['cdl_majority_category']} ({parcel.get('cdl_majority_percent', '')}%)")
    if parcel.get("zoning"):
        parts.append(f"Zoning: {parcel['zoning']}")
    if parcel.get("parcel_value"):
        parts.append(f"Assessed value: ${parcel['parcel_value']:,.0f}")
    return " | ".join(parts)
This approach works with any embedding model. For large-scale batch embedding, platforms like Databricks are commonly used to parallelize embedding generation across the full dataset.
Once embeddings are generated, you can implement natural language property search:
Python
# Query example: "large agricultural parcels in Iowa with corn cover"
# 1. Embed the query text
# 2. Retrieve semantically similar parcels from your vector index
# 3. Optionally post-filter with BPD API structured filters

API_KEY = "YOUR_API_KEY"
BASE = "https://api.buyparceldata.com"

# Structured query complement — filter by field after vector retrieval
response = httpx.post(
    f"{BASE}/parcels/query",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "filters": [
            {"field": "state_fp", "op": "eq", "value": "19"},
            {"field": "acres", "op": "gte", "value": 100},
            {"field": "cdl_majority_category", "op": "ilike", "value": "Corn"},
        ],
        "order": {"field": "acres", "direction": "desc"},
        "limit": 50,
    },
)
parcels = response.json()["results"]

Agentic workflows

BPD integrates cleanly into agentic AI workflows that need to answer questions about property, ownership, or land use. Typical patterns:
  • Geographic research agent: accepts a location or address, calls /parcels/point or /parcels/area, returns structured parcel data to the agent context
  • Owner lookup agent: queries /parcels/query with owner + state_fp filters to find all parcels owned by a given entity
  • Land screening agent: applies multi-field filters (acreage, crop type, zoning, adjacency) to identify candidate parcels for a specific use case
The API’s structured filter system maps naturally to agent tool parameters. Each filter field and operator can be exposed as a typed tool argument.

Flat files vs. API for AI use cases

Use caseRecommended
Batch embedding generationFlat files
Building a vector index over all parcelsFlat files
Real-time parcel lookup in an agentAPI
Filtering parcels by structured criteriaAPI
Offline analysis and model trainingFlat files
See Flat files for bulk delivery options.