AI and ML workflows - BuyParcelData

Overview

BPD parcel data is well-suited for AI and ML applications. The structured, consistent schema across 140M parcels makes it practical for embedding generation, natural language search, and agentic property research workflows.

Embedding generation

Flat file exports are the recommended delivery method for embedding workflows. Load the full parcel corpus into your pipeline and generate embeddings from parcel text fields:

Python

import httpx

# Example: build a text representation of a parcel for embedding
def parcel_to_text(parcel: dict) -> str:
    parts = []
    if parcel.get("owner"):
        parts.append(f"Owner: {parcel['owner']}")
    if parcel.get("situs"):
        parts.append(f"Address: {parcel['situs']}, {parcel.get('city', '')}, {parcel.get('state', '')}")
    if parcel.get("acres"):
        parts.append(f"Acres: {parcel['acres']}")
    if parcel.get("land_use"):
        parts.append(f"Land use: {parcel['land_use']}")
    if parcel.get("cdl_majority_category"):
        parts.append(f"Crop cover: {parcel['cdl_majority_category']} ({parcel.get('cdl_majority_percent', '')}%)")
    if parcel.get("zoning"):
        parts.append(f"Zoning: {parcel['zoning']}")
    if parcel.get("parcel_value"):
        parts.append(f"Assessed value: ${parcel['parcel_value']:,.0f}")
    return " | ".join(parts)

This approach works with any embedding model. For large-scale batch embedding, platforms like Databricks are commonly used to parallelize embedding generation across the full dataset.

Natural language search

Once embeddings are generated, you can implement natural language property search:

Python

# Query example: "large agricultural parcels in Iowa with corn cover"
# 1. Embed the query text
# 2. Retrieve semantically similar parcels from your vector index
# 3. Optionally post-filter with BPD API structured filters

API_KEY = "YOUR_API_KEY"
BASE = "https://api.buyparceldata.com"

# Structured query complement — filter by field after vector retrieval
response = httpx.post(
    f"{BASE}/parcels/query",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "filters": [
            {"field": "state_fp", "op": "eq", "value": "19"},
            {"field": "acres", "op": "gte", "value": 100},
            {"field": "cdl_majority_category", "op": "ilike", "value": "Corn"},
        ],
        "order": {"field": "acres", "direction": "desc"},
        "limit": 50,
    },
)
parcels = response.json()["results"]

Agentic workflows

BPD integrates cleanly into agentic AI workflows that need to answer questions about property, ownership, or land use. Typical patterns:

Geographic research agent: accepts a location or address, calls /parcels/point or /parcels/area, returns structured parcel data to the agent context
Owner lookup agent: queries /parcels/query with owner + state_fp filters to find all parcels owned by a given entity
Land screening agent: applies multi-field filters (acreage, crop type, zoning, adjacency) to identify candidate parcels for a specific use case

The API’s structured filter system maps naturally to agent tool parameters. Each filter field and operator can be exposed as a typed tool argument.

Flat files vs. API for AI use cases

Use case	Recommended
Batch embedding generation	Flat files
Building a vector index over all parcels	Flat files
Real-time parcel lookup in an agent	API
Filtering parcels by structured criteria	API
Offline analysis and model training	Flat files

See Flat files for bulk delivery options.

​Overview

​Embedding generation

​Natural language search

​Agentic workflows

​Flat files vs. API for AI use cases

Overview

Embedding generation

Natural language search

Agentic workflows

Flat files vs. API for AI use cases