From Notion Database to Sales Intelligence Agent

From Notion Database to Sales Intelligence Agent
Building an MCP-Powered Pipeline That Turns Static CRM Data Into Actionable Intelligence
See the Architecture
View Results
The Challenge
When Your Sales Team Is Drowning in Data
Last month, our sales team came to the AI Center of Excellence with a familiar refrain: "Can AI help us stop drowning in data?" While they had access to the kind of enterprise CRM you'd expect at a company our size, their operational reality was different: a curated Notion database tailored to their specific workflows, tracking the metrics and relationships that actually mattered for their portfolio. Their vision? AI that could surface risks before they became surprises, draft contextual outreach, and turn static records into living intelligence.
I could have pushed them back toward the "official" system. But KISS kicked in: Keep It Simple, Stupid. They're already using Notion as their source of truth. Why fight that battle when we could augment what they have?
Three weeks later, we have a working system: an AI agent named Prospector that lives in their chat workflow, answers natural language questions about portfolio risk, drafts contextual outreach, and generates executive presentations on demand, all backed by real-time web intelligence that refreshes automatically.
Three weeks later, we have a working system: an AI agent named Prospector that lives in their chat workflow, answers natural language questions about portfolio risk, drafts contextual outreach, and generates executive presentations on demand, all backed by real-time web intelligence that refreshes automatically.
Here's how we built it.
The Intelligence Gap
The sales team was managing a substantial enterprise software portfolio, the data was there, but it was static. Nobody had time to research whether Company A was laying off their IT team or if Company B had budget freezes coming.
Sales reps were making decisions in information vacuums. A client might be three months from churn due to acquisition rumors, but the rep wouldn't know until the "we're not renewing" email landed in their inbox.
The scale problem:
This wasn't a data quality issue. The sales team was managing a $100M+ portfolio spanning over 600 client relationships, and frankly, their Notion database was impressive. Account details, renewal dates, relationship history, contact info: all well-maintained.
But here's the thing: keeping 600 records accurate is a full-time job. Keeping 600 records enriched with current market intelligence is impossible.
No human team can Google every client weekly, read the earnings reports, catch the layoff announcements, notice the acquisition rumors, while also doing their actual job of selling.
The data was solid. The intelligence wasn't there because it couldn't be. Not manually. Not at this scale.
Architecture Decision: Augment, Don't Replace
Before writing code, I mapped the options:
Option 3 won. The sales team already lives in Notion. Adding an intelligence layer on top means they get smarter data without changing their workflow, and without waiting six months for IT to prioritize CRM customizations.
The architecture that emerged:
┌─────────────────────────────────────────────────────────────────┐
│ User Interface                                                  │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐       │
│ │   Notion    │ │  BrainTrust │ │  FastAPI Swagger    │       │
│ │  Database   │ │ Chat Agent  │ │     (Debug)         │       │
│ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘       │
└─────────┼──────────────────┼─────────────────────┼─────────────┘
          │                  │                     │
          ▼                  ▼                     ▼
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI Backend                                                 │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐       │
│ │  /actions   │ │  /insights  │ │   /enrichment       │       │
│ │  endpoints  │ │  endpoints  │ │    endpoints        │       │
│ └─────────────┘ └─────────────┘ └─────────────────────┘       │
└─────────────────────────────┬───────────────────────────────────┘
                              │
          ┌───────────────────┼───────────────────┐
          ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│   Notion API    │ │   Firecrawl     │ │   Claude API    │
│ (Data Layer)    │ │  (Web Intel)    │ │   (Reasoning)   │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Key design principles:
Notion stays source of truth
No data migration, no sync conflicts
Intelligence is additive
New fields enrich existing records, never overwrite user data
API-first backend
Everything accessible via REST for agent integration
Human-in-the-loop by default
AI recommends, humans approve
The Intelligence Pipeline: Firecrawl + Evidence-Based Scoring
The enrichment pipeline runs on a simple premise: companies broadcast signals about their health constantly. News articles, press releases, job postings, earnings reports. We just need to collect and interpret them.
Firecrawl Integration
Firecrawl handles the web intelligence extraction. The key is using Firecrawl's agent mode with Pydantic schemas for structured extraction. No separate LLM call needed:
from firecrawl import Firecrawl
from pydantic import BaseModel, Field
from typing import Optional, List

# Pydantic schemas for structured extraction
class CompanyInfo(BaseModel):
    """Schema for company information extraction."""
    employee_count: Optional[str] = Field(description="Employee count or range (e.g., '1000-5000')")
    industry: Optional[str] = Field(description="Primary business category")
    company_size: Optional[str] = Field(description="Size category: startup/SMB/enterprise")
    headquarters: Optional[str] = Field(description="City, State/Country")

class CompanyNews(BaseModel):
    """Schema for company news extraction."""
    has_layoffs: bool = Field(default=False, description="Recent layoff announcements")
    has_ma: bool = Field(default=False, description="M&A activity (acquiring or being acquired)")
    is_expanding: bool = Field(default=False, description="Growth/expansion signals")
    sentiment: Optional[str] = Field(description="Overall news sentiment: positive/neutral/negative")
    key_events: List[str] = Field(default_factory=list, description="Recent significant events")
    sources: List[str] = Field(default_factory=list, description="URLs of source articles")

def enrich_company(firecrawl: Firecrawl, company_name: str) -> dict:
    """
    Enrich a single company using Firecrawl agent.
    Returns structured data via Pydantic schema extraction.
    """
    # Get company info
    info_result = firecrawl.agent(
        prompt=f"Find current information about {company_name}: employee count, industry, company size, headquarters location.",
        schema=CompanyInfo,
        max_pages=3
    )
    
    # Get recent news and events
    news_result = firecrawl.agent(
        prompt=f"Find recent news about {company_name} from the last 6 months. Look for: layoffs, acquisitions, mergers, expansion, funding, leadership changes.",
        schema=CompanyNews,
        max_pages=5
    )
    
    return {
        "company_name": company_name,
        "employee_count": info_result.employee_count,
        "industry": info_result.industry,
        "company_size": info_result.company_size,
        "headquarters": info_result.headquarters,
        "has_layoffs": news_result.has_layoffs,
        "has_ma": news_result.has_ma,
        "is_expanding": news_result.is_expanding,
        "sentiment": news_result.sentiment,
        "key_events": news_result.key_events,
        "sources": news_result.sources,
    }
The beauty of this approach: Firecrawl's agent handles the search, scraping, AND extraction in one call. The Pydantic schema enforces structure, so downstream code never deals with parsing failures.
Evidence-Based Risk Scoring
Here's the critical insight: sentiment analysis alone is useless for churn prediction. A company can have negative press coverage and still be a loyal customer. What matters is evidence of specific business events that correlate with contract risk.
I built an evidence-based scoring system that only flags real churn indicators:
class RiskScorer:
    """
    Evidence-based risk scoring for customer churn prediction.
    Each signal must be backed by a specific event, not just sentiment.
    """
    
    # Critical signals - immediate action required
    CRITICAL_SIGNALS = {
        "budget_cuts": 3,              # Explicit cost reduction programs
        "spending_freeze": 3,          # Procurement freezes announced
        "acquired_by_competitor": 3,   # Being bought by someone who has competing product
        "bankruptcy_filing": 3,        # Chapter 11 or equivalent
    }
    
    # High-risk signals - schedule outreach this week
    HIGH_RISK_SIGNALS = {
        "major_layoffs": 2,            # >10% workforce reduction
        "it_department_cuts": 2,       # Specifically IT/tech layoffs
        "cto_cio_departure": 2,        # Key technical stakeholder leaving
        "negative_earnings": 2,        # Missed earnings + negative guidance
        "contract_consolidation": 2,   # Announced vendor consolidation initiative
    }
    
    # Watch signals - monitor closely
    WATCH_SIGNALS = {
        "minor_layoffs": 1,            # <10% workforce reduction
        "leadership_transition": 1,    # C-suite changes (not tech-specific)
        "restructuring": 1,            # Org changes without clear direction
        "mixed_earnings": 1,           # Beat/miss with uncertain outlook
    }
    
    # Positive signals - reduce risk score
    POSITIVE_SIGNALS = {
        "acquisition_mode": -2,        # Company acquiring others (growth)
        "expansion_announced": -2,     # New markets, new offices
        "funding_round": -2,           # Fresh capital injection
        "record_revenue": -1,          # Strong financial performance
        "hiring_spree": -1,            # Significant hiring (especially IT)
    }
    
    def calculate_risk(self, events: list[CompanyEvent]) -> RiskAssessment:
        """
        Calculate risk score from list of detected events.
        Returns risk level + supporting evidence.
        """
        score = 0
        evidence = []
        
        for event in events:
            signal_weight = self._get_signal_weight(event.event_type)
            if signal_weight != 0:
                score += signal_weight
                evidence.append({
                    "signal": event.event_type,
                    "weight": signal_weight,
                    "date": event.event_date,
                    "summary": event.event_summary,
                    "source": event.source_url
                })
        
        risk_level = self._score_to_level(score)
        
        return RiskAssessment(
            level=risk_level,
            score=score,
            evidence=evidence,
            recommended_action=self._get_recommended_action(risk_level)
        )
    
    def _score_to_level(self, score: int) -> str:
        if score >= 5:
            return "CRITICAL"
        elif score >= 3:
            return "HIGH"
        elif score >= 1:
            return "WATCH"
        elif score <= -2:
            return "OPPORTUNITY"
        else:
            return "STABLE"
Why evidence-based scoring matters:
The system won't flag a company as high-risk just because there's negative sentiment in a news article. It needs to detect a specific event type (layoffs, budget cuts, acquisition) with a source URL. This dramatically reduces false positives and gives sales reps something concrete to reference in conversations.
❌ Bad: Sentiment-only approach
if sentiment == "negative":
    risk = "HIGH"
# Too many false positives
✓ Good: Evidence-based approach
if any(e.event_type == "major_layoffs" for e in events):
    risk = "HIGH"
# Actionable, verifiable signal
Risk Level to Action Mapping
Each risk level maps to a specific recommended action:
The Automation Stack: AWS Infrastructure
One of the first decisions was where to run this thing. The system has two distinct workloads with different requirements:
Trying to run both on the same infrastructure would mean either paying for idle compute (running enrichment on an always-on server) or dealing with cold starts (running the API on Lambda). The answer: use both.
The Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ AWS INFRASTRUCTURE                                                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│ ┌──────────────┐ ┌──────────────────────────────────────────────┐         │
│ │ EventBridge  │ │ EC2 (t3.small)                               │         │
│ │ (Daily 7am)  │ │ sales-pipeline.inceptapi.com                 │         │
│ └──────┬───────┘ │                                              │         │
│        │         │ ┌────────────────────────────────────────┐   │         │
│        ▼         │ │ FastAPI Server (uvicorn)               │   │         │
│ ┌──────────────┐ │ │ • /actions/* endpoints                 │   │         │
│ │ Orchestrator │ │ │ • /insights/* endpoints                │   │         │
│ │   Lambda     │ │ │ • /enrich endpoint (triggers Lambda)   │   │         │
│ │              │ │ │ • OpenAPI spec for MCP                 │   │         │
│ │ Reads Notion │ │ └────────────────────────────────────────┘   │         │
│ │ Finds stale  │ │                                              │         │
│ │  companies   │ └──────────────────────────────────────────────┘         │
│ └──────┬───────┘        ▲                                                  │
│        │                │                                                  │
│        │ Invokes (batches of 10)    Agent queries                         │
│        ▼                │                                                  │
│ ┌──────────────┐ ┌──────┴───────┐                                         │
│ │   Worker     │ │  BrainTrust  │                                         │
│ │   Lambda     │ ×N parallel │ Hive Agent   │                            │
│ │              │ └──────────────┘                                         │
│ │ • Firecrawl  │                                                          │
│ │ • Risk score │                                                          │
│ │ • Notion sync│                                                          │
│ └──────────────┘                                                          │
│                                                                            │
│ ┌──────────────────────────────────────────────────────────────────────┐  │
│ │ Secrets Manager: jay-portfolio-enrichment/secrets                    │  │
│ │ • NOTION_API_KEY, NOTION_DATABASE_ID, FIRECRAWL_API_KEY, API_KEY    │  │
│ └──────────────────────────────────────────────────────────────────────┘  │
│                                                                            │
└─────────────────────────────────────────────────────────────────────────────┘
Why Lambda for Enrichment
The enrichment pipeline is a perfect Lambda use case:
Burst pattern
Enriching 600 companies once a day means ~600 function invocations, then nothing until tomorrow. Paying for a server to sit idle 23 hours a day makes no sense.
Natural parallelization
Each company enrichment is independent. The orchestrator fans out to 10 concurrent workers, processes a batch, then fans out again. Total enrichment time for the full portfolio: ~15 minutes instead of ~10 hours sequential.
Built-in resilience
Lambda automatically retries failed invocations. If Firecrawl times out on one company, that worker retries without blocking the others.
# orchestrator.py - Fan-out pattern
def handler(event: Dict, context: Any) -> Dict:
    """
    Orchestrator Lambda - reads Notion, dispatches workers in parallel.
    Triggered daily by EventBridge at 12:00 UTC (7am EST).
    """
    secrets = get_secrets()  # From AWS Secrets Manager
    reader = NotionReader(
        api_key=secrets['NOTION_API_KEY'],
        database_id=secrets['NOTION_DATABASE_ID']
    )
    
    # Find companies needing refresh (not enriched in 7+ days)
    companies = reader.get_companies_to_enrich(
        max_companies=event.get('max_companies', 50),
        stale_days=event.get('stale_days', 7)
    )
    
    logger.info(f"Found {len(companies)} companies to enrich")
    
    # Fan out to workers in batches of 10
    results = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        futures = [
            executor.submit(invoke_worker, company, lambda_client)
            for company in companies
        ]
        for future in concurrent.futures.as_completed(futures):
            results.append(future.result())
    
    successful = sum(1 for r in results if r.get('success'))
    
    return {
        'statusCode': 200,
        'body': {
            'total': len(companies),
            'successful': successful,
            'failed': len(companies) - successful
        }
    }
Why EC2 for the API
The FastAPI server runs on a t3.small EC2 instance. Why not Lambda?
Always-on requirement
The BrainTrust agent needs sub-second responses. Lambda cold starts add 2-3 seconds of latency.
Connection pooling
The API maintains persistent connections to Notion. Lambda's stateless model would mean reconnecting on every request.
Cost math
A t3.small runs ~$15/month. The equivalent Lambda invocations would cost roughly the same, but with worse latency.
The API server is deployed via a simple shell script on the EC2 instance:
# deploy-api.sh
#!/bin/bash
set -e

# Update and install Docker
sudo yum update -y
sudo yum install docker -y
sudo service docker start
sudo usermod -a -G docker ec2-user

# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

# Pull and run the latest Docker image
cd /home/ec2-user/api
docker-compose down # Stop existing services
docker-compose pull # Pull latest images
docker-compose up -d # Start new services in detached mode

echo "Deployment complete."
EventBridge Scheduling
The daily enrichment pipeline is triggered by an AWS EventBridge rule configured with a cron expression. This ensures the orchestrator Lambda runs reliably every morning.
# EventBridge cron configuration
Name: daily-enrichment-trigger
Description: Triggers the Orchestrator Lambda daily at 7 AM EST (12 PM UTC).
Schedule: cron(0 12 ? * MON-FRI *) # Monday to Friday, 12:00 UTC
Target: OrchestratorLambdaFunctionARN
Input: { "max_companies": 50, "stale_days": 7 }
Secrets Management
All sensitive API keys and credentials are securely stored and retrieved from AWS Secrets Manager.
# get_secrets() function
import boto3
import json

def get_secrets():
    secret_name = "jay-portfolio-enrichment/secrets"
    region_name = "us-east-1" # Or your specific region

    # Create a Secrets Manager client
    client = boto3.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except Exception as e:
        # Handle exceptions, e.g., secret not found
        raise e

    # Decrypts secret using the associated KMS key.
    secret = get_secret_value_response['SecretString']
    return json.loads(secret)
Worker Function
The worker Lambda function performs the core enrichment and risk scoring for each company, then synchronizes the results back to Notion.
# worker.py
import json
from firecrawl import Firecrawl
from pydantic import BaseModel, Field
from typing import Optional, List, Dict, Any
from datetime import datetime
import os

# Assume RiskScorer and Notion syncing logic from previous card
# class CompanyInfo(...), CompanyNews(...), RiskScorer, sync_to_notion(...)

def handler(event: Dict, context: Any) -> Dict:
    """
    Worker Lambda - performs enrichment, risk scoring, and Notion sync for a single company.
    """
    # Load secrets (e.g., Firecrawl API key, Notion API key)
    secrets = get_secrets() # Re-use the get_secrets() function

    firecrawl_api_key = secrets['FIRECRAWL_API_KEY']
    notion_api_key = secrets['NOTION_API_KEY']
    
    firecrawl = Firecrawl(api_key=firecrawl_api_key)
    risk_scorer = RiskScorer() # Initialize risk scorer

    company_id = event['company_id']
    company_name = event['company_name']
    
    try:
        # 1. Enrich company using Firecrawl (as shown in previous section)
        info_result = firecrawl.agent(
            prompt=f"Find current information about {company_name}: employee count, industry, company size, headquarters location.",
            schema=CompanyInfo,
            max_pages=3
        )
        news_result = firecrawl.agent(
            prompt=f"Find recent news about {company_name} from the last 6 months. Look for: layoffs, acquisitions, mergers, expansion, funding, leadership changes.",
            schema=CompanyNews,
            max_pages=5
        )
        
        enrichment_data = {
            "employee_count": info_result.employee_count,
            "industry": info_result.industry,
            "company_size": info_result.company_size,
            "headquarters": info_result.headquarters,
            "has_layoffs": news_result.has_layoffs,
            "has_ma": news_result.has_ma,
            "is_expanding": news_result.is_expanding,
            "sentiment": news_result.sentiment,
            "key_events": news_result.key_events,
            "sources": news_result.sources,
        }

        # 2. Convert news_result into CompanyEvent objects for risk scoring
        # (This would involve parsing news_result.key_events and sources into a list of CompanyEvent instances)
        # For brevity, let's assume a simplified event structure for demonstration
        mock_events = []
        if news_result.has_layoffs: mock_events.append({"event_type": "major_layoffs"})
        if news_result.has_ma: mock_events.append({"event_type": "acquisition_mode"})
        if news_result.is_expanding: mock_events.append({"event_type": "expansion_announced"})
        
        risk_assessment = risk_scorer.calculate_risk(mock_events) # Use actual CompanyEvent objects here

        # 3. Sync to Notion
        sync_to_notion(
            page_id=company_id, 
            enrichment=enrichment_data, 
            risk=risk_assessment.dict(), # Convert Pydantic model to dict
            api_key=notion_api_key
        )
        
        return {
            'statusCode': 200,
            'body': json.dumps({
                'company_id': company_id,
                'status': 'success',
                'risk_level': risk_assessment.level
            })
        }
    except Exception as e:
        print(f"Error enriching company {company_name} ({company_id}): {e}")
        return {
            'statusCode': 500,
            'body': json.dumps({
                'company_id': company_id,
                'status': 'failed',
                'error': str(e)
            })
        }
Cost Breakdown
Here's what this actually costs to run:
Compare that to enterprise sales intelligence tools that charge $500+/seat/month. This system costs less than one user license and covers the entire team.
Notion Field Updates
The worker updates AI-managed fields in Notion (all prefixed with [Auto] to distinguish from user data):
def sync_to_notion(page_id: str, enrichment: Dict, risk: Dict, api_key: str) -> bool:
 """
 Update Notion page with enrichment results.
 Only updates AI-managed fields, never overwrites user data.
 """
 notion = Client(auth=api_key)
 
 properties = {
 # Company intel
 "[Auto] Employee Count": {"rich_text": [{"text": {"content": enrichment.get('employee_count', '')}}]},
 "[Auto] Industry": {"rich_text": [{"text": {"content": enrichment.get('industry', '')}}]},
 "[Auto] Company Size": {"rich_text": [{"text": {"content": enrichment.get('company_size', '')}}]},
 
 # Risk assessment
 "[Auto] Risk Level": {"select": {"name": risk['risk_level']} if risk['risk_level'] else None},
 "[Auto] News Sentiment": {"select": {"name": enrichment.get('sentiment')} if enrichment.get('sentiment') else None},
 "[Auto] Enrichment Notes": {"rich_text": [{"text": {"content": risk.get('reasoning', '')[:2000]}}]},
 
 # Signals (boolean flags)
 "[Auto] Has Layoffs": {"checkbox": enrichment.get('has_layoffs', False)},
 "[Auto] Has M&A": {"checkbox": enrichment.get('has_ma', False)},
 "[Auto] Is Expanding": {"checkbox": enrichment.get('is_expanding', False)},
 
 # Metadata
 "[Auto] Last Enriched": {"date": {"start": datetime.now().isoformat()[:10]}},
 }
 
 # Create action if needed
 if risk['risk_level'] in ['CRITICAL', 'HIGH']:
 action = ACTION_MAP.get(risk['risk_level'], 'Review Account')
 properties["[Auto] Recommendation Action"] = {"rich_text": [{"text": {"content": action}}]}
 properties["[Auto] Recommendation Status"] = {"select": {"name": "Pending"}}
 properties["[Auto] Recommendation Reasoning"] = {"rich_text": [{"text": {"content": risk['reasoning'][:2000]}}]}
 
 notion.pages.update(page_id=page_id, properties=properties)
 return True
The FastAPI Backend: REST Interface for Agent Integration
The backend exposes the enrichment data and action management through clean REST endpoints. This became the foundation for the agent integration.
# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI(
    title="Sales Intelligence API",
    description="Portfolio intelligence and action management",
    version="1.0.0"
)

# ============ Actions Endpoints ============

@app.get("/actions/pending", response_model=list[ActionItem])
async def get_pending_actions(
    risk_level: str | None = None,
    limit: int = 20
):
    """
    Get pending action items, optionally filtered by risk level.
    Returns actions sorted by priority (CRITICAL first).
    """
    actions = await notion_service.get_actions(
        status="pending",
        risk_level=risk_level,
        limit=limit
    )
    return sorted(actions, key=lambda a: RISK_PRIORITY[a.risk_level])

@app.post("/actions/{action_id}/complete")
async def complete_action(
    action_id: str,
    outcome: ActionOutcome
):
    """
    Mark an action as complete with outcome tracking.
    """
    await notion_service.update_action(
        action_id=action_id,
        status="completed",
        outcome=outcome.result,
        notes=outcome.notes,
        completed_at=datetime.now()
    )
    return {"status": "completed", "action_id": action_id}

# ============ Insights Endpoints ============

@app.get("/insights/portfolio-summary", response_model=PortfolioSummary)
async def get_portfolio_summary():
    """
    Executive overview of portfolio health.
    """
    companies = await notion_service.get_all_companies()
    
    return PortfolioSummary(
        total_companies=len(companies),
        total_arr=sum(c.arr for c in companies if c.arr),
        risk_distribution={
            level: len([c for c in companies if c.risk_level == level])
            for level in ['CRITICAL', 'HIGH', 'WATCH', 'STABLE', 'OPPORTUNITY']
        },
        enrichment_coverage=len([c for c in companies if c.last_enriched]) / len(companies),
        pending_actions=await notion_service.count_pending_actions()
    )

@app.get("/insights/churn-risk", response_model=list[ChurnRiskAccount])
async def get_churn_risk_accounts(
    quarter: str | None = None,
    min_arr: float | None = None
):
    """
    Get accounts with elevated churn risk, filtered by renewal quarter and ARR.
    """
    companies = await notion_service.get_companies(
        risk_levels=['CRITICAL', 'HIGH'],
        renewal_quarter=quarter,
        min_arr=min_arr
    )
    
    return [
        ChurnRiskAccount(
            company_name=c.name,
            arr=c.arr,
            risk_level=c.risk_level,
            risk_evidence=c.risk_evidence,
            renewal_date=c.renewal_date,
            days_to_renewal=(c.renewal_date - date.today()).days if c.renewal_date else None,
            recommended_action=get_recommended_action(c.risk_level)
        )
        for c in companies
    ]

@app.get("/insights/expansion-opportunities", response_model=list[ExpansionOpportunity])
async def get_expansion_opportunities(
    min_employees: int | None = None,
    industry: str | None = None
):
    """
    Get accounts showing growth signals suitable for upsell.
    """
    companies = await notion_service.get_companies(
        risk_levels=['OPPORTUNITY'],
        min_employees=min_employees,
        industry=industry
    )
    
    return [
        ExpansionOpportunity(
            company_name=c.name,
            current_arr=c.arr,
            employee_count=c.employee_count,
            growth_signals=[e for e in c.events if e.event_type in GROWTH_EVENTS],
            expansion_potential=estimate_expansion_potential(c)
        )
        for c in companies
    ]

# ============ Company Endpoints ============

@app.get("/companies/{company_id}", response_model=CompanyDetail)
async def get_company_detail(company_id: str):
    """
    Get full detail for a single company including enrichment data.
    """
    company = await notion_service.get_company(company_id)
    if not company:
        raise HTTPException(status_code=404, detail="Company not found")
    return company

@app.post("/companies/{company_id}/enrich")
async def trigger_enrichment(company_id: str):
    """
    Manually trigger enrichment for a specific company.
    """
    company = await notion_service.get_company(company_id)
    if not company:
        raise HTTPException(status_code=404, detail="Company not found")
    
    # Invoke worker Lambda directly
    await trigger_worker(company_id, company.name, company.domain)
    
    return {"status": "enrichment_triggered", "company": company.name}
The OpenAPI spec generated from these endpoints becomes the foundation for the agent integration.
Agent Integration
Meet Prospector: Where Users Actually Work
Here's the thing about enterprise tools: nobody wants another application to check. Account managers live in Google Chat. They're not going to context-switch to a separate dashboard, no matter how beautiful it is.
We run an internal platform called BrainTrust, an AI agent orchestration system that meets users where they are. Instead of building a chatbot from scratch, I created a specialized agent called "Prospector" that plugs into BrainTrust and gives the sales team conversational access to the pipeline intelligence.
How BrainTrust Works
BrainTrust uses a concept called Hive: you upload an OpenAPI spec, and it automatically creates an MCP (Model Context Protocol) server. The AI agent can then call your API endpoints as tools, with full type safety and parameter validation.
┌─────────────────────────────────────────────────────────────────────────┐
│ USER INTERACTION LAYER                                                  │
│ ┌─────────────┐ ┌──────────────┐ ┌─────────────────────────┐          │
│ │ Google Chat │────│ Hive Agent   │────│ MCP Server (from OpenAPI)│     │
│ │   (User)    │ │ (Prospector) │ │ sales-pipeline API      │          │
│ └─────────────┘ └──────────────┘ └─────────────────────────┘          │
│        ↑                                                                │
│        │                                                                │
│        │ Other MCP Tools:                                              │
│        │ - Gmail (send emails)                                         │
│        │ - Google Docs (create docs)                                   │
│        │ - Calendar (schedule calls)                                   │
│        │                                                                │
└─────────────────────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ API LAYER (FastAPI)                                                     │
│ api_server.py - REST API on sales-pipeline.inceptapi.com               │
│                                                                         │
│ Endpoints:                                                              │
│ • /actions/pending - Get actions needing attention                     │
│ • /actions/company/{n} - Get actions for specific company              │
│ • /actions/{id}/approve - Approve action + track execution             │
│ • /actions/{id}/deny - Deny action with reason                         │
│ • /insights/churn-risk - At-risk accounts by quarter                   │
│ • /insights/portfolio-summary - Executive overview                     │
│ • /enrich - Trigger AI enrichment                                      │
└─────────────────────────────────────────────────────────────────────────┘
The key insight: I didn't build an agent platform. I built an API and leveraged BrainTrust to make it conversational.
OpenAPI → MCP: Zero Agent Code
Here's the OpenAPI spec that defines what Prospector can do:
# openapi.yaml (uploaded to BrainTrust Hive)
openapi: 3.0.3
info:
  title: Sales Pipeline Actions API
  description: |
    API for managing AI-driven actions in the ESW Sales Pipeline.
    Portfolio intelligence and action management capabilities:
    - Portfolio enrichment with external intelligence (Firecrawl AI)
    - AI-driven risk analysis and action recommendations
    - Human-in-the-loop action approval workflow
    - Portfolio insights and analytics
    
    **Production URL:** https://sales-pipeline.inceptapi.com
  version: 1.0.0

paths:
  /actions/pending:
    get:
      summary: Get all pending actions
      description: |
        Get all pending actions that need attention.
        Returns entries with AI Actions that are not yet Completed or Denied.
      operationId: getPendingActions
      parameters:
        - name: limit
          in: query
          schema:
            type: integer
            default: 100

  /actions/{page_id}/approve:
    post:
      summary: Approve a pending action
      description: |
        Approve a pending action and mark it as completed.
        Optionally provide execution details to track what was done.
      operationId: approveAction
      parameters:
        - name: page_id
          in: path
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              type: object
              properties:
                execution_type:
                  type: string
                  description: What happened (e.g., "Email Sent", "Call Scheduled")
                execution_link:
                  type: string
                  description: Link to artifact (e.g., Google Doc URL)

  /insights/churn-risk:
    get:
      summary: Get churn risk analysis
      description: |
        Get companies at risk of churning in specified timeframe.
        Sorted by risk severity (CRITICAL first), then by ARR descending.
      operationId: getChurnRisk
      parameters:
        - name: quarter
          in: query
          description: Quarter like Q1-2025
          schema:
            type: string
            pattern: '^Q[1-4]-\d{4}$'

  /enrich:
    post:
      summary: Trigger portfolio enrichment
      description: |
        Enriches companies with Firecrawl AI (news, sentiment, risk signals)
        and automatically creates AI Actions in Notion for high-risk accounts.
      operationId: enrichPortfolio
      requestBody:
        content:
          application/json:
            schema:
              type: object
              properties:
                company_name:
                  type: string
                  description: Single company to enrich
Upload this to BrainTrust Hive, and you get an MCP server. The agent automatically knows:
What tools are available (getPendingActions, approveAction, getChurnRisk, etc.)
What parameters each tool accepts
What the response schemas look like
No LangChain. No custom tool definitions. No agent code at all.
How It Actually Works
When a user asks a question in Google Chat, here's the flow:
User (Google Chat): "Are there any actions to take?"
         ↓
Hive Agent: Calls getPendingActions via MCP
         ↓
API: Returns list of pending actions with reasoning
         ↓
Agent: "Yes, Acme Corp has a 'Schedule Call' action. Risk level is HIGH due to budget cuts mentioned in recent news. Would you like me to schedule a call?"
         ↓
User: "Yes, schedule it for next Tuesday"
         ↓
Agent: Uses Calendar MCP tool to schedule call
         ↓
Agent: Calls approveAction with execution_type="Call Scheduled"
         ↓
Notion: Action History updated, ready for next action
The agent has access to multiple MCP tools. Not just the sales pipeline, but also Gmail, Google Docs, and Calendar. This means it can execute the full workflow: identify the risk → draft an email → schedule a call → mark the action complete.
The Compound Effect of MCP
This is where the architecture pays off. Because everything is exposed as MCP tools, the agent can compose actions:
User: "What needs my attention today?"
Prospector:
Calls getPendingActions → 3 items
Calls getChurnRisk(quarter="Q1-2025") → 2 additional accounts
Synthesizes: "You have 5 accounts needing attention. The highest priority is Acme Corp (CRITICAL risk, $150K ARR, renewal in 45 days). They announced budget cuts last week. Want me to draft a cost-optimization pitch?"
User: "Yes, draft something"
Prospector:
Calls getActionsForCompany("Acme Corp") → Gets full context
Uses Gmail MCP to draft email with cost-optimization messaging
"Here's a draft email. I've highlighted the efficiency gains from consolidation. Send it?"
User: "Send it and mark the action done"
Prospector:
Sends via Gmail MCP
Calls approveAction with execution_type="Email Sent", execution_link="<gmail link>"
"Done. Action marked complete. Next up is TechCorp..."
The sales team isn't learning a new tool. They're having a conversation that happens to orchestrate five different systems.
Real Interaction Examples
Query: "What needs my attention today?"
Prospector: You have 4 pending actions across 3 priority levels:

1. CRITICAL: Acme Corp (churn risk)
   - Action: Follow up on recent budget cuts.
   - AI Reasoning: Firecrawl AI detected budget cut news; increased risk score.

2. HIGH: Innovate Solutions (expansion opportunity)
   - Action: Prepare upsell proposal for new product line.
   - AI Reasoning: LinkedIn Sales Navigator indicates new funding round and hiring surge.

3. WATCH: Global Dynamics (renewal upcoming)
   - Action: Check in with account, ensure satisfaction.
   - AI Reasoning: No recent activity or sentiment changes, proactive outreach recommended.

4. WATCH: Zenith Holdings (stalled deal)
   - Action: Re-engage with decision-maker after recent acquisition news.
   - AI Reasoning: Google Alerts shows acquisition, potential change in leadership.

Would you like to prioritize any of these, or should I draft a summary email for your manager?
Query: "Tell me about Acme Corp"
Prospector: Acme Corp is a CRITICAL churn risk account, with an ARR of $150K, and their renewal is in 45 days.

Key Risk Factors:
- Recent news of budget cuts (Firecrawl AI)
- Executive leadership changes (LinkedIn)
- Decreased product usage observed over last 2 months (internal data)

Last Action: Follow-up email sent on 2024-03-10 by John Doe.

Recommended Action: Schedule a call to discuss cost optimization and value alignment. Would you like me to draft a meeting invite?
Query: "Schedule a call for Acme Corp next Tuesday at 2 PM PST"
Prospector: Got it. Scheduling a 30-minute meeting with Acme Corp for next Tuesday (2024-03-26) at 2 PM PST. I'll include a placeholder agenda to discuss "Optimizing Your Investment with [Our Product]".

Confirm scheduling?
Multi-Agent Collaboration
The BrainTrust platform supports multiple agents. Prospector is specialized for sales intelligence, but it frequently collaborates with other internal agents to complete tasks.
Prospector (Sales Agent): Identifies a high-value account for upsell.
Marketing Agent: Generates a personalized case study based on Prospector's findings.
Support Agent: Provides a summary of recent support tickets and satisfaction scores for the account.
Prospector: Synthesizes this information into a comprehensive upsell pitch for the account manager.
This allows for complex, cross-functional workflows without manual coordination.
Where Users Actually Interact: Google Chat
Integrating directly into Google Chat is crucial for adoption. Account managers spend their day in chat for quick, informal communication. By bringing the sales intelligence directly to them, we eliminate friction.
Instant Access: No need to open a new tab or application.
Contextual Conversations: Sales data and actions are discussed in the same channels where team collaboration happens.
Workflow Automation: Complex tasks are initiated and tracked from a familiar interface.
Notifications: Proactive alerts about high-risk accounts or new opportunities appear directly in their chat feed.
This approach maximizes utility and minimizes the learning curve, making the AI assistant an indispensable part of the sales workflow.
Status Update
Where We Are Now
Full disclosure: I'm writing this article the week we started demonstrating the system. I don't have six months of production metrics to share. What I have is a working pipeline, a functional agent, and a sales team that's genuinely excited to use it.
What's Built:
Enrichment pipeline running daily against 600+ accounts
Risk scoring generating evidence-based flags
FastAPI backend exposing all intelligence via REST
Prospector agent answering natural language queries in Google Chat
Action management workflow for human-in-the-loop approvals
What's Happening This Week:
Tomorrow I'm meeting with the sales team to walk through the system end-to-end. We'll look at real enrichment data, review the risk flags it's generating, and discuss what's working vs. what needs adjustment.
The skeleton is there. But sales intelligence isn't something you build in isolation. The risk scoring weights? Those need validation from people who actually know which signals predict churn. The action recommendations? Those need input from reps who understand what "reach out" actually means in context. The enrichment fields? Those might need additions I haven't thought of.
Early Signals:
The team's reaction so far has been encouraging. When I demoed the agent answering questions about portfolio risk, the response wasn't "that's cool" but "when can I start using this?" That's the reaction you want.
The real test comes when it's in daily use. Does the risk scoring surface actual problems? Do the action recommendations feel useful or noisy? Does the agent save time or create friction? Those answers come from usage, not demos.
What I'm Watching:
Which risk signals correlate with actual renewal outcomes
How often AI-suggested actions get approved vs. modified vs. rejected
What questions the team asks that the system can't answer yet
Where the enrichment data is wrong or incomplete
I'll follow up with real results once we have them. For now, this is a story about building the thing, not measuring its impact.
The Human-AI Handoff Model
The key design decision throughout this project: AI recommends, humans decide.
Prospector doesn't send emails without approval. It doesn't mark actions complete without confirmation. It doesn't update customer records with speculative information. Every output is a draft, a suggestion, a data point for human judgment.
The workflow:
1
AI identifies opportunity or risk
(based on evidence)
2
AI researches context and drafts response
3
Human reviews, adjusts, approves
4
AI executes and tracks completion
5
Both learn from outcome
This creates a feedback loop where the AI gets better at context (we track which suggestions get approved vs. modified) and the human stays focused on relationship work.
What AI handles well:
Data gathering at scale
Pattern detection across portfolio
Draft generation with consistent structure
Task tracking and reminders
Presentation assembly
What humans handle better:
Reading between the lines of a negotiation
Navigating procurement politics
Deciding when to push vs. accommodate
Building genuine relationships
Handling exceptions and edge cases
The 80/20 rule applies: AI handles 80% of the information processing so humans can focus 80% of their time on the 20% that actually requires human judgment.
What's Next
1
Immediate (Next 30 Days)
Incorporate feedback from sales team on risk signal relevance
Tune scoring weights based on their domain knowledge
Add any enrichment fields they're missing
Track which action recommendations get approved vs. ignored
2
Near-Term (Q1)
Churn probability scoring using renewal countdown + risk trends
Expansion likelihood based on company growth signals
Calendar integration for automated meeting prep
Support ticket sentiment as a leading indicator
3
Future Exploration
LinkedIn hiring signals for stakeholder changes
Closed-loop learning: track prediction accuracy and tune weights
CRM sync for teams using Salesforce alongside Notion
The Bottom Line
AI can't close deals for you, but it can make sure you're having the right conversations with the right people at the right time.
The technical stack (Firecrawl + FastAPI + MCP + BrainTrust) is less important than the design principles:
Augment existing workflows
Don't ask people to change how they work
Evidence over sentiment
Actionable signals require verifiable events
Human-in-the-loop by default
AI recommends, humans approve
API-first architecture
Everything should be accessible to agents
The magic happens when AI amplifies human expertise instead of trying to replace it. Prospector doesn't make sales calls, but it can ensure every sales call is informed by the best available intelligence.
That's the difference between automation and intelligence augmentation.
Building in Public
This is a building-in-public story. I'll follow up with real adoption data and lessons learned once we've had the system in production for a few months. Subscribe if you want to see how it actually plays out.
Questions about the architecture? Running into similar challenges with sales intelligence? Drop a comment. I'm curious how others are approaching this problem.
Get in Touch
Subscribe for Updates