Aethera
Engineering Case Study

How I Built Aethera

From raw web scraping to a high-performance, geolocation-aware directory. A look inside the 2026 HealthTech stack.

1

The Scraper Pipeline

The core challenge was acquiring clean, verified data from highly protected directories like Psychology Today. I built a multi-stage Python pipeline using BeautifulSoup for static parsing and Playwright for JavaScript-heavy network traversal.

Key Metrics

  • 1,063 Unique Providers
  • Multiple States Scraped
  • Anti-bot bypass logic
# Step 1: Python Scraper Mockup
def scrape_region(region_url):
    html = fetch_with_stealth(region_url)
    soup = BeautifulSoup(html, 'lxml')
    # ... DOM parsing logic ...
async def enrich_provider(page, name, city, state):
    # Search Bing, defeat tracking
    search_url = f"https://www.bing.com/search?q={query}"
    await page.goto(search_url)
    # Decode b64 tracking links
    link = decode_bing_url(a.get('href'))
    
    # Rip organic phone numbers via NLP
    phones = extract_valid_phones(site_text)

Fig 1.5: OSINT Playwright Enrichment

2

OSINT Data Enrichment

Proxy directories intentionally mask provider phone numbers and trap clinic websites behind redirects. We bypass this using a sophisticated Playwright & NLP pipeline to automatically scrape Bing, decode Base64 tracking links, and rip organic text from private practice DOMs.

Playwright Base64 Decode NLP Parsing
// Haversine Radius Calculation
$sql = "SELECT id, name, lat, lng FROM providers";
$radius = 10; // Miles

foreach($providers as $p) {
    if (!$p['lat']) continue;
    $dist = haversine($userLat, $userLng, $p['lat'], $p['lng']);
    if ($dist <= $radius) {
        $filtered[] = $p;
    }
}

Fig 2.0: SQL-Lite Compatible Radial Search Logic

3

Spatial Intelligence

Most directories fail at location because they rely on zip codes. I mapped every city to high-precision latitude/longitude coordinates (city_coords.py). This enables our 10-mile radius search feature via the Haversine formula, providing a much higher CX.

Accuracy

~0.1 Miles

Load Time

< 40ms

4

Premium Design Language

Clinical directories are often cold and complex. Aethera uses a custom HealthTech Glassmorphism design system. High-contrast typography (Playfair Display) paired with a functional 8px grid.

OHP Green Engine

Specific color-coding (Sage & Emerald) for Oregon Health Plan providers, making low-cost care easily visible.

Adaptive Funnels

Therapists take center stage; experimental "Treatments" are hidden deeper to ensure high-intent medical browsing.

Shadow Profiles

A dual-database approach that maps scraped public data into claimable "Shadow Profiles" for provider conversion.

Recent Shipping &
Engineering Roadmap

Stripe Monetization & Onboarding

Implemented multi-tier subscriptions, a secure provider dashboard, and clean dynamic URL routing.

2

Dynamic SEO Sitemap & NLP Tools

Building out automated AI tools to rewrite bios and generate local search pages for "Therapist in [City]".

3

Review Engine

A HIPAA-compliant patient feedback loop to verify quality of care across the network.