A Developer's Guide to the openFDA API

The openFDA API is one of the most valuable free resources available to healthcare technology developers. Launched in 2014, it provides programmatic access to millions of records across drug labels, adverse event reports, product recalls, device complaints, and food enforcement actions. For teams building formulary management systems, clinical decision support tools, or drug safety monitoring platforms, it is an essential data source.

This guide covers the practical details: what endpoints are available, how the query syntax works, what the rate limits are, and the patterns we have found most effective when building production systems on top of it.

Available Endpoints

The openFDA API is organized around five major domains, each with multiple endpoints:

Drug Label (/drug/label) - Structured Prescribing Label (SPL) data for marketed drugs. This is the richest drug endpoint, containing indications, contraindications, warnings, dosage information, and pharmacology data.
Drug Adverse Events (/drug/event) - Individual case safety reports from the FDA Adverse Event Reporting System (FAERS). Over 20 million records and growing.
Drug Recalls (/drug/enforcement) - Recall and enforcement action data, including classification, reason, and distribution information.
Drug NDC (/drug/ndc) - National Drug Code directory data including packaging, route, and active ingredients.
Device, Food, and other domains - Similar patterns for medical devices, food safety, tobacco, and animal/veterinary data.

Query Syntax

All openFDA endpoints use a common query syntax based on Elasticsearch. The base URL pattern is:

https://api.fda.gov/drug/label.json?search=QUERY&limit=N

The search parameter accepts field-level queries using the format field:value. Here is an example that fetches drug label data for atorvastatin:

# Fetch labels containing "atorvastatin" in the generic name field
curl "https://api.fda.gov/drug/label.json?\
search=openfda.generic_name:atorvastatin&\
limit=5"

You can combine conditions with AND and OR operators:

# Find adverse events for metformin with outcome "Death"
curl "https://api.fda.gov/drug/event.json?\
search=patient.drug.medicinalproduct:metformin+AND+\
serious:1&\
limit=10"

Counting and Aggregation

The count parameter lets you aggregate results without fetching individual records. This is extremely useful for analytics dashboards:

# Count adverse events by reaction for lisinopril
curl "https://api.fda.gov/drug/event.json?\
search=patient.drug.medicinalproduct:lisinopril&\
count=patient.reaction.reactionmeddrapt.exact"

This returns a ranked list of reaction terms and their frequencies, which is exactly the kind of data a formulary team needs when evaluating drug safety profiles.

Rate Limits and API Keys

Without an API key, openFDA allows 40 requests per minute and 1,000 requests per day. With a free API key (available at open.fda.gov), those limits increase to 240 requests per minute and no daily cap. For production systems, the API key is mandatory.

https://api.fda.gov/drug/label.json?api_key=YOUR_KEY&search=...

Practical Patterns for Production Use

Pattern 1: Local Caching with TTL

Drug label data does not change frequently. Most labels are updated quarterly or less. Caching API responses locally with a 24-hour TTL dramatically reduces API calls while keeping data reasonably current:

import hashlib, json, time, os

CACHE_DIR = "./openfda_cache"
CACHE_TTL = 86400  # 24 hours

def get_cached(url):
    key = hashlib.md5(url.encode()).hexdigest()
    cache_path = f"{CACHE_DIR}/{key}.json"

    if os.path.exists(cache_path):
        age = time.time() - os.path.getmtime(cache_path)
        if age < CACHE_TTL:
            with open(cache_path) as f:
                return json.load(f)

    response = requests.get(url)
    data = response.json()

    os.makedirs(CACHE_DIR, exist_ok=True)
    with open(cache_path, 'w') as f:
        json.dump(data, f)

    return data

Pattern 2: Batch Lookups with Retry

When processing a full formulary against openFDA (often thousands of NDCs), you need a robust batch pipeline with exponential backoff:

import time

def batch_lookup(ndc_list, batch_size=10):
    results = {}
    for i in range(0, len(ndc_list), batch_size):
        batch = ndc_list[i:i+batch_size]
        query = "+OR+".join(
            f'openfda.package_ndc:"{ndc}"' for ndc in batch
        )
        url = f"https://api.fda.gov/drug/label.json?search={query}&limit=100"

        for attempt in range(3):
            try:
                data = get_cached(url)
                for result in data.get("results", []):
                    for ndc in result.get("openfda", {}).get("package_ndc", []):
                        results[ndc] = result
                break
            except Exception:
                time.sleep(2 ** attempt)

    return results

Pattern 3: Adverse Event Signal Detection

For formulary teams evaluating drug safety, the adverse event endpoint can surface signals that inform tier placement and coverage decisions:

# Compare adverse event profiles for two drugs in the same class
def compare_safety(drug_a, drug_b, top_n=20):
    """Fetch top adverse reactions for two drugs and compare."""
    base = "https://api.fda.gov/drug/event.json"
    results = {}

    for drug in [drug_a, drug_b]:
        url = (f"{base}?search=patient.drug.medicinalproduct:{drug}"
               f"&count=patient.reaction.reactionmeddrapt.exact"
               f"&limit={top_n}")
        data = get_cached(url)
        results[drug] = {
            r["term"]: r["count"]
            for r in data.get("results", [])
        }

    return results

Limitations to Know About

OpenFDA is powerful but has specific limitations that affect system design:

No pricing data. OpenFDA does not include AWP, WAC, or any pricing information. You need Medi-Span, First Databank, or similar commercial sources for that.
NDC coverage gaps. Some older or niche products may not appear in the NDC endpoint. Cross-reference with the FDA's NDC Directory download for completeness.
Adverse event reporting bias. FAERS data is voluntarily reported and cannot be used to calculate incidence rates. It is useful for signal detection and comparative analysis, not for absolute risk quantification.
No historical snapshots. The API returns current data. If you need to know what a drug label said six months ago, you need to maintain your own versioned copies.

Despite these limitations, openFDA is a foundation of any modern healthcare data stack. Combined with RxNorm for drug relationships and DailyMed for prescribing information, it provides a remarkably comprehensive free data layer for pharmaceutical technology applications.