Bài 20: Working với APIs - Part 1
Mục Tiêu Bài Học
Sau khi hoàn thành bài này, bạn sẽ:
- ✅ Sử dụng requests library
- ✅ Làm việc với HTTP methods
- ✅ Handle headers và authentication
- ✅ Process responses
- ✅ Implement error handling
- ✅ Handle rate limiting
Requests Library
requests là thư viện phổ biến nhất để làm việc với HTTP APIs trong Python.
Installation
pip install requests
HTTP Methods
1. GET Request
GET được dùng để retrieve data.
import requests # Basic GET requestresponse = requests.get('https://api.github.com/users/octocat') print(f"Status Code: {response.status_code}")print(f"Content Type: {response.headers['content-type']}")print(f"Response: {response.json()}")
GET with Query Parameters
import requests # Query parameters as dictionaryparams = { 'q': 'python', 'sort': 'stars', 'order': 'desc', 'per_page': 5} response = requests.get( 'https://api.github.com/search/repositories', params=params) if response.status_code == 200: data = response.json() print(f"Total repositories: {data['total_count']}") for repo in data['items']: print(f"- {repo['name']}: {repo['stargazers_count']} stars")
2. POST Request
POST được dùng để create new resources.
import requestsimport json # POST with JSON dataurl = 'https://jsonplaceholder.typicode.com/posts' data = { 'title': 'My New Post', 'body': 'This is the content of my post.', 'userId': 1} # Method 1: Using json parameter (recommended)response = requests.post(url, json=data) # Method 2: Using data parameter with JSON string# response = requests.post(# url,# data=json.dumps(data),# headers={'Content-Type': 'application/json'}# ) print(f"Status Code: {response.status_code}")print(f"Response: {response.json()}")
POST with Form Data
import requests url = 'https://httpbin.org/post' # Form dataform_data = { 'username': 'john_doe', 'email': '[email protected]', 'age': '30'} response = requests.post(url, data=form_data) print(response.json()['form'])
POST with File Upload
import requests url = 'https://httpbin.org/post' # Upload single filefiles = { 'file': open('document.pdf', 'rb')} response = requests.post(url, files=files)print(response.json()) # Upload multiple filesfiles = { 'file1': open('document1.pdf', 'rb'), 'file2': open('document2.pdf', 'rb'),} response = requests.post(url, files=files) # Upload with additional datafiles = {'file': open('document.pdf', 'rb')}data = {'description': 'My document'} response = requests.post(url, files=files, data=data)
3. PUT Request
PUT được dùng để update existing resources.
import requests url = 'https://jsonplaceholder.typicode.com/posts/1' data = { 'id': 1, 'title': 'Updated Title', 'body': 'Updated content.', 'userId': 1} response = requests.put(url, json=data) print(f"Status Code: {response.status_code}")print(f"Updated: {response.json()}")
4. PATCH Request
PATCH được dùng để partial update.
import requests url = 'https://jsonplaceholder.typicode.com/posts/1' # Only update titledata = { 'title': 'Partially Updated Title'} response = requests.patch(url, json=data) print(f"Status Code: {response.status_code}")print(f"Updated: {response.json()}")
5. DELETE Request
DELETE được dùng để remove resources.
import requests url = 'https://jsonplaceholder.typicode.com/posts/1' response = requests.delete(url) print(f"Status Code: {response.status_code}") # 204 No Content means successful deletionif response.status_code == 204: print("Resource deleted successfully")
Headers và Authentication
Custom Headers
import requests url = 'https://api.github.com/users/octocat' headers = { 'User-Agent': 'MyApp/1.0', 'Accept': 'application/vnd.github.v3+json', 'X-Custom-Header': 'CustomValue'} response = requests.get(url, headers=headers)print(response.json())
Basic Authentication
import requestsfrom requests.auth import HTTPBasicAuth url = 'https://api.example.com/protected' # Method 1: Using auth parameterresponse = requests.get( url, auth=HTTPBasicAuth('username', 'password')) # Method 2: Using tuple (shorthand)response = requests.get( url, auth=('username', 'password')) print(response.status_code)
Bearer Token Authentication
import requests url = 'https://api.github.com/user'token = 'ghp_your_github_token_here' headers = { 'Authorization': f'Bearer {token}', 'Accept': 'application/vnd.github.v3+json'} response = requests.get(url, headers=headers) if response.status_code == 200: user = response.json() print(f"Username: {user['login']}") print(f"Name: {user['name']}")
API Key Authentication
import requests # Method 1: API key in headerurl = 'https://api.example.com/data'headers = { 'X-API-Key': 'your_api_key_here'} response = requests.get(url, headers=headers) # Method 2: API key in query parameterparams = { 'api_key': 'your_api_key_here', 'format': 'json'} response = requests.get(url, params=params)
OAuth 2.0 Authentication
import requests # Step 1: Get access tokentoken_url = 'https://oauth.example.com/token' token_data = { 'grant_type': 'client_credentials', 'client_id': 'your_client_id', 'client_secret': 'your_client_secret'} token_response = requests.post(token_url, data=token_data)access_token = token_response.json()['access_token'] # Step 2: Use access tokenapi_url = 'https://api.example.com/data'headers = { 'Authorization': f'Bearer {access_token}'} response = requests.get(api_url, headers=headers)print(response.json())
Response Handling
Response Properties
import requests response = requests.get('https://api.github.com/users/octocat') # Status codeprint(f"Status Code: {response.status_code}") # Headersprint(f"Content Type: {response.headers['content-type']}")print(f"All Headers: {response.headers}") # Contentprint(f"Text: {response.text}") # Raw textprint(f"JSON: {response.json()}") # Parsed JSON # Encodingprint(f"Encoding: {response.encoding}") # URLprint(f"URL: {response.url}") # Elapsed timeprint(f"Time: {response.elapsed.total_seconds()} seconds")
Status Code Checking
import requests response = requests.get('https://api.github.com/users/octocat') # Method 1: Direct comparisonif response.status_code == 200: print("Success!")elif response.status_code == 404: print("Not found!")elif response.status_code == 500: print("Server error!") # Method 2: Using status codes constantsif response.status_code == requests.codes.ok: print("Success!") # Method 3: Raise exception for error codestry: response.raise_for_status() print("Success!")except requests.exceptions.HTTPError as e: print(f"HTTP Error: {e}")
JSON Response
import requests response = requests.get('https://api.github.com/users/octocat') try: data = response.json() print(f"Username: {data['login']}") print(f"Followers: {data['followers']}")except requests.exceptions.JSONDecodeError: print("Response is not valid JSON")
Binary Response
import requests # Download imageresponse = requests.get('https://via.placeholder.com/150') if response.status_code == 200: with open('image.png', 'wb') as f: f.write(response.content) print("Image downloaded!")
Streaming Response
import requests url = 'https://example.com/large-file.zip' # Stream large filesresponse = requests.get(url, stream=True) with open('large-file.zip', 'wb') as f: for chunk in response.iter_content(chunk_size=8192): if chunk: f.write(chunk) print("File downloaded!") # Track progressfile_size = int(response.headers.get('content-length', 0))downloaded = 0 with open('large-file.zip', 'wb') as f: for chunk in response.iter_content(chunk_size=8192): if chunk: f.write(chunk) downloaded += len(chunk) percent = (downloaded / file_size) * 100 print(f"Downloaded: {percent:.1f}%", end='\r')
Error Handling
Exception Types
import requestsfrom requests.exceptions import ( RequestException, HTTPError, ConnectionError, Timeout, TooManyRedirects) url = 'https://api.example.com/data' try: response = requests.get(url, timeout=5) response.raise_for_status() data = response.json() except ConnectionError: print("Failed to connect to the server") except Timeout: print("Request timed out") except HTTPError as e: print(f"HTTP error occurred: {e}") print(f"Status Code: {e.response.status_code}") except TooManyRedirects: print("Too many redirects") except RequestException as e: print(f"An error occurred: {e}")
Retry Logic
import requestsfrom requests.adapters import HTTPAdapterfrom requests.packages.urllib3.util.retry import Retryimport time # Method 1: Manual retrydef get_with_retry(url, max_retries=3, delay=1): """Make request with retry logic.""" for attempt in range(max_retries): try: response = requests.get(url, timeout=5) response.raise_for_status() return response except requests.exceptions.RequestException as e: if attempt < max_retries - 1: print(f"Attempt {attempt + 1} failed. Retrying in {delay}s...") time.sleep(delay) delay *= 2 # Exponential backoff else: raise # Method 2: Using urllib3 Retrydef create_session_with_retry(): """Create session with automatic retry.""" session = requests.Session() retry_strategy = Retry( total=3, # Total retries status_forcelist=[429, 500, 502, 503, 504], # Retry on these codes method_whitelist=["HEAD", "GET", "OPTIONS", "POST"], backoff_factor=1 # Wait 1, 2, 4 seconds ) adapter = HTTPAdapter(max_retries=retry_strategy) session.mount("http://", adapter) session.mount("https://", adapter) return session # Usagesession = create_session_with_retry()response = session.get('https://api.example.com/data')
Timeout Configuration
import requests url = 'https://api.example.com/data' # Single timeout (applies to both connect and read)try: response = requests.get(url, timeout=5)except requests.exceptions.Timeout: print("Request timed out") # Separate connect and read timeoutstry: response = requests.get(url, timeout=(3, 10)) # (connect, read)except requests.exceptions.Timeout: print("Request timed out") # No timeout (not recommended)response = requests.get(url, timeout=None)
3 Ứng Dụng Thực Tế
1. GitHub API Client
import requestsfrom typing import List, Dict, Optional class GitHubClient: """GitHub API client.""" BASE_URL = 'https://api.github.com' def __init__(self, token: Optional[str] = None): self.session = requests.Session() self.session.headers.update({ 'Accept': 'application/vnd.github.v3+json', 'User-Agent': 'Python-GitHubClient/1.0' }) if token: self.session.headers['Authorization'] = f'token {token}' def get_user(self, username: str) -> Dict: """Get user information.""" response = self.session.get(f'{self.BASE_URL}/users/{username}') response.raise_for_status() return response.json() def get_user_repos(self, username: str) -> List[Dict]: """Get user repositories.""" repos = [] page = 1 while True: response = self.session.get( f'{self.BASE_URL}/users/{username}/repos', params={'page': page, 'per_page': 100} ) response.raise_for_status() data = response.json() if not data: break repos.extend(data) page += 1 return repos def search_repositories(self, query: str, sort: str = 'stars') -> List[Dict]: """Search repositories.""" response = self.session.get( f'{self.BASE_URL}/search/repositories', params={'q': query, 'sort': sort} ) response.raise_for_status() return response.json()['items'] def create_gist(self, description: str, files: Dict[str, str], public: bool = True) -> Dict: """Create a gist.""" data = { 'description': description, 'public': public, 'files': { filename: {'content': content} for filename, content in files.items() } } response = self.session.post( f'{self.BASE_URL}/gists', json=data ) response.raise_for_status() return response.json() # Usageclient = GitHubClient(token='your_token_here') # Get useruser = client.get_user('octocat')print(f"Name: {user['name']}")print(f"Followers: {user['followers']}") # Get repositoriesrepos = client.get_user_repos('octocat')print(f"Total repos: {len(repos)}") for repo in repos[:5]: print(f"- {repo['name']}: {repo['stargazers_count']} stars") # Search repositoriespython_repos = client.search_repositories('python machine learning')print(f"\nTop Python ML repos:")for repo in python_repos[:5]: print(f"- {repo['full_name']}: {repo['stargazers_count']} stars")
2. Weather API Client with Caching
import requestsimport timefrom typing import Dict, Optionalfrom functools import lru_cacheimport hashlibimport json class WeatherClient: """Weather API client with caching.""" BASE_URL = 'https://api.openweathermap.org/data/2.5' def __init__(self, api_key: str): self.api_key = api_key self.session = requests.Session() self.cache = {} self.cache_ttl = 600 # 10 minutes def _get_cache_key(self, endpoint: str, params: Dict) -> str: """Generate cache key.""" key_data = f"{endpoint}:{json.dumps(params, sort_keys=True)}" return hashlib.md5(key_data.encode()).hexdigest() def _get_from_cache(self, key: str) -> Optional[Dict]: """Get data from cache if valid.""" if key in self.cache: data, timestamp = self.cache[key] if time.time() - timestamp < self.cache_ttl: return data return None def _save_to_cache(self, key: str, data: Dict): """Save data to cache.""" self.cache[key] = (data, time.time()) def _make_request(self, endpoint: str, params: Dict) -> Dict: """Make API request with caching.""" params['appid'] = self.api_key # Check cache cache_key = self._get_cache_key(endpoint, params) cached_data = self._get_from_cache(cache_key) if cached_data: print("Returning cached data") return cached_data # Make request response = self.session.get( f'{self.BASE_URL}/{endpoint}', params=params ) response.raise_for_status() data = response.json() # Save to cache self._save_to_cache(cache_key, data) return data def get_current_weather(self, city: str, units: str = 'metric') -> Dict: """Get current weather for a city.""" return self._make_request('weather', { 'q': city, 'units': units }) def get_forecast(self, city: str, units: str = 'metric') -> Dict: """Get 5-day forecast.""" return self._make_request('forecast', { 'q': city, 'units': units }) def get_weather_by_coords(self, lat: float, lon: float, units: str = 'metric') -> Dict: """Get weather by coordinates.""" return self._make_request('weather', { 'lat': lat, 'lon': lon, 'units': units }) # Usageclient = WeatherClient(api_key='your_api_key_here') # Get current weatherweather = client.get_current_weather('London')print(f"City: {weather['name']}")print(f"Temperature: {weather['main']['temp']}°C")print(f"Description: {weather['weather'][0]['description']}") # Second call - returns cached dataweather = client.get_current_weather('London') # Get forecastforecast = client.get_forecast('Paris')print(f"\n5-day forecast for {forecast['city']['name']}:")for item in forecast['list'][:5]: print(f"- {item['dt_txt']}: {item['main']['temp']}°C")
3. REST API Wrapper with Rate Limiting
import requestsimport timefrom typing import Dict, Any, Optionalfrom collections import dequefrom datetime import datetime, timedelta class RateLimitedClient: """API client with rate limiting.""" def __init__(self, base_url: str, requests_per_minute: int = 60): self.base_url = base_url.rstrip('/') self.session = requests.Session() self.requests_per_minute = requests_per_minute self.request_times = deque() def _wait_if_needed(self): """Wait if rate limit would be exceeded.""" now = datetime.now() minute_ago = now - timedelta(minutes=1) # Remove old requests while self.request_times and self.request_times[0] < minute_ago: self.request_times.popleft() # Check if we need to wait if len(self.request_times) >= self.requests_per_minute: sleep_time = (self.request_times[0] - minute_ago).total_seconds() if sleep_time > 0: print(f"Rate limit reached. Waiting {sleep_time:.1f}s...") time.sleep(sleep_time) self._wait_if_needed() # Recheck def _record_request(self): """Record request timestamp.""" self.request_times.append(datetime.now()) def request(self, method: str, endpoint: str, **kwargs) -> requests.Response: """Make rate-limited request.""" self._wait_if_needed() url = f'{self.base_url}/{endpoint.lstrip("/")}' response = self.session.request(method, url, **kwargs) self._record_request() return response def get(self, endpoint: str, **kwargs) -> requests.Response: """GET request.""" return self.request('GET', endpoint, **kwargs) def post(self, endpoint: str, **kwargs) -> requests.Response: """POST request.""" return self.request('POST', endpoint, **kwargs) def put(self, endpoint: str, **kwargs) -> requests.Response: """PUT request.""" return self.request('PUT', endpoint, **kwargs) def delete(self, endpoint: str, **kwargs) -> requests.Response: """DELETE request.""" return self.request('DELETE', endpoint, **kwargs) # Usageclient = RateLimitedClient( base_url='https://api.example.com', requests_per_minute=10) # Make requests - automatically rate limitedfor i in range(15): print(f"Request {i + 1}") response = client.get('/data', params={'id': i}) print(f"Status: {response.status_code}")
Best Practices
1. Sử dụng Session
import requests # ❌ Don't: Create new connection for each requestfor i in range(100): response = requests.get('https://api.example.com/data') # ✅ Do: Reuse sessionsession = requests.Session()session.headers.update({'User-Agent': 'MyApp/1.0'}) for i in range(100): response = session.get('https://api.example.com/data')
2. Always Set Timeout
# ❌ Don't: No timeoutresponse = requests.get('https://api.example.com/data') # ✅ Do: Set timeoutresponse = requests.get('https://api.example.com/data', timeout=5)
3. Handle Errors
# ❌ Don't: Ignore errorsresponse = requests.get('https://api.example.com/data')data = response.json() # ✅ Do: Handle errorstry: response = requests.get('https://api.example.com/data', timeout=5) response.raise_for_status() data = response.json()except requests.exceptions.RequestException as e: print(f"Error: {e}")
4. Use Environment Variables for Secrets
import osimport requests # ❌ Don't: Hard-code secretsapi_key = 'my_secret_api_key' # ✅ Do: Use environment variablesapi_key = os.environ.get('API_KEY') if not api_key: raise ValueError("API_KEY environment variable not set") response = requests.get( 'https://api.example.com/data', headers={'Authorization': f'Bearer {api_key}'})
Bài Tập Thực Hành
Bài 1: REST API Client
Tạo client cho JSONPlaceholder API với CRUD operations.
Bài 2: Rate Limiter
Implement rate limiter với token bucket algorithm.
Bài 3: Retry Logic
Tạo decorator cho automatic retry với exponential backoff.
Bài 4: API Caching
Implement caching layer với TTL cho API responses.
Bài 5: Concurrent Requests
Sử dụng concurrent.futures để make multiple API requests đồng thời.
Tóm Tắt
Trong Part 1 chúng ta đã học:
- ✅ Requests Library - HTTP client cho Python
- ✅ HTTP Methods - GET, POST, PUT, PATCH, DELETE
- ✅ Headers & Authentication - Basic, Bearer, API Key, OAuth
- ✅ Response Handling - Status codes, JSON, streaming
- ✅ Error Handling - Exceptions, retry logic, timeouts
- ✅ Real Applications - GitHub client, Weather client, Rate limiter
Part 2 sẽ cover: Sessions, async requests, webhooks, và API best practices! 🚀
Bài tiếp theo: Bài 20.2: Advanced API Techniques 🌐