Bài 20: Working với APIs - Part 1

Mục Tiêu Bài Học

Sau khi hoàn thành bài này, bạn sẽ:

  • ✅ Sử dụng requests library
  • ✅ Làm việc với HTTP methods
  • ✅ Handle headers và authentication
  • ✅ Process responses
  • ✅ Implement error handling
  • ✅ Handle rate limiting

Requests Library

requests là thư viện phổ biến nhất để làm việc với HTTP APIs trong Python.

Installation

pip install requests

HTTP Methods

1. GET Request

GET được dùng để retrieve data.

import requests # Basic GET requestresponse = requests.get('https://api.github.com/users/octocat') print(f"Status Code: {response.status_code}")print(f"Content Type: {response.headers['content-type']}")print(f"Response: {response.json()}")

GET with Query Parameters

import requests # Query parameters as dictionaryparams = {    'q': 'python',    'sort': 'stars',    'order': 'desc',    'per_page': 5} response = requests.get(    'https://api.github.com/search/repositories',    params=params) if response.status_code == 200:    data = response.json()    print(f"Total repositories: {data['total_count']}")        for repo in data['items']:        print(f"- {repo['name']}: {repo['stargazers_count']} stars")

2. POST Request

POST được dùng để create new resources.

import requestsimport json # POST with JSON dataurl = 'https://jsonplaceholder.typicode.com/posts' data = {    'title': 'My New Post',    'body': 'This is the content of my post.',    'userId': 1} # Method 1: Using json parameter (recommended)response = requests.post(url, json=data) # Method 2: Using data parameter with JSON string# response = requests.post(#     url,#     data=json.dumps(data),#     headers={'Content-Type': 'application/json'}# ) print(f"Status Code: {response.status_code}")print(f"Response: {response.json()}")

POST with Form Data

import requests url = 'https://httpbin.org/post' # Form dataform_data = {    'username': 'john_doe',    'email': '[email protected]',    'age': '30'} response = requests.post(url, data=form_data) print(response.json()['form'])

POST with File Upload

import requests url = 'https://httpbin.org/post' # Upload single filefiles = {    'file': open('document.pdf', 'rb')} response = requests.post(url, files=files)print(response.json()) # Upload multiple filesfiles = {    'file1': open('document1.pdf', 'rb'),    'file2': open('document2.pdf', 'rb'),} response = requests.post(url, files=files) # Upload with additional datafiles = {'file': open('document.pdf', 'rb')}data = {'description': 'My document'} response = requests.post(url, files=files, data=data)

3. PUT Request

PUT được dùng để update existing resources.

import requests url = 'https://jsonplaceholder.typicode.com/posts/1' data = {    'id': 1,    'title': 'Updated Title',    'body': 'Updated content.',    'userId': 1} response = requests.put(url, json=data) print(f"Status Code: {response.status_code}")print(f"Updated: {response.json()}")

4. PATCH Request

PATCH được dùng để partial update.

import requests url = 'https://jsonplaceholder.typicode.com/posts/1' # Only update titledata = {    'title': 'Partially Updated Title'} response = requests.patch(url, json=data) print(f"Status Code: {response.status_code}")print(f"Updated: {response.json()}")

5. DELETE Request

DELETE được dùng để remove resources.

import requests url = 'https://jsonplaceholder.typicode.com/posts/1' response = requests.delete(url) print(f"Status Code: {response.status_code}") # 204 No Content means successful deletionif response.status_code == 204:    print("Resource deleted successfully")

Headers và Authentication

Custom Headers

import requests url = 'https://api.github.com/users/octocat' headers = {    'User-Agent': 'MyApp/1.0',    'Accept': 'application/vnd.github.v3+json',    'X-Custom-Header': 'CustomValue'} response = requests.get(url, headers=headers)print(response.json())

Basic Authentication

import requestsfrom requests.auth import HTTPBasicAuth url = 'https://api.example.com/protected' # Method 1: Using auth parameterresponse = requests.get(    url,    auth=HTTPBasicAuth('username', 'password')) # Method 2: Using tuple (shorthand)response = requests.get(    url,    auth=('username', 'password')) print(response.status_code)

Bearer Token Authentication

import requests url = 'https://api.github.com/user'token = 'ghp_your_github_token_here' headers = {    'Authorization': f'Bearer {token}',    'Accept': 'application/vnd.github.v3+json'} response = requests.get(url, headers=headers) if response.status_code == 200:    user = response.json()    print(f"Username: {user['login']}")    print(f"Name: {user['name']}")

API Key Authentication

import requests # Method 1: API key in headerurl = 'https://api.example.com/data'headers = {    'X-API-Key': 'your_api_key_here'} response = requests.get(url, headers=headers) # Method 2: API key in query parameterparams = {    'api_key': 'your_api_key_here',    'format': 'json'} response = requests.get(url, params=params)

OAuth 2.0 Authentication

import requests # Step 1: Get access tokentoken_url = 'https://oauth.example.com/token' token_data = {    'grant_type': 'client_credentials',    'client_id': 'your_client_id',    'client_secret': 'your_client_secret'} token_response = requests.post(token_url, data=token_data)access_token = token_response.json()['access_token'] # Step 2: Use access tokenapi_url = 'https://api.example.com/data'headers = {    'Authorization': f'Bearer {access_token}'} response = requests.get(api_url, headers=headers)print(response.json())

Response Handling

Response Properties

import requests response = requests.get('https://api.github.com/users/octocat') # Status codeprint(f"Status Code: {response.status_code}") # Headersprint(f"Content Type: {response.headers['content-type']}")print(f"All Headers: {response.headers}") # Contentprint(f"Text: {response.text}")  # Raw textprint(f"JSON: {response.json()}")  # Parsed JSON # Encodingprint(f"Encoding: {response.encoding}") # URLprint(f"URL: {response.url}") # Elapsed timeprint(f"Time: {response.elapsed.total_seconds()} seconds")

Status Code Checking

import requests response = requests.get('https://api.github.com/users/octocat') # Method 1: Direct comparisonif response.status_code == 200:    print("Success!")elif response.status_code == 404:    print("Not found!")elif response.status_code == 500:    print("Server error!") # Method 2: Using status codes constantsif response.status_code == requests.codes.ok:    print("Success!") # Method 3: Raise exception for error codestry:    response.raise_for_status()    print("Success!")except requests.exceptions.HTTPError as e:    print(f"HTTP Error: {e}")

JSON Response

import requests response = requests.get('https://api.github.com/users/octocat') try:    data = response.json()    print(f"Username: {data['login']}")    print(f"Followers: {data['followers']}")except requests.exceptions.JSONDecodeError:    print("Response is not valid JSON")

Binary Response

import requests # Download imageresponse = requests.get('https://via.placeholder.com/150') if response.status_code == 200:    with open('image.png', 'wb') as f:        f.write(response.content)    print("Image downloaded!")

Streaming Response

import requests url = 'https://example.com/large-file.zip' # Stream large filesresponse = requests.get(url, stream=True) with open('large-file.zip', 'wb') as f:    for chunk in response.iter_content(chunk_size=8192):        if chunk:            f.write(chunk)    print("File downloaded!") # Track progressfile_size = int(response.headers.get('content-length', 0))downloaded = 0 with open('large-file.zip', 'wb') as f:    for chunk in response.iter_content(chunk_size=8192):        if chunk:            f.write(chunk)            downloaded += len(chunk)            percent = (downloaded / file_size) * 100            print(f"Downloaded: {percent:.1f}%", end='\r')

Error Handling

Exception Types

import requestsfrom requests.exceptions import (    RequestException,    HTTPError,    ConnectionError,    Timeout,    TooManyRedirects) url = 'https://api.example.com/data' try:    response = requests.get(url, timeout=5)    response.raise_for_status()    data = response.json()    except ConnectionError:    print("Failed to connect to the server")    except Timeout:    print("Request timed out")    except HTTPError as e:    print(f"HTTP error occurred: {e}")    print(f"Status Code: {e.response.status_code}")    except TooManyRedirects:    print("Too many redirects")    except RequestException as e:    print(f"An error occurred: {e}")

Retry Logic

import requestsfrom requests.adapters import HTTPAdapterfrom requests.packages.urllib3.util.retry import Retryimport time # Method 1: Manual retrydef get_with_retry(url, max_retries=3, delay=1):    """Make request with retry logic."""    for attempt in range(max_retries):        try:            response = requests.get(url, timeout=5)            response.raise_for_status()            return response        except requests.exceptions.RequestException as e:            if attempt < max_retries - 1:                print(f"Attempt {attempt + 1} failed. Retrying in {delay}s...")                time.sleep(delay)                delay *= 2  # Exponential backoff            else:                raise # Method 2: Using urllib3 Retrydef create_session_with_retry():    """Create session with automatic retry."""    session = requests.Session()        retry_strategy = Retry(        total=3,  # Total retries        status_forcelist=[429, 500, 502, 503, 504],  # Retry on these codes        method_whitelist=["HEAD", "GET", "OPTIONS", "POST"],        backoff_factor=1  # Wait 1, 2, 4 seconds    )        adapter = HTTPAdapter(max_retries=retry_strategy)    session.mount("http://", adapter)    session.mount("https://", adapter)        return session # Usagesession = create_session_with_retry()response = session.get('https://api.example.com/data')

Timeout Configuration

import requests url = 'https://api.example.com/data' # Single timeout (applies to both connect and read)try:    response = requests.get(url, timeout=5)except requests.exceptions.Timeout:    print("Request timed out") # Separate connect and read timeoutstry:    response = requests.get(url, timeout=(3, 10))  # (connect, read)except requests.exceptions.Timeout:    print("Request timed out") # No timeout (not recommended)response = requests.get(url, timeout=None)

3 Ứng Dụng Thực Tế

1. GitHub API Client

import requestsfrom typing import List, Dict, Optional class GitHubClient:    """GitHub API client."""        BASE_URL = 'https://api.github.com'        def __init__(self, token: Optional[str] = None):        self.session = requests.Session()        self.session.headers.update({            'Accept': 'application/vnd.github.v3+json',            'User-Agent': 'Python-GitHubClient/1.0'        })                if token:            self.session.headers['Authorization'] = f'token {token}'        def get_user(self, username: str) -> Dict:        """Get user information."""        response = self.session.get(f'{self.BASE_URL}/users/{username}')        response.raise_for_status()        return response.json()        def get_user_repos(self, username: str) -> List[Dict]:        """Get user repositories."""        repos = []        page = 1                while True:            response = self.session.get(                f'{self.BASE_URL}/users/{username}/repos',                params={'page': page, 'per_page': 100}            )            response.raise_for_status()                        data = response.json()            if not data:                break                        repos.extend(data)            page += 1                return repos        def search_repositories(self, query: str, sort: str = 'stars') -> List[Dict]:        """Search repositories."""        response = self.session.get(            f'{self.BASE_URL}/search/repositories',            params={'q': query, 'sort': sort}        )        response.raise_for_status()        return response.json()['items']        def create_gist(self, description: str, files: Dict[str, str],                    public: bool = True) -> Dict:        """Create a gist."""        data = {            'description': description,            'public': public,            'files': {                filename: {'content': content}                for filename, content in files.items()            }        }                response = self.session.post(            f'{self.BASE_URL}/gists',            json=data        )        response.raise_for_status()        return response.json() # Usageclient = GitHubClient(token='your_token_here') # Get useruser = client.get_user('octocat')print(f"Name: {user['name']}")print(f"Followers: {user['followers']}") # Get repositoriesrepos = client.get_user_repos('octocat')print(f"Total repos: {len(repos)}") for repo in repos[:5]:    print(f"- {repo['name']}: {repo['stargazers_count']} stars") # Search repositoriespython_repos = client.search_repositories('python machine learning')print(f"\nTop Python ML repos:")for repo in python_repos[:5]:    print(f"- {repo['full_name']}: {repo['stargazers_count']} stars")

2. Weather API Client with Caching

import requestsimport timefrom typing import Dict, Optionalfrom functools import lru_cacheimport hashlibimport json class WeatherClient:    """Weather API client with caching."""        BASE_URL = 'https://api.openweathermap.org/data/2.5'        def __init__(self, api_key: str):        self.api_key = api_key        self.session = requests.Session()        self.cache = {}        self.cache_ttl = 600  # 10 minutes        def _get_cache_key(self, endpoint: str, params: Dict) -> str:        """Generate cache key."""        key_data = f"{endpoint}:{json.dumps(params, sort_keys=True)}"        return hashlib.md5(key_data.encode()).hexdigest()        def _get_from_cache(self, key: str) -> Optional[Dict]:        """Get data from cache if valid."""        if key in self.cache:            data, timestamp = self.cache[key]            if time.time() - timestamp < self.cache_ttl:                return data        return None        def _save_to_cache(self, key: str, data: Dict):        """Save data to cache."""        self.cache[key] = (data, time.time())        def _make_request(self, endpoint: str, params: Dict) -> Dict:        """Make API request with caching."""        params['appid'] = self.api_key                # Check cache        cache_key = self._get_cache_key(endpoint, params)        cached_data = self._get_from_cache(cache_key)                if cached_data:            print("Returning cached data")            return cached_data                # Make request        response = self.session.get(            f'{self.BASE_URL}/{endpoint}',            params=params        )        response.raise_for_status()                data = response.json()                # Save to cache        self._save_to_cache(cache_key, data)                return data        def get_current_weather(self, city: str, units: str = 'metric') -> Dict:        """Get current weather for a city."""        return self._make_request('weather', {            'q': city,            'units': units        })        def get_forecast(self, city: str, units: str = 'metric') -> Dict:        """Get 5-day forecast."""        return self._make_request('forecast', {            'q': city,            'units': units        })        def get_weather_by_coords(self, lat: float, lon: float,                              units: str = 'metric') -> Dict:        """Get weather by coordinates."""        return self._make_request('weather', {            'lat': lat,            'lon': lon,            'units': units        }) # Usageclient = WeatherClient(api_key='your_api_key_here') # Get current weatherweather = client.get_current_weather('London')print(f"City: {weather['name']}")print(f"Temperature: {weather['main']['temp']}°C")print(f"Description: {weather['weather'][0]['description']}") # Second call - returns cached dataweather = client.get_current_weather('London') # Get forecastforecast = client.get_forecast('Paris')print(f"\n5-day forecast for {forecast['city']['name']}:")for item in forecast['list'][:5]:    print(f"- {item['dt_txt']}: {item['main']['temp']}°C")

3. REST API Wrapper with Rate Limiting

import requestsimport timefrom typing import Dict, Any, Optionalfrom collections import dequefrom datetime import datetime, timedelta class RateLimitedClient:    """API client with rate limiting."""        def __init__(self, base_url: str, requests_per_minute: int = 60):        self.base_url = base_url.rstrip('/')        self.session = requests.Session()        self.requests_per_minute = requests_per_minute        self.request_times = deque()        def _wait_if_needed(self):        """Wait if rate limit would be exceeded."""        now = datetime.now()        minute_ago = now - timedelta(minutes=1)                # Remove old requests        while self.request_times and self.request_times[0] < minute_ago:            self.request_times.popleft()                # Check if we need to wait        if len(self.request_times) >= self.requests_per_minute:            sleep_time = (self.request_times[0] - minute_ago).total_seconds()            if sleep_time > 0:                print(f"Rate limit reached. Waiting {sleep_time:.1f}s...")                time.sleep(sleep_time)                self._wait_if_needed()  # Recheck        def _record_request(self):        """Record request timestamp."""        self.request_times.append(datetime.now())        def request(self, method: str, endpoint: str, **kwargs) -> requests.Response:        """Make rate-limited request."""        self._wait_if_needed()                url = f'{self.base_url}/{endpoint.lstrip("/")}'        response = self.session.request(method, url, **kwargs)                self._record_request()                return response        def get(self, endpoint: str, **kwargs) -> requests.Response:        """GET request."""        return self.request('GET', endpoint, **kwargs)        def post(self, endpoint: str, **kwargs) -> requests.Response:        """POST request."""        return self.request('POST', endpoint, **kwargs)        def put(self, endpoint: str, **kwargs) -> requests.Response:        """PUT request."""        return self.request('PUT', endpoint, **kwargs)        def delete(self, endpoint: str, **kwargs) -> requests.Response:        """DELETE request."""        return self.request('DELETE', endpoint, **kwargs) # Usageclient = RateLimitedClient(    base_url='https://api.example.com',    requests_per_minute=10) # Make requests - automatically rate limitedfor i in range(15):    print(f"Request {i + 1}")    response = client.get('/data', params={'id': i})    print(f"Status: {response.status_code}")

Best Practices

1. Sử dụng Session

import requests # ❌ Don't: Create new connection for each requestfor i in range(100):    response = requests.get('https://api.example.com/data') # ✅ Do: Reuse sessionsession = requests.Session()session.headers.update({'User-Agent': 'MyApp/1.0'}) for i in range(100):    response = session.get('https://api.example.com/data')

2. Always Set Timeout

# ❌ Don't: No timeoutresponse = requests.get('https://api.example.com/data') # ✅ Do: Set timeoutresponse = requests.get('https://api.example.com/data', timeout=5)

3. Handle Errors

# ❌ Don't: Ignore errorsresponse = requests.get('https://api.example.com/data')data = response.json() # ✅ Do: Handle errorstry:    response = requests.get('https://api.example.com/data', timeout=5)    response.raise_for_status()    data = response.json()except requests.exceptions.RequestException as e:    print(f"Error: {e}")

4. Use Environment Variables for Secrets

import osimport requests # ❌ Don't: Hard-code secretsapi_key = 'my_secret_api_key' # ✅ Do: Use environment variablesapi_key = os.environ.get('API_KEY') if not api_key:    raise ValueError("API_KEY environment variable not set") response = requests.get(    'https://api.example.com/data',    headers={'Authorization': f'Bearer {api_key}'})

Bài Tập Thực Hành

Bài 1: REST API Client

Tạo client cho JSONPlaceholder API với CRUD operations.

Bài 2: Rate Limiter

Implement rate limiter với token bucket algorithm.

Bài 3: Retry Logic

Tạo decorator cho automatic retry với exponential backoff.

Bài 4: API Caching

Implement caching layer với TTL cho API responses.

Bài 5: Concurrent Requests

Sử dụng concurrent.futures để make multiple API requests đồng thời.

Tóm Tắt

Trong Part 1 chúng ta đã học:

  1. Requests Library - HTTP client cho Python
  2. HTTP Methods - GET, POST, PUT, PATCH, DELETE
  3. Headers & Authentication - Basic, Bearer, API Key, OAuth
  4. Response Handling - Status codes, JSON, streaming
  5. Error Handling - Exceptions, retry logic, timeouts
  6. Real Applications - GitHub client, Weather client, Rate limiter

Part 2 sẽ cover: Sessions, async requests, webhooks, và API best practices! 🚀


Bài tiếp theo: Bài 20.2: Advanced API Techniques 🌐