Bài 8: Working với JSON
Mục Tiêu Bài Học
Sau khi hoàn thành bài này, bạn sẽ:
- ✅ Sử dụng json module
- ✅ Serialize và deserialize data
- ✅ Tạo custom JSON encoders/decoders
- ✅ Làm việc với APIs
- ✅ Handle JSON errors
- ✅ Best practices cho JSON operations
JSON Là Gì?
JSON (JavaScript Object Notation) là lightweight data format dùng để trao đổi data giữa các hệ thống.
Why JSON?
# JSON là universal format# - Human-readable# - Language-agnostic# - Widely supported# - Compact # Python dictdata = { "name": "John Doe", "age": 30, "email": "[email protected]", "active": True} # JSON stringjson_string = '{"name": "John Doe", "age": 30, "email": "[email protected]", "active": true}' # APIs, configs, data storage đều dùng JSON!
json Module Basics
Module json cung cấp methods để work với JSON.
dumps() - Python to JSON String
import json # Dict to JSON stringdata = { "name": "Alice", "age": 25, "skills": ["Python", "Django", "PostgreSQL"], "active": True, "salary": None} json_string = json.dumps(data)print(json_string)# {"name": "Alice", "age": 25, "skills": ["Python", "Django", "PostgreSQL"], "active": true, "salary": null} print(type(json_string)) # <class 'str'> # Pretty print với indentjson_pretty = json.dumps(data, indent=2)print(json_pretty)# {# "name": "Alice",# "age": 25,# "skills": [# "Python",# "Django",# "PostgreSQL"# ],# "active": true,# "salary": null# } # Sort keysjson_sorted = json.dumps(data, indent=2, sort_keys=True)print(json_sorted)
loads() - JSON String to Python
import json # JSON string to Python dictjson_string = '{"name": "Bob", "age": 30, "active": true}' data = json.loads(json_string)print(data) # {'name': 'Bob', 'age': 30, 'active': True}print(type(data)) # <class 'dict'> # Access dataprint(data['name']) # Bobprint(data['age']) # 30 # JSON array to Python listjson_array = '["Python", "JavaScript", "Go"]'languages = json.loads(json_array)print(languages) # ['Python', 'JavaScript', 'Go']print(type(languages)) # <class 'list'>
dump() - Write to File
import json data = { "users": [ {"id": 1, "name": "Alice", "email": "[email protected]"}, {"id": 2, "name": "Bob", "email": "[email protected]"} ], "count": 2} # Write to filewith open('data.json', 'w') as f: json.dump(data, f, indent=2) # File content:# {# "users": [# {# "id": 1,# "name": "Alice",# "email": "[email protected]"# },# ...# ],# "count": 2# }
load() - Read from File
import json # Read from filewith open('data.json', 'r') as f: data = json.load(f) print(data)print(data['count']) # 2print(data['users'][0]['name']) # Alice # Handle missing fileimport os if os.path.exists('config.json'): with open('config.json', 'r') as f: config = json.load(f)else: config = {"default": True} with open('config.json', 'w') as f: json.dump(config, f)
Data Type Mapping
Python to JSON
import json # Python types → JSON typesdata = { "string": "text", # → "text" "integer": 42, # → 42 "float": 3.14, # → 3.14 "boolean": True, # → true "none": None, # → null "list": [1, 2, 3], # → [1, 2, 3] "tuple": (1, 2, 3), # → [1, 2, 3] (becomes array) "dict": {"key": "value"} # → {"key": "value"}} print(json.dumps(data, indent=2)) # Note: Tuples become arrays in JSON!
JSON to Python
import json json_data = '''{ "string": "text", "number": 42, "float": 3.14, "boolean": true, "null": null, "array": [1, 2, 3], "object": {"key": "value"}}''' data = json.loads(json_data) print(type(data['string'])) # <class 'str'>print(type(data['number'])) # <class 'int'>print(type(data['float'])) # <class 'float'>print(type(data['boolean'])) # <class 'bool'>print(type(data['null'])) # <class 'NoneType'>print(type(data['array'])) # <class 'list'>print(type(data['object'])) # <class 'dict'>
Custom JSON Encoders
Để serialize custom objects, tạo custom encoder.
Basic Custom Encoder
import jsonfrom datetime import datetime class DateTimeEncoder(json.JSONEncoder): """Encode datetime objects to ISO format.""" def default(self, obj): if isinstance(obj, datetime): return obj.isoformat() return super().default(obj) # Use custom encoderdata = { "event": "Meeting", "timestamp": datetime.now()} # Without encoder - ERROR# json.dumps(data) # TypeError: Object of type datetime is not JSON serializable # With encoder - SUCCESSjson_string = json.dumps(data, cls=DateTimeEncoder)print(json_string)# {"event": "Meeting", "timestamp": "2025-10-27T10:30:00.123456"}
Advanced Custom Encoder
import jsonfrom datetime import datetime, datefrom decimal import Decimal class AdvancedEncoder(json.JSONEncoder): """Handle multiple custom types.""" def default(self, obj): # Handle datetime if isinstance(obj, datetime): return { '_type': 'datetime', 'value': obj.isoformat() } # Handle date if isinstance(obj, date): return { '_type': 'date', 'value': obj.isoformat() } # Handle Decimal if isinstance(obj, Decimal): return { '_type': 'decimal', 'value': str(obj) } # Handle sets if isinstance(obj, set): return { '_type': 'set', 'value': list(obj) } return super().default(obj) # Testfrom decimal import Decimal data = { "created_at": datetime.now(), "date": date.today(), "price": Decimal('99.99'), "tags": {'python', 'json', 'tutorial'}} json_string = json.dumps(data, cls=AdvancedEncoder, indent=2)print(json_string)
Class Instance Encoder
import json class User: def __init__(self, id, name, email): self.id = id self.name = name self.email = email self.created_at = datetime.now() def to_dict(self): """Convert to dictionary.""" return { 'id': self.id, 'name': self.name, 'email': self.email, 'created_at': self.created_at.isoformat() } class UserEncoder(json.JSONEncoder): """Encode User objects.""" def default(self, obj): if isinstance(obj, User): return obj.to_dict() return super().default(obj) # Usageuser = User(1, "Alice", "[email protected]") json_string = json.dumps(user, cls=UserEncoder, indent=2)print(json_string)# {# "id": 1,# "name": "Alice",# "email": "[email protected]",# "created_at": "2025-10-27T10:30:00.123456"# } # Multiple usersusers = [ User(1, "Alice", "[email protected]"), User(2, "Bob", "[email protected]")] json_string = json.dumps(users, cls=UserEncoder, indent=2)print(json_string)
Custom JSON Decoders
Để deserialize với custom logic.
object_hook
import jsonfrom datetime import datetime def datetime_decoder(dct): """Decode datetime objects.""" if '_type' in dct and dct['_type'] == 'datetime': return datetime.fromisoformat(dct['value']) return dct # JSON with datetimejson_string = '''{ "event": "Meeting", "timestamp": { "_type": "datetime", "value": "2025-10-27T10:30:00.123456" }}''' data = json.loads(json_string, object_hook=datetime_decoder)print(data)print(type(data['timestamp'])) # <class 'datetime.datetime'>print(data['timestamp']) # 2025-10-27 10:30:00.123456
Advanced Decoder
import jsonfrom datetime import datetime, datefrom decimal import Decimal def advanced_decoder(dct): """Handle multiple custom types.""" if '_type' not in dct: return dct type_name = dct['_type'] value = dct['value'] if type_name == 'datetime': return datetime.fromisoformat(value) if type_name == 'date': return date.fromisoformat(value) if type_name == 'decimal': return Decimal(value) if type_name == 'set': return set(value) return dct # Decodejson_string = '''{ "created_at": { "_type": "datetime", "value": "2025-10-27T10:30:00.123456" }, "price": { "_type": "decimal", "value": "99.99" }, "tags": { "_type": "set", "value": ["python", "json"] }}''' data = json.loads(json_string, object_hook=advanced_decoder)print(data['created_at']) # datetime objectprint(data['price']) # Decimal('99.99')print(data['tags']) # {'python', 'json'}
Error Handling
import json # JSONDecodeErrorinvalid_json = '{"name": "Alice", "age": 30' # Missing closing brace try: data = json.loads(invalid_json)except json.JSONDecodeError as e: print(f"Invalid JSON: {e}") print(f"Line {e.lineno}, Column {e.colno}") # Invalid JSON: Expecting ',' delimiter: line 1 column 27 (char 26) # Line 1, Column 27 # Handle with default valuedef safe_load_json(json_string, default=None): """Safely load JSON with fallback.""" try: return json.loads(json_string) except json.JSONDecodeError: return default data = safe_load_json('invalid json', default={})print(data) # {} # File not founddef load_config(filename, default=None): """Load config file with fallback.""" try: with open(filename, 'r') as f: return json.load(f) except FileNotFoundError: return default except json.JSONDecodeError: print(f"Invalid JSON in {filename}") return default config = load_config('config.json', default={'debug': False})
Real-world Examples
1. Configuration Manager
import jsonimport os class ConfigManager: """Manage application configuration.""" def __init__(self, config_file='config.json'): self.config_file = config_file self.config = self.load() def load(self): """Load configuration from file.""" if os.path.exists(self.config_file): with open(self.config_file, 'r') as f: return json.load(f) return self.default_config() def save(self): """Save configuration to file.""" with open(self.config_file, 'w') as f: json.dump(self.config, f, indent=2) def get(self, key, default=None): """Get configuration value.""" return self.config.get(key, default) def set(self, key, value): """Set configuration value.""" self.config[key] = value self.save() def default_config(self): """Default configuration.""" return { 'debug': False, 'host': 'localhost', 'port': 8000, 'database': { 'host': 'localhost', 'port': 5432, 'name': 'mydb' } } # Usageconfig = ConfigManager()print(config.get('debug')) # False config.set('debug', True)print(config.get('debug')) # True db_host = config.get('database')['host']print(db_host) # localhost
2. API Response Handler
import jsonfrom typing import Optional, Dict, Any class APIResponse: """Handle API responses.""" def __init__(self, status_code: int, body: str): self.status_code = status_code self.raw_body = body self._data = None @property def data(self) -> Optional[Dict[Any, Any]]: """Parse JSON data.""" if self._data is None: try: self._data = json.loads(self.raw_body) except json.JSONDecodeError: self._data = None return self._data @property def is_success(self) -> bool: """Check if request was successful.""" return 200 <= self.status_code < 300 def get(self, key: str, default=None): """Get value from response data.""" if self.data: return self.data.get(key, default) return default # Usage (simulation)response_body = '{"users": [{"id": 1, "name": "Alice"}], "count": 1}'response = APIResponse(200, response_body) print(response.is_success) # Trueprint(response.data) # {'users': [...], 'count': 1}print(response.get('count')) # 1
3. Cache System
import jsonimport timefrom typing import Any, Optional class JSONCache: """Simple file-based cache using JSON.""" def __init__(self, cache_file='cache.json', ttl=3600): self.cache_file = cache_file self.ttl = ttl # Time to live in seconds self.cache = self._load() def _load(self): """Load cache from file.""" try: with open(self.cache_file, 'r') as f: return json.load(f) except (FileNotFoundError, json.JSONDecodeError): return {} def _save(self): """Save cache to file.""" with open(self.cache_file, 'w') as f: json.dump(self.cache, f, indent=2) def get(self, key: str) -> Optional[Any]: """Get value from cache.""" if key not in self.cache: return None entry = self.cache[key] # Check if expired if time.time() - entry['timestamp'] > self.ttl: del self.cache[key] self._save() return None return entry['value'] def set(self, key: str, value: Any): """Set value in cache.""" self.cache[key] = { 'value': value, 'timestamp': time.time() } self._save() def clear(self): """Clear all cache.""" self.cache = {} self._save() # Usagecache = JSONCache(ttl=60) # 60 seconds TTL cache.set('user:1', {'name': 'Alice', 'email': '[email protected]'})user = cache.get('user:1')print(user) # {'name': 'Alice', 'email': '[email protected]'} time.sleep(61)user = cache.get('user:1')print(user) # None (expired)
4. Data Export/Import
import jsonfrom datetime import datetime class DataExporter: """Export data to JSON format.""" @staticmethod def export_users(users, filename): """Export users to JSON file.""" data = { 'exported_at': datetime.now().isoformat(), 'count': len(users), 'users': [ { 'id': user['id'], 'name': user['name'], 'email': user['email'] } for user in users ] } with open(filename, 'w') as f: json.dump(data, f, indent=2) print(f"Exported {len(users)} users to {filename}") @staticmethod def import_users(filename): """Import users from JSON file.""" with open(filename, 'r') as f: data = json.load(f) print(f"Imported {data['count']} users") print(f"Exported at: {data['exported_at']}") return data['users'] # Usageusers = [ {'id': 1, 'name': 'Alice', 'email': '[email protected]'}, {'id': 2, 'name': 'Bob', 'email': '[email protected]'}] DataExporter.export_users(users, 'users_export.json')imported = DataExporter.import_users('users_export.json')print(imported)
5. JSON Schema Validator
import json class JSONValidator: """Validate JSON structure.""" @staticmethod def validate_user(data): """Validate user data structure.""" required_fields = ['id', 'name', 'email'] # Check if all required fields exist for field in required_fields: if field not in data: return False, f"Missing required field: {field}" # Validate types if not isinstance(data['id'], int): return False, "id must be an integer" if not isinstance(data['name'], str): return False, "name must be a string" if not isinstance(data['email'], str): return False, "email must be a string" # Validate email format (basic) if '@' not in data['email']: return False, "email must contain @" return True, "Valid" @staticmethod def validate_json_file(filename, validator_func): """Validate JSON file.""" try: with open(filename, 'r') as f: data = json.load(f) is_valid, message = validator_func(data) return is_valid, message except FileNotFoundError: return False, f"File not found: {filename}" except json.JSONDecodeError as e: return False, f"Invalid JSON: {e}" # Usageuser_data = { 'id': 1, 'name': 'Alice', 'email': '[email protected]'} is_valid, message = JSONValidator.validate_user(user_data)print(f"Valid: {is_valid}, Message: {message}")# Valid: True, Message: Valid
Best Practices
import json # 1. Always use indent for readabilitydata = {'name': 'Alice', 'age': 25}json.dumps(data, indent=2) # Good# json.dumps(data) # Bad for debugging # 2. Handle errors properlydef safe_json_load(filename): try: with open(filename, 'r') as f: return json.load(f) except FileNotFoundError: print(f"File not found: {filename}") return None except json.JSONDecodeError as e: print(f"Invalid JSON: {e}") return None # 3. Use custom encoders for custom typesclass CustomEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, datetime): return obj.isoformat() return super().default(obj) # 4. Validate JSON structuredef validate_config(config): required = ['host', 'port', 'debug'] return all(key in config for key in required) # 5. Use context managers for fileswith open('data.json', 'r') as f: data = json.load(f) # File auto-closes # 6. Sort keys for consistencyjson.dumps(data, sort_keys=True, indent=2) # 7. Use ensure_ascii=False for Unicodedata = {'name': 'Nguyễn Văn A'}json.dumps(data, ensure_ascii=False, indent=2)# {"name": "Nguyễn Văn A"} # 8. Separate data and metadatadata_to_save = { 'metadata': { 'version': '1.0', 'created_at': datetime.now().isoformat() }, 'data': { 'users': [...] }}
Bài Tập Thực Hành
Bài 1: Contact Manager
Tạo contact manager với JSON storage:
- Add/edit/delete contacts
- Save to JSON file
- Load from JSON file
- Search contacts
Bài 2: Custom Serializer
Tạo custom encoder/decoder cho:
- datetime objects
- Decimal numbers
- Custom class instances
Bài 3: API Mock Server
Tạo mock API với JSON responses:
- Load responses từ JSON files
- Return data based on endpoint
- Handle errors
Bài 4: Config Editor
Tạo CLI tool để edit JSON config:
- Read config
- Update values
- Validate structure
- Save changes
Bài 5: Data Transformer
Transform JSON data:
- Read from one format
- Transform structure
- Write to new format
- Handle nested data
Tóm Tắt
✅ json module: dumps, loads, dump, load
✅ Serialization: Python → JSON string
✅ Deserialization: JSON string → Python
✅ Custom encoders: JSONEncoder.default()
✅ Custom decoders: object_hook parameter
✅ Error handling: JSONDecodeError
✅ File operations: dump/load with context managers
✅ Best practices: indent, sort_keys, ensure_ascii, validation
Bài Tiếp Theo
Bài 9: Working với Dates và Times - datetime module, timezone handling, date arithmetic! 🚀
Remember:
- Always use indent=2 for readability
- Handle JSONDecodeError properly
- Use custom encoders for custom types
- Validate JSON structure
- Use context managers for files! 🎯