Bài 8: Working với JSON

Mục Tiêu Bài Học

Sau khi hoàn thành bài này, bạn sẽ:

  • ✅ Sử dụng json module
  • ✅ Serialize và deserialize data
  • ✅ Tạo custom JSON encoders/decoders
  • ✅ Làm việc với APIs
  • ✅ Handle JSON errors
  • ✅ Best practices cho JSON operations

JSON Là Gì?

JSON (JavaScript Object Notation) là lightweight data format dùng để trao đổi data giữa các hệ thống.

Why JSON?

# JSON là universal format# - Human-readable# - Language-agnostic# - Widely supported# - Compact # Python dictdata = {    "name": "John Doe",    "age": 30,    "email": "[email protected]",    "active": True} # JSON stringjson_string = '{"name": "John Doe", "age": 30, "email": "[email protected]", "active": true}' # APIs, configs, data storage đều dùng JSON!

json Module Basics

Module json cung cấp methods để work với JSON.

dumps() - Python to JSON String

import json # Dict to JSON stringdata = {    "name": "Alice",    "age": 25,    "skills": ["Python", "Django", "PostgreSQL"],    "active": True,    "salary": None} json_string = json.dumps(data)print(json_string)# {"name": "Alice", "age": 25, "skills": ["Python", "Django", "PostgreSQL"], "active": true, "salary": null} print(type(json_string))  # <class 'str'> # Pretty print với indentjson_pretty = json.dumps(data, indent=2)print(json_pretty)# {#   "name": "Alice",#   "age": 25,#   "skills": [#     "Python",#     "Django",#     "PostgreSQL"#   ],#   "active": true,#   "salary": null# } # Sort keysjson_sorted = json.dumps(data, indent=2, sort_keys=True)print(json_sorted)

loads() - JSON String to Python

import json # JSON string to Python dictjson_string = '{"name": "Bob", "age": 30, "active": true}' data = json.loads(json_string)print(data)  # {'name': 'Bob', 'age': 30, 'active': True}print(type(data))  # <class 'dict'> # Access dataprint(data['name'])  # Bobprint(data['age'])   # 30 # JSON array to Python listjson_array = '["Python", "JavaScript", "Go"]'languages = json.loads(json_array)print(languages)  # ['Python', 'JavaScript', 'Go']print(type(languages))  # <class 'list'>

dump() - Write to File

import json data = {    "users": [        {"id": 1, "name": "Alice", "email": "[email protected]"},        {"id": 2, "name": "Bob", "email": "[email protected]"}    ],    "count": 2} # Write to filewith open('data.json', 'w') as f:    json.dump(data, f, indent=2) # File content:# {#   "users": [#     {#       "id": 1,#       "name": "Alice",#       "email": "[email protected]"#     },#     ...#   ],#   "count": 2# }

load() - Read from File

import json # Read from filewith open('data.json', 'r') as f:    data = json.load(f) print(data)print(data['count'])  # 2print(data['users'][0]['name'])  # Alice # Handle missing fileimport os if os.path.exists('config.json'):    with open('config.json', 'r') as f:        config = json.load(f)else:    config = {"default": True}    with open('config.json', 'w') as f:        json.dump(config, f)

Data Type Mapping

Python to JSON

import json # Python types → JSON typesdata = {    "string": "text",           # → "text"    "integer": 42,              # → 42    "float": 3.14,              # → 3.14    "boolean": True,            # → true    "none": None,               # → null    "list": [1, 2, 3],          # → [1, 2, 3]    "tuple": (1, 2, 3),         # → [1, 2, 3] (becomes array)    "dict": {"key": "value"}    # → {"key": "value"}} print(json.dumps(data, indent=2)) # Note: Tuples become arrays in JSON!

JSON to Python

import json json_data = '''{    "string": "text",    "number": 42,    "float": 3.14,    "boolean": true,    "null": null,    "array": [1, 2, 3],    "object": {"key": "value"}}''' data = json.loads(json_data) print(type(data['string']))   # <class 'str'>print(type(data['number']))   # <class 'int'>print(type(data['float']))    # <class 'float'>print(type(data['boolean']))  # <class 'bool'>print(type(data['null']))     # <class 'NoneType'>print(type(data['array']))    # <class 'list'>print(type(data['object']))   # <class 'dict'>

Custom JSON Encoders

Để serialize custom objects, tạo custom encoder.

Basic Custom Encoder

import jsonfrom datetime import datetime class DateTimeEncoder(json.JSONEncoder):    """Encode datetime objects to ISO format."""        def default(self, obj):        if isinstance(obj, datetime):            return obj.isoformat()        return super().default(obj) # Use custom encoderdata = {    "event": "Meeting",    "timestamp": datetime.now()} # Without encoder - ERROR# json.dumps(data)  # TypeError: Object of type datetime is not JSON serializable # With encoder - SUCCESSjson_string = json.dumps(data, cls=DateTimeEncoder)print(json_string)# {"event": "Meeting", "timestamp": "2025-10-27T10:30:00.123456"}

Advanced Custom Encoder

import jsonfrom datetime import datetime, datefrom decimal import Decimal class AdvancedEncoder(json.JSONEncoder):    """Handle multiple custom types."""        def default(self, obj):        # Handle datetime        if isinstance(obj, datetime):            return {                '_type': 'datetime',                'value': obj.isoformat()            }                # Handle date        if isinstance(obj, date):            return {                '_type': 'date',                'value': obj.isoformat()            }                # Handle Decimal        if isinstance(obj, Decimal):            return {                '_type': 'decimal',                'value': str(obj)            }                # Handle sets        if isinstance(obj, set):            return {                '_type': 'set',                'value': list(obj)            }                return super().default(obj) # Testfrom decimal import Decimal data = {    "created_at": datetime.now(),    "date": date.today(),    "price": Decimal('99.99'),    "tags": {'python', 'json', 'tutorial'}} json_string = json.dumps(data, cls=AdvancedEncoder, indent=2)print(json_string)

Class Instance Encoder

import json class User:    def __init__(self, id, name, email):        self.id = id        self.name = name        self.email = email        self.created_at = datetime.now()        def to_dict(self):        """Convert to dictionary."""        return {            'id': self.id,            'name': self.name,            'email': self.email,            'created_at': self.created_at.isoformat()        } class UserEncoder(json.JSONEncoder):    """Encode User objects."""        def default(self, obj):        if isinstance(obj, User):            return obj.to_dict()        return super().default(obj) # Usageuser = User(1, "Alice", "[email protected]") json_string = json.dumps(user, cls=UserEncoder, indent=2)print(json_string)# {#   "id": 1,#   "name": "Alice",#   "email": "[email protected]",#   "created_at": "2025-10-27T10:30:00.123456"# } # Multiple usersusers = [    User(1, "Alice", "[email protected]"),    User(2, "Bob", "[email protected]")] json_string = json.dumps(users, cls=UserEncoder, indent=2)print(json_string)

Custom JSON Decoders

Để deserialize với custom logic.

object_hook

import jsonfrom datetime import datetime def datetime_decoder(dct):    """Decode datetime objects."""    if '_type' in dct and dct['_type'] == 'datetime':        return datetime.fromisoformat(dct['value'])    return dct # JSON with datetimejson_string = '''{    "event": "Meeting",    "timestamp": {        "_type": "datetime",        "value": "2025-10-27T10:30:00.123456"    }}''' data = json.loads(json_string, object_hook=datetime_decoder)print(data)print(type(data['timestamp']))  # <class 'datetime.datetime'>print(data['timestamp'])        # 2025-10-27 10:30:00.123456

Advanced Decoder

import jsonfrom datetime import datetime, datefrom decimal import Decimal def advanced_decoder(dct):    """Handle multiple custom types."""    if '_type' not in dct:        return dct        type_name = dct['_type']    value = dct['value']        if type_name == 'datetime':        return datetime.fromisoformat(value)        if type_name == 'date':        return date.fromisoformat(value)        if type_name == 'decimal':        return Decimal(value)        if type_name == 'set':        return set(value)        return dct # Decodejson_string = '''{    "created_at": {        "_type": "datetime",        "value": "2025-10-27T10:30:00.123456"    },    "price": {        "_type": "decimal",        "value": "99.99"    },    "tags": {        "_type": "set",        "value": ["python", "json"]    }}''' data = json.loads(json_string, object_hook=advanced_decoder)print(data['created_at'])  # datetime objectprint(data['price'])       # Decimal('99.99')print(data['tags'])        # {'python', 'json'}

Error Handling

import json # JSONDecodeErrorinvalid_json = '{"name": "Alice", "age": 30'  # Missing closing brace try:    data = json.loads(invalid_json)except json.JSONDecodeError as e:    print(f"Invalid JSON: {e}")    print(f"Line {e.lineno}, Column {e.colno}")    # Invalid JSON: Expecting ',' delimiter: line 1 column 27 (char 26)    # Line 1, Column 27 # Handle with default valuedef safe_load_json(json_string, default=None):    """Safely load JSON with fallback."""    try:        return json.loads(json_string)    except json.JSONDecodeError:        return default data = safe_load_json('invalid json', default={})print(data)  # {} # File not founddef load_config(filename, default=None):    """Load config file with fallback."""    try:        with open(filename, 'r') as f:            return json.load(f)    except FileNotFoundError:        return default    except json.JSONDecodeError:        print(f"Invalid JSON in {filename}")        return default config = load_config('config.json', default={'debug': False})

Real-world Examples

1. Configuration Manager

import jsonimport os class ConfigManager:    """Manage application configuration."""        def __init__(self, config_file='config.json'):        self.config_file = config_file        self.config = self.load()        def load(self):        """Load configuration from file."""        if os.path.exists(self.config_file):            with open(self.config_file, 'r') as f:                return json.load(f)        return self.default_config()        def save(self):        """Save configuration to file."""        with open(self.config_file, 'w') as f:            json.dump(self.config, f, indent=2)        def get(self, key, default=None):        """Get configuration value."""        return self.config.get(key, default)        def set(self, key, value):        """Set configuration value."""        self.config[key] = value        self.save()        def default_config(self):        """Default configuration."""        return {            'debug': False,            'host': 'localhost',            'port': 8000,            'database': {                'host': 'localhost',                'port': 5432,                'name': 'mydb'            }        } # Usageconfig = ConfigManager()print(config.get('debug'))  # False config.set('debug', True)print(config.get('debug'))  # True db_host = config.get('database')['host']print(db_host)  # localhost

2. API Response Handler

import jsonfrom typing import Optional, Dict, Any class APIResponse:    """Handle API responses."""        def __init__(self, status_code: int, body: str):        self.status_code = status_code        self.raw_body = body        self._data = None        @property    def data(self) -> Optional[Dict[Any, Any]]:        """Parse JSON data."""        if self._data is None:            try:                self._data = json.loads(self.raw_body)            except json.JSONDecodeError:                self._data = None        return self._data        @property    def is_success(self) -> bool:        """Check if request was successful."""        return 200 <= self.status_code < 300        def get(self, key: str, default=None):        """Get value from response data."""        if self.data:            return self.data.get(key, default)        return default # Usage (simulation)response_body = '{"users": [{"id": 1, "name": "Alice"}], "count": 1}'response = APIResponse(200, response_body) print(response.is_success)  # Trueprint(response.data)        # {'users': [...], 'count': 1}print(response.get('count'))  # 1

3. Cache System

import jsonimport timefrom typing import Any, Optional class JSONCache:    """Simple file-based cache using JSON."""        def __init__(self, cache_file='cache.json', ttl=3600):        self.cache_file = cache_file        self.ttl = ttl  # Time to live in seconds        self.cache = self._load()        def _load(self):        """Load cache from file."""        try:            with open(self.cache_file, 'r') as f:                return json.load(f)        except (FileNotFoundError, json.JSONDecodeError):            return {}        def _save(self):        """Save cache to file."""        with open(self.cache_file, 'w') as f:            json.dump(self.cache, f, indent=2)        def get(self, key: str) -> Optional[Any]:        """Get value from cache."""        if key not in self.cache:            return None                entry = self.cache[key]                # Check if expired        if time.time() - entry['timestamp'] > self.ttl:            del self.cache[key]            self._save()            return None                return entry['value']        def set(self, key: str, value: Any):        """Set value in cache."""        self.cache[key] = {            'value': value,            'timestamp': time.time()        }        self._save()        def clear(self):        """Clear all cache."""        self.cache = {}        self._save() # Usagecache = JSONCache(ttl=60)  # 60 seconds TTL cache.set('user:1', {'name': 'Alice', 'email': '[email protected]'})user = cache.get('user:1')print(user)  # {'name': 'Alice', 'email': '[email protected]'} time.sleep(61)user = cache.get('user:1')print(user)  # None (expired)

4. Data Export/Import

import jsonfrom datetime import datetime class DataExporter:    """Export data to JSON format."""        @staticmethod    def export_users(users, filename):        """Export users to JSON file."""        data = {            'exported_at': datetime.now().isoformat(),            'count': len(users),            'users': [                {                    'id': user['id'],                    'name': user['name'],                    'email': user['email']                }                for user in users            ]        }                with open(filename, 'w') as f:            json.dump(data, f, indent=2)                print(f"Exported {len(users)} users to {filename}")        @staticmethod    def import_users(filename):        """Import users from JSON file."""        with open(filename, 'r') as f:            data = json.load(f)                print(f"Imported {data['count']} users")        print(f"Exported at: {data['exported_at']}")                return data['users'] # Usageusers = [    {'id': 1, 'name': 'Alice', 'email': '[email protected]'},    {'id': 2, 'name': 'Bob', 'email': '[email protected]'}] DataExporter.export_users(users, 'users_export.json')imported = DataExporter.import_users('users_export.json')print(imported)

5. JSON Schema Validator

import json class JSONValidator:    """Validate JSON structure."""        @staticmethod    def validate_user(data):        """Validate user data structure."""        required_fields = ['id', 'name', 'email']                # Check if all required fields exist        for field in required_fields:            if field not in data:                return False, f"Missing required field: {field}"                # Validate types        if not isinstance(data['id'], int):            return False, "id must be an integer"                if not isinstance(data['name'], str):            return False, "name must be a string"                if not isinstance(data['email'], str):            return False, "email must be a string"                # Validate email format (basic)        if '@' not in data['email']:            return False, "email must contain @"                return True, "Valid"        @staticmethod    def validate_json_file(filename, validator_func):        """Validate JSON file."""        try:            with open(filename, 'r') as f:                data = json.load(f)                        is_valid, message = validator_func(data)            return is_valid, message                except FileNotFoundError:            return False, f"File not found: {filename}"        except json.JSONDecodeError as e:            return False, f"Invalid JSON: {e}" # Usageuser_data = {    'id': 1,    'name': 'Alice',    'email': '[email protected]'} is_valid, message = JSONValidator.validate_user(user_data)print(f"Valid: {is_valid}, Message: {message}")# Valid: True, Message: Valid

Best Practices

import json # 1. Always use indent for readabilitydata = {'name': 'Alice', 'age': 25}json.dumps(data, indent=2)  # Good# json.dumps(data)  # Bad for debugging # 2. Handle errors properlydef safe_json_load(filename):    try:        with open(filename, 'r') as f:            return json.load(f)    except FileNotFoundError:        print(f"File not found: {filename}")        return None    except json.JSONDecodeError as e:        print(f"Invalid JSON: {e}")        return None # 3. Use custom encoders for custom typesclass CustomEncoder(json.JSONEncoder):    def default(self, obj):        if isinstance(obj, datetime):            return obj.isoformat()        return super().default(obj) # 4. Validate JSON structuredef validate_config(config):    required = ['host', 'port', 'debug']    return all(key in config for key in required) # 5. Use context managers for fileswith open('data.json', 'r') as f:    data = json.load(f)  # File auto-closes # 6. Sort keys for consistencyjson.dumps(data, sort_keys=True, indent=2) # 7. Use ensure_ascii=False for Unicodedata = {'name': 'Nguyễn Văn A'}json.dumps(data, ensure_ascii=False, indent=2)# {"name": "Nguyễn Văn A"} # 8. Separate data and metadatadata_to_save = {    'metadata': {        'version': '1.0',        'created_at': datetime.now().isoformat()    },    'data': {        'users': [...]    }}

Bài Tập Thực Hành

Bài 1: Contact Manager

Tạo contact manager với JSON storage:

  • Add/edit/delete contacts
  • Save to JSON file
  • Load from JSON file
  • Search contacts

Bài 2: Custom Serializer

Tạo custom encoder/decoder cho:

  • datetime objects
  • Decimal numbers
  • Custom class instances

Bài 3: API Mock Server

Tạo mock API với JSON responses:

  • Load responses từ JSON files
  • Return data based on endpoint
  • Handle errors

Bài 4: Config Editor

Tạo CLI tool để edit JSON config:

  • Read config
  • Update values
  • Validate structure
  • Save changes

Bài 5: Data Transformer

Transform JSON data:

  • Read from one format
  • Transform structure
  • Write to new format
  • Handle nested data

Tóm Tắt

json module: dumps, loads, dump, load
Serialization: Python → JSON string
Deserialization: JSON string → Python
Custom encoders: JSONEncoder.default()
Custom decoders: object_hook parameter
Error handling: JSONDecodeError
File operations: dump/load with context managers
Best practices: indent, sort_keys, ensure_ascii, validation

Bài Tiếp Theo

Bài 9: Working với Dates và Times - datetime module, timezone handling, date arithmetic! 🚀


Remember:

  • Always use indent=2 for readability
  • Handle JSONDecodeError properly
  • Use custom encoders for custom types
  • Validate JSON structure
  • Use context managers for files! 🎯