StructBERT-WebUI实战教程：Python requests异常捕获（超时/连接拒绝/JSON解析失败）

# StructBERT-WebUI实战教程：Python requests异常捕获（超时/连接拒绝/JSON解析失败） ## 1. 项目概述 StructBERT文本相似度服务是一个基于百度StructBERT大模型的高精度中文句子相似度计算工具。这个WebUI界面让你能够通过简单的网页操作或API调用，快速判断两个中文句子的语义相似程度。相似度计算结果范围在0到1之间，数值越接近1表示两个句子的意思越相似。比如： - "今天天气很好" 和 "今天阳光明媚" → 相似度约0.85（意思很接近） - "今天天气很好" 和 "我喜欢吃苹果" → 相似度约0.12（意思不同） ### 1.1 核心应用场景这个工具在多个实际场景中都能发挥重要作用： **文本查重检测**：判断两篇文章或段落是否存在抄袭嫌疑，通过相似度分数快速识别重复内容。 **智能问答匹配**：在客服系统中，将用户问题与知识库中的标准问题进行匹配，找到最相关的答案。 **语义检索优化**：提升搜索体验，让系统能够理解"手机没电了"和"充电宝在哪借"之间的语义关联。 **内容推荐系统**：根据用户阅读的内容，推荐语义上相关的其他文章或产品。 ## 2. 环境准备与快速开始 ### 2.1 服务状态确认首先确认StructBERT服务是否正常运行。服务已经配置了开机自启，通常不需要手动启动。 ```bash # 检查服务进程 ps aux | grep "python.*app.py" # 测试服务健康状态 curl http://127.0.0.1:5000/health ``` 正常应该返回： ```json { "status": "healthy", "model_loaded": true } ``` ### 2.2 访问Web界面打开浏览器，访问Web界面： ``` http://gpu-pod698386bfe177c841fb0af650-5000.web.gpu.csdn.net/ ``` 如果页面显示服务状态为绿色，说明一切正常，可以开始使用了。 ## 3. Python requests异常处理实战在实际使用StructBERT的API接口时，网络请求可能会遇到各种异常情况。良好的异常处理能够让你的程序更加健壮。 ### 3.1 基础请求封装首先我们创建一个基础的请求函数，包含基本的异常处理： ```python import requests import json from typing import Dict, Any, Optional class StructBERTClient: def __init__(self, base_url: str = "http://127.0.0.1:5000"): self.base_url = base_url self.timeout = 30 # 默认超时时间30秒 def calculate_similarity(self, sentence1: str, sentence2: str) -> Optional[float]: """ 计算两个句子的相似度，包含完整的异常处理 """ url = f"{self.base_url}/similarity" try: # 构造请求数据 data = { "sentence1": sentence1, "sentence2": sentence2 } # 发送请求，设置超时 response = requests.post( url, json=data, timeout=self.timeout, headers={"Content-Type": "application/json"} ) # 检查HTTP状态码 response.raise_for_status() # 解析JSON响应 result = response.json() # 返回相似度结果 return result.get('similarity') except requests.exceptions.Timeout: print(f"请求超时：{self.timeout}秒内未收到响应") return None except requests.exceptions.ConnectionError: print("连接失败：无法连接到StructBERT服务") print("请检查服务是否启动：bash scripts/start.sh") return None except requests.exceptions.HTTPError as e: print(f"HTTP错误：{e}") return None except json.JSONDecodeError: print("JSON解析失败：服务器返回了无效的JSON数据") return None except Exception as e: print(f"未知错误：{e}") return None # 使用示例 client = StructBERTClient() similarity = client.calculate_similarity("今天天气很好", "今天阳光明媚") if similarity is not None: print(f"相似度：{similarity:.4f}") else: print("计算失败，请检查服务状态") ``` ### 3.2 连接拒绝异常处理连接拒绝错误通常发生在服务未启动或端口被占用时： ```python def check_service_availability(base_url: str = "http://127.0.0.1:5000") -> bool: """ 检查StructBERT服务是否可用 """ try: health_url = f"{base_url}/health" response = requests.get(health_url, timeout=5) if response.status_code == 200: data = response.json() return data.get('status') == 'healthy' and data.get('model_loaded') is True return False except requests.exceptions.ConnectionError: print("❌ 连接被拒绝：服务可能未启动") print("💡 解决方案：") print(" 1. 启动服务：bash /root/nlp_structbert_project/scripts/start.sh") print(" 2. 检查端口：netstat -tlnp | grep 5000") print(" 3. 查看日志：tail -f /root/nlp_structbert_project/logs/startup.log") return False except requests.exceptions.Timeout: print("⏰ 连接超时：服务响应过慢") return False except Exception as e: print(f"⚠️ 服务检查失败：{e}") return False # 使用示例 if check_service_availability(): print("✅ 服务正常运行") else: print("❌ 服务不可用，请先启动服务") ``` ### 3.3 请求超时异常处理设置合理的超时时间很重要，既要避免等待过久，又要给服务足够的处理时间： ```python def smart_request_with_retry(url: str, data: Dict, max_retries: int = 3) -> Optional[Dict]: """ 智能请求函数，包含重试机制和动态超时 """ retry_delays = [2, 5, 10] # 重试延迟（秒） timeouts = [15, 30, 45] # 每次重试的超时时间 for attempt in range(max_retries): try: current_timeout = timeouts[attempt] print(f"第 {attempt + 1} 次尝试，超时时间：{current_timeout}秒") response = requests.post( url, json=data, timeout=current_timeout, headers={"Content-Type": "application/json"} ) response.raise_for_status() return response.json() except requests.exceptions.Timeout: print(f"⚠️ 第 {attempt + 1} 次请求超时") if attempt < max_retries - 1: delay = retry_delays[attempt] print(f"等待 {delay} 秒后重试...") time.sleep(delay) else: print("❌ 所有重试尝试均超时") return None except requests.exceptions.ConnectionError: print("❌ 连接错误，请检查服务状态") return None return None # 使用示例 def calculate_similarity_with_retry(sentence1: str, sentence2: str) -> Optional[float]: url = "http://127.0.0.1:5000/similarity" data = {"sentence1": sentence1, "sentence2": sentence2} result = smart_request_with_retry(url, data) return result.get('similarity') if result else None ``` ### 3.4 JSON解析异常处理 JSON解析失败可能由于服务器返回了错误格式的数据： ```python def safe_json_parsing(response: requests.Response) -> Optional[Dict]: """ 安全的JSON解析，处理各种异常情况 """ try: # 检查响应内容类型 content_type = response.headers.get('Content-Type', '') if 'application/json' not in content_type: print(f"⚠️ 响应不是JSON格式：{content_type}") print(f"响应内容：{response.text[:200]}...") return None # 尝试解析JSON return response.json() except json.JSONDecodeError as e: print(f"❌ JSON解析失败：{e}") print(f"原始响应：{response.text[:200]}...") # 尝试修复常见的JSON格式问题 try: # 移除可能的多余字符 cleaned_text = response.text.strip() if not cleaned_text.startswith('{'): # 尝试提取JSON对象 import re json_match = re.search(r'\{.*\}', cleaned_text, re.DOTALL) if json_match: cleaned_text = json_match.group(0) return json.loads(cleaned_text) except: return None except Exception as e: print(f"❌ 解析过程中发生未知错误：{e}") return None # 在请求函数中使用 def make_request_with_safe_parsing(url: str, data: Dict) -> Optional[Dict]: try: response = requests.post(url, json=data, timeout=30) response.raise_for_status() return safe_json_parsing(response) except requests.exceptions.RequestException as e: print(f"请求失败：{e}") return None ``` ## 4. 完整的异常处理实战示例 ### 4.1 生产环境级的客户端实现下面是一个适合生产环境的完整客户端实现： ```python import requests import json import time from typing import Dict, List, Optional, Tuple import logging # 配置日志 logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) class RobustStructBERTClient: def __init__(self, base_url: str = "http://127.0.0.1:5000", timeout: int = 30): self.base_url = base_url.rstrip('/') self.timeout = timeout self.session = requests.Session() # 配置会话 self.session.headers.update({ 'Content-Type': 'application/json', 'User-Agent': 'StructBERT-Python-Client/1.0' }) def check_health(self) -> Tuple[bool, str]: """检查服务健康状态""" try: response = self.session.get( f"{self.base_url}/health", timeout=5 ) if response.status_code == 200: data = response.json() if data.get('status') == 'healthy' and data.get('model_loaded'): return True, "服务正常" else: return False, f"服务状态异常：{data}" else: return False, f"HTTP错误：{response.status_code}" except requests.exceptions.ConnectionError: return False, "连接被拒绝：服务可能未启动" except requests.exceptions.Timeout: return False, "连接超时：服务响应过慢" except Exception as e: return False, f"健康检查失败：{str(e)}" def calculate_similarity(self, sentence1: str, sentence2: str, max_retries: int = 2) -> Optional[float]: """计算句子相似度（带重试机制）""" url = f"{self.base_url}/similarity" data = {"sentence1": sentence1, "sentence2": sentence2} for attempt in range(max_retries + 1): try: response = self.session.post( url, json=data, timeout=self.timeout ) # 检查HTTP状态 if response.status_code != 200: logger.warning(f"HTTP {response.status_code}: {response.text}") if attempt < max_retries: time.sleep(1 * (attempt + 1)) continue return None # 解析JSON try: result = response.json() return result.get('similarity') except json.JSONDecodeError: logger.error(f"JSON解析失败：{response.text}") return None except requests.exceptions.Timeout: logger.warning(f"请求超时（尝试 {attempt + 1}/{max_retries + 1}）") if attempt < max_retries: time.sleep(2 * (attempt + 1)) continue return None except requests.exceptions.ConnectionError: logger.error("连接被拒绝：请检查服务是否启动") return None except requests.exceptions.RequestException as e: logger.error(f"请求异常：{e}") return None return None def batch_similarity(self, source: str, targets: List[str]) -> Optional[List[Dict]]: """批量计算相似度""" url = f"{self.base_url}/batch_similarity" data = {"source": source, "targets": targets} try: response = self.session.post(url, json=data, timeout=self.timeout + 10) response.raise_for_status() result = response.json() return result.get('results', []) except requests.exceptions.Timeout: logger.error("批量计算超时：可能目标句子过多") return None except requests.exceptions.ConnectionError: logger.error("连接失败：服务不可用") return None except Exception as e: logger.error(f"批量计算失败：{e}") return None # 使用示例 def demonstrate_robust_client(): client = RobustStructBERTClient() # 检查服务状态 is_healthy, message = client.check_health() print(f"服务状态：{'健康' if is_healthy else '异常'} - {message}") if is_healthy: # 计算相似度 similarity = client.calculate_similarity( "今天天气很好", "今天阳光明媚" ) if similarity is not None: print(f"相似度结果：{similarity:.4f}") # 批量计算示例 results = client.batch_similarity( "如何重置密码", ["密码忘记怎么办", "怎样修改登录密码", "如何注册新账号"] ) if results: print("批量计算结果：") for result in results: print(f" {result['sentence']}: {result['similarity']:.4f}") else: print("相似度计算失败") if __name__ == "__main__": demonstrate_robust_client() ``` ### 4.2 异常处理的最佳实践在实际项目中，建议采用以下异常处理策略： ```python class AdvancedErrorHandler: @staticmethod def handle_request_exception(e: Exception, operation: str = "请求") -> None: """高级异常处理""" error_mapping = { requests.exceptions.Timeout: f"{operation}超时，请检查网络或服务负载", requests.exceptions.ConnectionError: f"{operation}连接失败，服务可能未启动", requests.exceptions.HTTPError: f"{operation}HTTP错误", json.JSONDecodeError: "响应数据格式错误", ValueError: "无效的参数或数据", } for error_type, message in error_mapping.items(): if isinstance(e, error_type): logger.error(f"{message}：{str(e)}") return logger.error(f"未知{operation}错误：{str(e)}") @staticmethod def get_recovery_suggestion(e: Exception) -> str: """根据异常类型提供恢复建议""" if isinstance(e, requests.exceptions.ConnectionError): return """ 💡 恢复建议： 1. 检查StructBERT服务是否启动：bash scripts/start.sh 2. 确认服务端口(5000)是否被占用：netstat -tlnp | grep 5000 3. 查看服务日志：tail -f logs/startup.log """ elif isinstance(e, requests.exceptions.Timeout): return """ 💡 恢复建议： 1. 增加超时时间：client.timeout = 60 2. 检查服务器负载：top 或 htop 3. 优化请求数据量，避免过大文本 """ elif isinstance(e, json.JSONDecodeError): return """ 💡 恢复建议： 1. 检查服务是否返回有效JSON 2. 查看原始响应：response.text 3. 联系服务维护人员检查API格式 """ return "请查看日志获取详细错误信息" # 在客户端中使用 try: client = RobustStructBERTClient() result = client.calculate_similarity("句子1", "句子2") except Exception as e: AdvancedErrorHandler.handle_request_exception(e, "相似度计算") suggestion = AdvancedErrorHandler.get_recovery_suggestion(e) print(suggestion) ``` ## 5. 实战场景：构建健壮的相似度应用 ### 5.1 智能重试机制在实际应用中，简单的重试可能不够智能，我们需要更聪明的重试策略： ```python class SmartRetryMechanism: def __init__(self): self.failure_count = 0 self.last_failure_time = 0 def should_retry(self, exception: Exception) -> bool: """智能判断是否应该重试""" current_time = time.time() # 如果是连接错误，优先重试 if isinstance(exception, requests.exceptions.ConnectionError): return True # 如果是超时，根据失败频率决定 if isinstance(exception, requests.exceptions.Timeout): # 如果最近失败次数太多，暂停重试 if self.failure_count > 3 and current_time - self.last_failure_time < 60: return False return True # 其他错误通常不重试 return False def record_failure(self): """记录失败事件""" self.failure_count += 1 self.last_failure_time = time.time() def record_success(self): """记录成功事件，重置计数器""" self.failure_count = 0 def smart_retry_request(client, sentence1, sentence2, max_attempts=3): """智能重试请求""" retry_mechanism = SmartRetryMechanism() for attempt in range(max_attempts): try: result = client.calculate_similarity(sentence1, sentence2) if result is not None: retry_mechanism.record_success() return result else: # 结果为None但不抛异常的情况 raise ValueError("服务返回了空结果") except Exception as e: retry_mechanism.record_failure() if retry_mechanism.should_retry(e) and attempt < max_attempts - 1: wait_time = 2 ** attempt # 指数退避 print(f"第 {attempt + 1} 次尝试失败，{wait_time}秒后重试...") time.sleep(wait_time) else: print(f"最终失败：{e}") return None return None ``` ### 5.2 服务降级策略当StructBERT服务不可用时，可以提供降级方案： ```python class FallbackSimilarityService: """服务降级：当主服务不可用时的备选方案""" @staticmethod def jaccard_similarity(s1: str, s2: str) -> float: """简单的Jaccard相似度计算""" set1 = set(s1) set2 = set(s2) intersection = len(set1.intersection(set2)) union = len(set1.union(set2)) return intersection / union if union > 0 else 0 @staticmethod def word_based_similarity(s1: str, s2: str) -> float: """基于词汇的相似度计算""" words1 = set(s1.split()) words2 = set(s2.split()) common_words = words1.intersection(words2) all_words = words1.union(words2) return len(common_words) / len(all_words) if all_words else 0 class RobustSimilarityCalculator: """带降级策略的相似度计算器""" def __init__(self, primary_client): self.primary_client = primary_client self.fallback_service = FallbackSimilarityService() def calculate(self, sentence1: str, sentence2: str) -> float: """优先使用主服务，失败时降级""" try: # 尝试主服务 result = self.primary_client.calculate_similarity(sentence1, sentence2) if result is not None: return result # 主服务返回None但未抛异常 print("主服务返回空结果，使用降级方案") return self.fallback_service.jaccard_similarity(sentence1, sentence2) except Exception as e: print(f"主服务失败({e})，使用降级方案") return self.fallback_service.word_based_similarity(sentence1, sentence2) # 使用示例 primary_client = RobustStructBERTClient() calculator = RobustSimilarityCalculator(primary_client) # 无论主服务是否可用，都能返回结果 similarity = calculator.calculate("今天天气很好", "今天阳光明媚") print(f"相似度：{similarity:.4f}") ``` ## 6. 总结 ### 6.1 关键要点回顾通过本教程，我们学习了如何在使用StructBERT WebUI服务时进行完善的异常处理： **连接拒绝处理**：当服务未启动或端口被占用时，提供清晰的错误信息和恢复建议。 **请求超时管理**：设置合理的超时时间，实现智能的重试机制和指数退避策略。 **JSON解析安全**：处理服务器返回异常格式的情况，确保程序不会因为解析失败而崩溃。 **服务降级策略**：在主服务不可用时提供备选方案，保证系统的可用性。 ### 6.2 最佳实践建议在实际项目中使用StructBERT服务时，建议遵循以下最佳实践： **始终检查服务状态**：在发起请求前先检查服务健康状态，避免不必要的请求失败。 **设置合理的超时时间**：根据实际网络状况和服务性能调整超时设置。 **实现智能重试机制**：对于可重试的错误（如超时、连接错误），使用指数退避策略进行重试。 **提供有意义的错误信息**：不仅记录错误，还要为用户提供清晰的恢复建议。 **考虑服务降级方案**：在关键业务场景中，准备备选方案以保证系统可用性。 **监控和日志记录**：详细记录请求成功率和错误类型，便于问题排查和系统优化。通过遵循这些实践，你可以构建出健壮、可靠的StructBERT集成应用，为用户提供稳定的相似度计算服务。 --- > **获取更多AI镜像** > > 想探索更多AI镜像和应用场景？访问 [CSDN星图镜像广场](https://ai.csdn.net/?utm_source=mirror_blog_end)，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

下一篇 Python自动化办公：5分钟搞定Gmail邮件发送（OAuth2.0+应用密码双方案）