Qwen2.5-VL-Chord实战教程：Python脚本批量调用model.infer()方法详解

# Qwen2.5-VL-Chord实战教程：Python脚本批量调用model.infer()方法详解 ## 1. 项目简介 ### 1.1 什么是Chord视觉定位服务 Chord是一个基于Qwen2.5-VL多模态大模型的视觉定位服务，它能够理解自然语言描述，并在图像中精确定位目标对象，返回边界框坐标。这个服务特别适合需要批量处理图像定位任务的场景。 ### 1.2 核心功能特点 - **自然语言定位**：通过文本描述定位图像中的目标 - **批量处理支持**：支持一次性处理多张图片 - **多目标检测**：能够同时定位多个目标对象 - **高性能推理**：基于GPU加速，支持bfloat16精度 - **Python API**：提供简洁的编程接口 ### 1.3 适用场景 - 批量图像标注和数据集构建 - 智能相册的批量图像检索 - 机器人视觉导航的批量处理 - 工业质检的批量缺陷定位 - 内容审核的批量违规检测 ## 2. 环境准备与配置 ### 2.1 检查服务状态在开始编写Python脚本之前，首先确保Chord服务正常运行： ```bash supervisorctl status chord ``` 预期输出应该是： ``` chord RUNNING pid 135976, uptime 0:01:34 ``` ### 2.2 导入必要的库创建Python脚本时，需要导入以下依赖库： ```python import sys import os import time from pathlib import Path from PIL import Image import json from typing import List, Dict, Tuple, Optional # 添加Chord服务路径 sys.path.append('/root/chord-service/app') from model import ChordModel ``` ### 2.3 设置项目路径 ```python # 基础路径配置 BASE_DIR = Path("/root/chord-service") MODEL_PATH = "/root/ai-models/syModelScope/chord" INPUT_DIR = BASE_DIR / "input_images" OUTPUT_DIR = BASE_DIR / "output_results" LOG_DIR = BASE_DIR / "logs" # 创建必要的目录 for directory in [INPUT_DIR, OUTPUT_DIR, LOG_DIR]: directory.mkdir(exist_ok=True) ``` ## 3. 基础单张图片处理 ### 3.1 初始化模型实例首先学习如何创建和配置模型实例： ```python def initialize_model(device: str = "auto", model_path: str = MODEL_PATH) -> ChordModel: """ 初始化Chord模型 Args: device: 设备类型，可选"auto"、"cuda"、"cpu" model_path: 模型路径 Returns: ChordModel实例 """ try: model = ChordModel( model_path=model_path, device=device ) model.load() print(" 模型初始化成功") return model except Exception as e: print(f" 模型初始化失败: {e}") raise # 使用示例 model = initialize_model(device="cuda") ``` ### 3.2 单张图片推理示例学习基本的单张图片处理流程： ```python def process_single_image(model: ChordModel, image_path: str, prompt: str) -> Dict: """ 处理单张图片 Args: model: 初始化的模型实例 image_path: 图片路径 prompt: 文本提示 Returns: 推理结果字典 """ try: # 加载图片 if not os.path.exists(image_path): raise FileNotFoundError(f"图片不存在: {image_path}") image = Image.open(image_path) # 记录开始时间 start_time = time.time() # 执行推理 result = model.infer( image=image, prompt=prompt, max_new_tokens=512 ) # 计算处理时间 processing_time = time.time() - start_time # 整理结果 result_info = { "image_path": image_path, "prompt": prompt, "boxes": result.get("boxes", []), "text_output": result.get("text", ""), "image_size": result.get("image_size", (0, 0)), "processing_time": round(processing_time, 2), "timestamp": time.strftime("%Y-%m-%d %H:%M:%S") } print(f" 图片处理完成: {image_path}") print(f" 检测到 {len(result_info['boxes'])} 个目标") print(f" 处理时间: {result_info['processing_time']}秒") return result_info except Exception as e: print(f" 图片处理失败 {image_path}: {e}") return { "image_path": image_path, "prompt": prompt, "error": str(e), "timestamp": time.strftime("%Y-%m-%d %H:%M:%S") } ``` ### 3.3 基础使用示例 ```python # 示例：处理单张图片 def example_single_processing(): """单张图片处理示例""" # 初始化模型 model = initialize_model() # 设置图片路径和提示词 image_path = "/path/to/your/image.jpg" prompt = "找到图中的人" # 处理图片 result = process_single_image(model, image_path, prompt) # 打印结果 print("推理结果:") print(json.dumps(result, indent=2, ensure_ascii=False)) return result # 运行示例 if __name__ == "__main__": example_single_processing() ``` ## 4. 批量处理实现 ### 4.1 批量处理核心函数现在进入重点：如何批量调用model.infer()方法： ```python def batch_process_images( model: ChordModel, image_dir: str, prompts: List[str], output_file: Optional[str] = None, batch_size: int = 10, same_prompt_for_all: bool = False ) -> List[Dict]: """ 批量处理图片 Args: model: 初始化的模型实例 image_dir: 图片目录路径 prompts: 提示词列表 output_file: 结果输出文件路径 batch_size: 每批处理数量 same_prompt_for_all: 是否对所有图片使用相同提示词 Returns: 所有图片的处理结果列表 """ # 获取所有图片文件 image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.webp'] image_files = [] for ext in image_extensions: image_files.extend(list(Path(image_dir).glob(f"*{ext}"))) image_files.extend(list(Path(image_dir).glob(f"*{ext.upper()}"))) if not image_files: print(" 未找到图片文件") return [] print(f" 找到 {len(image_files)} 张图片") all_results = [] processed_count = 0 # 分批处理 for i in range(0, len(image_files), batch_size): batch_files = image_files[i:i + batch_size] batch_results = [] print(f"\n🔧 正在处理第 {i//batch_size + 1} 批，共 {len(batch_files)} 张图片") for j, image_file in enumerate(batch_files): # 选择提示词 if same_prompt_for_all: prompt = prompts[0] if prompts else "找到图中的目标" else: prompt_index = j % len(prompts) if prompts else 0 prompt = prompts[prompt_index] if prompts else "找到图中的目标" # 处理单张图片 result = process_single_image(model, str(image_file), prompt) batch_results.append(result) processed_count += 1 # 显示进度 if (j + 1) % 5 == 0 or (j + 1) == len(batch_files): print(f" 进度: {processed_count}/{len(image_files)}") all_results.extend(batch_results) # 可选：每批处理后保存中间结果 if output_file: save_results(all_results, output_file) print(f"\n 批量处理完成！共处理 {len(all_results)} 张图片") return all_results ``` ### 4.2 结果保存函数 ```python def save_results(results: List[Dict], output_path: str) -> None: """ 保存处理结果到JSON文件 Args: results: 处理结果列表 output_path: 输出文件路径 """ try: # 确保输出目录存在 output_dir = os.path.dirname(output_path) if output_dir: os.makedirs(output_dir, exist_ok=True) # 保存为JSON格式 with open(output_path, 'w', encoding='utf-8') as f: json.dump(results, f, indent=2, ensure_ascii=False) print(f"💾 结果已保存到: {output_path}") except Exception as e: print(f" 保存结果失败: {e}") def load_results(input_path: str) -> List[Dict]: """ 从JSON文件加载处理结果 Args: input_path: 输入文件路径 Returns: 加载的结果列表 """ try: with open(input_path, 'r', encoding='utf-8') as f: return json.load(f) except Exception as e: print(f" 加载结果失败: {e}") return [] ``` ### 4.3 批量处理示例 ```python def example_batch_processing(): """批量处理示例""" # 初始化模型 model = initialize_model() # 配置处理参数 image_directory = "/path/to/your/images" output_json = str(OUTPUT_DIR / "batch_results.json") # 定义不同的提示词 prompts = [ "找到图中的人", "定位所有的汽车", "找到穿红色衣服的人", "检测动物", "找到电子设备" ] # 执行批量处理 results = batch_process_images( model=model, image_dir=image_directory, prompts=prompts, output_file=output_json, batch_size=8, same_prompt_for_all=False # 对不同图片使用不同提示词 ) # 分析统计结果 analyze_results(results) return results def analyze_results(results: List[Dict]) -> None: """分析处理结果统计信息""" successful = [r for r in results if 'error' not in r] failed = [r for r in results if 'error' in r] print(f"\n 处理统计:") print(f" 成功: {len(successful)} 张图片") print(f" 失败: {len(failed)} 张图片") if successful: total_boxes = sum(len(r.get('boxes', [])) for r in successful) avg_boxes = total_boxes / len(successful) avg_time = sum(r.get('processing_time', 0) for r in successful) / len(successful) print(f" 检测到目标总数: {total_boxes}") print(f" 平均每张图片目标数: {avg_boxes:.1f}") print(f" 平均处理时间: {avg_time:.2f}秒") if failed: print(f"\n 失败详情:") for fail in failed[:5]: # 只显示前5个失败 print(f" {fail['image_path']}: {fail['error']}") ``` ## 5. 高级批量处理技巧 ### 5.1 多提示词策略处理 ```python def multi_prompt_strategy( model: ChordModel, image_path: str, prompts: List[str] ) -> Dict: """ 对单张图片使用多个提示词策略 Args: model: 模型实例 image_path: 图片路径 prompts: 多个提示词列表 Returns: 包含所有提示词结果的字典 """ results = {} for i, prompt in enumerate(prompts): print(f" 使用提示词 {i+1}/{len(prompts)}: {prompt}") result = process_single_image(model, image_path, prompt) results[f"prompt_{i+1}"] = { "prompt": prompt, "result": result } return results def advanced_batch_processing(): """高级批量处理示例""" model = initialize_model() # 对每张图片使用多个提示词 image_files = list(Path(INPUT_DIR).glob("*.jpg"))[:5] # 只处理前5张 all_multi_results = {} for image_file in image_files: print(f"\n 处理图片: {image_file.name}") # 定义多个相关的提示词 related_prompts = [ "找到图中的人", "定位所有的人脸", "检测穿深色衣服的人", "找到站立的人" ] multi_result = multi_prompt_strategy(model, str(image_file), related_prompts) all_multi_results[str(image_file)] = multi_result # 保存高级结果 advanced_output = OUTPUT_DIR / "multi_prompt_results.json" with open(advanced_output, 'w', encoding='utf-8') as f: json.dump(all_multi_results, f, indent=2, ensure_ascii=False) return all_multi_results ``` ### 5.2 带重试机制的批量处理 ```python def process_with_retry( model: ChordModel, image_path: str, prompt: str, max_retries: int = 3, retry_delay: float = 2.0 ) -> Dict: """ 带重试机制的单张图片处理 Args: model: 模型实例 image_path: 图片路径 prompt: 提示词 max_retries: 最大重试次数 retry_delay: 重试延迟(秒) Returns: 处理结果 """ for attempt in range(max_retries): try: result = process_single_image(model, image_path, prompt) if 'error' not in result: return result except Exception as e: print(f" 第 {attempt + 1} 次尝试失败: {e}") if attempt < max_retries - 1: print(f"⏳ 等待 {retry_delay}秒后重试...") time.sleep(retry_delay) # 所有尝试都失败 return { "image_path": image_path, "prompt": prompt, "error": f"所有 {max_retries} 次尝试均失败", "timestamp": time.strftime("%Y-%m-%d %H:%M:%S") } def robust_batch_processing(): """健壮的批量处理示例""" model = initialize_model() image_files = list(Path(INPUT_DIR).glob("*.jpg")) results = [] for i, image_file in enumerate(image_files): print(f"\n 处理图片 {i+1}/{len(image_files)}: {image_file.name}") result = process_with_retry( model=model, image_path=str(image_file), prompt="找到图中的主要目标", max_retries=3 ) results.append(result) return results ``` ## 6. 性能优化与监控 ### 6.1 处理性能监控 ```python class ProcessingMonitor: """处理性能监控器""" def __init__(self): self.start_time = None self.processed_count = 0 self.total_boxes = 0 self.failed_count = 0 self.processing_times = [] def start_batch(self): """开始批量处理""" self.start_time = time.time() self.processed_count = 0 self.total_boxes = 0 self.failed_count = 0 self.processing_times = [] def record_result(self, result: Dict): """记录处理结果""" self.processed_count += 1 if 'error' in result: self.failed_count += 1 else: self.total_boxes += len(result.get('boxes', [])) self.processing_times.append(result.get('processing_time', 0)) def get_stats(self) -> Dict: """获取统计信息""" if not self.processing_times: return {} elapsed = time.time() - self.start_time if self.start_time else 0 return { "total_processed": self.processed_count, "successful": self.processed_count - self.failed_count, "failed": self.failed_count, "total_boxes_detected": self.total_boxes, "total_time_seconds": round(elapsed, 2), "avg_processing_time": round(sum(self.processing_times) / len(self.processing_times), 2), "images_per_minute": round((self.processed_count / elapsed) * 60, 2) if elapsed > 0 else 0 } def print_stats(self): """打印统计信息""" stats = self.get_stats() if not stats: print("暂无统计信息") return print(f"\n 性能统计:") print(f" 总处理图片: {stats['total_processed']}") print(f" 成功: {stats['successful']}") print(f" 失败: {stats['failed']}") print(f" 检测目标总数: {stats['total_boxes_detected']}") print(f" 总耗时: {stats['total_time_seconds']}秒") print(f" 平均处理时间: {stats['avg_processing_time']}秒/张") print(f" 处理速度: {stats['images_per_minute']}张/分钟") ``` ### 6.2 使用监控器的批量处理 ```python def monitored_batch_processing(): """带性能监控的批量处理""" model = initialize_model() monitor = ProcessingMonitor() image_files = list(Path(INPUT_DIR).glob("*.jpg")) results = [] print(" 开始带监控的批量处理...") monitor.start_batch() for i, image_file in enumerate(image_files): # 每10张图片显示一次进度 if i % 10 == 0: print(f" 进度: {i}/{len(image_files)}") result = process_single_image(model, str(image_file), "找到图中的目标") monitor.record_result(result) results.append(result) # 打印最终统计 monitor.print_stats() # 保存结果 output_file = OUTPUT_DIR / "monitored_results.json" save_results(results, str(output_file)) return results, monitor.get_stats() ``` ## 7. 实战应用示例 ### 7.1 完整的批量处理脚本 ```python #!/usr/bin/env python3 """ Chord视觉定位批量处理脚本支持批量调用model.infer()方法处理多张图片 """ import argparse import sys from pathlib import Path # 添加Chord服务路径 sys.path.append('/root/chord-service/app') def main(): """主函数""" parser = argparse.ArgumentParser(description='Chord视觉定位批量处理工具') parser.add_argument('--input-dir', required=True, help='输入图片目录') parser.add_argument('--output-file', default='batch_results.json', help='输出结果文件') parser.add_argument('--prompts', nargs='+', help='提示词列表') parser.add_argument('--batch-size', type=int, default=10, help='批处理大小') parser.add_argument('--same-prompt', action='store_true', help='对所有图片使用相同提示词') parser.add_argument('--device', default='cuda', choices=['cuda', 'cpu', 'auto'], help='计算设备') args = parser.parse_args() # 导入必要的模块 from model import ChordModel from PIL import Image import json import time try: print(" 开始Chord批量处理...") # 初始化模型 print(" 初始化模型...") model = ChordModel( model_path="/root/ai-models/syModelScope/chord", device=args.device ) model.load() print(" 模型初始化成功") # 执行批量处理 from your_module import batch_process_images # 假设这些函数在单独模块中 results = batch_process_images( model=model, image_dir=args.input_dir, prompts=args.prompts, output_file=args.output_file, batch_size=args.batch_size, same_prompt_for_all=args.same_prompt ) print(f" 处理完成！结果已保存到 {args.output_file}") except Exception as e: print(f" 处理失败: {e}") sys.exit(1) if __name__ == "__main__": main() ``` ### 7.2 使用示例和命令行调用 ```bash # 示例1：使用相同提示词处理所有图片 python batch_chord.py \ --input-dir /path/to/images \ --output-file results.json \ --prompts "找到图中的人" \ --same-prompt \ --batch-size 8 # 示例2：使用不同提示词处理图片 python batch_chord.py \ --input-dir /path/to/images \ --output-file results.json \ --prompts "找到人" "检测汽车" "定位动物" \ --batch-size 5 # 示例3：使用CPU模式处理 python batch_chord.py \ --input-dir /path/to/images \ --output-file results.json \ --prompts "找到目标" \ --device cpu ``` ## 8. 总结与最佳实践 ### 8.1 批量处理最佳实践通过本教程，我们学习了如何高效地批量调用Chord服务的model.infer()方法。以下是一些最佳实践建议： 1. **合适的批处理大小**：根据GPU内存调整batch_size，通常8-16是比较理想的范围 2. **多样化的提示词**：为不同类型的图片准备不同的提示词，提高检测准确率 3. **错误处理机制**：实现重试逻辑，处理偶尔的推理失败 4. **性能监控**：监控处理速度和质量，优化处理流程 5. **结果验证**：定期检查处理结果，确保质量符合预期 ### 8.2 常见问题处理 - **内存不足**：减小batch_size或使用CPU模式 - **处理速度慢**：检查GPU使用情况，确保CUDA正确配置 - **检测准确率低**：优化提示词，使用更具体的描述 - **服务连接问题**：检查Chord服务状态，确保端口7860可访问 ### 8.3 进一步优化方向 1. **异步处理**：使用asyncio实现真正的异步批量处理 2. **分布式处理**：在多台机器上分布式处理大量图片 3. **结果后处理**：添加结果过滤和排序功能 4. **可视化界面**：开发Web界面监控批量处理进度通过掌握这些批量处理技巧，你可以高效地处理大量的图像定位任务，充分发挥Qwen2.5-VL-Chord服务的强大能力。 --- > **获取更多AI镜像** > > 想探索更多AI镜像和应用场景？访问 [CSDN星图镜像广场](https://ai.csdn.net/?utm_source=mirror_blog_end)，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

下一篇立知-lychee-rerank-mm实战教程：批量重排序API封装与Python调用示例