<think>我们正在讨论Python测试工程师面试中的SQL和Python手写题。根据引用[4],SQL题目通常涉及多表关联查询,包括join on、group by、order by、having等,以及一些日期格式转换函数。而Python手写题则可能涉及基础的数据结构操作、字符串处理、文件读写等,以及一些与测试相关的场景,比如处理JSON数据(如引用[5]中的例子)。
下面我将分别整理SQL和Python的典型手写题,并给出解答示例。
### 一、SQL手写题
1. **多表关联查询**
- 题目示例:假设有两个表,`orders`(订单表)和`customers`(客户表)。订单表包含字段:`order_id`, `customer_id`, `order_date`, `amount`。客户表包含字段:`customer_id`, `customer_name`, `country`。请查询2023年每个国家客户的总订单金额,并按金额降序排列。
```sql
SELECT c.country, SUM(o.amount) AS total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY c.country
ORDER BY total_amount DESC;
```
2. **分组统计与条件过滤**
- 题目示例:在上面的订单表中,查询2023年订单总额超过10000元的客户,并显示客户姓名和总金额。
```sql
SELECT c.customer_name, SUM(o.amount) AS total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY c.customer_name
HAVING SUM(o.amount) > 10000;
```
3. **日期格式转换**
- 题目示例:将订单表中的日期字段`order_date`(假设为TIMESTAMP类型)转换为'YYYY-MM-DD'格式的字符串,并查询2023年3月的所有订单。
```sql
SELECT order_id, customer_id, TO_CHAR(order_date, 'YYYY-MM-DD') AS order_date_str, amount
FROM orders
WHERE TO_CHAR(order_date, 'YYYY-MM') = '2023-03';
```
### 二、Python手写题
1. **数据结构操作**
- 题目示例:有一个列表包含多个字典,每个字典代表一个订单,如`[{"order_id": 1, "amount": 100}, {"order_id": 2, "amount": 200}, ...]`。请编写函数,返回订单总金额。
```python
def total_amount(orders):
return sum(order['amount'] for order in orders)
```
2. **字符串处理**
- 题目示例:给定一个字符串,请统计每个单词出现的次数(忽略大小写和标点)。
```python
import re
from collections import Counter
def word_count(text):
words = re.findall(r'\b\w+\b', text.lower())
return Counter(words)
```
3. **JSON数据处理(如引用[5])**
- 题目示例:编写函数将嵌套的JSON对象展平(将嵌套字典的键合并为用点分隔的键,但题目中要求输出没有嵌套的字典,即所有键都在第一层)。
输入:`{"a":"aa","b":"bb","c":{"d":"dd","e":"ee"}}`
输出:`{'a': 'aa', 'b': 'bb', 'd': 'dd', 'e': 'ee'}`
注意:这个题目要求将嵌套字典展平,但注意题目输出中并没有保留父键(如'c'),而是直接将内层键提升到顶层。但通常展平会保留路径,比如`{'c.d': 'dd', 'c.e':'ee'}`。根据题目要求,我们按题目的输出格式写。
但是,引用[5]中的输出是:`{'a': 'aa', 'b': 'bb', 'd': 'dd', 'e': 'ee'}`,即直接丢弃了外层键'c',将内层键提升到顶层。这在实际中可能会造成键冲突,所以这里我们按照题目的要求写(可能是题目要求如此,或者有特殊说明)。
如果题目要求就是直接合并(不管键冲突),那么可以这样写:
```python
def flatten_dict(d):
result = {}
for key, value in d.items():
if isinstance(value, dict):
inner_dict = flatten_dict(value) # 递归展平
result.update(inner_dict)
else:
result[key] = value
return result
```
但是,上面的函数对于输入`{"a":"aa","b":"bb","c":{"d":"dd","e":"ee"}}`会输出`{'a': 'aa', 'b': 'bb', 'd': 'dd', 'e': 'ee'}`。
注意:如果存在键冲突(比如外层和内层有相同的键),则内层的值会覆盖外层的值。但题目没有这种情况,所以可以这样写。
4. **文件读写与处理**
- 题目示例:读取一个文本文件,统计文件中出现次数最多的前3个单词。
```python
from collections import Counter
import re
def top_words(filename, n=3):
with open(filename, 'r') as file:
text = file.read()
words = re.findall(r'\b\w+\b', text.lower())
word_counts = Counter(words)
return word_counts.most_common(n)
```
5. **测试相关:模拟测试数据**
- 题目示例:编写一个生成测试数据的函数,生成100个订单,每个订单包含`order_id`(从1开始)和`amount`(随机在100-1000之间的整数)。
```python
import random
def generate_orders(n=100):
orders = []
for i in range(1, n+1):
orders.append({
'order_id': i,
'amount': random.randint(100, 1000)
})
return orders
```
6. **测试相关:断言和异常处理**
- 题目示例:写一个函数,验证订单金额是否为正数,如果不是则抛出异常,并编写测试用例(使用pytest)。
```python
class InvalidOrderException(Exception):
pass
def validate_order(order):
if order['amount'] <= 0:
raise InvalidOrderException(f"Invalid amount: {order['amount']}")
# 测试用例(单独在测试文件中)
def test_validate_order():
valid_order = {'amount': 100}
validate_order(valid_order)
invalid_order = {'amount': -10}
try:
validate_order(invalid_order)
assert False # 不应该执行到这里
except InvalidOrderException:
assert True
```
### 三、总结
在Python测试工程师面试中,SQL手写题重点考察多表连接、分组统计、条件过滤和日期处理等。Python手写题则侧重于基础数据结构操作、字符串处理、文件读写、JSON处理以及测试相关场景的代码编写。注意在编写代码时考虑边界条件和异常处理。
§§相关问题§§
1. 在SQL中,LEFT JOIN和INNER JOIN的区别是什么?请举例说明。
2. 如何用Python递归遍历一个目录下的所有文件?
3. 在测试中,如何模拟一个需要权限验证的API接口?
4. 如何用pytest编写参数化测试用例来测试不同的输入和预期输出?
[^1]: Python测试工程师面试题基础部分
[^2]: Python自动化测试面试题目汇总
[^4]: 测试工程师面试题目(SQL和Python基础)
[^5]: JSON转字典的Python手写题示例
</think>
### Python测试工程师面试SQL和Python手写题精选
#### 一、SQL手写题(重点考察多表关联和数据处理)
1. **多表关联查询**
```sql
/* 查询每个客户的最新订单信息(客户表customers,订单表orders) */
SELECT c.customer_id, c.name, o.order_date, o.amount
FROM customers c
JOIN (
SELECT customer_id, MAX(order_date) AS latest_date
FROM orders
GROUP BY customer_id
) latest ON c.customer_id = latest.customer_id
JOIN orders o ON o.customer_id = latest.customer_id
AND o.order_date = latest.latest_date;
```
**考点**:子查询+JOIN优化[^4]
2. **数据聚合与过滤**
```sql
/* 统计2023年各月份订单总额超过10万的客户 */
SELECT
c.customer_id,
TO_CHAR(o.order_date, 'YYYY-MM') AS month,
SUM(o.amount) AS total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY c.customer_id, TO_CHAR(o.order_date, 'YYYY-MM')
HAVING SUM(o.amount) > 100000
ORDER BY month DESC, total_amount DESC;
```
**考点**:GROUP BY + HAVING + 日期函数[^4]
---
#### 二、Python手写题(重点考察数据处理和测试场景)
1. **JSON数据扁平化**(引用[5]案例扩展)
```python
def flatten_json(data, prefix=''):
""" 将嵌套JSON转换为单层字典 """
result = {}
for key, value in data.items():
new_key = f"{prefix}.{key}" if prefix else key
if isinstance(value, dict):
result.update(flatten_json(value, new_key))
else:
result[new_key] = value
return result
# 测试用例
input_data = {"a":"aa", "b":"bb", "c": {"d":"dd", "e":"ee"}}
print(flatten_json(input_data))
# 输出: {'a': 'aa', 'b': 'bb', 'c.d': 'dd', 'c.e': 'ee'}
```
2. **异常订单检测**(测试场景实现)
```python
def detect_abnormal_orders(orders, threshold=3):
"""
检测异常订单:同一用户短期内连续下单超过阈值
orders格式: [{"user_id":101, "order_time": "2023-01-01 12:00:00"}, ...]
"""
from collections import defaultdict
from datetime import datetime, timedelta
user_orders = defaultdict(list)
abnormal = []
# 按用户分组并排序
for order in sorted(orders, key=lambda x: x["order_time"]):
user_orders[order["user_id"]].append(order)
# 检测连续订单
for user, orders in user_orders.items():
if len(orders) < 2:
continue
for i in range(1, len(orders)):
t1 = datetime.strptime(orders[i-1]["order_time"], "%Y-%m-%d %H:%M:%S")
t2 = datetime.strptime(orders[i]["order_time"], "%Y-%m-%d %H:%M:%S")
if (t2 - t1) < timedelta(minutes=threshold):
abnormal.append(orders[i])
return abnormal
```
---
#### 三、测试专用SQL题(数据库验证场景)
```sql
/* 验证库存同步一致性(ERP表 vs 平台表) */
SELECT
e.sku,
e.erp_stock,
p.platform_stock,
ABS(e.erp_stock - p.platform_stock) AS diff
FROM erp_inventory e
JOIN platform_inventory p ON e.sku = p.sku
WHERE e.last_updated > CURRENT_TIMESTAMP - INTERVAL '1 HOUR'
AND p.last_updated > CURRENT_TIMESTAMP - INTERVAL '1 HOUR'
AND ABS(e.erp_stock - p.platform_stock) > 5; -- 允许5个以内误差
```
**考点**:数据一致性检查 + 时间窗口过滤[^2]
---
#### 四、测试专用Python题(API测试场景)
```python
import requests
import pytest
def test_order_sync():
""" 订单同步接口测试 """
# 1. 构造测试数据
payload = {
"order_id": "TEST_12345",
"items": [{"sku": "A001", "qty": 2}],
"status": "paid"
}
# 2. 调用同步接口
response = requests.post(
"https://api.example.com/order/sync",
json=payload,
timeout=5
)
# 3. 验证响应
assert response.status_code == 200
assert response.json()["success"] is True
# 4. 验证数据库
db_record = query_db("SELECT status FROM orders WHERE order_id='TEST_12345'")
assert db_record[0]["status"] == "paid"
```
---