一、准备镜像

# 1.下载docker镜像
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
# 2.导出镜像
docker save ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu | gzip > paddle_npu_cann800_ubuntu20_910b_aarch64_gcc84.tar.gz

二、准备paddleocr相关安装包

注意:需要在aarch64的环境下载、并且是py3.10环境,需要和镜像内python版本保持一致。

pip download -d ./packs paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/
pip download -d ./packs paddle-custom-npu -i https://www.paddlepaddle.org.cn/packages/nightly/npu/
pip download -d ./packs "paddleocr[all]" -i https://mirrors.aliyun.com/pypi/simple/
# 注意,下面的包也需要下载,不下载的话,识别模型会报错
pip download -d ./packs numpy==1.26.4 -i https://mirrors.aliyun.com/pypi/simple/
pip download -d ./packs opencv-python==3.4.18.65 -i https://mirrors.aliyun.com/pypi/simple/

三、手动启动方式

1.提前准备

1.1 启动容器

# 加载镜像
docker load -i paddle_npu_cann800_ubuntu20_910b_aarch64_gcc84.tar.gz

# 启动服务
docker run -it --name paddle-npu-dev -v $(pwd):/work \
    --privileged --network=host --shm-size=128G -w=/work \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -e ASCEND_RT_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"\
    ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann80RC2-ubuntu20-aarch64-gcc84-py39 /bin/bash

1.2 安装python依赖

  • 这些包是第一章下载的包,需要进入paddle-npu-dev容器进行安装
pip install paddlepaddle==3.3.0 --no-index --find-links=./packs/packages
pip install paddle-custom-npu==3.3.0 --no-index --find-links=./packs/packages
pip install "paddleocr[all]" --no-index --find-links=./packs/packages
pip install packages/numpy=1.26.4 --no-index --find-links=./packs/packages
pip install packages/opencv-python==3.4.18.65 --no-index --find-links=./packs/packages

1.3 安装web服务器依赖

这个对于paddlex 或者 gunicorn都会用到

  • 获取paddlex安装服务的时候,需要哪些依赖包。需要自行下载后,在镜像里面安装
  • 获取paddlex服务的python依赖包清单
from paddlex.utils.deps import get_serving_dep_specs

open("paddlex_requirements.txt", "w").write("\n".join(get_serving_dep_specs()))
  • 下载包
# 下载python包
pip download -d ./packs -r paddlex_requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
# 打包
tar -czvf paddlex_packs.tar.gz .
  • 如果使用gunicorn,那么需要下载gunicorn依赖

2.paddlex-单并发服务

该方式只支持并发为1,处理完一张图片,才处理后面的图片

2.1 安装paddlex服务

  • 安装paddlex服务
paddlex --install serving

2.2 启动paddlex服务

  • 启动命令:
paddlex --serve --pipeline OCR --device npu:0 --port 8004
  • 请求格式:
url: http://127.0.0.1:8004/ocr
method:POST
body(json):
	{"file": "http://127.0.0.1:12345/test.png"}

3.Gunicorn+uvicorn+fastapi(多并发服务)

3.1 安装gunicorn

3.1 代码

  • 需要自行下载gunicorn包,进行安装
import base64
import binascii
import os
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline


def prune_result(result: dict) -> dict:
    KEYS_TO_REMOVE = ["input_path", "page_index"]

    def _process_obj(obj):
        if isinstance(obj, dict):
            return {
                k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
            }
        elif isinstance(obj, list):
            return [_process_obj(item) for item in obj]
        else:
            return obj

    return _process_obj(result)


# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
    os.mkdir(OCR_TMP)


def get_random_file_path(filename):
    filename = uuid.uuid4().hex + filename[filename.rfind("."):]
    return os.path.join(OCR_TMP, filename)


def init_ocr_pipeline():
    """初始化 OCR 管道(确保只执行一次)"""
    global ocr_pipeline
    if ocr_pipeline is None:
        try:
            ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
            print(f"OCR 初始化成功(进程ID:{os.getpid()})")
        except Exception as e:
            print(f"OCR 初始化失败: {str(e)}")
            raise RuntimeError(f"OCR 初始化失败: {str(e)}")


@asynccontextmanager
async def lifespan(app: FastAPI):
    # 在应用启动时加载模型
    init_ocr_pipeline()
    yield


# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)

# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None


# 定义请求体模型
class OCRRequest(BaseModel):
    image_base64: str
    image_name: str


# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: OCRRequest):
    file_path = None  # 定义临时文件路径,用于finally中清理
    try:
        # 处理 Base64 字符串(该base64,不带前缀)
        base64_str = request.image_base64

        file_path = get_random_file_path(request.image_name)

        if "," in base64_str:
            base64_str = base64_str.split(",")[-1]

        # 解码 Base64 为二进制数据
        image_data = base64.b64decode(base64_str)

        # 3. 创建临时文件(PNG格式),写入二进制数据
        with open(file_path, "wb") as f:
            f.write(image_data)

        result = ocr_pipeline.predict(
            file_path
        )

        # 4.处理ocr结果
        ocr_results: List[Dict[str, Any]] = []
        for i, item in enumerate(result):
            pruned_res = prune_result(item.json["res"])
            ocr_results.append(
                {
                    "prunedResult": pruned_res,
                }
            )

        # 返回识别结果
        return {
            "code": 200,
            "message": "识别成功",
            "data": ocr_results
        }

    except binascii.Error:
        raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
    finally:
        if os.path.exists(file_path):
            os.remove(file_path)
  • 启动命令
gunicorn paddle_ocr_server:app -w 8 -b 0.0.0.0:8004 -k uvicorn.workers.UvicornWorker

四、docker-compose启动方式

1.start.sh

#!/bin/bash
set -e

echo "Installing offline packages from /work/packages..."
pip install paddlepaddle==3.3.0 --no-index --find-links=/work/packages
pip install paddle-custom-npu==3.3.0 --no-index --find-links=/work/packages
pip install "paddleocr[all]" --no-index --find-links=/work/packages
pip install -r /work/paddlex_requirements.txt --no-index --find-links=/work/packs
pip install packages/numpy-1.26.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
pip install packages/opencv_python-3.4.18.65-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
pip install gunicorn --no-index --find-links=/work/packages

# 解决安装完包,找不到.so报错
source /root/.bashrc

gunicorn paddle_ocr_server:app -w 8 -b 0.0.0.0:8004 -k uvicorn.workers.UvicornWorker

2.docker-compose.yaml

services:
  fxyj_paddle:
    image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
    container_name: fxyj_paddle_910b
    privileged: true
    network_mode: host
    shm_size: 128G
    working_dir: /work
    volumes:
      - .:/work
      - /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
      - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro
      - /usr/local/dcmi:/usr/local/dcmi:ro
    environment:
      - ASCEND_RT_VISIBLE_DEVICES=5
      - FLAGS_npu_jit_compile=0
      - PADDLE_PDX_CACHE_HOME=/work/models
    entrypoint: ["/work/start.sh"]
    command: ["/bin/bash"]
    stdin_open: true
    tty: true

3.服务端+客户端(无接口权限)

3.1 服务端

import base64
import binascii
import os
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline


class OCRUtils:

    @staticmethod
    def prune_result(result: dict) -> dict:
        KEYS_TO_REMOVE = ["input_path", "page_index"]

        def _process_obj(obj):
            if isinstance(obj, dict):
                return {
                    k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
                }
            elif isinstance(obj, list):
                return [_process_obj(item) for item in obj]
            else:
                return obj

        return _process_obj(result)

    @staticmethod
    def get_random_file_path(filename):
        filename = uuid.uuid4().hex + filename[filename.rfind("."):]
        return os.path.join(OCR_TMP, filename)

    @staticmethod
    def init_ocr_pipeline():
        """初始化 OCR 管道(确保只执行一次)"""
        global ocr_pipeline
        if ocr_pipeline is None:
            try:
                ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
                print(f"OCR 初始化成功(进程ID:{os.getpid()})")
            except Exception as e:
                print(f"OCR 初始化失败: {str(e)}")
                raise RuntimeError(f"OCR 初始化失败: {str(e)}")


# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
    os.mkdir(OCR_TMP)


@asynccontextmanager
async def lifespan(app: FastAPI):
    # 在应用启动时加载模型
    OCRUtils.init_ocr_pipeline()
    yield


# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)

# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None


# 定义请求体模型
class OCRRequest(BaseModel):
    image_base64: str
    image_name: str


# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: OCRRequest):
    file_path = None  # 定义临时文件路径,用于finally中清理
    try:
        # 处理 Base64 字符串(该base64,不带前缀)
        base64_str = request.image_base64

        file_path = OCRUtils.get_random_file_path(request.image_name)

        if "," in base64_str:
            base64_str = base64_str.split(",")[-1]

        # 解码 Base64 为二进制数据
        image_data = base64.b64decode(base64_str)

        # 3. 创建临时文件(PNG格式),写入二进制数据
        with open(file_path, "wb") as f:
            f.write(image_data)

        result = ocr_pipeline.predict(
            file_path
        )

        # 4.处理ocr结果
        ocr_results: List[Dict[str, Any]] = []
        for i, item in enumerate(result):
            pruned_res = OCRUtils.prune_result(item.json["res"])
            ocr_results.append(
                {
                    "prunedResult": pruned_res,
                }
            )

        # 返回识别结果
        return {
            "code": 200,
            "message": "识别成功",
            "data": ocr_results
        }

    except binascii.Error:
        raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
    finally:
        if os.path.exists(file_path):
            os.remove(file_path)

3.2 客户端

import base64
import os.path

import requests

API_URL = "http://127.0.0.1:8004/ocr/recognize"

def ocr_reconize(img_path, retry_req_num=3, timeout=300):
    with open(file_path, "rb") as file:
        file_bytes = file.read()
        file_data = base64.b64encode(file_bytes).decode()
    payloard = {
        "image_base64": file_data,
        "image_name": os.path.basename(img_path)
    }
    for try_req_n in range(retry_req_num):
        try:
            response = requests.post(API_URL, json=payloard,
                                     timeout=timeout)
        except requests.exceptions.RequestException as e:
            print(f"第{try_req_n + 1}次请求失败,错误信息:{e}")
            continue
        if response.status_code != 200:
            print(f"第{try_req_n + 1}次请求失败,状态码:{response.status_code}")
            continue
        # 拼接ocr识别结果
        ocr_data = response.json().get("data")
        if not ocr_data:
            return ""
        return "".join(ocr_data[0]["prunedResult"]["rec_texts"])
    return ""

if __name__ == '__main__':
    file_path = "./123.png"
    print(ocr_reconize(file_path))

3.服务端+客户端(接口有权限)

3.1.docker-compose.yaml

  • 添加SECRET_KEY配置
services:
  fxyj_paddle:
    image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
    container_name: fxyj_paddle_910b
    privileged: true
    network_mode: host
    shm_size: 128G
    working_dir: /work
    volumes:
      - .:/work
      - /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
      - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro
      - /usr/local/dcmi:/usr/local/dcmi:ro
    environment:
      - ASCEND_RT_VISIBLE_DEVICES=5
      - FLAGS_npu_jit_compile=0
      - PADDLE_PDX_CACHE_HOME=/work/models
      - SECRET_KEY=your_secret_key
    entrypoint: ["/work/start.sh"]
    command: ["/bin/bash"]
    stdin_open: true
    tty: true

3.1 服务端

import base64
import binascii
import hashlib
import json
import os
import time
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline


class OCRUtils:
    @staticmethod
    def prune_result(result: dict) -> dict:
        KEYS_TO_REMOVE = ["input_path", "page_index"]

        def _process_obj(obj):
            if isinstance(obj, dict):
                return {
                    k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
                }
            elif isinstance(obj, list):
                return [_process_obj(item) for item in obj]
            else:
                return obj

        return _process_obj(result)

    @staticmethod
    def init_ocr_pipeline():
        """初始化 OCR 管道(确保只执行一次)"""
        global ocr_pipeline
        if ocr_pipeline is None:
            try:
                ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
                print(f"OCR 初始化成功(进程ID:{os.getpid()})")
            except Exception as e:
                print(f"OCR 初始化失败: {str(e)}")
                raise RuntimeError(f"OCR 初始化失败: {str(e)}")

    @staticmethod
    def get_random_file_path(filename):
        filename = uuid.uuid4().hex + filename[filename.rfind("."):]
        return os.path.join(OCR_TMP, filename)

    @staticmethod
    def verify_ocr_sign(signed_request: dict, secret_key: str, timeout: int = 300) -> bool:
        """
        验证OCR请求的签名合法性

        Args:
            signed_request: 带签名的请求数据 {"timestamp":..., "ocr_data":..., "sign":...}
            secret_key: 签名密钥(字符串)
            timeout: 签名有效期(秒),默认5分钟

        Returns:
            验签结果:True(合法)/False(非法)
        """
        try:
            # 1. 提取必要字段
            timestamp = signed_request["timestamp"]
            ocr_data = signed_request["ocr_data"]
            received_sign = signed_request["sign"]

            # 2. 校验时间戳是否过期(防止重放攻击)
            current_timestamp = int(time.time() * 1000)
            if abs(current_timestamp - timestamp) > timeout * 1000:
                print(f"签名过期:当前时间戳{current_timestamp},请求时间戳{timestamp}")
                return False

            # 3. 重新计算签名
            ocr_data_str = json.dumps(ocr_data, ensure_ascii=False, sort_keys=True)
            sign_str = f"{timestamp}{ocr_data_str}{secret_key}"
            calculated_sign = hashlib.sha256(sign_str.encode('utf-8')).hexdigest()

            # 4. 对比签名(使用恒等比较防止时序攻击)
            if received_sign != calculated_sign:
                print(f"签名不匹配:收到{received_sign},计算出{calculated_sign}")
                return False
            return True

        except KeyError as e:
            print(f"请求数据缺少必要字段:{e}")
            return False
        except Exception as e:
            print(f"验签过程出错:{e}")
            return False


# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
    os.mkdir(OCR_TMP)


@asynccontextmanager
async def lifespan(app: FastAPI):
    # 在应用启动时加载模型
    OCRUtils.init_ocr_pipeline()
    yield


# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)

# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None
# 验签的密钥
SECRET_KEY = os.environ.get('SECRET_KEY', '默认密钥')
print("密钥: ", SECRET_KEY)

# 1. 定义嵌套的请求体模型(适配新的请求结构)
class OCRData(BaseModel):
    image_base64: str  # 对应原image_base64,字段名按新要求调整
    image_name: str  # 对应原image_name,字段名按新要求调整


class SignedOCRRequest(BaseModel):
    sign: str  # 签名
    timestamp: int  # 时间戳(毫秒级)
    ocr_data: OCRData  # 嵌套的OCR数据


# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: SignedOCRRequest):
    file_path = None  # 定义临时文件路径,用于finally中清理

    # 将Pydantic模型转为字典,用于验签
    request_dict = request.model_dump()
    if not OCRUtils.verify_ocr_sign(request_dict, SECRET_KEY):
        raise HTTPException(
            status_code=401,
            detail={"code": 401, "message": "签名无效或过期", "data": []}
        )

    try:
        # 处理 Base64 字符串(该base64,不带前缀)
        ocr_data = request.ocr_data
        base64_str = ocr_data.image_base64

        file_path = OCRUtils.get_random_file_path(ocr_data.image_name)

        if "," in base64_str:
            base64_str = base64_str.split(",")[-1]

        # 解码 Base64 为二进制数据
        image_data = base64.b64decode(base64_str)

        # 3. 创建临时文件(PNG格式),写入二进制数据
        with open(file_path, "wb") as f:
            f.write(image_data)

        result = ocr_pipeline.predict(
            file_path
        )

        # 4.处理ocr结果
        ocr_results: List[Dict[str, Any]] = []
        for i, item in enumerate(result):
            pruned_res = OCRUtils.prune_result(item.json["res"])
            ocr_results.append(
                {
                    "prunedResult": pruned_res,
                }
            )

        # 返回识别结果
        return {
            "code": 200,
            "message": "识别成功",
            "data": ocr_results
        }

    except binascii.Error:
        raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
    finally:
        if os.path.exists(file_path):
            os.remove(file_path)

3.2 客户端

import base64
import hashlib
import json
import os.path
import time

import requests

API_URL = "http://127.0.0.1:8004/ocr/recognize"
SECRET_KEY = 'your_secret_key'

def generate_ocr_sign(ocr_request: dict, secret_key: str) -> dict:
    """
    生成带签名的OCR请求数据

    Args:
        ocr_request: 原始OCR请求数据 {"img_base64": "xxx", "img_name": "xxx.png"}
        secret_key: 签名密钥(字符串)

    Returns:
        带签名的请求字典 {"timestamp": 12345677, "ocr_data": {...}, "sign": "..."}
    """
    # 1. 生成毫秒级时间戳
    timestamp = int(time.time() * 1000)

    # 2. 按固定顺序拼接待签名字符串(保证前后端拼接规则一致)
    # 先将ocr_request转为JSON字符串
    ocr_data_str = json.dumps(ocr_request, ensure_ascii=False, sort_keys=True)
    # 拼接规则:timestamp + ocr_data_str + secret_key
    sign_str = f"{timestamp}{ocr_data_str}{secret_key}"

    # 3. SHA256签名
    sign = hashlib.sha256(sign_str.encode('utf-8')).hexdigest()

    # 4. 组装最终请求结构
    result = {
        "timestamp": timestamp,
        "ocr_data": ocr_request,
        "sign": sign
    }

    return result

def ocr_reconize(img_path, secret_key, retry_req_num=3, timeout=300):
    with open(file_path, "rb") as file:
        file_bytes = file.read()
        file_data = base64.b64encode(file_bytes).decode()
    payloard = {
        "image_base64": file_data,
        "image_name": os.path.basename(img_path)
    }
    sign_payload =  generate_ocr_sign(ocr_request=payloard,
                                      secret_key=secret_key)
    print(sign_payload)
    for try_req_n in range(retry_req_num):
        try:
            response = requests.post(API_URL, json=sign_payload,
                                     timeout=timeout)
        except requests.exceptions.RequestException as e:
            print(f"第{try_req_n + 1}次请求失败,错误信息:{e}")
            continue
        if response.status_code != 200:
            open("w.txt", "w").write(response.text)
            print(f"第{try_req_n + 1}次请求失败,状态码:{response.status_code}")
            continue
        # 拼接ocr识别结果
        ocr_data = response.json().get("data")
        if not ocr_data:
            return ""
        return "".join(ocr_data[0]["prunedResult"]["rec_texts"])
    return ""

if __name__ == '__main__':
    file_path = "./123.png"
    print(ocr_reconize(file_path, secret_key=SECRET_KEY))
Logo

作为“人工智能6S店”的官方数字引擎,为AI开发者与企业提供一个覆盖软硬件全栈、一站式门户。

更多推荐