使用 C# 通过 Faster RCNN 深度学习进行目标检测

本示例将演示如何使用 ONNX Runtime C# API 运行预训练的 Faster R-CNN 目标检测 ONNX 模型。

本示例的源代码可在此处获取：此处。

先决条件

要运行此示例，您需要具备以下条件：

为您的操作系统（Mac、Windows 或 Linux）安装 .NET Core 3.1 或更高版本。
将 Faster R-CNN ONNX 模型下载到本地系统。
下载此演示图像以测试模型。您也可以使用任何您喜欢的图像。

开始使用

现在我们已经设置好了一切，我们可以开始添加代码来在图像上运行模型。为了简单起见，我们将在程序的 main 方法中完成此操作。

读取路径

首先，让我们读取模型路径、要测试的图像路径和输出图像路径

string modelFilePath = args[0];
string imageFilePath = args[1];
string outImageFilePath = args[2];

读取图像

接下来，我们将使用跨平台图像库 ImageSharp 读取图像。

using Image<Rgb24> image = Image.Load<Rgb24>(imageFilePath, out IImageFormat format);

请注意，我们专门读取 Rgb24 类型，以便在后续步骤中高效地预处理图像。

调整图像大小

接下来，我们将图像调整到模型期望的合适大小；建议将图像调整为高度和宽度都在 [800, 1333] 范围内的尺寸。

float ratio = 800f / Math.Min(image.Width, image.Height);
using Stream imageStream = new MemoryStream();
image.Mutate(x => x.Resize((int)(ratio * image.Width), (int)(ratio * image.Height)));
image.Save(imageStream, format);

预处理图像

接下来，我们将根据模型要求预处理图像。

var paddedHeight = (int)(Math.Ceiling(image.Height / 32f) * 32f);
var paddedWidth = (int)(Math.Ceiling(image.Width / 32f) * 32f);
var mean = new[] { 102.9801f, 115.9465f, 122.7717f };

// Preprocessing image
// We use DenseTensor for multi-dimensional access
DenseTensor<float> input = new(new[] { 3, paddedHeight, paddedWidth });
image.ProcessPixelRows(accessor =>
{
    for (int y = paddedHeight - accessor.Height; y < accessor.Height; y++)
    {
        Span<Rgb24> pixelSpan = accessor.GetRowSpan(y);
        for (int x = paddedWidth - accessor.Width; x < accessor.Width; x++)
        {
            input[0, y, x] = pixelSpan[x].B - mean[0];
            input[1, y, x] = pixelSpan[x].G - mean[1];
            input[2, y, x] = pixelSpan[x].R - mean[2];
        }
    }
});

这里，我们创建了一个所需大小 (channels, paddedHeight, paddedWidth) 的张量，访问像素值，对其进行预处理，最后将其分配到张量的相应索引处。

设置输入

// 锁定 DenseTensor 内存并在 OrtValue 张量中直接使用 // 它将在 OrtValue 释放时解除锁定

using var inputOrtValue = OrtValue.CreateTensorValueFromMemory(OrtMemoryInfo.DefaultInstance,
    input.Buffer, new long[] { 3, paddedHeight, paddedWidth });

接下来，我们将创建模型的输入

var inputs = new Dictionary<string, OrtValue>
{
    { "image", inputOrtValue }
};

要检查 ONNX 模型的输入节点名称，您可以使用 Netron 可视化模型并查看输入/输出名称。在本例中，该模型的输入节点名称为 image。

运行推理

接下来，我们将创建一个推理会话并通过它运行输入

using var session = new InferenceSession(modelFilePath);
using var runOptions = new RunOptions();
using IDisposableReadOnlyCollection<OrtValue> results = session.Run(runOptions, inputs, session.OutputNames);

后处理输出

接下来，我们需要后处理输出，以获取每个框的边界框、关联的标签和置信度分数。

var boxesSpan = results[0].GetTensorDataAsSpan<float>();
var labelsSpan = results[1].GetTensorDataAsSpan<long>();
var confidencesSpan = results[2].GetTensorDataAsSpan<float>();

const float minConfidence = 0.7f;
var predictions = new List<Prediction>();

for (int i = 0; i < boxesSpan.Length - 4; i += 4)
{
    var index = i / 4;
    if (confidencesSpan[index] >= minConfidence)
    {
        predictions.Add(new Prediction
        {
            Box = new Box(boxesSpan[i], boxesSpan[i + 1], boxesSpan[i + 2], boxesSpan[i + 3]),
            Label = LabelMap.Labels[labelsSpan[index]],
            Confidence = confidencesSpan[index]
        });
    }
}

请注意，我们只选择置信度高于 0.7 的框，以消除误报。

查看预测

接下来，我们将在图像上绘制边界框以及关联的标签和置信度分数，以查看模型的表现。

using var outputImage = File.OpenWrite(outImageFilePath);
Font font = SystemFonts.CreateFont("Arial", 16);
foreach (var p in predictions)
{
    image.Mutate(x =>
    {
        x.DrawLines(Color.Red, 2f, new PointF[] {

            new PointF(p.Box.Xmin, p.Box.Ymin),
            new PointF(p.Box.Xmax, p.Box.Ymin),

            new PointF(p.Box.Xmax, p.Box.Ymin),
            new PointF(p.Box.Xmax, p.Box.Ymax),

            new PointF(p.Box.Xmax, p.Box.Ymax),
            new PointF(p.Box.Xmin, p.Box.Ymax),

            new PointF(p.Box.Xmin, p.Box.Ymax),
            new PointF(p.Box.Xmin, p.Box.Ymin)
        });
        x.DrawText($"{p.Label}, {p.Confidence:0.00}", font, Color.White, new PointF(p.Box.Xmin, p.Box.Ymin));
    });
}
image.Save(outputImage, format);

对于每个边界框预测，我们使用 ImageSharp 绘制红线以创建边界框，并绘制标签和置信度文本。

运行程序

现在程序已创建，我们可以使用以下命令运行它

dotnet run [path-to-model] [path-to-image] [path-to-output-image]

例如，运行

dotnet run ~/Downloads/FasterRCNN-10.onnx ~/Downloads/demo.jpg ~/Downloads/out.jpg

检测图像中的以下对象