Main Content

predict

Class: dlhdl.Workflow
Package: dlhdl

Run inference on deployed network and profile speed of neural network deployed on specified target device

Description

example

predict(imds) predicts responses for the image data in imds by using the deep learning network that you specified in the dlhdl.Workflow class for deployment on the specified target board and returns the results.

predict(imds, Name,Value) predicts responses for the image data in imds by using the deep learning network that you specified by using the dlhdl.Workflow class for deployment on the specified target boards and returns the results, with one or more arguments specified by optional name-value pair arguments.

Examples

Predict Outcome and Profile Results

Note

Before you run the predict function, make sure that your host computer is connected to the target device board. For more information, see Configure Board-Specific Setup Information .

Use this image to run the code:

% Save the pretrained SeriesNetwork object
snet = vgg19;

% Create a Target object and define the interface to the target board
hTarget = dlhdl.Target('Intel');

% Create a workflow object for the SeriesNetwork and using the FPFA bitstream 
hW = dlhdl.Workflow('Network', snet, 'Bitstream', 'arria10soc_single','Target',hTarget);

% Load input images and resize them according to the network specifications
image = imread('zebra.jpeg');
inputImg = imresize(image, [224, 224]);
imshow(inputImg);
imIn = single(inputImg);
% Deploy the workflow object
hW.deploy;
% Predict the outcome and optionally profile the results to measure performance.
[prediction, speed] = hW.predict(imIn,'Profile','on');
[val, idx] = max(prediction);
snet.Layers(end).ClassNames{idx}

### Finished writing input activations.
### Running single input activations.


              Deep Learning Processor Profiler Performance Results

                   LastLayerLatency(cycles)   LastLayerLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                  166206640                  1.10804                       1          166206873              0.9
    conv_module          156100737                  1.04067 
        conv1_1            2174602                  0.01450 
        conv1_2           15580687                  0.10387 
        pool1              1976185                  0.01317 
        conv2_1            7534356                  0.05023 
        conv2_2           14623885                  0.09749 
        pool2              1171628                  0.00781 
        conv3_1            7540868                  0.05027 
        conv3_2           14093791                  0.09396 
        conv3_3           14093717                  0.09396 
        conv3_4           14094381                  0.09396 
        pool3               766669                  0.00511 
        conv4_1            6999620                  0.04666 
        conv4_2           13725380                  0.09150 
        conv4_3           13724671                  0.09150 
        conv4_4           13725125                  0.09150 
        pool4               465360                  0.00310 
        conv5_1            3424060                  0.02283 
        conv5_2            3423759                  0.02283 
        conv5_3            3424758                  0.02283 
        conv5_4            3424461                  0.02283 
        pool5               113010                  0.00075 
    fc_module             10105903                  0.06737 
        fc6                8397997                  0.05599 
        fc7                1370215                  0.00913 
        fc8                 337689                  0.00225 
 * The clock frequency of the DL processor is: 150MHz



ans =

    'zebra'

 

Obtain Prediction Results for Quantized LogoNet Network

Note

Before you run the predict function, make sure that your host computer is connected to the target device board. For more information, see Configure Board-Specific Setup Information .

Create a file in your current working directory called getLogoNetwork.m. Enter these lines into the file:

function net = getLogoNetwork()
    data = getLogoData();
    net  = data.convnet;
end

function data = getLogoData()
    if ~isfile('LogoNet.mat')
        url = 'https://www.mathworks.com/supportfiles/gpucoder/cnn_models/logo_detection/LogoNet.mat';
        websave('LogoNet.mat',url);
    end
    data = load('LogoNet.mat');
end

Use this image to run the code:

To quantize the network, you need the products listed under FPGA in Quantization Workflow Prerequisites.

% Save the pretrained SeriesNetwork object
snet = getLogoNetwork();

% Create a Target object and define the interface to the target board
hTarget = dlhdl.Target('Xilinx','Interface','Ethernet');

% Create a Quantized Network Object

dlquantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA');
Image = imageDatastore('heineken.png','Labels','Heineken');
dlquantObj.calibrate(Image);

% Create a workflow object for the SeriesNetwork and using the FPFA bitstream 
hW = dlhdl.Workflow('Network', dlquantObj, 'Bitstream', 'zcu102_int8','Target',hTarget);

% Load input images and resize them according to the network specifications
image = imread('heineken.png');
inputImg = imresize(image, [227, 227]);
imshow(inputImg);
imIn = single(inputImg);
% Deploy the workflow object
hW.deploy;
% Predict the outcome and optionally profile the results to measure performance.
[prediction, speed] = hW.predict(imIn,'Profile','on');
[val, idx] = max(prediction);
snet.Layers(end).ClassNames{idx}

### Loading weights to FC Processor.
### FC Weights loaded. Current time is 12-Jun-2020 16:55:34
### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA.
### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA.
### Finished writing input activations.
### Running single input activations.


              Deep Learning Processor Profiler Performance Results

                   LastLayerLatency(cycles)   LastLayerLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                   13604105                  0.04535                       1           13604146             22.1
    conv_module           12033763                  0.04011 
        conv_1             3339984                  0.01113 
        maxpool_1          1490805                  0.00497 
        conv_2             2866483                  0.00955 
        maxpool_2           574102                  0.00191 
        conv_3             2432474                  0.00811 
        maxpool_3           700552                  0.00234 
        conv_4              617505                  0.00206 
        maxpool_4            11951                  0.00004 
    fc_module              1570342                  0.00523 
        fc_1                937715                  0.00313 
        fc_2                599341                  0.00200 
        fc_3                 33284                  0.00011 
 * The clock frequency of the DL processor is: 300MHz

Input Arguments

expand all

Input image resized to match the image input layer size of the deep learning network.

Example: To read an input image, resize it to 227x227, and convert it to single use:

Use this image to run the code:

image = imread('heineken.png');
inputImg = imresize(image, [227, 227]);
imIn = single(inputImg)

Example: imIn

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example:

Flag to return profiling results, for the deep learning network deployed to the target board.

Example: 'Profile', 'On'

Introduced in R2020b