Discrete FIR Filter
Finite impulse response filter
 Library:
DSP HDL Toolbox / Filtering
Description
The Discrete FIR Filter block models finiteimpulse response filter architectures optimized for HDL code generation. The block accepts scalar or framebased input, and provides an option for programmable coefficients. It provides a hardwarefriendly interface with input and output control signals. To provide a cycleaccurate simulation of the generated HDL code, the block models architectural latency including pipeline registers and resource sharing.
The block provides three filter structures. The direct form systolic architecture provides a fully parallel implementation that makes efficient use of Intel^{®} and Xilinx^{®} DSP blocks. The direct form transposed architecture is a fully parallel implementation and is suitable for FPGA and ASIC applications. The partly serial systolic architecture provides a configurable serial implementation that makes efficient use of FPGA DSP blocks. For a filter implementation that matches multipliers, pipeline registers, and preadders to the DSP configuration of your FPGA vendor, specify your target device when you generate HDL code.
All three filter structures remove multipliers for zerovalued coefficients, such as in halfband filters and Hilbert transforms. When you use scalar input data, all filter structures share multipliers for symmetric and antisymmetric coefficients. Framebased filters do not implement symmetry optimization.
The latency between valid input data and the corresponding valid output data depends on the filter structure, serialization options, the number of coefficients, and whether the coefficient values provide optimization opportunities. For details of structure and latency, see the Algorithm section.
For a FIR filter with multichannel support, use the Discrete FIR Filter (Simulink) block instead.
Ports
Input
data
— Input data
scalar or column vector of real or complex values
Input data, specified as a scalar or column vector of real or complex values. The vector size must be a power of 2 in the range from 1 to 64. When the input data type is an integer type or a fixedpoint type, the block uses fixedpoint arithmetic for internal calculations.
double
and single
data
types are supported for simulation, but not for HDL code generation.
Data Types: fixed point
 single
 double
 int8
 int16
 int32
 uint8
 uint16
 uint32
Complex Number Support: Yes
valid
— Indicates valid input data
scalar
Control signal that indicates if the input data is valid.
When valid is 1
(true
), the block captures the
values from the input data port. When
valid is 0
(false
), the block ignores the
values from the input data
port.
Data Types: Boolean
coeff
— Filter coefficients
real or complex row vector
Filter coefficients, specified as a row vector of real or complex values. You can change the input coefficients at any time. When you use scalar input data, the size of the coefficient vector depends on the size and symmetry of the sample coefficients specified in the Coefficients prototype parameter. The prototype specifies a sample coefficient vector that is representative of the symmetry and zerovalued locations of the expected input coefficients. The block uses the prototype to optimize the filter by sharing multipliers for symmetric or antisymmetric coefficients, and removing multipliers for zerovalued coefficients. Therefore, provide only the nonduplicate coefficients at the port. For example, if you set the Coefficients prototype parameter to a symmetric 14tap filter, the block expects a vector of 7 values on the coeff input port. You must still provide zeros in the input coeff vector for the nonduplicate zerovalued coefficients.
When you use framebased input data, the block does not optimize the filter for coefficient symmetry. The block still uses the Coefficients prototype to remove multipliers for zerovalued coefficients. At the coeff input port, specify a vector that is the same size as the prototype.
double
and single
data
types are supported for simulation, but not for HDL code generation.
Dependencies
To enable this port, set Coefficients
source to Input port
(Parallel interface)
.
Data Types: single
 double
 int8
 int16
 int32
 uint8
 uint16
 uint32
 fixed point
reset
— Clears internal states
scalar
Control signal that clears internal states. When
reset is 1
(true
), the block stops the
current calculation and clears internal states. When the
reset is 0
(false
) and the input
valid is 1
(true
), the block captures data
for processing.
For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.
Dependencies
To enable this port, on the Control Ports tab, select Enable reset input port.
Data Types: Boolean
Output
data
— Filtered output data
scalar or column vector of real or complex values
Filtered output data, returned as a scalar or column vector of real or complex values. The dimensions of the output match the dimensions of the input. When the input data type is a floatingpoint type, the output data inherits the data type of the input data. When the input data type is an integer type or a fixedpoint type, the Output parameter on the Data Types tab controls the output data type.
Data Types: fixed point
 single
 double
Complex Number Support: Yes
valid
— Indicates valid output data
scalar
Control signal that indicates if the data from the output
data port is valid. When
valid is 1
(true
), the block returns valid
data from the output data port. When
valid is 0
(false
), the values from the
output data port are not
valid.
Data Types: Boolean
ready
— Indicates block is ready for new input data
scalar
Control signal that indicates that the block is ready for
new input data sample on the next cycle. When ready is 1
(true
), you can specify the
data and valid inputs for the next
time step. When ready is
0
(false
), the
block ignores any input data in the next time step.
When using the partly serial architecture, the block
processes one sample at a time. If your design waits for
this block to return ready set to 0
before setting the input valid to 0
(false
), then one additional
cycle of input data arrives at the port. The block stores
this additional data while processing the current data,
and then does not set ready
to 1
(true
), until your model
processes the additional input data.
Dependencies
To enable this port, set Filter
structure to Partly serial
systolic
.
Data Types: Boolean
Parameters
Main
Coefficient source
— Source of filter coefficients
Property
(default)  Input port (Parallel
interface)
You can enter constant filter coefficients as a parameter or provide timevarying filter coefficients using an input port.
Selecting Input port (Parallel
interface)
enables the
coeff port on the block and the
Coefficients prototype
parameter. Specify a prototype to enable the block to
optimize the filter implementation according to the values
of your coefficients. To use Input port
(Parallel interface)
, set the
Filter structure parameter to
Direct form systolic
or
Direct form
transposed
.
When you use programmable coefficients with framebased input, the filter does not optimize the filter for coefficient symmetry. Also, the output after a change of coefficient values may not exactly match the output in the scalar case. This difference is because the subfilter calculations are done at different times relative to the input coefficient values, compared with the scalar implementation.
Coefficients
— Discrete FIR filter coefficients
[0.5, 0.5]
(default)  real or complex vector
Discrete FIR filter coefficients, specified as a vector of real or complex values. You can also specify the vector as a workspace variable or as a call to a filter design function. When the input data type is a floatingpoint type, the block casts the coefficients to the same data type as the input. When the input data type is an integer type or a fixedpoint type, you can set the data type of the coefficients on the Data Types tab.
Example: firpm(30,[0 0.1 0.2 0.5]*2,[1 1 0
0])
Dependencies
To enable this parameter, set Coefficients
source to
Property
.
Data Types: single
 double
 int8
 int16
 int32
 uint8
 uint16
 uint32
Coefficients prototype
— Prototype filter coefficients
[]
(default)  real or complex vector
Prototype filter coefficients, specified as a vector of
real or complex values. The prototype specifies a sample
coefficient vector that is representative of the symmetry
and zerovalued locations of the expected input
coefficients. If all of your input coefficient vectors
have the same symmetry and zerovalued coefficient
locations, set Coefficients prototype
to one of those vectors. If your coefficients are unknown
or not expected to share symmetry or zerovalued
locations, set Coefficients prototype
to []
. The block uses the prototype to
optimize the filter by sharing multipliers for symmetric
or antisymmetric coefficients, and removing multipliers
for zerovalued coefficients.
When you use framebased input data, the block does not optimize the filter for coefficient symmetry. The block still uses the Coefficients prototype to remove multipliers for zerovalued coefficients. At the coeff input port, specify a vector that is the same size as the prototype.
When you use scalar input data, coefficient optimizations affect the expected size of the vector on the coeff port. Provide only the nonduplicate coefficients at the port. For example, if you set the Coefficients prototype parameter to a symmetric 14tap filter, the block shares one multiplier between each pair of duplicate coefficients, so the block expects a vector of 7 values on the coeff port. You must still provide zeros in the input coeff vector for the nonduplicate zerovalued coefficients.
Dependencies
To enable this parameter, set Coefficients
source to Input port
(Parallel interface)
.
Data Types: single
 double
 int8
 int16
 int32
 uint8
 uint16
 uint32
Filter structure
— HDL filter architecture
Direct form
systolic
(default)  Direct form transposed
 Partly serial systolic
Specify the HDL filter architecture as one of these structures:
Direct form systolic
— This architecture provides a fully parallel filter implementation that makes efficient use of Intel and Xilinx DSP blocks. For architecture details, see Fully Parallel Systolic Architecture.Direct form transposed
— This architecture is a fully parallel implementation that is suitable for FPGA and ASIC applications. For architecture details, see Fully Parallel Transposed Architecture.Partly serial systolic
— This architecture provides a serial filter implementation and options for tradeoffs between throughput and resource utilization. It makes efficient use of Intel and Xilinx DSP blocks. The block implements a serial Lcoefficient filter with M multipliers and requires input samples that are at least N cycles apart, such that L = N×M. You can specify either M or N. For this implementation, the block provides an output port, ready, that indicates when the block is ready for new input data. For architecture details, see Partly Serial Systolic Architecture (1 < N < L) and Fully Serial Systolic Architecture (N ≥ L). You cannot use framebased input with the partly serial architecture.
All implementations remove multipliers for zerovalued coefficients. When you use scalar input data, all implementations share multipliers for symmetric and antisymmetric coefficients. Framebased filters do not implement symmetry optimization.
Specify serialization factor as
— Rule to define serial implementation
Minimum number of cycles
between valid input samples
(default)  Maximum number of
multipliers
You can specify the rule that the block uses to serialize the filter as either:
Minimum number of cycles between valid input samples
– Specify a requirement for input data timing using the Number of cycles parameter.Maximum number of multipliers
– Specify a requirement for resource usage using the Number of multipliers parameter.
For a filter with L coefficients, the block implements a serial filter with not more than M multipliers and requires input samples that are at least N cycles apart, such that L = N×M. The block might remove multipliers when it applies coefficient optimizations, so the actual M or N value of the filter implementation can be lower than the value that you specified.
Dependencies
To enable this parameter, set the Filter
structure parameter to
Partly serial systolic
.
Number of cycles
— Serialization requirement for input timing
2
(default)  positive integer
Serialization requirement for input timing, specified as a
positive integer. This parameter represents
N, the minimum number of cycles
between valid input samples. In this case, the block
calculates M =
L/N. To
implement a fullyserial architecture, set
Number of cycles greater than
the filter length, L, or to
Inf
.
The block might remove multipliers when it applies coefficient optimizations, so the actual M and N values of the filter can be lower than the value you specified.
Dependencies
To enable this parameter, set Filter
structure to Partly serial
systolic
and set Specify
serialization factor as to
Minimum number of cycles between
valid input samples
.
Number of multipliers
— Serialization requirement for resource usage
2
(default)  positive integer
Serialization requirement for resource usage, specified as
a positive integer. This parameter represents
M, the maximum number of
multipliers in the filter implementation. In this case,
the block calculates N =
L/M. If the
input data is complex, the block allocates
floor(M/2)
multipliers for the real part of the filter and
floor(M/2)
multipliers for the imaginary part of the filter. To
implement a fullyserial architecture, set
Number of multipliers to
1
for real input with real
coefficients, 2
for complex input and
real coefficients or real coefficients with complex input,
or 3
for complex input and complex
coefficients.
The block might remove multipliers when it applies coefficient optimizations, so the actual M and N values of the filter can be lower than the value you specified.
Dependencies
To enable this parameter, set the Filter
structure to Partly serial
systolic
, and set Specify
serialization factor as to
Maximum number of
multipliers
.
Data Types
Rounding mode
— Rounding mode for typecasting the output
Floor
(default)  Ceiling
 Convergent
 Nearest
 Round
 Zero
Rounding mode for typecasting the output to the data type specified by the Output parameter. When the input data type is floating point, the block ignores this parameter. For more details, see Rounding Modes.
Saturate on integer overflow
— Overflow handling for typecasting the output
off
(default)  on
Overflow handling for typecasting the output to the data type specified by the Output parameter. When the input data type is floating point, the block ignores this parameter. For more details, see Overflow Handling.
Coefficients
— Data type of discrete FIR filter coefficients
Inherit: Same word length as
input
(default)  <data type
expression>
The block casts the filter coefficients to this data type. The quantization rounds to the nearest representable value and saturates on overflow. When the input data type is floating point, the block ignores this parameter.
The recommended data type for this parameter is
Inherit: Same word length as
input
.
The block returns a warning or error if:
The coefficients data type does not have enough fractional length to represent the coefficients accurately.
The coefficients data type is unsigned while the coefficients include negative values.
Dependencies
To enable this parameter, set Coefficients
source to
Property
.
Output
— Data type of filter output
Inherit: Inherit via internal
rule
(default)  Inherit: Same word length as
input
 <data type
expression>
The block casts the output of the filter to this data type. The quantization uses the settings of the Rounding mode and Overflow mode parameters. When the input data type is floating point, the block ignores this parameter.
The block increases the word length for full precision inside each filter tap and casts the final output to the specified type. The maximum final internal data type (WF) depends on the input data type (WI), the coefficient data type (WC), and the number of coefficients (L) and is given by
WF = WI +
WC +
ceil(log2(L))
.
When you specify a fixed set of coefficients, because the coefficient values limit the potential growth, usually the actual fullprecision internal word length is smaller than WF.
When you use programmable coefficients, the block cannot calculate the dynamic range, and the internal data type is always WF.
Control Ports
Enable reset input port
— Option to enable reset input port
off
(default)  on
Select this parameter to enable the reset input port. The reset signal implements a local synchronous reset of the data path registers.
For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.
Use HDL global reset
— Option to connect data path registers to generated HDL global reset signal
off
(default)  on
Select this parameter to connect the generated HDL global reset signal to the data path registers. This parameter does not change the appearance of the block or modify simulation behavior in Simulink^{®}. When you clear this parameter, the generated HDL global reset clears only the control path registers. The generated HDL global reset can be synchronous or asynchronous depending on the HDL Code Generation > Global Settings > Reset type parameter in the model Configuration Parameters.
For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.
Algorithms
The filter architectures for the Discrete FIR Filter block are shared with other blocks and described in detail on the FIR Filter Architectures for FPGAs and ASICs page. The sections here show the hardware resources and synthesized clock speed for the Discrete FIR Filter block configured with each filter architecture.
Performance — Fully Parallel Systolic
This table shows postsynthesis resource utilization for the HDL code
generated for a symmetric 26tap FIR filter with 16bit scalar input and
16bit coefficients. The synthesis targets a Xilinx ZC706 (XC7Z045ffg9002) FPGA. The Global HDL reset
type parameter is Synchronous
and Minimize clock enables is selected. The
reset port is not enabled, so only control path
registers are connected to the generated global HDL reset.
Resource  Uses 

LUT  36 
Slice Reg  487 
Slice  45 
Xilinx LogiCORE DSP48  13 
After place and route, the maximum clock frequency of the design is 630 MHz.
Performance — Fully Parallel Transposed
This table shows postsynthesis resource utilization for the HDL code
generated for a symmetric 26tap FIR filter with 16bit scalar input and
16bit coefficients. The synthesis targets a Xilinx ZC706 (XC7Z045ffg9002) FPGA. The Global HDL reset
type parameter is Synchronous
and Minimize clock enables is selected. The
reset port is not enabled, so only control path
registers are connected to the generated global HDL reset.
Resource  Uses 

LUT  32 
Slice Reg  108 
Xilinx LogiCORE DSP48  26 
After place and route, the maximum clock frequency of the design is 541 MHz.
Performance — Partly Serial Systolic (1 < N
< L)
This table shows postsynthesis resource utilization for the HDL code
generated from the Partly Serial Systolic FIR Filter Implementation example. The implementation is for
a 32tap FIR filter with 16bit scalar input, 16bit coefficients, and a
serialization factor of 8 cycles between valid input samples. The synthesis
targets a Xilinx Virtex6 (XC6VLX240T1FF1156) FPGA. The Global HDL
reset type parameter is
Synchronous
and Minimize clock
enables is selected.
Resource  Uses 

LUT  181 
FFS  428 
Xilinx LogiCORE DSP48  2 
After place and route, the maximum clock frequency of the design is 561 MHz.
Performance — Fully Serial Systolic (N ≥ L)
This table shows postsynthesis resource utilization for the HDL code
generated from the 32tap filter in the Partly Serial Systolic FIR Filter Implementation example, with the Number
of cycles parameter set to Inf
. This
configuration implements a fullyserial filter. The synthesis targets a
Xilinx Virtex6 (XC6VLX240T1FF1156) FPGA. The Global HDL
reset type parameter is
Synchronous
and Minimize clock
enables is selected.
Resource  Uses 

LUT  122 
Slice Reg  225 
Xilinx LogiCORE DSP48  1 
After place and route, the maximum clock frequency of the design is 590 MHz.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.
This block supports C/C++ code generation for Simulink accelerator and rapid accelerator modes and for DPI component generation.
HDL Code Generation
Generate Verilog and VHDL code for FPGA and ASIC designs using HDL Coder™.
HDL Coder™ provides additional configuration options that affect HDL implementation and synthesized logic.
For a FIR filter with multichannel, use the Discrete FIR Filter (Simulink) block instead.
The block provides three filter structures. The direct form systolic architecture provides a fully parallel implementation that makes efficient use of Intel and Xilinx DSP blocks. The direct form transposed architecture is a fully parallel implementation and is suitable for FPGA and ASIC applications. The partly serial systolic architecture provides a configurable serial implementation that also makes efficient use of FPGA DSP blocks. For a filter implementation that matches multipliers, pipeline registers, and preadders to the DSP configuration of your FPGA vendor, specify your target device when you generate HDL code.
All three filter structures remove multipliers for zerovalued coefficients, such as in halfband filters and Hilbert transforms. When you use scalar input data, all filter structures share multipliers for symmetric and antisymmetric coefficients. Framebased filters do not implement symmetry optimization.
You can set block parameters to make tradeoffs between throughput and resource utilization.
For highest throughput, choose a fully parallel systolic or transposed architecture. The generated code can accept input data and provides filtered output data on every cycle.
For reduced area, choose partly serial systolic architecture. Then specify a rule that the block uses to serialize the filter based on either input timing or resource usage. To specify a serial filter using an input timing rule, set Specify serialization factor as to
Minimum number of cycles between valid input samples
, and choose Number of cycles to be greater than or equal to2
. In this case, the filter accepts only input samples that are at least Number of cycles cycles apart. To specify a serial filter using a resource rule, set Specify serialization factor as toMaximum number of multipliers
, and set Number of multipliers to be less than the number of filter coefficients. In this case, the filter accepts input samples that are at leastNumCoeffs/NumMults
apart.
ConstrainedOutputPipeline  Number of registers to place at
the outputs by moving existing delays within your design. Distributed
pipelining does not redistribute these registers. The default is

InputPipeline  Number of input pipeline stages
to insert in the generated code. Distributed pipelining and constrained
output pipelining can move these registers. The default is

OutputPipeline  Number of output pipeline stages
to insert in the generated code. Distributed pipelining and constrained
output pipelining can move these registers. The default is

The Discrete FIR Filter block does not support resource sharing optimization
through HDL Coder settings. Instead, set the Filter
structure parameter to Partly serial
systolic
, and configure a serialization factor
based on either input timing or resource usage.
Version History
Introduced in R2017aR2022a: Moved to DSP HDL Toolbox from DSP System Toolbox
Before R2022a, this block was named Discrete FIR Filter HDL Optimized and was included in the DSP System Toolbox™ DSP System Toolbox HDL Support library.
R2022a: Highthroughput interface
This block supports highthroughput data. You can apply input data as an Nby1 vector, where N can be up to 64 values. You cannot use framebased input with the partly serial architecture.
R2022a: Input coefficients must be a row vector
When you use programmable coefficients with this block, you must supply the
coefficients as a row vector (1byN matrix). Before
R2022a, the block accepted a onedimensional array (for example,
ones(5)
), a column vector
(Mby1 matrix), or a row vector of coefficients.
R2022a: RAMbased partly serial architecture
This block uses a RAMbased partly serial architecture, which uses fewer
resources than the former registerbased architecture. Uninitialized RAM
locations can result in X
values at the start of your HDL
simulation. You can avoid X
values by having your test
initialize the RAM or by enabling the Initialize all RAM
blocks option in the model configuration parameters. This
parameter sets the RAM locations to 0
for simulation and
is ignored by synthesis tools.
R2019b: Complex coefficients
The block supports complexvalued coefficients. If both coefficients and input data are complex, the block implements each filter tap with three multipliers. If either data or coefficients are complex but not both, the block uses two multipliers for each filter tap. You can use complex coefficients with all architectures and with programmable coefficients.
R2019a: Programmable coefficients
The block provides the option to specify coefficients using an input port
when you select the Direct form systolic
architecture. You cannot use programmable coefficients with transposed or
partly serial systolic architectures.
R2019a: Optimize symmetric coefficients
The block provides optimization of symmetric and antisymmetric coefficients. This optimization reduces the number of multipliers and makes efficient use of FPGA DSP resources.
In R2018b, the block performed these optimizations only for fully parallel architectures.
R2019a: Optional reset port
The block provides an optional reset port for any architecture, including a serial systolic architecture with resource sharing. The reset port provides a local synchronous reset of the data path registers.
In R2018b, the block supported the reset port only for fully parallel architectures.
R2019a: Changes to serial filter parameters
Before R2019a, you specified the serial implementation by setting a requirement for input timing. Starting in R2019a, you can specify the serialization requirement based on either input timing or resource usage.
For a filter with L coefficients, the block implements a serial filter with not more than M multipliers and requires input samples that are at least N cycles apart, such that L = N×M.
Serial Filter Requirement  Configuration Before R2019a  Configuration in R2019a 

Specify a serialization rule based on input timing, that is, N cycles. 


Specify a serialization rule based on resource usage, that is, M multipliers.  Serialization by resource usage is not supported before R2019a. However, you can calculate N based on your multiplier requirement.


R2018b: Transposed architecture
The block provides an option to select a direct form transposed architecture.
R2018b: Changes to parallel filter architecture
The validIn port is mandatory. The Enable valid input port parameter is no longer available.
The ready port is enabled when you select Share DSP resources and disabled when you clear Share DSP resources. The Enable ready output port parameter is no longer available.
When you select
Direct form systolic
without Share DSP resources enabled, the block implements an improved fully parallel architecture compared to previous releases. This architecture may have different latency than previous versions. Use the validOut signal to align with parallel delay paths. When using this architecture, the default global HDL reset now clears only the control path registers. Previous releases connected the global HDL reset to the data path registers and the control path registers. This change improves hardware performance and lowers the resources used. To implement the same fully parallel architecture as previous releases, select Share DSP resources and set Sharing factor to1
.When you select
Direct form systolic
, select Share DSP resources, and use any Sharing factor, the implemented filter has the same latency and uses the same hardware resources as in previous releases. The reset behavior for this architecture is also the same as previous releases.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
 América Latina (Español)
 Canada (English)
 United States (English)
Europe
 Belgium (English)
 Denmark (English)
 Deutschland (Deutsch)
 España (Español)
 Finland (English)
 France (Français)
 Ireland (English)
 Italia (Italiano)
 Luxembourg (English)
 Netherlands (English)
 Norway (English)
 Österreich (Deutsch)
 Portugal (English)
 Sweden (English)
 Switzerland
 United Kingdom (English)