®
Altera Corporation 1
Comparing Performance of Dual-Port Memory Functions
July 1998, ver. 1 Application Note 99
A-AN-099-01
Introduction
The number of designs that use large memory functions has steadily increased over the last few years (see Figure 1). This increase has caused difficulties when designing for devices with segmented architectures, because large memory functions can decrease the performance and amount of logic available in these devices. Altera® FLEX® 10K devices, on the other hand, meet the high performance requirements of today’s applications because of their unique architecture. FLEX 10K devices contain separate embedded architectural elements for memory functions, called embedded array blocks (EABs), that enable these devices to implement large memory functions without impacting device performance or logic availability.Figure 1. Trend in Use of Large Memory Functions
This application note summarizes the results of performance tests comparing Altera FLEX 10K devices with Xilinx XC4000XL devices when implementing large memory functions, and discusses the memory vs.
logic tradeoff for both device families.
Memory Performance
The memory performance experiment tested the performance of FLEX 10KA, FLEX 10KE, and XC4000XL devices when implementing dual-port first-in first-out (FIFO) memory functions.
> 4 Mbits Growth
Rate
Memory Size 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 Mbit to 4 Mbits 64 Kbits to
264 Kbits 16 Kbits to
64 Kbits
< 16 Kbits 256 Kbits
to 1 Mbit
1996 1997 1998
2 Altera Corporation AN 99: Comparing Performance of Dual-Port Memory Functions
Lab Setup
In the experiment, 32 × 32-bit, 64 × 32-bit, 128 × 32-bit, and 256 × 32-bit dual-port FIFO functions were implemented in turn into an
EPF10K50V-1, EPF10K50E-1, and XC4036XL-09 device, which are devices of equivalent densities based on logic cell (LC) counts. The Altera single- clock FIFO function (scfifo) and dual-clock FIFO function (dcfifo) were implemented in the EPF10K50V-1 and EPF10K50E-1 device respectively, using the MAX+PLUS II version 9.0 software with timing- driven compilation turned on. The dp_ram LogiBLOX module was implemented in the XC4036XL-09 device using the XACT Step M1 version 4.12 software with timing-driven compilation turned on.
Results
Table 1 shows the results of the experiment. Unlike Altera’s
MAX+PLUS II software, the XACT Step M1 software does not have a drop-in parameterized FIFO function. Therefore, the dp_ram function was used because it was closest in functionality to the scfifo and dcfifo functions. However, the dp_ram function only implements the dual-port RAM module and does not have the control logic present in the Altera functions. Because the control logic was ignored by the dp_ram module, the experiment only tested the dual-port RAM portion of the module.
Thus, the results in Table 1 reflect upper-bound FIFO performance for the XC4036XL-09 device; when the control logic is the critical path, the Xilinx device performs even slower.
As shown in Table 1, both FLEX 10K devices outperformed the XC4000XL device when implementing dual-port FIFO functions. In fact,
improvements in the FLEX 10KE architecture led to significantly higher performance and, more importantly, consistent high performance. The FLEX 10KE device performed consistently because the EAB was the critical path in the device. The performance of the FLEX 10KA device degraded slightly because the FIFO buffer’s control logic was the critical path in the device.
Table 1. Dual-Port Memory Performance Results
Memory Size (Bits) fMAX (MHz)
Altera EPF10K50V-1
Altera EPF10K50E-1
Xilinx XC4036XL-09
32 × 32 70 102 66
64 × 32 59 102 53
128 × 32 58 102 40
256 × 32 57 102 33
Altera Corporation 3 AN 99: Comparing Performance of Dual-Port Memory Functions
Memory vs.
Logic Tradeoff
FLEX 10K devices feature embedded RAM, in which an area of the device is dedicated to memory cells so that the overhead structures required for RAM—addressing and write control—are very fast and efficient. Thus, the largest FLEX 10K device, the EPF10K250A, offers 41 Kbits of RAM without sacrificing logic capacity. Figure 2 shows that there is no memory vs. logic tradeoff for each device in the FLEX 10K family.
Figure 2. Altera FLEX 10K Memory vs. Logic Tradeoff
In contrast, XC4000XL devices use distributed RAM. Each configurable logic block (CLB) can be used for general-purpose logic or as a 32 × 1 RAM block. Using distributed memory in applications requiring RAM blocks larger than 32 × 1 results in lower performance and lower device utilization. As RAM sizes increase, the logic capacity of these devices decreases because more CLBs are required for implementation and are therefore not available for use as logic. Figure 3 shows the considerable memory vs. logic tradeoff for each device in the XC4000XL family.
Figure 3. Xilinx XC4000XL Memory vs. Logic Tradeoff
0 5,000 10,000 15,000 20,000 25,000 30,00035,000 40,000 Logic
Elements Available
Memory Bits 0
1,000 2,000 3,000 4,000 5,000 6,000
45,000 7,000
8,000 9,000 10,000 11,000 12,000
EPF10K20 EPF10K10
EPF10K30 EPF10K40
EPF10K50 EPF10K70
EPF10K100
EPF10K130V EPF10K250A
0 5,000 10,000 15,000 20,000 25,000 30,00035,000 40,000 Logic
Elements Available
Memory Bits 0
1,000 2,000 3,000 4,000 5,000
6,000 XC4062XL
XC4052XL XC4044/XL XC4036EX/XL XC4028EX/XL XC4020E XC4010E/L
Altera, MAX, MAX+PLUS, MAX+PLUS II, FLEX, FLEX 10K, EFP10K10, EPF10K20, EPF10K30, EPF10K40, EPF10K50, EPF10K50E, EPF10K50V, EPF10K70, EPF10K100, EPF10K130V, and EPF10K250A are trademarks and/or service marks of Altera Corporation in the United States and other countries. Altera acknowledges the trademarks of other organizations for their respective products or services mentioned in this document. Altera products are protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera’s standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out
of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services.
Copyright 1998 Altera Corporation. All rights reserved.
101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Applications Hotline:
(800) 800-EPLD Customer Marketing:
(408) 544-7104 Literature Services:
(888) 3-ALTERA [email protected]
®
AN 99: Comparing Performance of Dual-Port Memory Functions
4 Altera Corporation
Printed on Recycled Paper.
As shown in Figures 2 and 3, as the memory size increases, the logic capacity remains constant in FLEX 10K devices but degrades significantly in XC4000XL devices.
The impact of this memory vs. logic tradeoff is obvious when you implement large functions in both device families. For example, consider what happens when you simultaneously implement three 128 × 32-bit memory functions (equal to 12,288 memory bits) into either an
EPF10K30E or an XC4036XL device, which are the smallest devices from each family that can fit these functions. Once the functions are
implemented, the EPF10K30E device, which has 1,728 total LCs, still has all 1,728 LCs available for logic; the XC4036XL device, which has 2,592 total LCs, has only 1,300 LCs available for logic. Thus, even though the EPF10K30E device has less density, it has 25% more logic available than the higher density XC4036XL device.
Conclusion
The experiment showed that FLEX 10K devices implement memory functions more efficiently than XC4000XL devices, maintaining device performance as the memory size increases. In addition, the unique FLEX 10K architecture enables these devices to implement large memory functions without a decrease in logic availability. Thus, FLEX 10K devices are the best devices for meeting today’s high performance and large memory applications.Additional Information
The following documents provide more detailed information:
■ FLEX 10K Embedded Programmable Logic Family Data Sheet
■ Product Information Bulletin 20 (Benefits of Embedded RAM in FLEX 10K Devices)
The documents are available by contacting Altera Literature Services at (888) 3-ALTERA; you can also download them from the Altera web site at http://www.altera.com.