Altera Home Page
文档资料 许可
在线购买 下载

  主页   |   产品   |   支持   |   最终市场   |   技术中心   |   教育与活动   |   公司介绍   |   在线购买  
  器件   |   设计软件   |   IP   |   设计服务   |   开发套件及配件   |   资料  

 CPLDs
      MAX II
      MAX 3000A
      MAX 7000
  
 FPGAs
      Cyclone III
      Cyclone II
      Cyclone
      Stratix III
      Stratix II
      Stratix
      Stratix II GX
      Stratix GX
      Arria GX
  
 Structured ASICs
      HardCopy
  
 By End Market
   End Market
  
 Configuration
      Configuration Devices
  
 Mature Products
      Product Listing
  
 Design Software
   Quartus II
      System-Level Software
      MAX+PLUS II
  
 IP/Embedded Processors
   IP Megafunctions
      Nios II Processor
      Nios Processor
  
 By Technology
   Technologies
  
 By Type
      Application Briefs
      Application Notes
      Conference Papers
      Data Sheets
      Device Pin-Outs
      Errata Sheets
      Functional Specifications
      Manuals
      Technical Briefs
      User Guides
      White Papers
  
 General Documentation
      Annual Reports
      Brochures
      Customer Notifications
      Design Contest Papers
      Glossary of Altera Terms
      Inserts and Advertorials
      Reliability Report
      SEC Filings
      Selector Guides
      Sparkle Sheets
  
 Email/E-Newsletter Sign Up
      Subscribe Now
      Manage Your Subscriptions
      View E-Newsletter Archives
      News & Views Ezine
  
 Literature Update Sign Up
      Subscribe Now
      Manage Your Subscriptions
      View Recent Updates
      FAQ
  
 RSS/XML News Feeds
      Subscribe Now
  

Multi-Processor Solutions with FPGAs

  Technology News
Ensuring Serial Protocol Signal Integrity with FPGAs & Embedded Transceivers

Optimize System Flexibility by Integrating Custom Microprocessors into FPGAs

Multi-Processor Solutions with FPGAs

DE2 Development & Education Board

by Bob Garrett, Senior Marketing Manager, Nios Marketing, Altera Corporation

Q1 2006 Issue

Page 1 of 2

Embedded designers seeking high-performance processing inevitably face the cost, performance, and power "Bermuda triangle" where the best of intentions can achieve any two of the key objectives, but fails to achieve all three. Custom ASIC designs are suitable for those few who can afford the time, expense, and risk involved. As device geometries continue to shrink and ASIC design costs continue to grow, fewer and fewer applications can justify the expense of a full-custom design.

FPGA-based embedded systems featuring multiple soft core processors offer a powerful set of new options for the embedded designer. No longer are ASIC designers alone in their ability to configure performance-optimized systems-on-chip with a custom tailored feature set. Now developers can change the performance characteristics of their embedded system right up to the time the product goes into final test. Developers also can extend product life cycle, getting to market quickly and upgrading both software and hardware features remotely over the Internet.

While the term "multi-processor" can conjure up memories of academic papers on "parallel processing," commercial applications of multiple CPUs in a single device are much more straightforward. When starting a new design, developers must meet certain performance criteria. Partitioning duties among multiple soft processors not only provides the design flexibility to adapt to last-minute design changes caused by evolving standards or competing products, but also the ability to keep pace with this performance criteria. Designers can use multiple soft processors as a divide-and-conquer strategy to increase overall system performance or offload tasks from an existing processor. Designers typically use 400- to 800-MHz discreet processors to perform the myriad of device tasks required, both simple and demanding. Using multiple soft processors enables a more efficient use of processing power by partitioning tasks based on time and power requirements, while providing the same or better overall performance.

The number of soft core processors that designers can implement in a single FPGA is limited only by the device's resources (i.e., its logic and memory). High-density FPGAs, for example, can contain hundreds of soft core processors. Likewise, designers can implement different types of soft core processors (i.e., 16- or 32-bit, performance optimized, or logic-area optimized processors).

The coding algorithm can be split among multiple processors, depending on the tasks involved. Time critical tasks can be assigned to dedicated processors, while less demanding duties can be shared on one or more other CPUs. This flexibility enables logical grouping of tasks, and potentially higher performance levels while running at a reduced clock frequency lowering system power consumption.

Embedded Processors in FPGAs

Building a custom device containing an exact set of peripherals, memory interfaces, and processing functions is not hard to imagine—ASIC designers have been doing it for years. Efforts to create an economically viable custom FPGA-based embedded processor device were unsuccessful until the late 1990s when FPGAs had enough on-chip memory, programmable logic, and raw performance. Today, embedded intellectual property (IP) functions designed specifically for FPGAs—including CPUs, signal processing engines, peripherals, and standard communications interfaces—are readily available and offer both cost and performance benefits over traditional discrete embedded devices.

Essentially, designers partition the problem the same way they might if they were building a multi-processor system on a printed circuit board (PCB), each assigned to a specific task. For example, one processor might perform general system housekeeping such as monitoring cabinet fans, man-machine interface, or the system console, while the others handle communications, signal processing, statistics gathering, or other system tasks.

The multiple processor approach can reduce overall device cost by moving individual processors from the device board onto the FPGA, which decreases device board size. This also enables less signal routing between processors requiring fewer interconnects, and more low-level processors running at lower clock frequencies, which reduces layers on the circuit board. See Figure 1.

Figure 1. Multiple Independent Processor

This approach can also reduce software design costs, which represent 80 percent of overall system design expenses due to time-consuming code writing. If the task can be partitioned to multiple processors, it is easier for engineers to write, debug, and maintain codes. This potentially represents a huge backend savings, enabling faster code development and debugging and, when the product matures, easier code maintenance because it is much easier to analyze.

Multi-Channel Applications

Multi-channel applications can be scaled to meet system throughput by using multiple processors in a single chip, each dedicated to handling a portion of the overall channel throughput. Each processor may run the exact same code, or some may change algorithms on-the-fly to adapt to system requirements. In some cases, a master processor is added to handle general housekeeping chores such as system initialization, statistics gathering, and error handling. See Figure 2.

Figure 2. Channelized Processing

Serially-Linked Processors

Combining several processors in a chain lets system architects treat each as a stage in a larger processing pipeline. Each CPU, responsible for one piece of the overall processing task, can share data memory (arbitrated or dedicated memory interfaces if off-chip, or dual-ported memory if on-chip) to pass results from the output of one stage to the input of the next. See Figure 3.

Figure 3. Pipelined Processing

Processor Companion Chip

Discrete processor and digital signal processing (DSP) chips connected to an FPGA can also benefit from hardware acceleration, peripheral expansion, and interface bridging, regardless of whether a CPU is inside the FPGA. Chip-to-chip interface IP is readily available today to provide external access to peripherals, acceleration logic, and I/O interfaces contained within the FPGA. See Figure 4.

Figure 4. Co-Processing/Companion Chip

Establishing Processor Performance

Establishing processor performance requirements for embedded systems can be challenging, particularly when the application software is still in flux. Industry-standard benchmarks provide some guidance, but nothing is certain until the software is complete. This tends to make designers cautious about under-calling their performance needs and can result in selecting a higher performance (and higher price) device than necessary. If a designer could accurately predict the performance required, processor selection would be much simpler. Such estimates would consider performance required by time-critical tasks as well as the load created by one or more low-priority tasks.

FPGA-based embedded systems can provide scalable performance, allowing last-minute changes to boost system performance based on customer demands. Compute-intensive algorithms, converted to logic in an FPGA, can run orders of magnitudes faster than the same algorithm run in software by a microprocessor or digital signal processor. More importantly, hardware resources can be applied to performance-hungry algorithms where they are needed most, potentially reducing the need for a high-performance CPU, reducing clock frequency, reducing power consumption, and simplifying the board design.

Page2

  Please Give Us Feedback
  Sign Up for E-mail Updates