 |
Advanced Circuits, Architecture, and Computing Lab
Projects
 |
|
A brief overview of research projects currently in progress is provided
below. These include: circuit- and microarchitecture-level techniques
to optimize one or more design metrics of random logic circuits,
microarchitectural components, and interconnects; system-level methods
for fault tolerance; and high-performance branch-and-bound algorithms.
|
Wave Triggering for Random Logic Circuit
Optimization
|
 |
Wave triggering is a comprehensive new design methodology we are
developing to optimize static-CMOS-based pipelined random logic
circuits (RLCs), comprising combinational logic blocks (CLBs) and I/O
latches/flip-flops, by exploiting signal timing characteristics. It
employs a clock-driven, optimized delay chain shared between
physically-close RLCs that provides a sequence of timing signals every
clock cycle. These signals are used to trigger on/off a selected subset
of CLB gates and I/O latches by enabling/disabling control transistors
embedded in them, and thereby putting them in sample/hold mode. We are
developing design techniques based on wave triggering to minimize RLC
soft error rate, power (glitch, short-circuit, and leakage), and to
optimize other design objectives.
Operand Encoding and Operation Bypass
for Microarchitecture Optimization
|
 |
Modern microprocessors feature wide (64-bit) datapaths and are mostly
designed with worst-case inputs in mind. However, such operands rarely
occur; the most frequently occurring input operand words have strings
or subwords (SWs) of 0's and 1's embedded in them. We are developing a
general methodology to optimize microarchitectural modules for
computation (e.g., functional units or FUs), communication (e.g.,
buses), and storage (e.g., pipeline registers, register files, caches,
etc.) by exploiting such frequent operand SW values. It partitions
operand words in a predetermined manner into SWs and encodes each SW
value as either special (i.e., an all-0 or all-1 pattern) or
regular (otherwise) using encoding bits. Hardware modules are
similarly partitioned into submodules. A submodule's operation is
bypassed when its input consists of an "exploitable" combination of SW
values (e.g., combinations of special values or special and regular
values). For example, communication or storage of just the encoding
bits suffices for special values. Similarly, in FUs, output SWs of
bypassed submodules are computed using an alternative, simpler, and
lower-power means.
Dynamic, Activity-Aware Interconnect
Modeling and Design
|
 |
Technology scaling, higher clock frequencies, and growing die sizes are
making accurate early-stage modeling and optimization of global and
semi-global interconnect area, latency, bandwidth, and reliability
critical to chip success. Most previous modeling and optimization
approaches have considered either worst-case bus activity or average
switching activity factors. However, real workloads exhibit
substantial temporal and spatial (across wires) variation in bus
traffic that depend upon the application and type of traffic: address,
instruction, data, or control. Our work, ranging from the physical
design to the microarchitecture level, considers the dynamic or
run-time value and timing characteristics of these streams to address
the above challenges.
Efficient, Highly-Reconfigurable
Structurally Fault-Tolerant Multicomponent Systems
|
 |
Many computer systems comprise numerous similar or identical components
and have interconnection structures integral to the correct and
nondegraded operation of the overall system. In these systems, faulty
components need to be reconfigured using spare ones and structural
integrity of the system needs to be preserved, i.e., these systems need
structural fault tolerance (SFT). Examples of such systems are
ALUs consisting of multiple bit-slices, VLSI/WSI arrays for DSP, FPGAs
comprising configurable logic blocks (CLBs), and multiprocessors with
tens to hundreds of thousands of processors interconnected in a tree,
mesh, hypercube, or other topology. A non-fault-tolerant system
consisting of a large number of even highly reliable components can
have low system reliability, since even a single component failure
renders the system faulty. This project is concerned with the
development of efficient SFT design techniques and reconfiguration
algorithms suitable for multiprocessors of arbitrary and special
topologies and for FPGAs.
High-Performance Branch-and-Bound and
Its Application to Discrete Optimization, Computational
Biology, and Data Mining
|
 |
|
Branch-and-bound (B&B) is a popular global optimization method used to
solve NP-hard discrete optimization problems (DOPs). Its applications
run the gamut of DOPs in Science, Engineering, Mathematics, and
Operations Research. Due to the enormous computation and memory
requirements and time-critical nature of many large, real-world DOPs,
high-performance B&B has been recognized as essential. We are building
a scalable B&B software system featuring sophisticated new sequential
and parallel B&B search and data management techniques. These
techniques adapt to problem and computational platform characteristics
to optimize performance subject to user QoS requirements (in terms of
solution quality, response time, etc.). Further, our B&B system will
have a flexible interface to permit domain experts to integrate
problem-specific heuristics to enhance its effectiveness. Finally, we
are developing B&B algorithms for problems in computational biology
(e.g., scoring function design for protein-ligand docking used in drug
design) and data mining (e.g., optimal feature subset selection).
|
|