DUAL-EXECUTION MODE PROCESSOR ARCHITECTURE FOR EMBEDDED APPLICATIONS
Keywords:
Dual-Execution Mode, Queue Computation, Dynamic Switching Mechanism, Embedded CoreAbstract
This paper presents a novel embedded 32-bit processor architecture targeted for mobile and embedded applications. The processor supports Queue and Stack based programming models in a single simple core. The design focuses on the ability to efficiently execute Queue programs and also to support Stack programs without a considerable increase in hardware to the base Queue architecture. A prototype implementation of the processor is produced by synthesizing the high level model for a target FPGA device. We present the architecture description and design results in a fair amount of details. From the design and evaluation results, the QSP32 core efficiently executes both Queue and Stack based programs and achieves on average about 65MHz speed. In addition, when compared to the base single-mode architecture (PQP), the QSP32 core requires only about 2.54% additional hardware.
Downloads
References
G. De Micheli, R. Ernst and W. Wolf, Readings in Hardware/Software co-design, Morgan Kaufmann
Publishers, ISBN 1-55860-702-1.
M. Sowa, B. A. Abderazek and T. Yoshinaga,Parallel Queue Processor Architecture Based on
Produced Order Computation Model, in: Int. Journal of Supercomputing, HPC, Vol.32, No.3, June
, pp.217-229.
B. A. Abderazek, T. Yoshinaga, M. Sowa, High-Level Modeling and FPGA Prototyping of Produced
Order Parallel Queue Processor Core, in: International Journal of supercomputing, Volume 38,
Number 1, October 2006, pp. 3-15.
P6 Power Data Slides provided by Intel Corp. to Universities.
B. Bisshop, T. Killiher, and M. Irwin, The Design of a Register Renaming Unit, in: Proceedings
of Great Lakes Symposium on VLSI, 1999, pp. 34-37.
M. Akanda, Ben A. Abderazek, S. Kawata, and M. Sowa, An Efficient Dynamic Switching Mechanism
(DSM) for Hybrid Processor Architecture, in: The proceedings of Springer’s Lecture Note
in Computer Science (LNCS), LNCS 3824, December 6-9, 2005, pp. 77-86.
D. Lewis et al, The Stratix Logic and Routing Architecture, in: FPGA-02, International Conference
on FPGA, 2002, pp 12-20.
Cadence Design Systems:http://www.cadence.com/
Altera Design Software: http://www.altera.com/
B. A. Abderazek, M. Arsenji, S. Shigeta, T. Yoshinaga, M. Sowa, Queue Processor for Novel Queue
Computing Paradigm Based on Produced Order Scheme, in: Proc. of HPC, IEEE CS, July 2004,
pp. 169-177.
F. Arahata, O. Nishii, K. Uchiyama, N. Nakagawa.,Functional verification of the superscalar SH-4
microprocessor, in: Compcon97, the Proceedings of the International conference Compcon97, Feb
, pp. 115-120.
SuperH RISC engine SH-1/Sh-2/Sh-DSP Programming Manual: http://www.renesas.com
H. Maejima, M. Kinaga and K. Uchiyama, Design and architecture for Low Power/High Speed
RISC Microprocesor:SuperH, in: IEICE Transaction on Electronics, Vol.E80, No.12, dec. 1997,
pp.1539-1549.
H. Takahashi, S. Abiko and S. Mizushima, A 100 MIPS High Speed and Low Power Digital Signal
Processor, in: IEICE Transaction on Electronics, Vol.E80-C, No.12, 1997, pp.1546-1552.
R. Lysecky and F. Vahid, A Study of the Speedups and Competitiveness of FPGA Soft Processor
Cores using Dynamic Hardware/Software Partitioning, in: Design Automation and Test in Europe
(DATE’05),Munich, Germany, Vo.1, March 2005, pp. 18-23.
J. P. Koopman, Stack Computers: the new wave, Ellis Horwood Limited, 1989.
M. Sheliga and E. H. Sha, Hardware/Software Co-design With the HMS Framework, in: Journal
of VLSI Signal Processing Systems, Vol. 13, No.1, 1996, pp. 37-56.
K. Kim, H. Y. Kim and T. G. Kim, Top-down Retargetable Framework with Token-level Design
for Accelerating Simulation Time of Processor Architecture, in: IEICE Trans. Fundamentals of
Electronics, Communications and Computer Sciences, Vol. E86-A, No. 12, Dec. 2003, pp.3089-
http://www.arm.com/products/CPUs/ARM926EJ-S.html
JazelleTM-ARM Architecture Extensions for Java Applications, White Paper, http://
www.arm.com
Advancel Logic Corporation, Tiny2J Microprocessor Core for Javacard Applications,
Oyvind Strgm, Einar J. Aas., An Implementation of an Embedded Microprocessor Core with support
for Executing Byte Compiled Java Code, in: Proceedings of the Euromicro Symposium on
Digital Systems Design, 2001, pp. 396-399.
Harlan McGhan and Mike O’Connor, PicoJava: A direct execution engine for Java bytecode, in:
Computer 31(10), October 1998, pp. 22-30.
ARM7DMI Data Sheet, Advanced RISC Machines Ltd, 1994.
ARM Architecture Reference Manual, Advanced RISC Machines Ltd., September 2001.
Gaisler Research Laboratory. LEON2 XST User’s Manual, 1.0.22 edition, May 2004.
B. R. Preiss, V. C. Hamacher, Data Flow on Queue Machine, in: ISCA 1985, 12th International
Symposium on Computer Architecture, Boston, August 1985, pp. 342-351.
M. Fernandes, J. Llosa, N. Topham, Using Queues for Register File Organization in VLIW, Technical
Report ECS-CSG-29-97, University of Edinburgh, Department of Computer Science, 1997.
L. S. Heath, S. V. Pemmaraju, A. N. Trenk, Stack and Queue Layouts of Directed Acyclic Graphs:
Part I, in: SIAM Journal of Computing, Vol 23, No. 4, 1996, pp.1510-1539.
H. Schmit, B. Levine, B. Ylvisaker, Queue Machines: Hardware Compilation in Hardware, in:
FCCM’02, 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines,
, pp. 152-161.
A. Canedo, Code Generation Algorithms for Consumed and Produced Order Queue Machines,
Master Thesis, Graduate School of Information Systems, University of Electro-Communications,
September 2006.