International Journal of Computer & Software Engineering Volume 3 (2018), Article ID 3:IJCSE-131, 15 pages
https://doi.org/10.15344/2456-4451/2018/131
https://doi.org/10.15344/2456-4451/2018/131
Original Article
A Low-Energy Multi-Threaded Processor Design for Application Specific Embedded Systems
References
- Harizopoulos S, Ailamaki A (2006) Improving instruction cache performance in OLTP. ACM Transactions on Database Systems, 31: 887-920. View
- Joseph PM, Rajan J, Kuriakose KK, Murty SAVS (2013) Exploiting SIMD instructions in modern microprocessors to optimize the performance of stream ciphers. International Journal of Computer Network and Information Security, 5: 56-66. View
- Zhang K, Wang YH, Chen SM, Li ZT, Wen L, et al. (2013) Customized MMRF: Efficient matrix operations on SIMD processors. Applied Mechanics and Materials 347-350: 1727-1731. View
- Huang L, Xiao N, Wang Z, Wang Y, Lai M, et al. (2013) Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP. Parallel Computing, 39: 586-602. View
- Welch E, Patru D, Saber E, Bengtson K (2012) A study of the use of SIMD instructions for two image processing algorithms. In Proceedings of the Western New York Image Processing Workshop (WNYIPW). View
- Wickramasinghe M, Guo H (2014) Energy-aware thread scheduling for embedded multi-threaded processors: Architectural level design and implementation. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI). View
- Cleary J, Callanan O, Purcell M, Gregg D (2013) Fast asymmetric thread synchronization. ACM Transactions on Architecture and Code Optimization 27: 1-27. View
- Wang X, Zhao Y, Wei Y, Song S, Han B (2010) Prophet synchronization thread model and compiler support. In Processing, the International Symposium on Parallel and Distributed Processing with Applications (ISPA). View
- Anderson JH, Ahmed T, Kalman SS (2015) Thread synchronization by transitioning threads to spin lock and sleep state. April 7 2015. US Patent 9,003,413..
- Atta I, Tozun P, Tong X, Ailamaki A, Moshovos A, et al. (2013) STREX: Boosting instruction cache reuse in OLTP workloads through stratified transaction execution. In Proceedings of International Symposium on Computer Architecture. View
- Nickolls JR, Lew SD, Coon BW, Mills PC (2010) Synchronization of threads in a cooperative thread array. August 31 2010. US Patent 7,788,468..
- Foo YC (2012) Synchronization of Execution Threads on a Multi-threaded processor, US Patent 8286180B2.
- Zhang W, Liu F, Fan R (2014) Cache matching: Thread scheduling to maximize data reuse. In Proceedings of the High Performance Computing Symposium, Tampa, Florida.
- Huang YH, Tseng YY, Kuo YK, Yen TK, Lai BCC, et al. (2013) A localityaware dynamic thread scheduler for GPGPUs. In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pages, Taipei, Taiwan. View
- Rogers TG, O’Connor M, Aamodt TM (2013) Cache-conscious thread scheduling for massively multithreaded processors. IEEE Micro, 33: 78-85. View
- Rogers TG, O’Connor M, Aamodt TM (2012) Cache-conscious wavefront scheduling. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-45. IEEE Computer Society. View
- Liang Y, Mitra T (2010) Instruction cache locking using temporal reuse profile. In Proceedings of the 47th Design Automation Conference, DAC, New York, NY, USA, ACM 10: 344-349. View
- Liu T, Li M, Xue CJ (2012) Instruction cache locking for embedded systems using probability profile. Journal of Signal Processing Systems 69: 173- 188. View
- Qiu K, Zhao M, Xue CJ, Orailoglu A (2014) Branch prediction-directed dynamic instruction cache locking for embedded systems. Transactions on Embedded Computer Systems 13: 1-156. View
- Anand K, Barua R (2015) Instruction-cache locking for improving embedded systems performance. ACM Transactions on Embedded Computer Systems 14: 1-53. View
- Buck B, Hollingsworth JK (2000) An API for runtime code patching. International Journal of High Performance Computing Applications 14: 317- 329. View
- Burger D, Austin TM (1997)The simplescalar tool set, version 2.0. ACM 25: 13-25. View
- Synopsys Design Compiler. http://www.synopsys.com. View
- TSMC 65nm GP Standard Cell Libraries - tcbn65gplus. https://www.cmc. ca/en/WhatWeOffer/Products/CMC-00200-01411.aspx. View
- Thoziyoor S, Ahn JH, Monchiero M, Brockman JB, Jouppi NP (2008) A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In Proceedings of the 35th International Symposium on Computer Architecture. View
- Modelsim Simulator. http://www.mentor.com/products/fv/modelsim. View