Beschreibung
In his seminal Turing Award Lecture, Backus discussed the issues stemming from the word-at-a-time style of programming inherited from the von Neumann computer. More than forty years later, computer architects must be creative to amortize the von Neumann Bottleneck (VNB) associated with fetching and decoding instructions which only keep the datapath busy for a very short period of time. In particular, vector processors promise to be one of the most efficient architectures to tackle the VNB, by amortizing the energy overhead of instruction fetching and decoding over several chunks of data. This work explores vector processing as an option to build small and efficient processing elements for large-scale clusters of cores sharing access to tightly-coupled L1 memory.