Stanford CS149 class notes
Lecture #1

A parallel computer is a collection of processing elements that cooperate to solve problems quickly
fast≠ efficiency, when the program runs faster on a parallel computer, it does not mean it is using the hardware efficiently. Sometimes raw performance, sometimes efficiency.
working to parallelize your code as often not worth the time (simple-threaded cpu performance doubling ~ every 18 months)
a computer program - a list of instructions
a processor executes instructions: modify the computer’s state
by state, it is the values of program data which are stored in a processor’s registers or in memory

Limited by power, the clock frequency stops, and there are no more benefits from parallelism. When the clock frequency is higher, the power consumption is a function.
achieving efficient processing almost always comes down to accessing data efficiently.
All memory is logically an array of bytes; each byte is identified by its “address” in memory (its position in this array)
about cache
Cache is on-chip storage that maintains a copy of a subset of values in memory
If an address is “in the cache,” the processor can load and store to this address more quickly than if the data resides in memory.
A cache is a hardware implementation detail that does not impact the output of a program, only its performance.
