05/22-review 1/2 并行计算机体系结构:SIsD,SIMD,MISD,MIMD MIMD的通信模型及存储器结构 地址空间的组织模式:共享存储(多处理机)νs.非共享存储(多计算机) 通信模型:LOAD/ STORE指令ws.消息传递 共享存储的MIMD结构 集中式共享存储(SMP)vs.分布式共享存储(DSM) 共享存储结构的存储器行为 Cache一致性问题( Coherence):使得多处理机系统的 Cache像单处理 机的 Cache一样对程序员而言是透明的 存储器同一性问题( Consistency:在多线程并发执行的情况下,提供 些规则来定义正确的共享存储器行为。通常允许有多种运行顺序 2021/2/1 计算机体系结构
05/22-review 1/2 • 并行计算机体系结构:SISD, SIMD, MISD, MIMD • MIMD 的通信模型及存储器结构 – 地址空间的组织模式:共享存储(多处理机) vs. 非共享存储(多计算机) – 通信模型:LOAD /STORE指令 vs. 消息传递 • 共享存储的MIMD结构 – 集中式共享存储(SMP)vs. 分布式共享存储(DSM) • 共享存储器结构的存储器行为 – Cache一致性问题(Coherence):使得多处理机系统的Cache像单处理 机的Cache一样对程序员而言是透明的 – 存储器同一性问题(Consistency):在多线程并发执行的情况下,提供 一些规则来定义正确的共享存储器行为。通常允许有多种运行顺序 2021/2/1 计算机体系结构 2
05/22-eveW 2/2 Cache一致性(定义 处理器P对X写之后又对X进行读,读和写之间没有其它 处理器对X进行写,则读的返回值总是写进的值。 处理器对X写之后,另一处理器对X进行读,读和写之 间无其它写,则读X的返回值应为写进的值。 对同一单元的写是顺序化的,即任意两个处理器对同 单元的两次写,从所有处理器看来顺序是相同的。 ·共享数据块的跟踪:监听和目录 Cache—致性协议实现:写作废和写更新 集中式共享存储体系结构 Snoopy Cache-Coherence Protocols 2021/2/1 计算机体系结构
05/22 -review 2/2 • Cache 一致性(定义) – 处理器P对X写之后又对X进行读,读和写之间没有其它 处理器对X进行写,则读的返回值总是写进的值。 – 处理器对X写之后,另一处理器对X进行读,读和写之 间无其它写,则读X的返回值应为写进的值。 – 对同一单元的写是顺序化的,即任意两个处理器对同 一单元的两次写,从所有处理器看来顺序是相同的。 • 共享数据块的跟踪:监听和目录 – Cache一致性协议实现:写作废和写更新 • 集中式共享存储体系结构 • Snoopy Cache-Coherence Protocols 2021/2/1 计算机体系结构 3
MSI Write-Back Invalidate Protocol 3 states Modified:.仅该 cache拥有修改过的、有效的该块copy - Shared:该块是干净块,其他 cache中也可能含有该块,存储器中 的内容是最新的 nva|id:该块是无效块(iva|id) 4 bus transactions. Read miss:服务于 Read miss on bus Write miss:服务于 Write miss on bus,得到一个独占的块 Invalidate:作废该块在其他处理器中的copy Write back:替换操作将修改过的块写回 写操作时,作废所有其他块 直到 validate transaction出现在总线上,写操 作才算完成 写串行化:总线事务在总线上串行化 2021/2/1 计算机体系结构
MSI Write-Back Invalidate Protocol • 3 states: – Modified: 仅该cache拥有修改过的、有效的该块copy – Shared: 该块是干净块,其他cache中也可能含有该块,存储器中 的内容是最新的 – Invalid: 该块是无效块(invalid) • 4 bus transactions: – Read Miss : 服务于Read Miss on Bus – Write Miss: 服务于Write Miss on Bus,得到一个独占的块 – Invalidate: 作废该块在其他处理器中的Copy – Write back:替换操作将修改过的块写回 • 写操作时,作废所有其他块 – 直到Invalidate transaction出现在总线上,写操 作才算完成 – 写串行化:总线事务在总线上串行化 2021/2/1 计算机体系结构 4
O MSI Snoopy Cache Coherence Protocol CPU read hit Write miss for this block Invalidate for Shared Invalid CPU read this block Place read mlss on bus"(read only) nvalid Shared CPU CPU CPU write miss read mIss Place read miss on bus Write miss for this block Read mIss for this block Cache state transitions Cache state transitions based Exclusive based on requests from CPU Exclusive on requests from the bus (read/write) CPU write miss Write-back cache block Place wrlte mIss on bus 当所访问的块的最新数据在某个私有 Cache时,在读写 CPU write hit 失效时,数据的提供者是拥有该块数据的私有 Cache CPU read hit 动作: Write-back block; abort memory access 2021/2/1 计算机体系结构
MSI Snoopy Cache Coherence Protocol 2021/2/1 计算机体系结构 5 当所访问的块的最新数据在某个私有Cache时,在读写 失效时,数据的提供者是拥有该块数据的私有Cache。 动作:Write-back block; abort memory access
MSI Snoopy Cache Coherence protocol Request Source State Transition Action and Explanation Read Hit Processor Shared or Modified Normal Hit: Read data in private data cache (no transaction Read Miss Processor Invalid Shared Normal Miss: Place read miss on bus, change state Read Miss Processor Shared Replace block Place read miss on bus Read Miss Processor Modified> Shared Write-Back block, Place read miss on bus, change state rite Hit Processor Modified Normal Hit: Write data in private data cache (no transaction) Write Hit Processor Shared,Modified Coherence: Place invalidate on bus(no data), change state Write Miss Processor Invalid, Modified Normal Miss: Place write miss on bus, change state Write Miss Processor Shared >Modified Replace block: Place write miss on bus, change state Write Miss Processor Modified Write-Back block, Place write miss on bus Read Miss Bus Shared Serve read miss from shared cache or memory Read miss Bus Modified Shared Coherence: Write-Back Serve read miss, change state Invalidate Bus Shared >Invalid Coherence: Invalidate shared block in other private caches Write Miss Bus Shared Invalid Coherence: Invalidate shared block in other private caches Write miss Bus Modified Invalid Coherence: Write-Back Serve write miss, Invalidate
MSI Snoopy Cache Coherence Protocol 2021/2/1 计算机体系结构 6
tO Example on MSI Cache Coherence Request Processor P1 Processor p2 BU Memory State Addr value State Addr Value Proc Addr Action Addr value P1: Write 10 to A1 P1 A1 Wr Miss A1 15 MA110 P1: Read A1(Hit) M A1 10 P2: Read A1 P2 A1 Rd miss SA110 P1 A1 Wr Back A1 10 s A1 10 P2 A1 Transfer P2: Write 20 to a1 P2 A1 Invalidate A110 MA120 A110 P2: Write 40 to A2 MA240 A225 Assume that a1 and a2 map to same cache block Initial cache state is invalid 2021/2/1 计算机体系结构
Example on MSI Cache Coherence 2021/2/1 计算机体系结构 7 • Assume that A1 and A2 map to same cache block • Initial cache state is invalid
Write-back Cache Cachet块状态 PrRd/- Invalid, Valid(clean), Modified(dirty PwR/ Processor/ Cache操作 PrRd, PrWr, block Replace M 总线事务 Bus read(BusRd), Write-Back BusWB) PrWr/- 仅传送 cache-b|ock Replace/BusWB PwR/BusRd 针对 Cache-致性的块状态调整 Replace/ Treat valid as shared PrRd/- Treat modified as Exclusive PrRd/BusRd ·引入新的总线事务 Bus Read-eXclusive(BusRdX) 其基本动作是:读进并修改 2021/2/1 计算机体系结构
Write-back Cache • Cache块状态 – Invalid, Valid (clean), Modified (dirty) • Processor / Cache 操作 – PrRd, PrWr, block Replace • 总线事务 – Bus Read (BusRd), Write-Back (BusWB) – 仅传送cache-block • 针对Cache一致性的块状态调整 – Treat Valid as Shared – Treat Modified as Exclusive • 引入新的总线事务 – Bus Read-eXclusive (BusRdX) – 其基本动作是:读进并修改 2021/2/1 计算机体系结构 8 PrRd/— PrWr/— V M I Replace/BusWB PrWr/— PrRd/BusRd Replace/— PrWr/BusRd PrRd/—
MSI Write-Back Invalidate Protocol 3 states. Modified:仅该 cache拥有修改过的、有效的该 Prrd/- 块 copy PwR Shared:该块是干净块,其他 cache中也可能含 有该块,存储器中的内容是最新的 M valid:该块是无效块( invalid) ·4 bus transactions: Bus read:读失效时产生 BUsRd总线事务 PrWr/Bus Rdx Bus Rd/Flush Bus read exclusive(总线排他读): BusRdX Pwr/BusRo Bus RdXFlush ·得到独占的( exclusive) cache b|ock s Replace /Bus We 其基本动作为读进并修改 Bus write-Back: BUsWB用于 cache块的替换 PrRd/Bus rd Bus RdX/- PrRd/ Replace- Flush on busrd or busrdX BusRd/- · Cache将数据块放到总线上(而不是从存储器取数 据)完成 Cache-to-cache的传送,并更新存储器 2021/2/1 计算机体系结构
MSI Write-Back Invalidate Protocol • 3 states: – Modified: 仅该cache拥有修改过的、有效的该 块copy – Shared: 该块是干净块,其他cache中也可能含 有该块,存储器中的内容是最新的 – Invalid: 该块是无效块(invalid) • 4 bus transactions: – Bus Read: 读失效时产生BusRd总线事务 – Bus Read Exclusive(总线排他读): BusRdX • 得到独占的(exclusive)cache block • 其基本动作为读进并修改 – Bus Write-Back: BusWB用于cache 块的替换 – Flush on BusRd or BusRdX • Cache将数据块放到总线上(而不是从存储器取数 据)完成 Cache-to-cache的传送,并更新存储器 2021/2/1 计算机体系结构 9 M I S PrRd/— PrWr/— PrRd/BusRd PrWr/BusRdX PrWr/BusRdX PrRd/— BusRd/— BusRd/Flush BusRdX/Flush Replace/BusWB BusRdX/— Replace/—
State Transitions in the msI Protocol Processor read Cache miss→产生 BusRd事务 PrRd/- Cache hit(sorM)→无总线动作 Processor Write 当在非 Modified0态时,产生总线 BusRdX事务, BusRdX导致其他 Cache中 的对应块作废( invalidate 当在 Modified状态时,无总线动作 PrWr/Bus rdX Bus Rd/ flush observing a Bus read PrWr/Bus Rd BUs RdX/Flush Replace/BusWB 如果该块是 Modified,产生Fush总线事务 更新存储器和有需求的 Cache PrRd/Bus rd BusRdX/- ·引起总线事务的 Cache块状态→> Shared PrRdl Replace/- observing a bus read exclusive BusRd/- 作废相关b|ock 如果该块是 modified,产生 Flush总线事务 2021/2/1 计算机体系结构
State Transitions in the MSI Protocol • Processor Read – Cache miss 产生BusRd事务 – Cache hit (S or M) 无总线动作 • Processor Write – 当在非Modified状态时,产生总线 BusRdX事务,BusRdX导致其他Cache中 的对应块作废(invalidate) – 当在Modified状态时,无总线动作 • Observing a Bus Read – 如果该块是 Modified, 产生Flush总线事务 • 更新存储器和有需求的Cache • 引起总线事务的Cache块状态 Shared • Observing a Bus Read Exclusive – 作废相关block – 如果该块是modified, 产生Flush总线事务 2021/2/1 计算机体系结构 10 M I S PrRd/— PrWr/— PrRd/BusRd PrWr/BusRdX PrWr/BusRdX PrRd/— BusRd/— BusRd/Flush BusRdX/Flush Replace/BusWB BusRdX/— Replace/—
◎ Example on Msi Write-Back Protocol PrRd/- PrWr/ P us 7 PrWr/Bus rdx bus rd/flush、 PrWr/BusRdX Bus RdX/Flush Replace/ BusWE Memory PrRd/BusRd PrRd/-Replacel- BusRdX— yO devices 7 Processor Action State P1 State p2 State P3 Bus Action Data from 1. 1 reads u Busrd Memory 2 P3 reads u BusRd Memory 3. P3 writes u BusRdX Memory 4. P1 reads u BusRd. flush P3 cache 5. P2 reads u s BusRd Memory 2021/2/1 计算机体系结构
Example on MSI Write-Back Protocol 2021/2/1 计算机体系结构 11 Memory I/O devices u: P1 P2 P3 u S 75 u S 7 u MS 57 1. P1 reads u S BusRd Memory 2. P3 reads u S S BusRd Memory 3. P3 writes u I M BusRdX Memory 4. P1 reads u S S BusRd, Flush P3 cache 5. P2 reads u S S S BusRd Memory Processor Action State P1 State P2 State P3 Bus Action Data from 5 7 IS S M I S PrRd/— PrWr/— PrRd/BusRd PrWr/BusRdX PrWr/BusRdX PrRd/— BusRd/— BusRd/Flush BusRdX/Flush Replace/BusWB BusRdX/— Replace/—