Difference between revisions of "Unaligned memory accesses"

From gem5
Jump to: navigation, search
 
 
Line 1: Line 1:
I believe real hardware takes care of memory alignment by issuing two loads or stores if necessary from the same instruction in the load/store queue. The result is combined after the data gets back as part of the load/store and things continue. That doesn't seem to fit very well with m5 which assumes aligned accesses and causes problems if it receives something else. ptlsim handles things by having two different versions of every memory operation which do the low and high portions of an access and then glue it together. There are probably complications there with goofy segmentation and paging, and it doesn't seem like it would be very accurate performance wise.
+
I believe real hardware takes care of memory alignment by issuing two loads or stores if necessary from the same instruction in the load/store queue. The result is combined after the data gets back as part of the load/store and things continue. That doesn't seem to fit very well with m5 which assumes aligned accesses and causes problems if it receives something else. ptlsim handles things by having two different versions of every memory operation which do the low and high portions of an access and then glue it together. One problem with this method is that instead of a single memory operation "stuttering", there would be two operations taking up twice as many resources. Also, if segmentation happens at the address translation step, figuring out what part of a memory operation should be done in each step is harder to predict. Segmentation to align unaligned addresses and unalign aligned addresses.

Latest revision as of 01:04, 12 July 2007

I believe real hardware takes care of memory alignment by issuing two loads or stores if necessary from the same instruction in the load/store queue. The result is combined after the data gets back as part of the load/store and things continue. That doesn't seem to fit very well with m5 which assumes aligned accesses and causes problems if it receives something else. ptlsim handles things by having two different versions of every memory operation which do the low and high portions of an access and then glue it together. One problem with this method is that instead of a single memory operation "stuttering", there would be two operations taking up twice as many resources. Also, if segmentation happens at the address translation step, figuring out what part of a memory operation should be done in each step is harder to predict. Segmentation to align unaligned addresses and unalign aligned addresses.