The full session is available on website gdcvault.com . Look under GDC2011 free content for "hotspots, flops and uops" and click the link to watch it.
@ig0ro But if you dont see it live, then your name wont be entered in to win a sandybridge CPU, t-shirt, or whatever the give-away is this year. :) I do believe the content will be officially available online after GDC at some point. most likely on the gdc or intel website.
In conclusion, I hope to see more material of this nature, to teach programmers how to write code efficiently for modern architectures, as opposed to the textbook architectures they may have learned about in their typical computing science curricula.
The final example with the three different operations being performed in a loop provided an excellent example of how modern x86 CPUs use uOps to schedule work. The reminder to think in terms of both latency and throughput, along with clear cut examples, is of immediate benefit, as expensive loops are easy to locate in any code base by applying the simplest of profiling tools, and the suggested optimizations do not seem particularly difficult to apply.
I was unaware that a division operation was just as expensive as a square root operation, and that a reciprocal square root was just as cheap as it is.
I hope that we see more videos of this nature which can break some of the common misconceptions that may arise in programming.
The branch ordering section was insightful. While most programmers are probably used to thinking about branch prediction in terms of "How frequently does this branch occur?", it is interesting to realize that a branch may occur most frequently during a particular phase of an algorithm, and as long as the branching is clustered in time, we can still obtain great benefit from branch prediction, and the branch becomes cheap.
Overall, an excellent video, which provides instantly-graspable insight that is nonetheless of high value to anyone interested in writing high performance code.
The full session is available on website gdcvault.com . Look under GDC2011 free content for "hotspots, flops and uops" and click the link to watch it.
smelax 11 months ago
Nice work. Keep it up.
AntiProtonBoy 1 year ago
Very interesting stuff! Is the final talk going to be available online?
ig0ro 1 year ago
@ig0ro But if you dont see it live, then your name wont be entered in to win a sandybridge CPU, t-shirt, or whatever the give-away is this year. :) I do believe the content will be officially available online after GDC at some point. most likely on the gdc or intel website.
smelax 1 year ago
In conclusion, I hope to see more material of this nature, to teach programmers how to write code efficiently for modern architectures, as opposed to the textbook architectures they may have learned about in their typical computing science curricula.
sgorsten 1 year ago
The final example with the three different operations being performed in a loop provided an excellent example of how modern x86 CPUs use uOps to schedule work. The reminder to think in terms of both latency and throughput, along with clear cut examples, is of immediate benefit, as expensive loops are easy to locate in any code base by applying the simplest of profiling tools, and the suggested optimizations do not seem particularly difficult to apply.
sgorsten 1 year ago
I was unaware that a division operation was just as expensive as a square root operation, and that a reciprocal square root was just as cheap as it is.
I hope that we see more videos of this nature which can break some of the common misconceptions that may arise in programming.
sgorsten 1 year ago
The branch ordering section was insightful. While most programmers are probably used to thinking about branch prediction in terms of "How frequently does this branch occur?", it is interesting to realize that a branch may occur most frequently during a particular phase of an algorithm, and as long as the branching is clustered in time, we can still obtain great benefit from branch prediction, and the branch becomes cheap.
sgorsten 1 year ago
Overall, an excellent video, which provides instantly-graspable insight that is nonetheless of high value to anyone interested in writing high performance code.
Specific comments to follow.
sgorsten 1 year ago