On Fri, 13 Jan 2012, Isaac Marino Bavaresco wrote: > Em 13/1/2012 14:13, Herbert Graf escreveu: >> On Fri, 2012-01-13 at 15:36 +0000, smplx wrote: >>> I belive HT makes use of "dead time" during physical core intruction >>> execution to execute virtual core instructions. i.e. say there is a sta= ll >>> in the physical core instruction execution pipeline (maybe the core nee= ds >>> to get the results of a previous instruction but they aren't ready yet)= , >>> then the core can use the "stall" to execute instructions that have >>> nothing to do with the thread that is causing the stall, hence guarente= ing >>> (probably) that there is something that can executed. >> Actually, HT is much more then that. >> >> HT is basically making one core look like 2 cores, the OS has no idea >> that there aren't 2 physical cores (top would show 2 cores). >> >> To do this, there is some duplication in the core allowing for 2 threads >> to run. Large parts of the core are shared by the 2 threads. The result >> is certain parts of certain types of instructions can be run in parallel >> (effectively 2 cores), while everything else causes a stall in the other >> thread, effectively making it single core. >> >> The performance of HT is very application/thread dependant. Most of the >> time it has some benefit, but it's pretty minimal, and NOWHERE near the >> benefit you get from 2 actual cores. >> >> Note that while this SOUNDS very similar to what AMD is doing it's not. >> AMD has 4 physical cores, but duplicates pretty much EVERYTHING integer >> related, so that if you're running integer only code (the vast majority >> of "regular" OS stuff and apps is mostly integer only) you have 2 cores. >> You are only really impacted if you run FPU rich code. This used to be a >> much bigger deal, but with the advent of very excellent GPUs, alot of >> FPU work traditionally done by the CPU is now handled by the GPU >> (obviously anything graphics related, and more and more general compute >> stuff is being offloaded to the GPU). >> >> HT is nice, having 2 real(ish) cores is better. Isn't there a diminishing return, as in, the more cores you have the more=20 they actually interfere with each others need to access main memory. I can= =20 see where you would get to a point where cores are just starved and it=20 would be better to get them to do something else. > > The gains by using the stall time in one thread for another thread is > not very large, but you must remember that the time wasted with context > switching may be zero if running only two threads or be cut in half if > running more threads. That may add some more percents to the speed. > > Isaac > I haven't been keeping up with this stuff so I'm not sure BUT isn't=20 main memory access a huge bottle neck? I thought modern CPUs got long=20 stalls whenever something had to be fetched from RAM? Regards Sergio Masci --=20 http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist .