Em 13/1/2012 20:54, smplx escreveu: > > On Fri, 13 Jan 2012, Isaac Marino Bavaresco wrote: > >> Em 13/1/2012 14:13, Herbert Graf escreveu: >>> On Fri, 2012-01-13 at 15:36 +0000, smplx wrote: >>>> I belive HT makes use of "dead time" during physical core intruction >>>> execution to execute virtual core instructions. i.e. say there is a st= all >>>> in the physical core instruction execution pipeline (maybe the core ne= eds >>>> to get the results of a previous instruction but they aren't ready yet= ), >>>> then the core can use the "stall" to execute instructions that have >>>> nothing to do with the thread that is causing the stall, hence guarent= eing >>>> (probably) that there is something that can executed. >>> Actually, HT is much more then that. >>> >>> HT is basically making one core look like 2 cores, the OS has no idea >>> that there aren't 2 physical cores (top would show 2 cores). >>> >>> To do this, there is some duplication in the core allowing for 2 thread= s >>> to run. Large parts of the core are shared by the 2 threads. The result >>> is certain parts of certain types of instructions can be run in paralle= l >>> (effectively 2 cores), while everything else causes a stall in the othe= r >>> thread, effectively making it single core. >>> >>> The performance of HT is very application/thread dependant. Most of the >>> time it has some benefit, but it's pretty minimal, and NOWHERE near the >>> benefit you get from 2 actual cores. >>> >>> Note that while this SOUNDS very similar to what AMD is doing it's not. >>> AMD has 4 physical cores, but duplicates pretty much EVERYTHING integer >>> related, so that if you're running integer only code (the vast majority >>> of "regular" OS stuff and apps is mostly integer only) you have 2 cores= .. >>> You are only really impacted if you run FPU rich code. This used to be = a >>> much bigger deal, but with the advent of very excellent GPUs, alot of >>> FPU work traditionally done by the CPU is now handled by the GPU >>> (obviously anything graphics related, and more and more general compute >>> stuff is being offloaded to the GPU). >>> >>> HT is nice, having 2 real(ish) cores is better. > Isn't there a diminishing return, as in, the more cores you have the more= =20 > they actually interfere with each others need to access main memory. I ca= n=20 > see where you would get to a point where cores are just starved and it=20 > would be better to get them to do something else. > >> The gains by using the stall time in one thread for another thread is >> not very large, but you must remember that the time wasted with context >> switching may be zero if running only two threads or be cut in half if >> running more threads. That may add some more percents to the speed. >> >> Isaac >> > I haven't been keeping up with this stuff so I'm not sure BUT isn't=20 > main memory access a huge bottle neck? I thought modern CPUs got long=20 > stalls whenever something had to be fetched from RAM? > > Regards > Sergio Masci Don't forget the cache memory, it relieves a lot the pressure over the RAM. Also, the context switching requires a lot of updating to the MMU registers= .. Isaac --=20 http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist .