CPU making architectures and limitations explain

Hello everyone,

So I’m trying to find a somehow easy(ier) explanation on CPU architecture types and limitations in making them more powerful,

I want to simply understand why is hard to adopt ARM based IO in CPUs to use in desktop class computers ( I people say some ARM based CPUs are even more powerful than some desktop class ones)
is it clock speed? cache? the way it Input/Output data and connects?

and on the other hand, i’m trying to understand why there is a limitation in number of transistors in a CPU core?

  • is putting more transistors in a core and make it bigger in size, results in:
    out of range power consumption?
    un control heat?
    being physically farther away slows down the electron flow and makes it impossible to gain the
    higher speed?
    production cost?
  • can making smaller transistors (like 7 nm in AMD) result more in a core and => higher clock speed?
  • and why it is not easy to add more cores in a CPU?
  • is the maximum thread in a CPU core is splitting it in a half (don’t we have like a quad thread CPU core?)

sorry for these basic raw questions, I understand there are many scientific factors that are not explainable simply


First let me state that I have no expertise in this area, so take whatever I say with a whole teaspoon of salt :wink:

Think of the space on the die of a CPU chip as an area. Divide this area up into units of each size. Now as a cpu designer, you want to have features, and each feature requires so many of these units of space. (Really it’s a number of transistors.) Each unit of space produces heat the harder it works. So as a designer of such chips, you’re trying to balance how many features (maybe it’s cores, or machine learning units, or GPU units or whatever) so that your device meets your needs, within the power budget (which is directly related to the ability to dissipate the resulting heat.) Unfortunately, due to physics, you can only move heat at a certain speed, without requiring bigger cooling units which themselves use more space and power. This basically means that every CPU is a trade-off between power input versus compute output.


Well, there’s a lot to it but I’ll try to answer at least some of your questions.

Basically, a move from one architecture to the other is difficult because of the lack of compatibility. Most code base is not platform-independent, meaning, you usually need to invest a huge amount of effort into porting it. This is one of the reasons why Apple puts so much pressure on the developers to use the official API and Swift. Having that done, it’s Apples job to make those two work on whatever system they wish, and the developers usually need some minor adjustments in order to make their products work on another kind of system.

The electrons flow with the speed of light (well, close to it), so this isn’t an issue. However, space is one of them. You can make the transistors and other elements as small as the current technology allows for. And don’t forget, you need to connect them as well. Obviously, the more of them there are, the more potential heat will be generated (transistors in a CPU don’t consume any power if they don’t “work,” only the switching process requires energy, however, you do want them to work, otherwise, why would you put them there :slight_smile: )

That’s a whole other story. One issue is the process of how you “draw” your design on a silicon wafer. It uses light to project it onto a wafer first. And if you want to make a transistor smaller, you also need to create a proper light source with a shorter wavelength, which gets more and more complicated as the length decreases.

There’s also another, more important issue here: quantum tunneling, which I would recommend you to read about it as it’s pretty complex but also, very interesting. Basically, as the connections are getting smaller, there’s a bigger chance for the electrons to overcome the potential barrier between circuits even though they are electrically insulated. It’s a little like a short circuit.

Size and space is one of them, as I’ve explained above, the cost is another. You also need to have an OS that is sophisticated enough to spread load between the cores effectively, as it doesn’t happen on the CPU level. And your software has to allow for parallel processing as well. So, if most of your programs use linear processing, you won’t benefit from that at all. Unless you run a huge number of different ones at the same time.

You also have to remember that even though you have multiple cores, you still share many other elements like system buses and memory, access to which has to be synchronized in order to make sure that data stays consistent. And it costs a lot of time.

I hope this puts some more light on those issues.


Thanks for the clear detail that helps a lot

So simply for a non mobile processor
If we have a x a in diameters and n unites inside and x Gh speed
We can not go for 2a x 2a and 4n unites core to increase the speed
As power consumption and heat generation would be the issue
And need a more complicated bus and IO to support it
It’s a dasy chain
Something like that

1 Like

The CPU would theoretically have twice as much performance (let’s leave the shared components and the size aside for now), but only if you could keep those two extra cores busy with parallel tasks. Without making things too complicated, you need to consider CPU cores as separate processors.

However, nowadays, a single CPU core is also capable of simultaneous multi-processing to an extent by using techniques like pipelining (a single core has multiple execution units of one kind, so it’s able to execute multiple instruction streams at the same time) or superscalarity (a superscalar processor is able to execute multiple instructions in a single clock-cycle if they’re using different execution units, and also, is able to execute the same instruction for multiple sets of data-- they’re then called SISD (single-instruction, single-data) and SIMD (single-instruction, multiple-data) processors respectively (or MIMD for multiple-core superscalar processors)). All that is done on the CPU level with only some ways to influence it in the codebase.

There’s also Intel’s Hyper-Threading technology, which presents a superscalar core as two logical cores to the operating system. This allows for more influence on how simultaneous multi-processing is performed, however, an OS must be correctly optimized for it in order not to do more harm than good (remember, it’s still just one superscalar core, so there is no full parallel processing possible).


@marcinm, do you have a blog or something? I’d love to read more on stuff like this.

The first thing that jumped to mind was Digitimes. I can’t recommend it because it’s not a regular part of my reading. It may not be what you want as it’s pretty “newsy” but it may be bookmark worthy in any case. They have a sub-section for chips, which is where this link goes: https://www.digitimes.com/topic/bits+chips/


I don’t have a blog of my own, but I just came across those two articles by Ars Technica, which you might find interesting:


The thing with microprocessor architecture is, the further you go, the deeper is gets :slight_smile: And it’s hard to stop :wink:


Thank you @marcinm! Food for a hungry mind.

And thank you too, @PHolder - I subscribed to Digitimes’ free newsletter.