Last time we managed to finish the work horse of our processor, the ALU. Now we can crunch numbers and do all sorts of cool bitwise magic on them. But what good does that do for us if we can’t store it?

So this time, we will create our working memory, the registers. You may have noticed the weird banner on top, well it should make more sense after we’re done with the first part of this post. But really it’s just for giggles.

*“Okay then, how do we store, say, a single bit?”* glad you asked! We’ll use a **latch** also known as a *flip-flop*. (But they are a bit different, we’ll get to it)
Latch is a device that has 2 states, one of which represents `1`

/`ON`

/`TRUE`

bit and the other is `0`

/`OFF`

/`FALSE`

, and we can switch the state of the latch, hence, store information.

So let’s build one. For that we need 2 NOR gates:

It’s called SR NOR Latch. It’s a Latch, hence the Latch, it uses NOR gates, and the SR because it has `Set`

and `Reset`

pins.

As you can see, we actually feed the outputs into the inputs, which as you might have guessed can cause oscillation. This is why Latches start in unknown state, and need to reset (or preset) latches at least once. Logisim will give `Error`

before we set any value into it. Also you might’ve noticed that `~Q`

output, it’s complementary (opposite) of `Q`

which is our actual output. It’s just how these Latches are made in real world, and we’re sticking to that.

Let’s improve this design a bit. At the moment to store `1`

we need to set the `Set`

input. If we want to store `0`

after that we need to `Reset`

the Latch first. But `Set`

and `Reset`

are complementary, as in they are opposite. To store a `1`

we need to send 2 bits `01`

and to store `0`

we need to send `10`

. No fun. We can use SR’s complementary property to our advantage. But first, as always, the symbol:

We can have single input `D`

(stands for Data) and NOT it before it goes to `Reset`

. Also, let’s make this new latch a gated one. What’s a gated latch? Well, at the moment any change on `D`

pin would change the stored value. (this is why D latches are also called Delay latches, since they simply pass the same input, just delayed a bit)
Okay, so what do we do? Well, how about an AND gate before `Set`

and `Reset`

and 1 pin on both of them is dedicated to let the input go through. Let’s see it in action:

Nice, as you can see, I also added “asynchronous” `Async Reset`

, what I mean by that, is that we can reset the latch even when `Enable`

is `0`

. We do this so we can have a stable state latch whenever we want.

And the symbol for this Gated D Latch is:

So, remember I told you that latches and flip-flops are not quite the same. Well, it’s time to tell you the difference. Latches are transparent, while flip-flops are *synchronous* or *edge-triggered*. What does any of that mean, I hear you ask. Well, before this point all our logic was so called *combinational logic*. That means that the output only depended on input. Kinda like pure functions. In *sequential logic* on the other hand, output not only depends on current input, it also depends on *sequence* of previous inputs. In other words it has memory, aka state. How do we tell the difference between previous signal, and next one? Well we need to somehow *synchronize* all the data that is going around.
And for that we use clocks. Like any oscillators, crystals and alike. This is not your clock that counts time. It’s more like device that produces `1`

output every X time units. If you know what square wave is, you know what I mean. In any case, this wave has edges, where it starts, and where it ends.

So what does all this have to to with latches and flip-flops? Well, latches are transparent, aka, they change output immediately on input change. We want it all be in sync, so we can easily move data from one part of the CPU to another. Hence why we want flip-flops not latches. We want them to change their state only on clock change.

How do we do that? Well we kinda did that already (sneaky I know), the `Enable`

input only allows the data to change when it’s `high`

/`1`

. But to have better control over this we use 2 D latches to make 1 flip-flop:

As you can see, the 2nd latch only triggers when clock is going from `high`

to `low`

, so this would be *Falling Edge Triggered* D Flip-Flop. I also added back the `Enable`

pin by ANDing it with the `Clock`

. The 1st latch there is acting like a buffer before the clock edge actually fall.

And let’s give it a symbol:

See that weird ~~illuminati~~ triangle/arrow symbol? That’s how people usually mark clock pins. So I just follow that for consistency.

Okay now that we have 1-Bit storage, let’s scale it to our WORD size, 8 bits, a byte. We call it a WORD, because in *ye olde* days, we called a natural unit of data that CPU can operate on a MACHINE WORD, and it’s size was however long the word was in bits. So for us it’s a Byte, or Octet (Since bytes weren’t always 8 bits long too).

Anyways, we can just have 8 D Flip-Flops with control lines connected together, pretty easy:

And as always, the symbol:

Now that’s all the memory we need internally for a CPU. We can have a bunch of register to store some values. But what if want to store much more values? We’d need something we could access randomly, too. So we can have any value. Hmm… So how do we build this… RAM?

Well, you’ll have to tune in next time, because this is getting a bit long.

For now, feel free to play around with current CPU:

Okay, so that’s the basics of memory. I decided to split it in 2, because it’s getting long. Next time we’ll make RAM, address the problem of *addressing* and have ⅔ of the CPU done.
As always Logisim schematics: cpu-scratch-p5.circ

P.S. If you find any errors, mistakes, and/or want to give me some feedback, feel free to send me an email

]]>Welcome back to our little endeavour to build a CPU. Last time we finished our adder/subtractor, and it’s all fun and cool, but we’d like our CPU to do more than add and subtract. Today we will finish the ALU so we can finally move on to the overall architecture of the CPU.

Let’s wrap the arithmetic part of the ALU first. How about multiplication and division? Well, as it turns out it’s a bit more convoluted to create multipliers and dividers. And I was thinking for a long time if I even want to cover them. Here’s the deal, we can totally emulate multiplication and division with addition and subtraction. Since we’re trying to create a simple CPU, not necessarily fast and efficient I think we can leave both multiplication and division to software, not hardware. In fact that’s what early CPUs like Intel 4004 did. They didn’t have dedicated circuitry for multiplication or division, instead you would have a programmer write code to add in a loop to multiply. This approach is slower, but takes less space on a die for actual IC (integrated circuit) and, frankly, they are complicated enough, and don’t add too much to overall theory. I think it’s optional to talk about them.

I will, however, point you in the right direction if you want to make them on your own. Wikipedia has a good article on binary multipliers and division algorithms. Here’s for example an unoptimized binary multiplier I made:

It involves lots of logical shifting (which we didn’t cover, yet) and addition. As you can see it’s a bit complex, and it’s technically not full, since 8-bit by 8-bit multiplication should yield 16-bit integer, or at least set an overflow flag or something.

*Alright, but how would the software version look?*
Good question, let’s look at pseudo-code assembly example:

```
1
2
3
4
5
6
7
8
9
10
11
12
```

```
; multiply 5 by 3
load R0, 5 ; load decimal 5 into register 0
load R1, 3 ; load decimal 3 into register 1
mult: ; label for our loop
add R0, R0 ; add register 0 to itself
sub R1, 1 ; subtract 1 from register 1
cmp R1, 1 ; compare register 1 with 1 (sets zero flag if same)
jnz mult ; if zero flag is not set by cmp jump to mult
print R0 ; 15
```

Something similar can be written for division:

```
1
2
3
4
5
6
7
8
9
10
11
12
13
```

```
; divide 15 by 3
load R0, 15 ; load decimal 15 into register 0, this is our dividend
load R1, 3 ; load decimal 3 into register 1, this is our divisor
load R2, 0 ; load 0 into register 2, this is our result
div: ; label for our loop
sub R0, R1 ; subtract divisor from dividend
add R2, 1 ; increment the result
cmp R0, 0 ; compare dividend with 0 (sets negative flag if less, zero if same)
jg div ; jump if greater; if both zero and negative flags aren't set jump to div
print R2 ; 5
```

As you can see we don’t cover the reminder, but you get the gist.
In theory, we can provide some sort of *standard library* that contains routines, or even implement this in microcode (which we haven’t covered, yet).

Alright, so that covers the Arithmetic part of the ALU, now let’s move on to Logic. This is going to be pretty easy for the most part. All we need to have are `AND`

, `OR`

, `XOR`

, and `NOT`

, as well as, Logical Left and Right shifts.
The simple logical operators are super easy, it’s just 8 gates in parallel. For example an 8-nit `AND`

gate would look something like this:

All the other ones are the same, but instead of `AND`

gate we use the appropriate gate. But we don’t need to do this manually. In Logisim we can change the bit width of the gate so we can use built-in Logism logic gates.

Okay so now we can move to the shifting. Logical shifting is a simple operation in which we take the operand’s bits and *shift* all of them left or right. We can achieve this by connecting the wires from input to output, but we offset them to the next one. So input bit 0 becomes bit 1 in the output.

Here’s the right shift:

And its symbol:

And of course the left shift:

As well as its symbol:

Both of them shift the input by 1 place. You can create circuits that can do more than that. You might want to just offset the inputs more, but there is a better way. We won’t use it for simplicity and consistency (with the multiplier and divider) sake. But if you want, take a look at Barrel Shifters. They use something called Multiplexer which we didn’t cover, yet. It’s a very useful gate which allows selecting different inputs depending on select bits.

Here’s an example of logical left barrel shifter I made:

As I said we won’t use it, instead we will do same as with multiplication, we’ll do it in software:

```
1
2
3
4
5
6
7
8
9
10
11
```

```
; shift 0xFF (0b11111111) left by 3
load R0, 0xFF ; load hex 0xFF (dec 255) into register 0
load R1, 3 ; load decimal 3 into register 1
shift: ; label for our loop
lsh R0 ; left shift register 0 by 1
sub R1, 1 ; decrement the register 0 (sub sets zero and neg flags)
jz shift ; jump if zero flag is set to shift label
print R0 ; 0xF8 (0b11111000)
```

Now that we have all the parts of the ALU we can wire it all together, right? Well, we need to cover one more little guy and his brother. For that we will need to jump back to transistors for a second. I am talking about Buffer and Tri-State Buffer (or Controlled Buffer). Buffer is a gate that simply pass input to output.

So, the truth table will look like this:

in | out |
---|---|

0 | 0 |

1 | 1 |

Boring, right? So why do we need it? Well usually it’s used to *buffer* signal source between different circuits. We, however, are more interested in his brother, the Tri-State Buffer. Let’s take a look at the truth table:

in | enable | out |
---|---|---|

0 | 0 | Z |

1 | 0 | Z |

0 | 1 | 0 |

1 | 1 | 1 |

*Woah, woah, woah. What is this ‘Z’ thing?* some of you might’ve thought to themselves. The `Z`

thing is known as high-impedance or *floating* state. Logisim shows it as `X`

instead of `Z`

. What it basically means for us, is that nothing is *driving* that wire. Nothing is *“connected”* to it. Nothing is sending the signal through it. It’s neither `0`

, nor is it `1`

.

So how do we build it? Okay let’s start with buffer first, and we will keep using CMOS design here, for real world sake:

It’s basically the inverse of `NOT`

gate. That arrangement of P-Type and N-Type transistors is basically an inverter, so by having them both stacked like that, we simply let the same signal go through (`!!a == a`

).

The symbol for the Buffer is a simple triangle:

Now the good part, the Tri-State Buffer:

It’s a bit more involved, but it’s pretty simple. We simply close the power and ground from the output transistors, so they have neither. As you can see, I also labeled the invertors in the circuit.

The symbol for it is the same as the Buffer one, except we have the enable pin out the bottom:

Okay so the last thing we need to do, is to use 8 of them in parallel for our 8-bit buses:

And let’s give it a symbol to differentiate it from 1-bit one:

Okay. That’s cool and all, but why do we need this again? Well, our ALU has different operators now, but it can only output one thing. So, we need a way to select which one it does. And that’s exactly why we needed Tri-State Buffers. When they are off, nothing is *driving* the output wires, when one of them is on, only that data is on the bus. You can think of this like this: when Tri-State Buffer is disabled, the wires aren’t *“connected”*, so no signal, not even `0`

is going through them. We need this because bad things happen when multiple signals try to drive same wire. In Logisim it simply gives us an `Error`

a red wire and a big `E`

for the output. In real life I honestly never even tried that, but it can probably damage the device, basically it’s bad, and you won’t have any readable data there for sure.

Now that we have everything we need, let’s wire our ALU:

As you can see, we just combine all our operands and we use Tri-State Buffers to control what we are outputting. You may have also noticed, that I moved the Zero flag *calculation* from our Adder/Subtractor into the ALU proper. That’s because I think it can be useful if other operations can set it, but all the other ones are arithmetic only. One can also notice that Add and Sub signals are `OR`

ed. That’s because both of them are used in Adder so we output from it when any of them is set.

Now for the symbol of ALU. It’s usually represented as this V shaped thing, hence this is what I’ve done:

The inputs come from top, control bits from the left, flags from the right, and of course the output from the bottom.

And that’s that! We just build the muscle of the CPU, the thing that does the work! With all this operations we can do all sorts of things. But all of that will come in the future. For now play around with your newly baked ALU. Wanna go deeper? Try to implement multiplications and divisions in hardware, as well as, the barrel shifters.

Till next time!

Yey! I am back! Sorry for such a long delay between the posts, was bit busy, then Ludum Dare happened, then the dilemma *“just tell the basics vs go deep”*. I decided to only cover the basics for multiple reasons. It’s going to be easier and faster to produce, less technical people don’t get scared off, the *deep* stuff is not that necessary for understanding of the overall architecture of a CPU. Maybe after I finish this series, I can make another one that covers some optimizations.

Anyways, I hope you enjoyed this one, next time we’ll cover either overall architecture or start working on memory. Tune in next time to find out!

Oh, and the download link for Logisim circuit for this part: cpu-scratch-p4.circ I included a play area for the ALU in the `main`

circuit.

P.S. If you find any errors, mistakes, and/or want to give me some feedback, feel free to send me an email

]]>**Preface:**
I decided to ditch my plan for Part 3 as monolith part that covers all aspects of our ALU. It’d be too long, and I found myself cutting some interesting, yet non-essential parts out. That seems stupid to me, because with approach like that you may as well just search for ALU schematic online and be done with it.

Instead I will break topics up into more parts, and talk a bit more in depth about some parts of ALU.

Last time we left off with a Full Adder on our hands, as well as, with dreams of 8-bit glory. So it would be fitting to scale up our adder into this new and exciting territory.

First let’s check out our Full Adder again:

As you can see, I added text near the pins for ease of distinguishing.

Now, as some of you might’ve guessed from my tip in the last part, in order to go 8-bit with our little adder all we really need to do is stack’em one on top of the other. To be more precise, You take two 8-bit inputs and a 1-bit carry, you split the inputs and feed them into into 8 1-bit Full Adders, you also connect all the carries, and- Well, just check the picture, it’s pretty self-explanatory:

This is called Ripple-Carry Adder, because if you put the adders horizontally instead of vertically like I did, the carries ripple from output of one into input of another, like wave. There are other ways of creating an adder, more optimized and efficient. One example of those are Carry-Lookahead Adder. I won’t cover those here, because I think Ripple-Carry Adder is very easy to understand and is sufficient for our purposes of toy-CPU. Anyways, let’s make a symbol for our newly baked 8-bit adder:

Now, Logisim can’t create truth table for schematics with multi-bit I/O. Sad. But, we can test it ourselves. From this point on, I will provide Logisim schematic file, but if you are really interested in this, you’d be better of doing this along with me.

Okay so we can add 8-bit integers, now what? How would we, say subtract 2 integers?
This is very good question to ask, but in order to answer it, first we need to ask another question. *How do we even represent negative numbers?*

So far we only worked with unsigned integers. Actually what we worked and will always work with are simply bit patterns. We assign meaning to them ourselves by doing specific operations on them. This is very broad topic (Type Theory) that we won’t cover here, because it’s way out of our abstraction level. But we do need to provide some basic operations in our CPU, so we need to interpret those bit patterns as something at some point. Like our adder for example, interprets them as unsigned integers. Our *future* Logic part of ALU needs to interpret them as Booleans. But for real world applications limiting ourselves with only unsigned integers is not practical. So we need a way to distinguish between unsigned and signed integers.

*So how do we do it?* Well, let’s explore a bunch of ways to do that:

If we make our most significant bit (MSB) of a byte a sign bit, we can simply define that `0`

is positive and `1`

is negative. The rest of the byte is the magnitude of the number.

Keep in mind that we number bits inside of a byte from right to left, so MSB is most left bit.

```
1
2
3
4
```

```
7 0
Byte: 0000 0000
^
Most significant bit
```

For example, in signed 3-bit system:

Base 10 | Base 2 |
---|---|

0 | 000 |

1 | 001 |

2 | 010 |

3 | 011 |

-0 | 100 |

-1 | 101 |

-2 | 110 |

-3 | 111 |

*“Wait, what in the name of Turing is negative zero?*” I assume some of you might’ve thought this way.

Yeah, this is really not cool, but this is how we defined it. The benefit of this system is that it is easy to get for humans, in real world we are used to stick `-`

before a number to represent that it’s a negative one. It also means that we will need to handle the sign in our adder and subtractor. But let’s check another systems before we commit ourselves.

In this system we represent negative numbers by inverting them, aka applying bitwise NOT on them. For example `2`

is `010`

, so `-2`

would be `101`

.

So let’s check it out in our 3-bit system:

Base 10 | Base 2 |
---|---|

0 | 000 |

1 | 001 |

2 | 010 |

3 | 011 |

-3 | 100 |

-2 | 101 |

-1 | 110 |

-0 | 111 |

*“That didn’t fix anything, though!”* - readers.

Yep, it just changed notation. But this one also has a perk. You see, we don’t need to change our adder much. If we want to add 2 signed numbers we use the same adder we already have and just add back carry to it.

Adding two positive integers doesn’t change at all with this:

```
1
2
3
4
5
6
7
8
9
```

```
Base 2 Base 10
001 1
+ 010 + 2
--- ---
011 3
+ 0 + 0 (carry)
--- ---
011 3 (as expected)
```

Now let’s add positive and negative numbers:

```
1
2
3
4
5
6
7
8
9
```

```
Base 2 Base 10
110 -1
+ 010 + 2
--- ---
000 0 (incorrect)
+ 1 + 1 (carry)
--- ---
001 1 (as expected)
```

Now it’s very awesome and cool, saves us building specialized signed adder, but that negative zero is really bothering me. Let’s look at yet another representation.

What if I told you, we can take One’s Complement and put it on steroids? By that I, *of course*, mean let’s do the same thing we did in One’s Complement and add 1.

What we do here is *effectively* shift our problem. We make negative zero go away. Don’t believe me? Watch.

First let’s see how the numbers look. Remember to represent numbers in this system we invert the number and add one. For example `2`

is `010`

, so `-2`

would be `101 + 1 = 110`

.

So in 3-bit two’s complement system:

Base 10 | Base 2 |
---|---|

0 | 000 |

1 | 001 |

2 | 010 |

3 | 011 |

-4 | 100 |

-3 | 101 |

-2 | 110 |

-1 | 111 |

See, no negative zero! But how?

Let’s try to represent negative zero in two’s complement: `0`

is `000`

, so `-0`

is `111 + 1 = 000`

. Wow. Right? Granted we do have carry `1`

, but we don’t use it. So we just solved our double zeros problem with *one magic trick, “Computer Scientists hate it”.*

Leaving clickbait behind, we also retain the benefit of having same adder for signed and unsigned integers, even better we don’t need to add carry back.

Okay now so I assume y’all agree with me here that Two’s Complement is the best out of these three, so we will use this one. And now that we finished our theory lesson for this part, let’s finally answer our original question: *“How would we, say subtract 2 integers?”*

First of all let’s remember some basic base 10 math. Subtracting, say, `5`

from `10`

is the same as *adding* `-5`

to `10`

. aka `10 - 5 = 10 + (-5)`

.

So if we convert our number to negative before adding it, we can subtract it. Sounds easy enough, all we need to do is to send all 8-bits through NOT gates and then add 1. Wait so we need an adder in adder, or 2 cycles to do this? Nope, remember that we have **Carry In** input in our adder that is added to the result of `a + b`

inside. Awesome!

But, before we do that, one more thing. Instead of using NOT gates, let’s use XOR. Why?

Since our Adder *is* our Subtractor we want to choose the mode of operation. So let’s add an input called **Sub?** that will select either *add* or *subtract* modes in our Adder/Subtractor. We can use this same input as **Carry In** to add that `1`

for transforming the number into negative. We also can use it as an input to our XOR gates to flip bits only when we need to subtract! This works because of properties of XOR gates. If you recall, XOR returns `1`

only when inputs are opposite. So we can flip one of the inputs with the other. (Check the XOR implementation in previous parts for reference.) Incredible!

Let’s take a look at the *“negator”* circuit:

And as always, let’s abstract it away and hide it under it’s own symbol:

Now we can make our Adder/Subtractor. But, again, let’s talk a bit before we do that.
Since we are working with finite amount of bits we cannot represent all the numbers. With 8-bits we can represent unsigned integers from `0`

to `255`

(`2^8 - 1`

we subtract one because we start from 0) and because of the two’s complement we can express signed integers from `-128`

to `127`

. So what happens when we say add `1`

to `255`

? Let’s check it out, shall we?

```
1
2
3
4
5
6
7
8
```

```
Base 2 Base 10
11111111 255
+ 00000001 + 1
-------- ----
1 00000000 0
^
Carry
```

This is called an overflow, we ran out of bits to store the number. We can detect this if we actually do not ignore the **Carry**, so it would be good if out Adder/Subtractor outputted it so we can save it in fufure.

Detecting overflows in signed computation is a bit more involved. Consider next:

```
1
2
3
4
5
6
7
8
```

```
Base 2 Base 10
01111111 127
+ 10000001 + -127
-------- ----
1 00000000 0
^
Carry
```

This is actually correct, but if we assume **Carry** is sign of overflow we’d make a mistake, here it’s better to ignore it. Let’s look at this:

```
1
2
3
4
5
6
7
8
```

```
Base 2 Base 10
01111111 127
- 10000001 - -127
-------- ----
0 11111110 -2
^
Carry
```

This clearly wrong, `127 - (-127) = 127 + 127 = 254`

we can’t represent that large of a number in two’s complement 8-bit signed integer. So it’s an overflow, but **Carry** is 0!

We need to detect this too somehow, and it’s actually pretty easy. If the sign bits (Most Significant Bits) of our inputs do not match the carry we’ve got overflow on our hands. It is important to note that we get the MSBs after we negate 2nd input, we only check this for actual addition.

I know you are itching to see the complete circuit, so here we go:

So there are two things I’ve left unexplained. And both are pretty much self-explanatory. The **zero** and **negative** *“flags”* are necessary when we want to check things. Like if **a** is bigger than **b**. We can check it by subtracting **b** from **a** and seeing if result is negative, but we are getting ahead of ourselves.

The **negative** flag is simply set by MSB of the result. The **zero** flag is a bit more involved, but it is simple enough. All we need to do is OR all the bits, if any bit is `1`

the result is `1`

, if all bits are `0`

it’s `0`

, but since we want flag *raised* or set to `1`

when all bits are `0`

we simply NOT it to flip it. Easy.

And as per tradition, after all this hard work, let’s abstract it away:

So that’s that, phew, that was involved. But, I for one, think it was really cool. It makes one to appreciate how things work inside of our computers, and we are not even close to done! Just imagine how much stuff is going on inside today’s 64-bit CPUs!

Anyways, so next time we will continue building Arithmetic part of the ALU, and hopefully finish it!

Oh, and here’s the Logisim circuit: cpu-scratch-p3.circ

P.S. I recently saw this amazing talk by Todd Fernandez about how we actually produce nano-scale transistors. I highly recommend you to give it a watch: youtube.com I can’t recommend it enough.

P.P.S. If you find any errors, mistakes, and/or want to give me some feedback, feel free to send me an email

]]>Last time we made NAND gate, now it’s all fun and cool, but in order to proceed we need all the other Boolean operations. And as we found out last time, we can create them from our magical NAND gate.

But first things first. We don’t want to look at the transistor schematic of the NAND gate all the time. As we make more and more complicated circuits it’s just going to be a mess.

So let’s use so called Shaped icons for our logic gates (IEEE Std 91/91a-1991).

Meet NAND:

You can think of this as an interface of our NAND implementation, a black box. Imagine that the inputs and outputs are connected with our CMOS NAND implementation inside:

Okay so now that we got that out of the way, let’s start with simple NOT gate. It has only one input and it simply outputs inverted version of it.

NOT truth table:

a | out |
---|---|

0 | 1 |

1 | 0 |

Let’s also bring back the truth table of NAND gate again:

a | b | out |
---|---|---|

0 | 0 | 1 |

0 | 1 | 1 |

1 | 0 | 1 |

1 | 1 | 0 |

Spotted anything? That’s right, when **a** and **b** are the same we effectively invert one of them! So let’s just connect **a** and **b** as single input. And *voila*!

To make our lives easier in the future, we’ll assign a symbol for NOT just as we did for NAND.

Okay that’s great, now that we have NOT we can invert signals, which is quite useful. For example, to make AND gate we just need to negate the NAND gate, like this:

And the symbol for AND is:

We’re doing great! Now let’s make OR gate. OR gates return 1 if any or both inputs are 1. We can achieve this pretty easily by just negating inputs to our NAND gate.

This works because, if you look at the truth table of the NAND gate, we basically turned the inputs on their head, leaving the outputs the same.

a | b | out |
---|---|---|

1 | 1 | 1 |

1 | 0 | 1 |

0 | 1 | 1 |

0 | 0 | 0 |

And the symbol for OR is:

With all that done, we can implement one of the coolest gates ever! XOR gate, or eXclusive OR. To make it, first let’s look at our trusty truth table:

a | b | out |
---|---|---|

0 | 0 | 0 |

1 | 0 | 1 |

0 | 1 | 1 |

1 | 1 | 0 |

Aha, so it works just like OR, except when both inputs are the same it’s 0. We can use OR gate to implement the first 3 rows of the table. The last row looks awfully familiar to last row of NAND gate, so let’s also use that. Now we need to combine the results together. If we simply use the AND gate we can see magic in action.

It’s not magic though, this will return 1 only when both NAND and OR produce 1. Effectively we’re masking excessive positive outputs of these gates. In fact AND is often used to mask things to 0.

The symbol for XOR is this awesome starship like shape:

Now this is ~~podracing~~ amazing and stuff, but we are nowhere near the implementation of the CPU, and we already spent considerable amount of time. But don’t worry, I left the best for dessert.

Here’s a nice property of XOR, it’s just like addition, just binary.

0 + 0 = 0

1 + 0 = 1

0 + 1 = 1

1 + 1 = 0

*“Wait what? 1 + 1 is 0? I think you stared to long at circuit traces there.”* - some of you might say. And you would be right, if not for carry. See we are in binary here. Base 2. All we have to work with here are 0s and 1s. So to represent number like 2 (Base 10, also known as decimal) in binary we need two digits. 10 in Base 2 is 2 in Base 10. It’s just like when you run out of digits in decimal, you carry the 1. For example, 4 + 1 = 5, but 9 + 1 = 10 (All Base 10 here).

So if we produce a carry we would make a 1-bit adder. Let’s make a truth table.

a | b | carry | sum |
---|---|---|---|

0 | 0 | 0 | 0 |

1 | 0 | 0 | 1 |

0 | 1 | 0 | 1 |

1 | 1 | 1 | 0 |

As we can see, for **sum** output all we need is XOR, and for the **carry** all we need is AND. Pretty easy.

To make sure, lets ask Logisim to generate a truth table.

Great! What we created is called Half Adder. Why half? Well, because it doesn’t care about previous carry. So even though it generates it, in the next iteration of computation it won’t consider it.

So let’s step up our game, let’s create Full Adder! Here come the good news. We can make Full Adder by using two Half ones. Makes sense, half + half = full. *Hmm, come to think of it, maybe that’s why they are named like this.* Anyways, all we need to do is to add **sum** of **a** + **b** to **input carry** and then OR **output carries** of each of that additions.

Before that, to make our lives easier once again, let’s encapsulate our Half Adder into a box. Since it’s not in IEEE standard we used for our Shaped gates. We can make it look however we want (within reason and what Logisim allows us to do).

I made mine a box with half of the ‘add’ sign:

Now let’s look at the Full Adder implementation:

And check its correctness with Logisim.

Awesome! Looks like we can now add bits together! Aren’t you excited yet? No? Just me? Well you should! We just made our first step, we can compute sums. Yeah, yeah. It’s only for two 1 digit, binary numbers with carries, but hey.

Oh and let’s also create a symbol for our newly baked Full Adder:

Pheh! That was a lot, wasn’t it? Next time we will finally move into 8-bit territory, excited? I am! We will start creating the heart of the computation in the CPU, the number cruncher, the ALU, or Arithmetic Logic Unit!

P.S. Noticed how Half Adders clicked together to form Full Adder, I wonder how we will make 8-Bit Adder. Hmm… ;)

P.P.S. If you find any errors, or want to give me some feedback, feel free to send me an email

]]>In order to make a CPU we need to start from somewhere. The 0th level of abstraction here is physics (at least to our understanding), but I think that’s too low. Let’s skip that, let’s also skip electrical engineering and head straight into logic, Boolean logic that is. But before we do that I do feel the need to talk about the building block of it all.

Meet the transistor.

Transistors are very useful, before them we used cogs, relays, vacuum tubes. Obviously each one was an improvement but they were still slow as heck. But what is a transistor?

A transistor is a semiconductor device used to amplify or switch electronic signals and electrical power.

from Wikipedia

But for our purposes just think of it as a switch, you know, like the one you are using to turn on or off a light bulb. Except it’s not mechanical, and can switch thousands times per second.

There are many types of transistors, but we only care about P-type and N-type. Essentially in N-Type you need to apply current to let current flow in the gate (close the gate) and in P-type it’s the reverse.

Quick tour of Logisim:

- Squares are inputs;
- Circles are outputs;
- Dark green = false/0/low current;
- Bright green = true/1/high current;
- Blue = no current;
- Red = error.

Notice that source is an output. There’s a physical explanation of that, but we don’t need to care about it, since we are dealing with logic only, not current.

So how can a switch be used to do computation? Well, enter Boolean algebra. The same one that you use in high-level languages to check conditions! With a couple of transistors we can create familiar to us operations like NOT, OR, and an AND.

But before we do that, let’s look at one property in Boolean algebra - functional completeness.

In logic, a functionally complete set of logical connectives or Boolean operators is one which can be used to express all possible truth tables by combining members of the set into a Boolean expression.

from Wikipedia

To simplify, some set of Boolean operators can emulate all the other ones. For example, you can create all the operators using only AND and NOT. Better yet single NAND (NOT AND) operation is functionally complete! This means that by creating NAND gate with transistors we can then use the same part to create everything else.

*“Hold on a sec, what’s a ‘gate’?”* - some of you might ask. In a nutshell it’s some device that performs some Boolean function. A NAND gate in this case is some device that takes 2 or more inputs performs NOT AND operation on them and return an output.

So why would we want to make everything out of one gate? Well there are a lot of reasons actually, and to be honest I don’t know them all. Here’s the gist: smaller sizes, lower power consumption, less signal noise and I guess process streamlining.

But enough with words! Let’s see it all in action!

First we need to define what we want our gate to do. There are a bunch of representations of Boolean functions, but I think the most readable at a glance are truth tables. You put all your possible inputs into columns on the left, and your corresponding outputs on the right, and voila!

a | b | out |
---|---|---|

0 | 0 | 1 |

1 | 0 | 1 |

0 | 1 | 1 |

1 | 1 | 0 |

As we can see we need to output 1 in all cases except when both inputs are 1, so let’s build this gate using only transistors.

There are many ways to do this, I chose the CMOS NAND gate implementation, but it doesn’t really matter in our case.

Let’s analyze the circuit using Logisim. We can ask it to generate truth table and algebraic expression which we can use to make sure our gate is doing what we want. (*Project -> Analyze Circuit -> Table*)

Success! We created our first gate! Now finally we can leave this layer of abstraction and move on!

Next time we will create all the other gates and make our first computation!

I would also highly recommend you to play around in Logisim, see what sorts of things you can create with just few basic components!

P.S. if you find any errors, or want to give me some feedback, feel free to send me an email

]]>Ever had a thought *“how does my computer work”*? I sure did. And I wanted to understand how it does all the cool thing it does. How can I tell it to draw pixels on screen? How does it *know* what screen is? How does it *know* what’s *blue*, or what’s *‘c’* is, *“this is a string”*? How can it do multiple things at the same time? (Talking about single core here)

In this series of blog posts I will try to answer some of those questions by building my own, simple CPU, starting with just a transistor, Boolean logic and my *google-fu*.

The goal is simple really, build the simplest 8-bit CPU to understand how they work. Nothing fancy, no branch-prediction, not even any pipelining. Maybe even cheat somewhere, like not making all instructions for it, since some can be emulated with others.

If you don’t know some of those terms, don’t worry, I will introduce them later when they become relevant. It’ll all make sense in the end.

*“Wait, are you going to solder a giant CPU?”* some of you might ask. And, no, no I won’t. What we’ll do, is we’ll simulate it of course. There are many ways to do this, write an emulator in a programming language of your choice, implement an HDL (Hardware Description Language) module and either simulate it on existing sims, or actually flash it onto FPGA (Field Programmable Gate Array) and run real hardware, or… Or you could use something like Logisim.

Logisim is a free and open-source software to design and simulate logic-circuits. This is what we are going to use to create our CPU. Why Logisim though, why not do it on hardware?! Well, answer is pretty simple. To make it more visual. I think it’s easy for us humans to process visual stimuli, so I think this is a good choice to start.

By all means though, if you feel inclined so, go look into HDLs, Simulators, and FPGAs!

Tune in next time, we will build out basic logic gates from mere transistors!

]]>