Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think Rob has ever designed a single CPU, much less measured the effects of different tradeoffs.


To be fair, likely none of the readers here have designed a single CPU either :)


I toyed with my own implementation of one-stage RV32I for FPGAs, does this count ? ;)


I think that's a lot more than Rob has done. What did you learn?


Well, at least I learnt that pipelining is for good, it speed-ups execution quite a lot. :)


>one-stage

as in not pipelined?


Yes, there was no pipeline. :)


Some variation of designing, building or extending a CPU or ISA is standard in comp-sci curricula.


It's a standard thing to do in EE curricula; you normally do it in a one-semester class, and there are literally thousands of open-source synthesizable CPU cores on GitHub now. Some one-semester classes go so far as to design ASICs and, if they pass DRCs, get them fabbed through something like MOSIS or CMP.

To take three examples to show that designing a CPU is less work than writing a novel:

- Chuck Thacker's "A Tiny Computer", fairly similar to the Nova, is a page and a half of synthesizable Verilog; it runs at 66 MHz in 200 (6-input) LUTs of a Virtex-5: https://www.cl.cam.ac.uk/~swm11/examples/bluespec/Tiny3/Thac...

- James Bowman's J1A is more like Chuck Moore's MuP21 and is about three pages of synthesizable Verilog: https://github.com/jamesbowman/swapforth/blob/master/j1a/ver... and https://github.com/jamesbowman/swapforth/blob/master/j1a/ver.... You can build it with Claire Wolf's iCEStorm (yosys, etc.) and run it on any but Lattice's tiniest FPGAs; it takes up 1162 4-input LUTs.

- Ultraembedded's uriscv is about 11 pages of Verilog and implements the full RV32IMZicsr instruction set, including interrupt handling (but not virtual memory or supervisor mode): https://github.com/rolandbernard/kleine-riscv/tree/master/sr...

In all three cases, this doesn't include testbenches and other verification work, but as I understand it, that's usually only two or three times as much work as the logic design itself.

Maybe we should have a NaCpuDeMo, National CPU Design Month, like NaNoWriMo.

I haven't quite done it myself. Last time I played https://nandgame.com/ it took me a couple of hours to play through the hardware design levels. But that's not really "design" in the sense of defining the instruction set (which is, like Thacker's design, kind of Nova-like), thinking through state machine design, and trying different pipeline depths; you're mostly just doing the kind of logic minimization exercises you'd normally delegate to yosys.

In https://github.com/kragen/calculusvaporis I designed a CPU instruction set, wrote a simulator for it, wrote and tested some simple programs, designed a CPU at the RTL level, and sketched out gate-level logic designs to get an estimate of how big it would be. But I haven't simulated the RTL to verify it, written it down in an HDL, or breadboarded the circuit, so I'm reluctant to say that this qualifies as "designing a single CPU" either. (Since it's not 01982 anymore maybe you should also include a simple compiler backend before you say a new ISA is really designed?)

But I also wouldn't say I'm "well versed in the topic". I can say things about what makes CPUs fast or slow, but I don't know them from my own experience; I'm mostly just repeating things I've heard from people I judge as credible on CPU design. But what is that credibility judgment based on? How would I know if I was just believing a smooth charlatan who doesn't really know any more than I do? And I think Rob is in the same situation as I am, just worse, because he has even less experience.


Indeed, I suspect most computer engineering students have done some level of CPU design in their coursework. I rather enjoy the design process, and have done many different designs over the years in an attempt to learn about different optimization and design decisions. I typically do something on paper first, then in some simulation, and sometimes into HDL an on an FPGA, or in some cases discrete logic.

I recently did a very simple 16 instruction/16 register RISC-like design (no microcode) built using just 74xx series logic which was successful at over 10MHz, and I then took that design and implemented it in CPLDs to see how it would compress. It really is an enjoyable process and a nice change from the daily software engineering tasks.

Napkin CPU design should be table topic at your next dinner!


That's great! Presumably you mean 74HC or 74HCT or at least 74ALS, not really 74? How much trouble did you get with noise as you pushed it to 10 MHz with SSI chips? How many CPLDs did you end up using?

Thank you for comprehensively rebutting "To be fair, likely none of the readers here have designed a single CPU either :)"


Correct! 74HCs for everything, which are both easy to get and forgiving to use. The first build has 4 PCBs ( ALU, Register file, instruction logic, and memory/LEDs/switches). Each PCB is 12-16 74HCx ICs.

I then did a second design of the same thing using Atmel 1504/1508 CPLDs, and that compressed the design to 2 PCBs - One for the CPU itself and a second for system ram, switches, LEDs, etc. That first board had 5 1504s and 1 1508, although it could have been done in a tad less if I used another 1508. The biggest consumer was the register file since it was 16x 8-bit register, which consumes 128 flip flops.


That sounds great! Have you written it up anywhere publicly? It sounds better designed (e.g., using less chips) than most of the SSI-board CPUs I've seen.

I've been thinking it would be fun to see if I could get JLCPCB to build me such a CPU out of SSI with their PCB assembly service. In theory this is nice and simple: I design the processor at the chip/netlist level, debug the thing in a discrete-event or synchronous simulation with Logisim or something, import it into KiCAD, produce some board layouts, send it to them along with US$50 or so, get back a stack of fully populated SMD PCBs, plug them together, plug it in, start single-stepping the clock while watching what's going on, and then ramp the clock up to see how fast it can run reliably. What surprises am I likely to run into?

Unfortunately I don't think Potato Chips' product line covers things like 1GHz 8-bit registers, but if so that would be pretty entertaining.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: