How to improve the RISC-V specification

alastairreid.github.io

162 points by todsacerdoti 14 days ago


acuster - 14 days ago

You are quite right that the document that 'specifies' RISC-V remains a key weakness in the whole movement.

For expediency, the choice was made to not sweat it. So the document is actually called a 'Manual' but is linked as being the specification. Even so, the document needs a real editor to review it. For example, the preferred bit pattern which is to be processed by an implementation as doing nothing but incrementing the program counter ('no op') is called an 'instruction' in some sections but is clearly not in others---a dumb discrepency. A review by a good technical editor would be a great first step in improving the document.

However, the greater tragedy is that a great 'specification' for RISC-V would be an invaluable educational document. This would be a very hard document to write. No document that I could find has ever tried to specify an instruction set independent of an actual implementation. So there is no roadmap towards writing a good spec for RISC-V. This is surely one of the reasons the effort has not yet been started.

After a couple of months trying to imagine how such an effort could be undertaken, how one could argue that the effort was worth trying, and how I might convince the community of the value and need for a good spec, I gave up. The work would require a team combining very fine technical knowledge with exeedingly accurate control of technical english. The work would be a multi-person-year effort, requiring concomitant funding. It is not clear to me how this work might begin.

Also you are entirely right to think about the test suite as a central concern. Specifications are strange documents. Some specifications make requirements which can not be tested; this affects the very nature of what is being 'specified'. Others have tried to root every injunction in the test suite; that approach leads to its own difficulties. The specification will have to make its choice on the matter and the authors would benefit from being very clear with themselves about what stance they are taking on the matter.

So thanks for your argument for a better specification; it would be a wonderful addition to the open instruction set. Hopefully, somehow, such an effort finds its wings.

gchadwick - 14 days ago

Another issue I take with the RISC-V spec is it relies on a common understanding of technical terms without actually defining them precisely anywhere.

To take one example it never defines what an interrupt is and more broadly never defines terminology around exceptions. Contrast to the arm ISA which precisely describes what it means by asynchronous Vs synchronous, precise Vs imprecise etc (see section D3-1 in https://developer.arm.com/documentation/ddi0487/latest/).

The original authors may see this as a virtue, the small size of the RISC-V ISA manuals Vs Arm was portrayed as a great benefit but in part that size is because it's missing lots of stuff like this that I view as highly important for a specification.

sweetjuly - 14 days ago

This is one of the things I think is most sorely missing from RISC-V. ARM provides executable (but perfectly legible) pseudocode for every instruction. You don't have to rely on natural language to understand what an instruction does, which is really important when dealing with very complex ISA features which have many different (and sometimes contradictory) extensions. SAIL sort of fulfills this purpose if you squint but it doesn't feel like a specification like ARM pseudocode so much as a theorem proving language which happens to be the reference for the ISA.

timhh - 14 days ago

I've been doing a lot of work with Sail (not SAIL btw) and I'm not sure I agree with the points about it.

There's already a way to extract functions into asciidoc as the author noted. I've used it. It works well.

The liquid types do take some getting used to but they aren't actually used in most of the code; mostly for utility function definitions like `zero_extend`. If you look at the definition for simple instructions they can be very readable and practically pseudocode:

https://github.com/riscv/sail-riscv/blob/0aae5bc7f57df4ebedd...

A lot of instructions are more complex or course but that's what you get if you want to precisely define them.

Overall Sail is a really fantastic language and the liquid types really help avoid bugs.

The biggest actual problems are:

1. The RISC-V spec is chock full of undefined / implementation defined behaviour. How do you capture that in code, where basically everything is defined. The biggest example is probably WARL fields which can do basically anything. Another example is decomposing misaligned accesses. You can decompose them into any number of atomic memory operations and do them in any order. E.g. Spike decomposes them into single byte accesses. (This problem isn't really unique to Sail tbf).

2. The RISC-V Sail model doesn't do a good job of letting you configure it currently. E.g. you can't even set the spec version at the moment. This is just an engineering problem though. We're hoping to fix it one day using riscv-config which is a YAML file that's supposed to specify all the configurable behaviour about a RISC-V chip.

I definitely agree about the often wooly language in the spec though. It doesn't even use RFC-style MUST/SHOULD/MAY terms.

zyedidia - 14 days ago

I am excitedly awaiting the full release of ASL1 from Arm. I wonder if anyone with more knowledge might be able to comment on how it compares with Sail and/or when we might expect to see a full Arm specification in ASL1 (as opposed to the current spec which is normal ASL and appears to be incompatible with the upcoming version). Perhaps in the future there might also be a RISC-V specification written in ASL1.

chrsw - 14 days ago

The part about needing to be fluent in programming language research in order to understand the SAIL specification is a great point. I think it speaks to the origins of RISC-V being by and for computer science or computer engineering academics and not for practicing digital designers for commercial systems.

artisanspam - 14 days ago

> The easiest way to improve this would be to capture as much of the architecture as possible in formats that are easy to read and manipulate. In particular, instruction encodings and control/status registers are easily described by simple JSON/YAML/XML/… formats.

This has been something I wish was available for ARM pseudocode. It’d be ideal to just generate an equivalent Python, SystemVerilog, etc. library from the ARM ARM instead of having to reimplement a subset of it yourself.

hlandau - 14 days ago

Yeah, the low quality and informal "tutorial" presentation of the RISC-V "specification" has always been very jarring to me given the popularity of RISC-V.

The authoritative description should be machine readable, but the PDF also needs to be authoritative. That means the PDF needs to be generated from the machine readable spec automatically.

I think the highest quality ISA spec in the industry at this point is ARM's ARMv8/9 ISA manual - it's a readable PDF with psuedocode in a well defined language. Every ISA rule is even given a unique identifier. But even better, it's all generated from XML source files which ARM releases, so you can parse the psuedocode out of those and run that code directly if you write an interpreter for it.

I was hoping to see ARM also release this XML for ARMv8-M but last I checked they haven't done so, sadly.

It's highly worthwhile to examine different ISA manuals to compare their relative approaches (e.g. Power, SPARC, ARMv8, Hexagon, s390x, Itanium, etc.). But I think ARMv8 is the best.

IAmLiterallyAB - 14 days ago

SAIL reminds me of SLEIGH, the language Ghidra uses to describe ISA semantics. Cool stuff

azubinski - 14 days ago

"Conclusion The RISC-V architecture was developed in classic startup/academic style: innovating quickly and avoiding too much investment in long-term engineering"

But this is not a conclusion in any sense. This is the premise of the entire RISC-V project. Because RISC-V is just an abstraction to teach and learn machine-level aspects of programming regardless of the specific processors available on the market. Donald Knuth has the same, MMIX. In addition it is difficult to say what innovations there may be in the RISС processor instruction system.

Well...

drycabinet - 14 days ago

Is RISC-V technically superior or is it just about the license?

ur-whale - 14 days ago

A slightly off-topic point but nevertheless worth mentioning (re-iterating in fact): much like ISA's , API's of any kind should first and foremost be described in a machine-readable format.

They unfortunately very rarely are.

And in these days of LLM's and whatnot, the human-readable version should easily be automatically derived from the machine-readable one, not the other way round.

photonbucket - 14 days ago

I once thought it would be nice to write a toy riscv isa simulator, but was also surprised and discouraged by the natural language spec

atVelocet - 14 days ago

Would a format like SVD fit his description? And if not: Why?

Reason077 - 14 days ago

Perhaps it's time for a RISC-VI specification!

fwsgonzo - 14 days ago

I honestly think the very readable specification has been a boon for RISC-V and possibly part of the reason why people continue to find it easy to pick up. If you are unsure about something in the spec, there's also a multitude of RISC-V emulators out there, probably several in your favorite language already.