Skip to content

Conversation

AnimeshAgarwal28
Copy link

@AnimeshAgarwal28 AnimeshAgarwal28 commented Oct 2, 2025

This PR implements GPR definition in the Unified Database, addressing issue #1085.

Problem

The UDB currently lacks information about General Purpose Registers in YAML format.

Solution

This PR adds structured register file support to UDB, starting with RISC-V general purpose registers as a foundation for future register file additions.

Changes

New Files

  • spec/schemas/register_schema.json: JSON schema defining the structure for YAML register file.
  • spec/std/isa/register/gpr.yaml: Complete RISC-V general purpose register file with all 32 registers, proper ABI mnemonics, calling convention roles, and conditional support for RV32E (16 registers)
  • tools/ruby-gems/udb/lib/udb/obj/register_file.rb: RegisterFile class extending TopLevelDatabaseObject for programmatic access to register file data

Modified Files

  • spec/schemas/schema_defs.json: Added register-specific schema definitions
  • spec/std/isa/README.adoc: Updated architecture documentation to include register files alongside extensions/instructions/CSRs with usage examples
  • tools/ruby-gems/udb/lib/udb/obj/database_obj.rb: Added RegisterFile kind to DatabaseObject::Kind enum for type system integration
  • tools/ruby-gems/udb/lib/udb/architecture.rb: Added register file loading support to architecture system

Register Details

The GPR implementation includes:

  • All 32 general purpose registers (x0-x31) with standard names
  • Proper ABI mnemonics (zero, ra, sp, gp, tp, t0-t6, s0-s11, a0-a7)
  • Calling convention classifications (caller/callee saved, arguments, return values)
  • Special register roles (zero register, stack pointer, frame pointer, etc.)
  • Conditional presence for RV32E embedded profile

Benefits

  • Single Source of Truth: Eliminates need for hardcoded register mappings in downstream tools
  • Consistency: Ensures all tools use identical register information
  • Automation Ready: Structured format enables automatic code generation

Future Work

This establishes the foundation for adding other register files mentioned in issue #1085:

  • Floating Point Registers
  • Vector Registers

Closes #1085

Implements General Purpose Register (GPR) support in the Unified
Database as requested in issue riscv-software-src#1085.

Changes:
- spec/schemas/register_schema.json: New schema defining structure for
  register files
- spec/schemas/schema_defs.json: Add register-specific schema definition
- spec/std/isa/register/gpr.yaml: New register file implementing all 32
  RISC-V general purpose registers with proper ABI mnemonics, calling
  convention roles, and conditional support for RV32E
- spec/std/isa/README.adoc: Update architecture documentation to include
  register files
- tools/ruby-gems/udb/lib/udb/obj/register_file.rb: New RegisterFile
  class extending TopLevelDatabaseObject
- tools/ruby-gems/udb/lib/udb/obj/database_obj.rb: Add RegisterFile kind
  to DatabaseObject::Kind enum for type system integration
- tools/ruby-gems/udb/lib/udb/architecture.rb: Add register file loading
  support to architecture

The implementation follows RISC-V ABI specifications and provides a
single source of truth for GPR information.

Resolves Issue: riscv-software-src#1085
@AnimeshAgarwal28 AnimeshAgarwal28 changed the title Add GPR Information to UDB feat: Add GPR Information to UDB Oct 2, 2025
@AnimeshAgarwal28 AnimeshAgarwal28 changed the title feat: Add GPR Information to UDB feat: add GPR Information to UDB Oct 2, 2025
Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, frankly, pretty awesome! Great work, @AnimeshAgarwal28 !

There are a few comments that need discussion, but this is an impressive start!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CSRs are also "registers", maybe we use a slightly more specific file name, like "register_file_schema.json" or "regfile_schema.json"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

"description": "Schema for describing a register file",

"$defs": {
"count_variant": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the best way to do this. I'd prefer to put the burden of defining the dimension of the register file on whatever needs to change it from a default of 32, like the "E" extension in this case. I'd prefer to avoid the definition of "variants" here, allow the register files to specify a default value, and have a configuration parameter override this. You could even have the extension include a "params_contraints" to enforce the setting, but then the burden is on every configuration that includes support for "E" to define the dimension of the GPR regfile. I don't think this is ideal, either. Might need some discussion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think this is the trickiest part; namely, where to put the condition(s).

At least two "generic" options that would probably be future-proofed:

  1. Condition the entire "meat" of the definition, so you would basically define everything twice (or more) when variants are needed.
  2. Condition each register separately. I see you already have this ("when" in the register definition).

I could go either way.

"temporary"
]
},
"uniqueItems": true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't hurt. I don't think it will reject multiple register definitions with the same index if they have different names (or same name and different ABI mnemonics), right? We may want to validate that externally, OR, do we even need to have an index attribute, given that this is an (ordered) array?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Duplicate indices would need to be validated outside JSON schema, especially since it might be valid to have entries with the same index before conditions are applied.

Copy link
Author

@AnimeshAgarwal28 AnimeshAgarwal28 Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now when I think of it, index is not needed even if it is not an "ordered" array because we always have the register name as the "unique key".


=== Instructions

=== Registers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
=== Registers
=== Register Files

Comment on lines 50 to 52
"length": {
"$ref": "schema_defs.json#/$defs/bit_length_value"
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to allow this. Having a register file where each entry can have its own length might make the code which uses them harder to write correctly. Maybe we remove this unless and until it's needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any problem in removing this.

Comment on lines 173 to 174
<1> Register files are defined independently of the resolved XLEN; individual register entries
may override the default bit length when necessary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that first phrase true?

I suggest we remove that second phrase and make all registers in each register file the same bit width.

Copy link
Author

@AnimeshAgarwal28 AnimeshAgarwal28 Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better way to convey the first phrase is to say: Register files use symbolic width values (XLEN) that are resolved during configuration.

As for the second phrase, yes you are right I shall drop it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Register files are defined independently of the resolved XLEN"

That's murky. I would just be more explicit: registers can have either a fixed architecture width or take their width from a parameter (e.g., MXLEN, VLEN).

Csr = new("csr")
CsrField = new("csr_field")
Extension = new("extension")
RegisterFile = new("register file")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RegisterFile = new("register file")
RegisterFile = new("register_file")

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an overlook on my side, I shall correct this.

Copy link
Collaborator

@dhower-qc dhower-qc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent first draft!

In addition to the inline comments, a few high-level thoughts:

To really get this integrated, we'll want to reflect it in a few places where the X registers are built in concepts. A few that come to mind: the IDL symbol table and the generated ISS.

In the symbol table, you'll find this:

@scopes = [{
"X" => Var.new(
"X",
Type.new(:array, sub_type: XregType.new(@mxlen.nil? ? 64 : @mxlen), width: 32, qualifiers: [:global])
),
"XReg" => XregType.new(@mxlen.nil? ? 64 : @mxlen),

That's where "X" gets added in global scope as an array of Bits. Instead of that, we'll want to construct (all of) the register files based on the YAML. Hopefully you can get some ideas of to do so looking around here where the symbol table is instantiated:

@symtab =
Idl::SymbolTable.new(
mxlen:,
possible_xlens:,
params:,
builtin_funcs: symtab_callbacks,
builtin_enums: [
Idl::SymbolTable::EnumDef.new(
name: "ExtensionName",
element_values: (1..extensions.size).to_a,
element_names: extensions.map(&:name)
),
Idl::SymbolTable::EnumDef.new(
name: "ExceptionCode",
element_values: exception_codes.map(&:num),
element_names: exception_codes.map(&:var)
),
Idl::SymbolTable::EnumDef.new(
name: "InterruptCode",
element_values: interrupt_codes.map(&:num),
element_names: interrupt_codes.map(&:var)
)
],
name: @name,
csrs:
)
overlay_path = config.info.overlay_path

In the ISS, the X registers are defined on a "hart" object, as you can see here:

uint64_t xreg(unsigned num) const override {
if (num >= 32) {
throw std::out_of_range("X register indices are 0 - 31, inclusive");
}
return _xreg(num).get();
}
PossiblyUnknownBits<MXLEN> _xreg(unsigned num) const {
return m_xregs[num];
}
template <template <unsigned, bool> class BitsClass, unsigned N, bool Signed>
requires (BitsType<BitsClass<N, Signed>>)
PossiblyUnknownBits<MXLEN> _xreg(const BitsClass<N, Signed>& num) const {
return m_xregs[num.get()];
}
// XRegister<MXLEN>& xregRef(unsigned num) { return m_xregs[num]; }
void set_xreg(unsigned num, uint64_t value) override {
if (num >= 32) {
throw std::out_of_range("X register indices are 0 - 31, inclusive");
}
_set_xreg(Bits<8>{num}, Bits<MXLEN>{value});
}
template <
template <unsigned, bool> class IdxType, unsigned IdxN, bool IdxSigned,
template<unsigned, bool> class ValueType, unsigned ValueN, bool ValueSigned
>
requires (BitsType<IdxType<IdxN, IdxSigned>> && BitsType<ValueType<ValueN, ValueSigned>>)
void _set_xreg(const IdxType<IdxN, IdxSigned>& num, const ValueType<ValueN, ValueSigned>& value) {
if (num != 0_b) {
m_xregs[static_cast<unsigned>(num.get())] = value;
}
}

<%- if cfg_arch.mxlen.nil? -%>
std::array<PossiblyUnknownRuntimeBits<64>, 32> m_xregs;
<%- else -%>
std::array<PossiblyUnknownBits<MXLEN>, 32> m_xregs;
<%- end -%>

We'll want to think this one through a bit more with @henrikg-qc

"description": "Schema for describing a register file",

"$defs": {
"count_variant": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think this is the trickiest part; namely, where to put the condition(s).

At least two "generic" options that would probably be future-proofed:

  1. Condition the entire "meat" of the definition, so you would basically define everything twice (or more) when variants are needed.
  2. Condition each register separately. I see you already have this ("when" in the register definition).

I could go either way.

},
"register_entry": {
"type": "object",
"required": ["name", "index"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about:

If we condition the entire "meat", the index key may not be needed since it could be inferred from the register array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, name is already enough of a unique key.

"name": {
"$ref": "schema_defs.json#/$defs/register_name"
},
"abi_mnemonic": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lenary, are you aware of any cases where a register might have different abi names under different conditions?

Comment on lines 44 to 48
"definedBy": {
"$ref": "schema_defs.json#/$defs/requires_entry"
},
"when": {
"$ref": "schema_defs.json#/$defs/requires_entry"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll need both definedBy and when; with #891, they would both be generic conditions

Comment on lines 66 to 67
"caller_saved",
"callee_saved",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be a new thing for us; we don't generally use defaults in the existing schemas.

Comment on lines +111 to +112
when:
not: { name: E }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #891, this would be:

  when:
    not:
      extension:
        name: E

Comment on lines 173 to 174
<1> Register files are defined independently of the resolved XLEN; individual register entries
may override the default bit length when necessary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Register files are defined independently of the resolved XLEN"

That's murky. I would just be more explicit: registers can have either a fixed architecture width or take their width from a parameter (e.g., MXLEN, VLEN).

Comment on lines 17 to 18
sig { returns(T::Hash[String, T.untyped]) }
attr_reader :raw
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, let's name this #data, which will correspond to the same thing you get from TopLevelDatabaseObject

Comment on lines 38 to 39
sig { returns(T::Array[String]) }
def roles = @raw.fetch("roles", [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more work, but also better checked if you return a Sorbet enum here.

"const": "register_file"
},
"name": {
"$ref": "schema_defs.json#/$defs/register_name"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register_file_name, which should be different than register_name.

So far, all register file names are a single uppercase letter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, all register file names are a single uppercase letter.

Is that true? We only have two so far, right? X and f. From fadd.s.yaml, for example:

f[fd] = f32_add(f[fs1], f[fs2], mode);

I guess we're expecting the vector register file to be V, although I don't know if that exists yet.

@dhower-qc dhower-qc requested a review from henrikg-qc October 3, 2025 14:42
Changes:
- rename schema from register_schema.json to register_file_schema.json,
  remove "$ref": "#/$defs/register_file" from bottom of the schema,
  and introduce register_file_name.
- drop per-register length and index fields, make abi mnemonics an
  array, and add 'caller_saved'/'callee_saved' booleans with default
  value of 'false'.
- remove the 'count' helper in favor of conditioning individual
  register entries directly and infer indices from position.
- fix the description for XLEN behavior and remove empty role arrays.
- update the register-file section of README.adoc so the example
  mirrors the new schema.
- register_file.rb: expose register data via '#data' and return
  Sorbet enum values for roles.

Signed-off-by: Animesh Agarwal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GPR Information to UDB

3 participants