Skip to content

Backend implementation

Chris Frisz edited this page Mar 11, 2022 · 4 revisions

Each processor architecture supported by Chez Scheme has a corresponding backend file. For example, s/x86_64.ss implements the x86_64 backend used by, e.g., a64le, ta64le, a6osx, and a6nt. While other parts of Chez Scheme have different code for different architectures, the backend files contain the vast majority of the architecture-specific code. When adding support for a new architecture to Chez Scheme, implementing the corresponding backend file comprises the bulk of the work to do so.

Chez Scheme backend files consist of three logical sections:

  1. Register declarations
  2. Instruction definitions
  3. Assembler definitions

Section 1: Register Definitions

The first section, register definitions, specifies what registers are available for a given architecture and how Chez Scheme can use them. It is also the shortest and easiest to implement of the three, since it consists solely of a call to the define-registers macro (defined in cpnanopass.ss).

By way of example, here is the define-registers from arm32.ss as of Chez Scheme v9.5.6:

(define-registers
  (reserved
    [%tc  %r9                   #t  9]
    [%sfp %r10                  #t 10]
    [%ap  %r5                   #t  5]
    #;[%esp]
    #;[%eap]
    [%trap %r8                  #t  8])
  (allocable
    [%ac0 %r4                   #t  4]
    [%xp  %r6                   #t  6]
    [%ts  %ip                   #f 12]
    [%td  %r11                  #t 11]
    #;[%ret]
    [%cp  %r7                   #t  7]
    #;[%ac1]
    #;[%yp]
    [     %r0  %Carg1 %Cretval  #f  0]
    [     %r1  %Carg2           #f  1]
    [     %r2  %Carg3           #f  2]
    [     %r3  %Carg4           #f  3]
    [     %lr                   #f 14] ; %lr is trashed by 'c' calls including calls to hand-coded routines like get-room
  )
  (machine-dependent
    [%sp                        #t 13]
    [%pc                        #f 15]
    [%Cfparg1 %Cfpretval %d0  %s0   #f  0] ; < 32: low bit goes in D, N, or M bit, high bits go in Vd, Vn, Vm
    [%Cfparg1b                %s1   #f  1]
    [%Cfparg2            %d1  %s2   #f  2]
    [%Cfparg2b                %s3   #f  3]
    [%Cfparg3            %d2  %s4   #f  4]
    [%Cfparg3b                %s5   #f  5]
    [%Cfparg4            %d3  %s6   #f  6]
    [%Cfparg4b                %s7   #f  7]
    [%Cfparg5            %d4  %s8   #f  8]
    [%Cfparg5b                %s9   #f  9]
    [%Cfparg6            %d5  %s10  #f 10]
    [%Cfparg6b                %s11  #f 11]
    [%Cfparg7            %d6  %s12  #f 12]
    [%Cfparg7b                %s13  #f 13]
    [%Cfparg8            %d7  %s14  #f 14]
    [%Cfparg8b                %s15  #f 15]
    [%flreg1             %d8  %s16  #f 16]
    [%flreg2             %d9  %s18  #f 18]
    ; etc.
    #;[                  %d16       #f 32] ; >= 32: high bit goes in D, N, or M bit, low bits go in Vd, Vn, Vm
    #;[                  %d17       #f 33]
    ; etc.
    ))

Register types

The syntax of define-registers categorizes registers into three types: reserved, allocable, and machine-dependent.

reserved

The registers listed in the reserved section represent addresses that must be in a fixed location at all times in the Chez Scheme runtime. That is, Chez Scheme must never allocate reserved registers as temporaries. The reserved list must always include %tc (the thread context) and %sfp (the stack frame pointer), but can include additional registers, as well. For instance, arm32.ss reserves %ap (the allocation pointer) and %trap (the trap pointer).

The data to which %tc and %sfp point is fundamental to many operations in the Chez Scheme runtime, plus they are necessary to maintain a coherent state (particularly in a threaded version of the runtime), which is why those two registers must have a fixed location. Any additional reserved registers should similarly be ubiquitously referenced or necessary to maintain state potentially across multiple threads.

Defining reserved registers should be done with care. When there are fewer allocable registers, the register allocator has fewer opportunities to store a temporary in a register, and in turn, it may lead to less efficient programs.

allocable

The registers defined in the allocable section (and only those registers) are used by the register allocator for finding homes for values used by instructions.

machine-dependent

The machine-dependent section of define-registers is used for any other registers available in the hardware to which the Chez Scheme runtime may refer, but are not reserved or allocable. This section generally includes the floating point registers, since Chez Scheme uses two floating point registers for floating point instructions, plus any floating point registers used by the foreign function interface.

Register definitions

All three types of registers use the same syntax for defining a register. Consider the definition of one of the registers from above:

[ %r0 %Carg1 %Cretval #f 0]

The symbols %r0, %Carg1, and %Cretval are symbolic names used to refer to a register in instructions. Each definition must include one symbolic name, and any number of additional aliases is optional.

The boolean in the second to last position of the definition declares whether the register is "callee-save"--#t means that the register is callee-saved, while #f means it is caller-saved. Note that the callee/caller-save declaration applies only to the C calling conventions for the foreign function interface, and has no bearing on the calling conventions used within the Chez Scheme runtime. In short, callee-save registers must be preserved at the start of a function call by the called function, i.e., callee, and will be guaranteed to have the same value when control returns to the calling function. Conversely, caller-save registers must be preserved by the calling function before executing a function call, since the value of those registers is not guaranteed. Whether a register is callee- or caller-save is typically specified by hardware designers in the programmer's manual for the processor.

The integer in the last position of the register definition is called the "machine-dependent info." Whereas define-instructions and assembly use symbolic names to refer to registers, the assembler uses the machine-dependent info integer to specify the register operands for instructions.