| |
|
Technical information on 26/32-bit RISC OS binary interfaces
Document version 0.2a, 10 January 2003
This document summarises information on the changes to the RISC OS binary
interfaces between 26 and 32-bit versions of RISC OS. Although details may be
subject to change, it is believed accurate, and changes will not be taken
lightly. Hence, it should provide a basis for developers to ensure that their
software is compatible with future systems.
Future systems may not contain IOMD/VIDC compatible peripheral sets.
This will introduce extra issues for device drivers - normal applications
should not be accessing hardware directly, so they should not be affected.
Information on support for different peripheral worlds will be made
available at a later date.
A lot of the information in this document is at a much lower level than the
majority of software developers require. Developers producing standard
Desktop applications written in C or BASIC will probably find the basic
"32-bit forward compatibility release" ReadMe file sufficient.
A version of RISC OS will either be 26-bit, or 32-bit, with no in-between
state.
When running 32-bit, RISC OS will make no use of 26-bit modes, and will call
all routines in a 32-bit mode. Although programs could use 26-bit modes (if
available on the processor), there will be no run-time selection of whether
to call program entry points in 26 or 32 bit mode, and the memory map may
preclude the use of 26-bit modes (eg the RMA being above 64M). However, hooks
will be available to permit a 26-bit emulator.
When running 26-bit, RISC OS will make no more use of 32-bit modes than it
does currently, and the same restrictions on 32-bit code will apply as now.
Any system running on an ARM 9/10/XScale etc will have to be 32-bit.
Software should be modified to work whether it is called in a 26-bit or a
32-bit mode. For pure C applications and modules, this can just be a
recompile; the compiled code will then run on both existing 26-bit systems,
back to RISC OS 3.1, and 32-bit systems. A new Shared C Library will be
required to support 32-bit programs on old systems.
Assembler modules should also run on either 26 or 32-bit systems; however to
achieve this use of MSR and MRS instructions will often be required to
manipulate the PSR - if this cannot be avoided the module will become RISC OS
3.1 incompatible. (RISC OS 3.1 ran on 26-bit only ARMv2 processors which
did not support MRS/MSR).
Most of the differences can be hidden inside macros - use macros instead of
using TEQP directly, and 26/32-bit forms can be selected on compile time by
build switches. You will either have a module that uses TEQP that will work
on 26-bit versions of RISC OS, or a module that uses MSR which will work on
RISC OS 3.5 or later. With a bit more effort, run-time selection is possible
(some examples are shown below).
The 32-bit RISC OS API is largely unmodified - almost all binary APIs will
act the same in a 32-bit system as they do now, except that they will be
called in a 32-bit version of the documented processor mode.
One side-effect of this is that R14 on entry to routines will contain just
the return address, with no flags. Hence to preserve flags, the CPSR must be
stacked on entry, and restored on exit. This is cumbersome, but can be hidden
inside macros. Note that this behaviour is then slightly different - you are
preserving flags across the call, not restoring the flags passed in in R14.
Most of the time this doesn't matter, as the API is documented in terms of
preserving flags. There are exceptions to this rule, notably SWI handling.
On a 32-bit system, SWIs are no longer expected to preserve the N Z and C flags.
They may still set/clear them to return results. 32-bit code should not
assume that SWIs preserve flags. Requiring flag preservation would impose
an unacceptable burden on SWI dispatch. This effectively re-specifies
*all* SWIs by changing the default rule in PRM 1-29. Also, it becomes
impossible for SWIs outside the Kernel to depend on the NZCV flags on
entry. SWIs inside the Kernel, such as OS_CallAVector, can still manipulate
flags freely. This should not be an onerous restriction; it is impossible
to specify entry flags for SWIs in C or BASIC, for example.
Many existing APIs do not actually require flag preservation, such as service
call entries. In this case, simply changing MOVS PC... to MOV PC... and LDM
{}^ to LDM {} is sufficient to achieve 32-bit compatibility.
A new flag in the module header is used to indicate that the module is 32-bit
ready. It is therefore essential that all modules are updated to add this
flag, even if they are otherwise 32-bit clean.
To just set and clear NZCV flags you can use macros which do the right thing for
the different processor types. To actually preserve flags, you will probably
be forced to use MRS and MSR instructions. These are NOPs on pre-ARM 6 ARMs,
so you may be able to do clever stuff to keep ARM2 compatibility.
Some example macros are supplied here in Libraries.Hdr - set the logical
switches No26bitCode and No32bitCode as required, then GET Hdr:CPU.Generic26
and Hdr:CPU.Generic32. "No26bitCode" means don't rely on 26-bit instructions
(eg TEQP and LDM ^) - the code will work on 32-bit systems. "No32bitCode" means
don't rely on 32-bit instructions (eg MSR and MRS) - the code will work on
RISC OS 3.1. Setting both to {TRUE} is too much for the macros to cope with -
you will have to use run-time code as shown below.
The recommended general-purpose code to check whether you're in a 26-bit mode
is:
| TEQ | | R0, R0 | | ; sets Z (can be omitted if not in User mode)
| | TEQ | | PC, PC | | ; EQ if in a 32-bit mode, NE if 26-bit
|
As another example, here is the case of calling a SWI from an IRQ routine:
TEQ PC, PC
MRSEQ R8, CPSR
MOVNE R8, PC
ORR R9, R8, #3 ; IRQ26->SVC26, IRQ32->SVC32
MSREQ CPSR_c, R9
TEQNEP R9, #0
NOP ; NOP to avoid problems on ARM2
STR R14, [R13, #-4]! ; faster than STMFD on some new processors
SWI XOS_AddCallBack
LDR R14, [R13], #4 ; ditto
TEQ PC, PC
MSREQ CPSR_c, R8
TEQNEP R8, #0
NOP
(Theoretically one could engineer for the TEQP to occur before the MSR and
hence have the MSR be the required NOP for the ARM 2, but this would in
turn hit the StrongARM bug detailed below.)
The complexity of the above example occurs because of the need to support pre-ARM
6 processors that don't have the MRS and MSR instruction (i.e. RISC OS
3.1 machines). If RISC OS 3.1 support is not required, it reduces to:
| MRS | | R8, CPSR
| | ORR | | R9, R8, #3 ; IRQ26->SVC26, IRQ32->SVC32
|
| MSR | | CPSR_c, R9
| | STR | | R14, [R13, #-4]! ; faster than STMFD on some new processors
| | SWI | | XOS_AddCallBack
| | LDR | | R14, [R13], #4 ; ditto
| | MSR | | CPSR_c, R8
|
This is possible because the MRS and MSR instructions are available on ARM6
and ARM7 processors even when running in 26-bit mode.
Sometimes you may be forced to play with the SPSR registers. Beware:
interrupt code will corrupt SPSR_svc if it calls a SWI. Existing interrupt
handlers know to preserve R14_svc before calling a SWI, but not SPSR_svc.
Hence you MUST disable interrupts around SPSR manipulations; the SPSR is not
suitable as a general mechanism for PSR restoration on function return.
The I-bit in the CPSR is at a different position to the I-bit in the PC in
26-bit mode.
To disable interrupts on a 26-bit system:
MOV R14,PC ; turn off interrupts
TST R14,#&08000000
TEQEQP R14,#&08000000
...
TEQP R14,#0 ; restore interrupt state
If RISC OS 3.1 support is not required, this can be replaced with:
MRS R14,CPSR
ORR R0, R14, #&80
MSR CPSR_c, R0 ; not conditional for StrongARM (see below)
...
MSR CPSR_c, R14 ; restore interrupt state
The StrongARM has a bug in its implementation of MSR which should be born in mind,
particularly when attempting to write 26/32-bit neutral code (c.f. the SWI
call from IRQ mode example above).
The bug is triggered when:
- the processor is in a privileged mode,
- the last instruction was an MSR to the CPSR with the 'c' bit set,
- the MSR was not executed due to the condition test failing,
- and the MSR was not the last instruction in a cache line.
In this circumstance, the instruction following the MSR will be executed
twice. To avoid this problem, there are a number of approaches, depending
on how cautious you wish to be:
- follow all MSR CPSR_c[xsf] instructions with a NOP, much like TEQP;
- don't use conditional MSR CPSR_c[xsf] instructions;
- follow all conditional MSR CPSR_c[xsf] instructions with a NOP;
- ensure all instructions following conditional MSR CPSR_c[xsf] instructions
are idempotent, i.e. can be executed multiple times without ill effects.
This then excludes instructions like BL or SWI, instructions with the
same source and destination register, or loads and stores with base
register writeback.
This bug only affects the StrongARM processor, but is present in all
current revisions.
Most module entries are treated the same in the 32-bit world, except they
will be entered in a 32-bit mode, and hence R14 will be a return address
with no flags. This section also clarifies flag significance on 26-bit
systems.
Any given system will only be 26 or 32-bit, so it is possible to note the
system type in your initialisation routine by checking the processor mode,
rather than checking on every entry point.
Unchanged. Flag preservation not required - only V on exit is looked at.
Unchanged. Flags on exit ignored.
Unchanged. Flag preservation not required - only V on exit is looked at.
Unchanged. Flag preservation not required - V on exit is looked at in number
to text case.
On 26 bit systems, R14 is a return address (inside the Kernel) with the
user's NZCIF flags in it, V clear, mode set to SVC. The CPSR NZCV
flags on exit are then passed back to the SWI caller. Hence MOVS PC,R14
preserves the SWI caller's NZC flags and clears V. The NZ and C flags in the
current PSR on entry are undefined, and are NOT the caller's (but V is
clear). Thus you can simply read, modify and preserve the caller's flags.
On 32 bit systems, R14 is a return address only. There is no way of
determining the caller's flags, so you are not expected to preserve them. The
NZC and V flags you exit with will be returned to the caller.
If writing a new module, simply specify that all your SWIs corrupt flags,
then your SWI dispatchers can return with MOV PC,R14, regardless of whether
running on a 26 or 32 bit system.
If converting an existing module to run on 32-bit, it is highly recommended
that the same binary continue to work on 26-bit systems. You should therefore
take steps to preserve flags when running in a 26-bit mode, if the module did
before. When running on a 32-bit system, you needn't preserve flags. The
following wrapper around the original SWI entry (converted to be 32-bit safe)
achieves this, assuming you always want NZ preserved on a 26-bit system.
Push R14
BL Original_SWI_Code ; NZ(C) corrupted, (C)V set
Pull R14
TEQ PC,PC ; are we in a 32-bit mode?
MOVEQ PC,R14 ; 32-bit exit: NZ corrupted, CV passed back
[ PassBackC
BICCC R14,R14,#C_bit ; Extra guff to pass back C as well
ORRCS R14,R14,#C_bit
]
MOVVCS PC,R14 ; 26-bit exit: NZ(C) preserved, V clear
ORRVSS PC,R14,#V_bit ; 26-bit exit: NZ(C) preserved, V set
This is cumbersome, but it can be removed when backwards compatibility is no
longer desired. The alternative, which would be to pass in caller flags in
R14, would impose a permanent carbuncle on the 32-bit API.
This is a new module header entry at &30. It is an offset to the module
flags word(s). The first module flag word looks like:
| Bit 0 | | Module is 32-bit compatible
| | Bits 1-31 | | Reserved (0)
|
Non 32-bit compatible modules will not be loaded by a 32-bit RISC OS.
If no flags word is present, the module is assumed to be 26-bit compatible.
32 bit system: Now called in UND32 mode. No preveneer.
26 bit: as before
32 bit system: Now called in ABT32 mode. No preveneer.
26 bit: as before
On a 32-bit system, there will be an Abort mode stack.
32 bit system: USR32 mode. PC contains no PSR flags.
26 bit: as before - PC contains PSR flags, but may not be reliable.
32 bit system: register block must be 17 words long.
contains R0-R15,CPSR.
entered in SVC32 mode
26 bit: as before
Handlers can check format by looking at mode on entry to handler - the correct
26 or 32-bit version of the handler should be called as appropriate.
The following code is suitable to restore the user registers and return in
the 32-bit case:
| ADR | | R14, saveblock | | ; get address of saved registers
| | LDR | | R0, [R14, #16*4] | | ; get user PSR
| | MRS | | R1, CPSR | | ; get current PSR
| | ORR | | R1, R1, #&80 | | ; disable interrupts to prevent
| | MSR | | CPSR_c, R1 | | ; SPSR_SVC corruption by IRQ code
| | MSR | | SPSR_cxsf, R0 | | ; put it into SPSR_SVC
| | LDMIA | | R14, {R0-R14}^ | | ; load user registers
| | MOV | | R0, R0 | | ; no-op after forcing user mode
| | LDR | | R14, [R14, #15*4] | | ; load user PC into R14_SVC
| | MOVS | | PC, R14 | | ; return to correct address and mode
|
32 bit system: as before, but called in SVC32
32 bit system: as before, but in IRQ32 or SVC32
32 bit system: as before, but in USR32
26 bit system: called in SVC26 mode.
R14 = a return address in the Kernel, with NZCVIF flags the
same as the caller's (except V clear).
PSR flags undefined (except I+F as caller)
32 bit system: called in SVC32 mode.
R14 = return address in the Kernel
No way to determine caller condition flags
PSR flags undefined (except I+F as caller)
32 bit system: as before, but SVC32 mode
32 bit system: register block must be 17 words long.
contains R0-R15,CPSR.
entered in SVC32 mode, IRQs disabled
26 bit: as before
Handlers can check format by looking at mode on entry to handler.
The following code is suitable to restore the user registers and return
in the 32-bit case:
ADR R14, saveblock ; get address of saved registers
LDR R0, [R14, #16*4] ; get user PSR
MSR SPSR_cxsf, R0 ; put it into SPSR_SVC/IRQ
LDMIA R14, {R0-R14}^ ; load user registers
MOV R0, R0 ; no-op after forcing user mode
LDR R14, [R14, #15*4] ; load user PC into R14_SVC/IRQ
MOVS PC, R14 ; return to correct address and mode
32 bit system: block must be 17 words long.
will contain R0-R15,PSR
Exception handlers can determine block format by looking at mode on entry
to handler.
Software vectors have a number of different properties. They can be called
under a variety of conditions, and the flags they exit with may or may not
be significant.
When called using OS_CallAVector, the caller's NZCV flags always used to be
passed in in R14, and the claimant's flags on exit would be passed back.
In a 32-bit system, the caller's flags are not passed in R14. Their C and V
flags are visible in the PSR though, just as in a 26-bit system. N and Z are
not visible. Again, exit flags are passed back.
Most vectors are not intended to be called with OS_CallAVector, and their
exit flags have never had significance, for example KeyV, EventV and TickerV.
Others are vectored SWIs, such as ByteV and ReadLineV. These pass back
C and V flags only.
A few vectors, like RemV, attach significance to entry flags. If not
claiming, you mustn't change those flags for the next callee. In 26-bit mode
this might have been achieved by:
| CMP | | R1,#mybuffer
| | MOVNES | | PC,LR
|
In the 32-bit world, you could change the CMP to a TEQ to preserve C and V,
or you could use something like:
Push R14
MRS R14, CPSR
CMP R1, #maxbuffers
BLS handleit
MSR CPSR_f, R14
Pull PC
handleit
...
Expansion card headers may contain loaders. These must be 32-bit compatible
to work on a 32-bit system. 32-bit compatibility is indicated by the fifth
word of the loader header containing "32OK". Expansion card loader entry
points are always called with V in the PSR being clear, even on existing
systems, so MOV PC, R14 is an adequate non-error return in the simplest case,
rather than the BICS PC, R14, #V_bit shown in the PRM. Loaders that are not
32-bit compatible will be faulted with an error.
The memory map of a 32-bit system will be considerably different. In
particular, the RMA, screen memory, ROM and usually I/O space will all be
differently located. Application space will remain based at &8000. Use system
calls to find the address of locations.
Privileged mode stacks (SVC, IRQ, UND and ABT) will move, but will remain
based on a one megabyte boundary.
IOMD-compatible systems may optionally map I/O space in in the traditional
locations (03000000 and 88000000). However, this disrupts the memory map and
limits the maximum size of applications.
The behaviour of some SWIs has been changed for 32-bit systems. For example, the
entry parameters to OS_HeapSort, OS_SubstituteArgs and OS_ReadLine use a
register to hold a 26 bit address with flags in the spare bits. In a 32-bit
environment the register can only be used to hold the address and the flags
will move to another register. The details of such calls have changed will
be supplied in a separate document.
For backwards compatibility a new module will be developed for existing
versions of RISC OS. It will intercept calls to these SWIs and if it sees the
new format in use it will adjust the registers before calling the original
SWI. This module can be distributed with any software that uses the new form
of the SWIs. Alternatively, software may be written to adapt depending on the
version of the OS that is detected.
On a 32-bit system, if called in a 26-bit mode, takes you into SVC26, else
into SVC32.
v0.2 5 Sep 2002 (PS):
Added IRQ example code.
Added NOP after TEQP to example calling a SWI
from an IRQ routine.
Added details of OS_ReadLine and SWI changes.
 |
| © 2006 IYONIX Ltd |
32-bit RISC OS |
|