PSX Notes
This book is a collection of notes related to the PSX (PlayStation 1), geared towards emulator development. It is intended to be used as a complementary resource to the very good PSX-SPX.
note
While information in this book is supposed to be correct, this is ensured on a best effort basis. If you find something wrong or missing, please open an issue or create a pull request.
CPU
Notes related to the CPU of the PSX.
Encoding of BLTZ/BGEZ/BLTZAL/BGEZAL
Both PSX-SPX and the R3000 manual specify these instructions as having the following encoding:
000001 | rs | 00000| <--immediate16bit--> | bltz
000001 | rs | 00001| <--immediate16bit--> | bgez
000001 | rs | 10000| <--immediate16bit--> | bltzal
000001 | rs | 10001| <--immediate16bit--> | bgezal
This is only partially correct. The true encoding is the following:
000001 | rs | xxxx0| <--immediate16bit--> | bltz
000001 | rs | xxxx1| <--immediate16bit--> | bgez
000001 | rs | 10000| <--immediate16bit--> | bltzal
000001 | rs | 10001| <--immediate16bit--> | bgezal
Where xxxx
is any bit sequence other than 1000
.
This behaviour is required for passing the b_0xXX_y
tests in
Amidog's CPU test.
Unaligned loads and stores
These instructions are surprisingly confusing, but become easier to grasp once you understand their use case, which is performing unaligned reads/writes from/to memory.
LWL
Description
LWL
's goal is reading the "left" (i.e. higher order) part of an unaligned word in memory into the
"left" part of a register.
After computing the address addr
to operate on, LWL performs the following:
- Iterate through the bytes of
rt
, from highest order to lowest (i.e. big endian) - At the same time, also iterate through bytes in memory starting at
addr
and decreasing (since the R3000 is little endian, this is also highest order to lowest) - In each iteration, set the byte of
rt
to the byte in memory - Stop once the byte address crosses a word boundary (in practice, this means you'll iterate exactly
addr % 4
times)
Examples
// starting state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BEEF
// operation: load left from address 6
lwl rt, 6($0)
// finishing state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0x6655_44EF
// starting state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BEEF
// operation: load left from address 4
lwl rt, 4($0)
// finishing state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0x44AD_BEEF
LWR
Description
LWR
's goal is reading the "right" (i.e. lower order) part of an unaligned word in memory into the
"right" part of a register.
After computing the address addr
to operate on, LWR performs the following:
- Iterate through the bytes of
rt
, from lowest order to highest (i.e. little endian) - At the same time, also iterate through bytes in memory starting at
addr
and increasing (since the R3000 is little endian, this is also lowest order to highest) - In each iteration, set the byte of
rt
to the byte in memory - Stop once the byte address crosses a word boundary (in practice, this means you'll iterate exactly
4 - addr % 4
times)
Examples
// starting state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BEEF
// operation: load right from address 3
lwr rt, 3($0)
// finishing state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BE33
// starting state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BEEF
// operation: load right from address 1
lwr rt, 1($0)
// finishing state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDE33_2211
LWL and LWR example usage together
// starting state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0xDEAD_BEEF
// let's load the word at address 1
// operation: load right from address 1
lwr rt, 1($0)
// operation: load left from address 4
lwl rt, 4($0)
// finishing state
memory: [0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, ..]
> address: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..]
> word_index: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ..]
rt: 0x4433_2211
note
Both LWL
and LWR
are special cased in the hardware to be able to bypas the load delay slot.
This means that performing any load to a register and then immediately using either LWL
or
LWR
to load into the same register will have the second load use the value loaded by the first
one, even with it being in the delay slot.
At the same time, however, a load cancel will happen, so the instruction right after the second load will not see the value loaded by the first instruction.
SWL and SWR
These two instructions work the exact same as their load counterparts, except that instead of
loading the bytes from memory and putting them into the bytes of rt
, they store the bytes of rt
in memory.
Load Cancelling
Sequential loads to the same register exhibit "load cancelling": non-terminal (i.e. not the last in a sequence) loads are completely discarded and only the last load will take effect.
This behaviour is required for passing some of the CPU MEM DLY
tests in
Amidog's CPU test.
Example
Consider the following program:
lb t0, 0($a0) // load value at address in a0 to t0
lb t0, 0($a1) // load value at address in a1 to t0
nop
nop
Since both loads have t0
as their target register, load cancelling will happen and the first load
will not be visible at any point in time. That is, up until the first nop
, the value of t0
remains unchanged. It is only at the second nop
that the value of t0
changes to that of the
value loaded by the second load:
lb t0, 0($a0) // t0 unchanged
lb t0, 0($a1) // t0 unchanged
nop // t0 unchanged
nop // t0 changed to the value loaded by second instruction
Manual Errata
A list of corrections of the R3000 manual.
SLTIU
The manual describes SLTIU
as putting the result into rt
, but show pseudocode that put's the
result into rd
. The pseudocode is wrong - rt
is the correct register.
References
This page is a collection of links that are commonly referenced through the book.
Resources
PSX-SPX
https://psx-spx.consoledev.net/
R3000 Manual
https://cgi.cse.unsw.edu.au/~cs3231/doc/R3000.pdf
Tests
psxtest_cpu
Also known as Amidog's CPU test.