Login
Back to forumReply to this topicGo to last reply

Posted By

Sandor
on 2024-03-30
01:14:36
 1 read 2 write system for scrollers

I have stumbled upon this thread on c64 forums:
https://www.lemon64.com/forum/viewtopic.php?t=58844&start=135

Where the guy is describing a system where he is using a 1 read, 2 write screen draw system in ECM on c64. It got me thinking if we can do something similar for the plus/4, and if this could be adapted for multicolor as well to get ultrafast scrolling code.

The idea was to have an uncompressed 128x128 playfield, that we can display at any given coordinate either with MCM or ECM character modes. Since I wanted maximum speed, completely unrolling the screen draw routine seemed like the best idea, which means that double buffering would be very memory expensive. So to solve that,I have found the easiest way is to keep the playfield rotated by 90 degrees and then draw it line by line. That way we can start drawing the screen as soon as the first 2 badlines are over, and we will never catch up to the raster. Now, for a starting memory map I was thinking something in the line of:

$3d <--- x coordinate x2 (in characters)
$3e <--- set to $08
$3f <--- y coordinate (in characters)
$40-$8f <--- precalculated pointers to playfield Y=0, X=screen_X

$8000-$80ff <--- x,0 precalculated coordinate pointers
$8100-$c0ff <--- 128x128 level

Then we can prepare for the full screen copy with an unrolled loop like this:

ldy #$00 ; bytes: 2x1=2 cycles: 2x1=2
lda ($3d),y ; bytes: 2x40=80 cycles: 5x40=200
sta $40-$8e ; bytes: 2x40=80 cycles: 3x40=120
ldy #$01 ; bytes: 2x1=2 cycles: 2x1=2
lda ($3d),y ; bytes: 2x40=80 cycles: 5x40=200
sta $41-$8f ; bytes: 2x40=80 cycles: 3x40=120
; total bytes: 324
; total cycles: 644

Once that is done, we can do the unrolled copy screen 1 read, 2 writes routine like this: (this is the fastest I could make it)

ldy $3f ; bytes: 2x1=2 cycles: 3x1=3
lda ($40),y ; bytes: 2x1000=2000 cycles: 5x1000=5000
sta $chars ; bytes: 3x1000=3000 cycles: 4x1000=4000
sta $colors ; bytes: 3x1000=3000 cycles: 4x1000=4000
iny ; bytes: 1x24=24 cycles: 2x24=48
; total bytes: 8026 (8350 with initialization)
; total cycles: 13051 (13695 with initialization, 9363 cycles per frame left)

Now, there are a few issues here that I don't really like. Unlinke the c64 where we only have 16 colors, we have 121 colors on plus 4, so in ECM we would only have 1 unique character per color and lum 0-3 would need to share the character definition with lum 4-7. On top of that background colors 0,2 would use lum 0-3 and 1,3 would use lum 4-7. In MCM mode we would end up with 2 characters per each unique color with 128 hi res and 128 multicolor characters. That is why I was thinking that masking the character codes to use a palette with less number of colors might be preferable on the plus/4. For example if we AND the character code with $4f, we would end up with just luma 0 and 4, so 32 colors total, which would give us 4 characters per color in ECM and 8 characters per color in MCM. This can be done for free actually, and the code would look like this:

ldx #$mask ; bytes: 2x1=2 cycles: 2x1=2
ldy $3f ; bytes: 2x1=2 cycles: 3x1=3
lda ($40),y ; bytes: 2x1000=2000 cycles: 5x1000=5000
sta $chars ; bytes: 3x1000=3000 cycles: 4x1000=4000
sax $colors ; bytes: 3x1000=3000 cycles: 4x1000=4000
iny ; bytes: 1x24=24 cycles: 2x24=48
; total bytes: 8028 (8352 with initialization)
; total cycles: 13053 (13697 with initialization, 9361 cycles per frame left)

That actually makes things a bit more dynamic since we can even further reduce the number of colors to get more characters per color if that is what we want.

However, what still annoyed me is that we can't force all characters to be multicolor in MCM mode, so I have added an ORA instruction to the unrolled loop, which makes it slightly slower, but now we can actually force all characters to be multicolor:

ldx #$mask ; bytes: 2x1=2 cycles: 2x1=2
ldy $3f ; bytes: 2x1=2 cycles: 3x1=3
lda ($40),y ; bytes: 2x1000=2000 cycles: 5x1000=5000
sta $chars ; bytes: 3x1000=3000 cycles: 4x1000=4000
ora #$setbits ; bytes: 2x1000=2000 cycles: 2x1000=2000
sax $colors ; bytes: 3x1000=3000 cycles: 4x1000=4000
iny ; bytes: 1x24=24 cycles: 2x24=48
; total bytes: 10028 (10352 with initialization)
; total cycles: 15053 (15697 with initialization, 7361 cycles per frame left)

In that case we can actually select between many different palette configurations if we play with the setbits and mask a bit. I was curious if any of this can be further optimized and if anyone finds this useful?

I am personally very good at starting things and then never finishing them, so I figured if this inspires someone else as well, it would be awesome. happy

The disadvantage of this is that you need to redefine each character if you want to use them with different colors and it is a bit restrictive, however the advantage is that you can scroll as fast as you want, since you fully refresh the entire screen 50 times per second, with plenty of cycles left for sprites, music, logic and interrupts.

Posted By

Litwr
on 2024-03-30
02:01:55
 Re: 1 read 2 write system for scrollers

I know an example of super nice scrolling for the old good Speccy from 2021 - the Speccy doesn't have any hardware scrolling or sprite support!
IMHO porting the pig game to the C+4 is very difficult because the pig and other moving objects are obviously hardware sprites, which the C+4 lacks.

Posted By

Sandor
on 2024-03-30
10:57:57
 Re: 1 read 2 write system for scrollers

I wasn't thinking about porting that game. I was just interested in the screen draw code, since it is much faster than anything that I was able to write before. Also, I really like the idea of 1 byte per character+color because it allows to keep a huge unpacked level in memory, since it would consume half the memory. On top of that, since we are not using the screen data to scroll and then unpack the edges, we don't need to make backups of the areas where the sprites would be. I just thought the idea was interesting.



Back to topReply to this topic


Copyright © Plus/4 World Team, 2001-2024