Login
Forum Help



Post Your Message
="hidden" name="cat" value="Programming">
Username: (Login)

Topic:
Message:
 


Previous Messages
Posted By

MMS
on 2018-09-28
12:53:33
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

wow, 10 years passed since then...

Posted By

bubis
on 2018-09-26
08:39:24
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

I think I have seen this inc $ff14 trick in very early FLI routines too.

And yes, István explained this earlier here.

Posted By

siz
on 2018-09-26
06:25:41
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

I'm pretty sure that IstvanV's converter is doing the same. I seem to recall that he was the first one who wrote about this inc $ff14 trick. At least he was the first one writing about it that I've read.

Posted By

bubis
on 2018-09-26
06:27:14
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

"This should work." grin You better test it properly first.

So, as Sandor already figured it out, the INC $FF14 trick works because TED sets the unused bits to one and the lowest 3 bits are unused in $FF14. So, for example if you write $08 there you read $0F, thus INC $FF14 will write $10 but it will read $17. When you INC again, the result will be $18, etc. So, INC $FF14 will increment by 8 every time and not just one. :) Also INC will steal 2 cycles before the DMA, so in that rasterline we will use 24 CPU cycles.

The other trick is that you trigger the badline by reseting $FF1D to 2 every second line. If mem[0xFF06]&7 = 3 it will trigger a badline when mem[0xFF1D]&7 changes to 3.

I will leave it to Sandor now to integrate this into his converter and demonstrate it working with some pretty DFLI pics. :)

BTW, is this really new or IstvanV's converter can do this as well?


Posted By

Sandor
on 2018-09-26
02:24:58
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Nicely done, bubis.

But I don't see the point of the 5 cycle STA which is pretty much only used to regain that same cycle from the STY $ff14.

btw: I didn't know that $ff14 low 3 bits are always set, that's why I wasn't thinking about the inc $ff14 trick. The $ff1d badline trigger on the other hand didn't even occur to me as a possibility. Really nice solutions right there.

The rest is pretty clear, all we need to do it to drop the $ff1d to 202 at the end and we're done.


edit:

Here is my (simplified) implementation of bubis' solution:

Setup:
x register = #$02
y register = #$60 (pointer to first color buffer, since I keep graphics at $4000-$6000 and colors at $6000-$8000)


lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
stx $ff1d ; dc14, dc15, dc16, sc

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
sty $ff14 ; dc14, dc15, dc16, sc

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
stx $ff1d ; dc14, dc15, dc16, sc

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
inc $ff14 ; dc14, dc15, dc16, sc, gr1, gr2

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
stx $ff1d ; dc14, dc15, dc16, sc

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
inc $ff14 ; dc14, dc15, dc16, sc, gr1, gr2

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
stx $ff1d ; dc14, dc15, dc16, sc

lda #$cl0 ; dram1, dram2
sta $ff15 ; dram3, dram4, dram5, dc01
lda #$cl1 ; dc02, dc03
sta $ff16 ; dc04, dc05, dc06, dc07
lda #$xsh ; dc08, dc09
sta $ff07 ; dc10, dc11, dc12, dc13
inc $ff14 ; dc14, dc15, dc16, sc, gr1, gr2


Unless I am missing something, this should work so I can add this mode to my converter. Thanks for the help (or to be more precise: thanks for solving the whole problem).

Posted By

bubis
on 2018-09-25
21:03:15
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Guys,

I managed to do the 3 TED reg updates per rasterline + DFLI. The trick is that I use $ff1d to trigger the badlines and I use inc $ff14. I will create a demonstration prg and I will explain how I did it later this week.

The timing diagram is here.

Posted By

Murphy
on 2018-09-25
17:45:24
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Wow, really nice trick to reuse the value of the registers! happy

In any case, I do not think these optimizations will result visible improvements on the picture.

Posted By

Sandor
on 2018-09-25
17:33:09
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Murphy, I have got it to more than that already.

With the following rules set:
- 1 of the color buffers is starting at $3800
- 1 of the color buffers is starting at $1800
- the color buffers are set to go in order 1-2-3-4-4-3-2-1

You can get to this:
lda #$ydel
sta $TED-YDEL
-- 38 cycles left -- (no need for TED-COLPTR write because the color pointer stays the same)

lda #$ydel
sta $TED-YDEL
sta $TED-COLPTR
-- 34 cycles left -- (color buffer starts from $3800, so we don't need a second LDA)

lda #$ydel
sta $TED-YDEL
sty $TED-COLPTR
-- 34 cycles left -- (color buffer starts from $1800, so we don't need a second LDA, Y is set from x-shift)

lda #$ydel
sta $TED-YDEL
lda #$colptr
sta $TED-COLPTR
-- 32 cycles left --

138 leftover cycles for additional writes
with 6 cycles per additional write we have 23 additional writes left
That's 2.875 additional writes per scanline, so 7 scanlines with 3 writes and 1 scanline with 2 writes

At this point I am 6 cycles per 8 scanlines short for the full effect.

*edit: updated it from 22 to 23 writes code :)

Posted By

Murphy
on 2018-09-25
16:55:37
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Sandor: You can write 2 registers on every scanline, and the third one on every second scanline.

//Scanline 0 - ff15 / ff16
lda #$3b
sta $ff06
lda #colormem
sta $ff14
stx $ff15
sty $ff16
nop

//Scanline 1 - ff15 / ff16 / ff07
lda #xshift
sta $ff07
lda #col0_cur_scanline
sta $ff15
lda #col1_cur_scanline
sta $ff16
ldx #col0_next_scanline
ldy #col1_next_scanline

Posted By

Sandor
on 2018-09-25
13:46:45
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

The best I can come up with is to have the color buffers like this:

1, 2, 3, 4, 4, 3, 2, 1

In this case we don't need to update the color buffer pointer every 4th double scanline, so we save 6 cycles, but even so the total scanilnes needed to do the whole thing would be 186 cycles per 8 scanlines, which is 23.25 cycles per scanline. Since we only have 22 cycles per scanline, this is still not enough to make the whole thing work. Now if we do the STA timing perfectly, we can probably do the effect because at least one of the numbers will likely be equal to the number from the previous scanline (be it an x-shift, or one of the 2 colors). And if it no numbers are equal to the previous scanline, we can force one to become equal from the converter. If that happens, we need:
4 y-scroll register adjustments = 24 cycles
8x3-1 x-scroll register / color adjustments = 138 cycles
3 color pointer adjustments = 18 cycles
--------------------------------------------------------------
total: 180 cycles
per scanline: 22.5 cycles

Posted By

bubis
on 2018-09-26
08:41:17
 Re: Multicolor DFLI with different background colors per scanline plus x-shift

Hi Sandor,

Your measurements are not all correct.

In double clock mode this is what you have:
Every rasterline has 5 single clock cycles (DRAM refresh) followed by 16 double clock cycles (mostly during horizontal blanking). After that you either have 88 double clock cycles on the border or 44 single clock cycles on the window area (more precisely on lines 0-204) and it starts again with the 5 cycle DRAM refresh.
If you have DMA too (badline), the TED steals 40 to 43 cycles of the 44 cycles from the CPU. There is a 3 cycles grace period before the DMA happens where the CPU can finish write operations but cannot do read operations. Some CPU instructions end with one (sta,pha,...), two (inc,dec,ror,rol,...) or three (brk and interrupts) write cycles, those can finish what they started.

So, to summarize:

* on the border: 5+16+88 = 109 cycles
* on the window area without DMA: 5+16+44 = 65 cycles
* on the window area with DMA: 5+16+44-43 (+ optional 1/2/3) = 22-25 cycles

As for the DMA in practice, you can have 23 cycles with the right STA/X/Y timing or 24 with INC/DEC, otherwise you only have 22 cycles per rasterline.

It is perfectly possible to have DMA on every rasterline and change two extra TED registers in each line (but I don't think you can do three *). There are some fli routines out there that do that, like my routine in DFLIConv or the routine in IstvanV's converter in Plus4Emu. I think one of the TED register writes can be timed to change $ff19 properly without any artifacts.


This is an example for stabilizing your raster interrupt routine if you set the interrupt in the 0-204 range and if that is not a badline:

stableIrq:
sta tempa
lda $ff1e
lsr
lsr
sta reljump
reljump = *+1
bpl *+2
cmp #$c9
cmp #$c9
cmp #$c9
cmp #$c9
cmp #$c9
cmp #$c5
nop
;stable
...
inc $ff09
tempa = *+1
lda #0
rti

Maybe some of those "cmp #$c9" lines can be skipped, I am not sure. It also depends on if you use 7 or 8 cycle instructions in your code.


Update:
* : You made me think about this again and 3 TED register updates per line with DMA might be possible. I will figure out and share the results when I have a bit more time.

Posted By

Sandor
on 2018-09-25
02:42:13
 Multicolor DFLI with different background colors per scanline plus x-shift

Hello,

I was wondering if this was even possible.

From my measurements, we get 109 cycles for scanlines where nothing is displayed, 64 scanlines while the screen is displayed and 24 scanlines during a badline. This doesn't make any sense to me because I don't undestand where do 45 cycles go during screen display? 40 cycles will go to fetch the actual pixel data, but where do the other 5 cycles go? Now, if the 64 cycles per scanline is correct, that means we are losing 40 more cycles per badline, which makes perfect sense, since TED needs to read extra 40 bytes during those scanlines. However I am not sure if this number is correct at all (or if any of those numbers are really correct at all).

Another question is, how many of those 24 cycles are actually cycles during the horizontal "break"?

If we want to do x-shift and change both background colors during every scanline and do the standard dfli trickery for every 2 scanlines, we get to this:

lda #$col-ptr
sta $TED-COLPTR ; 6 cycles to set ted to next color buffer

lda #$backcol0
ldx #$backcol1
ldy #$xdel ; 6 cycles to load

sta $TED-BACK0
stx $TED-BACK1
sty $TED-XDEL ; 12 cycles to write (9 cycles must be outside of screen)

lda #$ydel
sta $TED-YDEL ; 6 cycles to set ted Y scroll value

lda #$backcol0
ldx #$backcol1
ldy #$xdel ; 6 cycles to load

sta $TED-BACK0
stx $TED-BACK1
sty $TED-XDEL ; 12 cycles to write (9 cycles must be outside of screen)

So, if we really do have 24 cycles per badline and if 9 out of those 24 cycles are happening while there are no pixels drawn to the screen, we should be able to sync the code at the beginning of the screen and then keep them simply changing while the screen is being displayed.

So, I have 3 questions:

1) anyone knows how to sync the code to an exact position of the scanline
2) do we have 24 cycles per badline?
3) do we have 9 cycles while the screen isn't being drawn per scanline?

Thanks in forward.


Copyright © Plus/4 World Team, 2001-2024. Support Plus/4 World on Patreon