| Posted By
Sandor on 2014-01-21 04:16:26
| DFLI game(s)
Hello,
I've decided to try and make a game (or several games) in multicolor DFLI graphic mode.
The idea is quite simple: Fix the 2 background colors to black and light gray and use black outline for every sprite to differentiate them from the background. The light gray is a good choice because it can be used as filler color whenever there isn't enough colors in a square (red+gray looks better than pure red if the source was red-green for example).
For the background image I select a dominant color for each 8x2 square and then select a second color that gives the least PSNR when the block is remapped to the 4 colors.
For sprites there are 3 modes for each square: mode 1: the whole square is transparent - no modification to the screen mode 2: the square has some opaque and some transparent pixels - we remap the sprite square to 3 colors (black, gray and the color with least PSNR) and we keep the dominant color from the background square. mode 3: the square is opaque - we pick the 2 colors that give the least PSNR and remap the square to black/gray and the selected colors.
The game screen would be 320x184, with an 8 scanline scoreboard on top in character mode, 8 scanline off mode and then 160 scanline game area.
The first game I started making is a fighting game (Mortal Kombat II / Street Fighter II style). I have enough memory to keep the 320x160 dfli background in memory (twice - one for backup to be able to erase the sprites and the second is the actual display), as well as for 36-48 frames per fighter sprite. I can display one 64x64 sprite (with up to 80 8x2 squares opaque and with horizontal flip effect off/on) per frame, so if I update sprite 1 on even frames and sprite 2 on odd frames, I can have them both at 25 fps animation.
I have compressed both the sprite data and the background data to 6-7k, so I have enough disk space for 8 selectable characters and one final boss, each with unique background image.
Here is the problem I am facing: I have no clue how to do the diskloader. So the question is, are there any public fastloaders that work with both 1551/1541 with asm source code that I could use? I tried to search, but couldn't find anything.
Also, I have a few ideas for future projects and have a few more questions: 1) is it possible to implement vertical scroll without actually copying pixels (by scanline manipulation)? 2) is it possible to implement horizontal scroll without actually copying pixels?
Thanks for the info, Sandor
|
|
Posted By
Litwr on 2014-01-21 04:52:17
| Re: DFLI game(s)
Krills_Loader 1) 0-7 bits by y-offset TED register for 24 rows screen. The change of raster line number gives possibility to move the whole screen, e.g., HNY2013 and several others 2) 0-7 bits by X-offset TED register for 38 columns screen. I don't know other ways.
|
|
Posted By
gerliczer on 2014-01-21 06:21:41
| Re: DFLI game(s)
Linecrunching and VSP can be done on the 264 machines somewhat similarly to the C=64. Try finding some demos like Zenith Of Puberty and The 2nd and examine them.
|
|
Posted By
Sandor on 2014-01-21 09:22:47
| Re: DFLI game(s)
Hello,
Thanks for the answers, I've looked up linecrunching and it seems fairly easy to do, not so sure about how to go about VSP yet.
I have also looked up krills loader but it doesn't seem to support 1551 drives.
|
|
Posted By
Sandor on 2014-01-21 09:24:43
| Re: DFLI game(s)
Oh an one more thing. I've read it is possible to speed up the cpu by activating NTSC mode on PAL machines. I did test this myself, but I didn't notice any difference in CPU speed. I have just made a pretty long loop that runs forst 10 seconds when the machine is running normally and then measured it's speed in NTSC mode vs PAL mode and got the exact same speed.
|
|
Posted By
gerliczer on 2014-01-21 12:09:22
| Re: DFLI game(s)
You must have missed something. Setting NTSC mode on a PAL machine results in 25% clock-speed increase as it switches and internal divider from dividing by 10 to 8 (or 20 to 16?). It is impossible that this speed increase goes unnoticed. I think you should revise your measurement technique.
|
|
Posted By
Lavina on 2014-01-21 12:10:53
| Re: DFLI game(s)
Who is Sandor?
|
|
Posted By
Sandor on 2014-01-21 13:02:48
| Re: DFLI game(s)
Hmm... I will retest the NTSC thing, but I've really didn't notice any increase. Perhaps I'll make it longer (1-2 min test) so that measurement errors don't show up that easily.
Also, umm I'm not really known to anyone. I used to have a plus/4 as a kid and then went over to Amiga, did a few demos and intros on Amiga (was coder of demo group Reason under alias gODjR), then did nothing for a while, then did a video player for the first PC ever and now I am trying to do something with the plus/4 since it is where I started and I get really nostalgic as I get old(er?).
Here are some of the demos/intros I did in case anyone cares: http://www.youtube.com/watch?v=_OVjskydFYA http://www.youtube.com/watch?v=od3OUxh7D-Q http://www.youtube.com/watch?v=T6u1Arl_mZM
And here is the video player for the IBM 5150 (I did the video => character/color converter in it): http://www.youtube.com/watch?v=H1p1im_2uf4
Note: I didn't upload any of the videos, and I am not the guy showing the IBM5150 (that's Jim Leonard the guy who did the player).
Anyhow, I am extremely out of shape when it comes to the plus/4 (didn't do anything on it for ~20 years), but trying to get into shape fast because I am having so much fun writing code for it.
|
|
Posted By
Litwr on 2014-01-21 14:29:21
| Re: DFLI game(s)
[ntsc] just run Basic FOR loop. 10 poke65287,a 20 for i=1 to 1000: x=sin(i): next i 30 poke65287,8 try this with a=$48 (NTSC) or a=8 (PAL) and don't use internal timer.
[pc player] It's very impressive. Is it possible for C+4? Maybe with slower framerate and 38x24 screen...
|
|
Posted By
Gaia on 2014-01-21 15:41:03
| Re: DFLI game(s)
I think BSZ's loader that he used in Questionmark supports both. It was called Titanic (?) but I'm not sure and it was never released offically. Unfortunately the 1541 part of the loader is bit sloooowwww... Then we had a few folks namely Csory, Ceekay as well as Bionic that used "multiplatform" disk loaders. The first two are old fashioned though, not allowing effects during loading...
Krill was willing to port his loader to the 1551 but never got around doing it so far. Is there a reason why you insist on adding support for the 1551? I understand it from an emotional viewpoint but not from a rational one
BTW, I'm all excited about the DFLI game...
|
|
Posted By
MMS on 2014-01-21 18:15:30
| Re: DFLI game(s)
Yeah, I understand Sandor's motivation. I had the same issue when I started to wrote my adventure that loads a lot of gfx: I want a software automatically load parts FAST, even if meets with a 1541 or 1551. (automatically identify it, and use the proper loader part) Unfortunately right now the selection of proper driver can be done only by "hand", and it is not "nice" from a view of some programmers (imagine if each time you start your PC game you should select your videocard, CPU and memrory size )
|
|
Posted By
Sandor on 2014-01-21 20:23:58
| Re: DFLI game(s)
@Litwr In the basic example you gave, I can see the speed increase clearly. The problem is probably with the code I've used. I have switched between PAL/NTSC within the same frame, and probably the system stayed in PAL the whole time. Will try to implement it as full NTSC but with setting the raster back to 204 once it reaches 254, forcing it back to 312 scanlines in NTSC mode. Will test the whole thing tomorrow. If I am right, I will be able to have PAL display with full 312 scanlines with screen on, but with NTSC clock. With my original plan of reducing the total screen area from 320x200 to 320x184, that should give me ~16742 cycles per frame instead of 13952 for sprite rendering.
As for a video player on the plus/4, I don't really think it is doable because you'd have to compress each frame to 200 bytes or less in order to have 30 seconds of video on a diskette. Even with delta frames it seems unlikely to make a converter that would be able to do it - unless you select a video source that is easy to compress. However if someone could actually compress the data for it, you'd have much better quality than we had on that PC because you can redefine the characters on +4 and you have a palette of 121 colors vs fixed character set and only 16 colors.
@Gaia: Ok, so that's some homework. I'll try to see who did what, altho I'm not a fan of disassembling and stealing other people's code (which is why I asked if there is any code in really public domain). Also I don't really need any effects during loading, if I can display an image while it loads, I'll be happy. Also I 'insist' on supporting 1551 because 1) that is the real drive for the +4 and 2) it would suck not to support it.
@MMS: How did you solve your loader issues?
|
|
Posted By
Sandor on 2014-01-21 20:45:33
| Re: DFLI game(s)
@Litwr Well, couldn't sleep before I tried to have stable PAL image in NTSC mode. Still need to put it into an interrupt to see if there is an actual speed gain, but this test code finally did the trick in keeping the image stable (and in the right position too!):
lda #$48 sta $ff07 .lp0 lda #$c8 ldy #$e1 .lp1 cpy $ff1d bne .lp1 sta $ff1d lda #$e6 ldy #$fc .lp2 cpy $ff1d bne .lp2 sta $ff1d jmp .lp0
So it seems the NTSC trick is very very easy to implement after all. Thanks for the help.
|
|
Posted By
Sandor on 2014-01-21 22:39:14
| Re: DFLI game(s)
Ok, so this got me really excited and here I am at 4:30 AM still awake...
I have did the test with the interrupts with YAPE and the results were too good. In fact it made me wonder if YAPE correctly emulates this situation or not. I can't test on my plus4 (because the rom is burned out and I couldn't find a replacement yet), so if there is anyone willing to help and run the code on their own machine, here is the test code:
2000 sei 2001 lda #$00 2003 ldx #$00 2005 ldy #$00 2007 sta $ff19 200a clc 200b adc #$01 200d bcc $200b 200f clc 2010 inx 2011 bne $200b 2013 iny 2014 cpy #$80 2016 bne $200b 2018 lda #$71 201a sta $ff19 201d jmp $201d
4000 sei 4001 lda #$48 4003 sta $ff07 4006 ldx #$00 4008 ldy #$41 400a stx $fffe 400d sty $ffff 4010 lda #$e1 4012 sta $ff0b 4015 sta $ff3f 4018 cli 4019 jmp $2001
4100 pha 4101 lda #$c8 4103 sta $ff1d 4106 lda $ff09 4109 sta $ff09 410c lda #$fc 410e sta $ff0b 4111 lda #$42 4113 sta $ffff 4116 pla 4117 rti
4200 pha 4201 lda #$e6 4203 sta $ff1d 4206 lda $ff09 4209 sta $ff09 420c lda #$e1 420e sta $ff0b 4211 lda #$41 4213 sta $ffff 4216 pla 4217 rti
You can start it in PAL mode and in NTSC mode.
SYS 8192 - PAL SYS 16384 - NTSC
The screen border will turn black and then after some time it will turn white. The goal of this test is to measure the actual difference in the time that passes before the screen border turns white.
In YAPE, I've measured this: PAL: 37 seconds NTSC: 25 seconds
That is acceleration of 48%, which would put the clock at 2.34MHz. This doesn't seem right to me, since everywhere else people claim a bit lower CPU clock. So either YAPE is not emulating this correctly and the situation is different on the machine, or there is something else going on here that I simply don't understand. Does anyone have an idea why is the difference so large?
ps: If there are people who didn't do the NTSC trick yet and want to do it, you can use the $4000-$4217 part for whatever you need. If the speed gain I've got is real and not some fluke on my part (or emulation part), then it is certainly worth to use it for everything.
Also, sorry if I am reinventing the wheel here.
|
|
Posted By
Litwr on 2014-01-22 02:07:00
| Re: DFLI game(s)
@MMS IMHO it is easy to get drive type (1541, 1551, 1581, ...)...
[1551] this drive is perfectly emulated by Yape and Plus4emu and its theoretical maximum speed is greater than 1541. I hope to see one day 1551 turbo loader with 15-20 KB/s. This gives more chances for +4 multimedia...
@Sandor I almost sure that your codes can't work with the real hardware because h-sync frequency is not PAL's at the NTSC mode. I tried it with plus4emu (it is slightly more accurate than Yape) - sys16384 doesn't show any picture. BTW we still have no any demo for this mode. We also don't have demos (only pictures) for the interlaced mode.
|
|
Posted By
Luca on 2014-01-22 03:08:16
| Re: DFLI game(s)
Awake in the night until the code's up and significantly working, that's the spirit! No one should loose it on the way of aging up, my appreciation to you for this!
As Litwr pointed out: yes, YAPE is the best Plus/4 emulator so far, combining several characteristics, included being user-friendly. It's 99% perfectly matching with the real machine, nonetheless the real iron says the last words in the very end. And, by personal experiences, never never NEVER trust YAPE especially in cases like yours (taking the raster back, e.g. opening up/down borders): there where YAPE runs flawlessly, the real iron creats video artifacts (argh!) or slow down your tune (ARGH!). plus4emu appears to be not so friendly as YAPE, but its better emulating the right behavior with video stuff, use it intensively for this, but still let the real iron say...ok you got it Don't trust VICE +4, Minus4, Artifex, WinEMU or further minor ones.
This is how plus4emu displays the NTSC screen that follows your code:
Congrats for you efforts, IMO this forum gives its best when some serious coding threads get in, and your project is something that has to be really seen. Because of this, I'm gonna check your code on my real Plus/4 once at home this evening (have to work, then my terrible she-dentist awaits for me ), and starting from now feel free to ask for further testing too: my Plus/4 is up'n'working 24/7, as several users know
Meanwhile, I encourage you in using one of the many crossassemblers around, in particular the Plus4IDE, a dedicated one: testing your code would be easier that way. This is your code quickly ported on AS65 in order to compile it on Plus4IDE:
; Sandor's video testing ; 2014-01-22
org $1001-2 dw $1001 dw nextln,0 ; second word is line number db $9e if start > 9999 db "0"+start/10000 endif db "0"+(start/1000)%10,"0"+(start/100)%10,"0"+(start/10)%10,"0"+start%10,0 nextln db 0,0 start sei lda #$22 ; red ink sta $053b clrscreen ; fill screen by ROM routine ldx #$04 ldy #$08 lda #$71 jsr $c5a7 ldx #$04 ldy #$0c lda #$20 jsr $c5a7
jsr $ff4f ; print on screen by ROM routine db $93,"PRESS: [1] FOR PAL; [2] FOR NTSC." db 0
sta $ff3f
kcheck ; keyboard check for keys '1' and '2' lda #$7f sta $fd30 sta $ff08 lda $ff08 cmp #$fe bne *+5 jmp $2000 cmp #$f7 bne kcheck jmp $4000 org $2000 sei lda #$00 ldx #$00 ldy #$00 sta $ff19 clc adc #$01 bcc $200b clc inx bne $200b iny cpy #$80 bne $200b lda #$71 sta $ff19 jmp $201d org $4000
sei lda #$48 sta $ff07 ldx #nmi1 & 255 ldy #nmi1 >> 8 stx $fffe sty $ffff lda #$e1 sta $ff0b sta $ff3f cli jmp $2001 org $4100 nmi1 pha lda #$c8 sta $ff1d lda $ff09 sta $ff09 lda #$fc sta $ff0b lda #nmi2 >> 8 sta $ffff pla rti org $4200 nmi2 pha lda #$e6 sta $ff1d lda $ff09 sta $ff09 lda #$e1 sta $ff0b lda #nmi1 >> 8 sta $ffff pla rti
|
|
Posted By
Litwr on 2014-01-22 05:19:15
| Re: DFLI game(s)
@Luca "No one should loose it on the way of aging up" - it is the naked truth for me. My emulator is still the third after Plus4emu and Yape at the point of accuracy of TED/CPU emulation. It is even the best for the several tasks. So it can't be called "the further minor ones".
|
|
Posted By
Luca on 2014-01-22 05:56:34
| Re: DFLI game(s)
Ah oops my fault Litwr, you're right
|
|
Posted By
Sandor on 2014-01-22 08:28:30
| Re: DFLI game(s)
Eh, pity. It did sound too good to be true. And all my tests that tried to have the visible part of the screen in PAL and the border in NTSC didn't work at all. Who would have thought that a difference between 15.734 and 15.625 KHz would result in so nasty artifacts - I would expect a 2 pixel shift at worst (which could be corrected on per scanline basis), but the screenshot from plus4emu shows all the rasterlines visible 320 pixels compressed into 232. I am still curious about the real machine results though, but I won't be able to test it for at least 2 more weeks, because of my ROM issues. I am monitoring all the auction sites, so as soon as something shows up, I'll have a better test environment.
Well the NTSC trick isn't essential for my project anyway, but it would allow me to ditch the 2nd screen buffer and use that 16KB for additional sprite animation frames. And I also find it an interesting experiment.
@Litwr: I see you did code a scroller in NTSC mode. That one obviously uses border color modifications with the screen turned off, but I am still curious how come you didn't get those nasty horizontal artifacts in doing so? Did you do a horizontal raster modification to offset for the horizontal refresh frequency?
@Luca: I have actually started the project in the CBM prg Studio, because I really like the GUI of that thing. Also thanks for the testing.
|
|
Posted By
Litwr on 2014-01-22 11:02:13
| Re: DFLI game(s)
NTSC mode is not so easy - it sets h-sync to 25% higher than normal. You should create every raster line by software in this case (by change of $ff1e). My demo has about 1% lower h-sync - it is acceptable to tv or monitor. The vertical scrolling is made by $ff1c/1d change. BTW I used copy-paste to move your code to plus4emu.
|
|
Posted By
siz on 2014-01-22 14:58:45
| Re: DFLI game(s)
@Sandor: where do you live. I'm commuting between Tatabánya and Budapest. Probably I can give you a KERNAL and/or a BASIC ROM for free if we can meet somewhere. Send me an email to <mynickname>@<mynickname>.hu if you are interested.
|
|
Posted By
MMS on 2014-01-22 15:45:24
| Re: DFLI game(s)
Sandor, I did not solve it
I made some trials with an 1581 turboloader released in a C64 hack online magazine, but after I realized that I cannot intergate it into a a BASIC prg that is speeded up with Autrospeed compliler, I told: no way. Soon after that I realized that SYS commands may also not work in the compiled PRG, and as I plan a real RS232 mouse driver too ( ) It was an easy decision: pure BASIC, no speed
What I decided:
In the D64 directory there will be, and you can easily choose: "1551 turbo loader" PRG 4 "1541 turbo loader" PRG 3
Easy and straightforward
(but the PRG is still VERY VERY far from the end)
|
|
Posted By
Luca on 2014-01-22 19:06:43
| Re: DFLI game(s)
...and this is how the NTSC code runs on a real Plus/4
|
|
Posted By
Sandor on 2014-01-22 23:24:19
| Re: DFLI game(s)
Latest news in the PAL in NTSC arena:
http://s21.postimg.org/54s18969y/NTSC_cycles_PAL_image.jpg
If this works on the real machine, that means extra ~2000 cycles per frame with the cost of being able to display black border only (to avoid artifacts).
@siz: Thanks for the offer, mail sent!
|
|
Posted By
Sandor on 2014-01-23 01:26:46
| Re: DFLI game(s)
And here we go... Just need someone to verify this on a real plus4:
http://www.datafilehost.com/d/406ea15a
It will take about a minute to run, but it is fully automatic now and you don't need to manually measure anything. Works both in YAPE and in plus4emu and I've even got consistent results in both.
It measures how many cycles I have while displaying a 3 character height score board on top and a 320x160 dfli image for the gameplay area. Since I can't use the CPU whilte the dfli image is shown at all, this benchmark sets a flag where the dfli image should start and another flag where it ends. It then does a loop of exactly 19947615 cycles and counts the number of frames that it takes to finish the loop. And then at the end we have 19947615/frames as the end result - which shows how many cycles we have for each frame. Oh, and here are the screenshots from the emulators:
http://s30.postimg.org/z97sr9pz4/NTSC_cycles_PAL_image_Benchmark.jpg
As you can see the NTSC trick gives us 2244 free cycles per frame, or 14.85% increase in available cycles (whichever you prefer.)
Now I just really need someone to confirm this on a real machine and we're good to go!!!
|
|
Posted By
Sandor on 2014-01-23 01:29:03
| Re: DFLI game(s)
PS: When/if you're downloading the .prg file, make sure to uncheck the "Use our download manager and get recommended downloads" thingie on the hosting site because if you don't it will give you some .exe file instead of my .prg which is 99.99999% something that you don't want to happen. (Sorry about the bad hosting, but it was the first thing that came up in google for "free file hosting")
|
|
Posted By
Luca on 2014-01-23 02:43:26
| Re: DFLI game(s)
@sandor: leave it to me for the real machine test, I'll run it and report the values straight from the screen After worktime, in some hours...
EDIT: here are the results from my real Plus/4: PAL ONLY: 14986.9384 PAL/NTSC (the video goes nuts and I can't read the result! let's run/stop+reset then PRINT C/A): 17211.0569
emulators == real iron !
|
|
Posted By
Sandor on 2014-01-23 11:46:01
| Re: DFLI game(s)
Did the screen go nuts during the speed test too, or was it ok during the test and went nuts at the end?
Are there any artifacts?
(I forgot to set it back to PAL once the test is over)
|
|
Posted By
Sandor on 2014-01-23 12:26:21
| Re: DFLI game(s)
Sorry for doublepost (again), but I can't edit my posts because I'm not registered yet.
@Luca: if you just do a sys 8192, does the screen get immediately corrupted and are there any visible artifacts below/above the screen area?
Btw if there are no artifacts on the real machine, I can make a more general code and post it, so that anyone can use it to get the extra 2k cycles per frame.
|
|
Posted By
Litwr on 2014-01-23 13:06:58
| Re: DFLI game(s)
This program pushes 100 raster lines out of h-sync -- it is the big stress for tv/monitor. The black border can't completely hide the noisy image. Maybe some video hardware has some interpolation function to suppress this big noise at 30% of raster...
|
|
Posted By
Sandor on 2014-01-23 13:20:48
| Re: DFLI game(s)
@Litwr: It is possible to change the code to work only during the VBlank and then there are no artifacts , but the gain is only 1000 cycles per frame in that case. Also I believe if we could do a better sync on the last NTSC rasterline, it would reduce the affected rasterlines even further.
|
|
Posted By
Luca on 2014-01-23 14:00:57
| Re: DFLI game(s)
During the PAL test, the screen is okay, with some pinky flashing as expected. When PAL/NTSC test occurs, my Commodore monitor displays a wrung screen. This 1802 Daewoo manifactured model I have, is very sensible to the raster's shocks depending by the reached temperature, hence I show you a first low temp wrung screen, and a second high temp one In any case, the NTSC noisy screen as seen in the previous post's picture only comes at the end of the PAL/NTSC test.
NTSC test running (cold monitor)
NTSC test running (hot monitor)
|
|
Posted By
Sandor on 2014-01-24 21:06:42
| Re: DFLI game(s)
Ok, so further tuning is definitely needed.
edit: I did some further testing and mostly code reading.
Litwr's HNY2013 was very informative in this aspect. This is the data I have gathered (and please correct me if I am wrong):
When not displaying the screen: PAL scanline is 109 cycles. NTSC-PAL scanilne is 143 cycles.
To completely avoid any artifacts, we need to sync every single scanline. Unless the code is specifically written to do this, we have to rely on interrupts. The raster interrupt is bad for this because it starts at the beginning of the scanline, we need an interrupt at the end of the scanline and we need it to trigger on PAL scanline interval. We can setup a timer interrupt using $ff00-$ff01 to trigger at the right moment. However we can't just sta a fixed value because the interrupt will start at a random 0-6 cycle delay depending on the code we are using. The fastest interrupt code I managed to write looks like this:
sta $52 lda $ff1e sbc #$20 sta $ff1e lda $ff09 sta $ff09 lda $52 rti
However, with the 7 cycles interrupt startup cost, that interrupt will cost us a total of 37 cycles. Therefore instead of gaining, we would lose 3 cycles / scanline.
In conclusion: You can use the NTSC/PAL trick in cases where you either are copying from a fixed to a fixed location every frame, or where you use some table based calculations where you never skip the page boundary and you interleave the sync with the code, the way Litwr did in his demo. You can switch to NTSC at first empty scanline and switch back to PAL at the last empty scanline, in which case you will get 111 scanlines with extra ~3000 cycles total (with the sync at the start and the 4 cycles sync at every scanline calculated). So for a demo or for a game that uses that kind of a routine (for example drawing a tunnel, a rot zoomer or a plasma could easily work, scrollers could too), you would gain extra 3000 cycles per frame. However, sadly for my project, this whole thing is useless because the sprite engine I am using is completely unpredictable when it comes to cycle counts. :(
@Litwr: I see you're moving $ff1e to $b1, but I am not quite sure where are you moving it from. Did you move it from $e2 or earlier?
|
|
Posted By
bubis on 2018-09-03 04:52:46
| Re: DFLI game(s)
As for your question regarding fastloaders: Bitfire is fast, it loads stuff and if supports both 1541/1551 drives.
https://github.com/dotscha/bitfire/releases
|
|
| |
Copyright © Plus/4 World Team, 2001-2024. Support Plus/4 World on Patreon |