Can I turn this into a loop through some 16-Bit Magic?

Asked 5/2, 2014 at 4:18 Answered 7/5, 2014 at 18:43

I'm starting out with 6502 Assembly right now and have a problem wrapping my head around loops that need to deal with numbers bigger than 8 bit.

Specifically, I want to loop through some memory locations. In pseudo-c-code, I want to do this:

    // Address is a pointer to memory
    int* address = 0x44AD;
    for(x = 0; x < 21; x++){
        // Move pointer forward 40 bytes
        address += 0x28;
        // Set memory location to 0x01
        &address = 0x01;
    }

So starting at address $44AD I want to write $01 into ram, then jump forward $28, write $01 into that, then jump forward $28 again until I've done that 20 times (last address to write is $47A5).

My current approach is loop unrolling which is tedious to write (even though I guess an Assembler can make that simpler):

ldy #$01
// Start from $44AD for the first row, 
    // then increase by $28 (40 dec) for the next 20
sty $44AD
sty $44D5
sty $44FD
    [...snipped..]
sty $477D
sty $47A5

I know about absolute addressing (using the Accumulator instead of the Y register - sta $44AD, x), but that only gives me a number between 0 and 255. What I really think I want is something like this:

       lda #$01
       ldx #$14 // 20 Dec
loop:  sta $44AD, x * $28
       dex
       bne loop

Basically, start at the highest address, then loop down. Problem is that $14 * $28 = $320 or 800 dec, which is more than I can actually store in the 8-Bit X register.

Is there an elegant way to do this?

Biathlon answered 5/2, 2014 at 4:18 Comment(0)

The 6502 is an 8-bit processor, so you aren't going to be able to calculate 16-bit addresses entirely in registers. You will need to indirect through page zero.

      // set $00,$01 to $44AD + 20 * $28 = $47CD
      LDA #$CD
      STA $00
      LDA #$47
      STA $01

      LDX #20  // Loop 20 times
      LDY #0
loop: LDA #$01 // the value to store
      STA ($00),Y // store A to the address held in $00,$01
      // subtract $28 from $00,$01 (16-bit subtraction)
      SEC
      LDA $00
      SBC #$28
      STA $00
      LDA $01
      SBC #0
      STA $01
      // do it 19 more times
      DEX
      BNE loop

Alternatively, you could use self-modifying code. This is a dubious technique in general, but common on embedded processors like the 6502 because they are so limited.

      // set the instruction at "patch" to "STA $47CD"
      LDA #$CD
      STA patch+1
      LDA #$47
      STA patch+2

      LDX #20  // Loop 20 times
loop: LDA #$01 // the value to store
patch:STA $FFFF
      // subtract $28 from the address in "patch"
      SEC
      LDA patch+1
      SBC #$28
      STA patch+1
      LDA patch+2
      SBC #0
      STA patch+2
      // do it 19 more times
      DEX
      BNE loop

Neale answered 5/2, 2014 at 5:32 Comment(6)

Thank you so much! Reading up on the indirect mode, that eluded me. The self-modifying code is also interesting, I need to wrap my head around the whole "code = memory" idea that the old machine had. – Biathlon 5/2, 2014 at 6:6

STC?? SEC, surely. Incidentally, as a readability-aid in self-modifying code, I use $C0DE instead of $FFFF and have my syntax highlighter fluoresce it in yellow - makes it very easy to spot places where you're doing something gnarly. – Goatish 5/2, 2014 at 16:1

@EightBitGuru Sorry got my instruction sets confused. – Neale 5/2, 2014 at 16:34

Nitpick: The second STA in the first code snippet should be to $01. – Proximity 14/2, 2014 at 20:32

Since this question is tagged c64, you want to avoid using zeropage addresses $00 and $01, as they are used for I/O ports and ROM/RAM configuration by the 6510 CPU. If you want your code to be run safely from BASIC through a SYS command, your best bet is to use the $FB through $FE range. – Paynim 24/4, 2014 at 9:24

I used to use self-modifying code on the C64. Sure it's ugly, but when you need it to run on exactly one computer, it works. – Bash 16/9, 2014 at 18:24

More efficient way to copy 1k of data:

    ldy #0
nextvalue:
    lda address, y
    sta address, y

    lda address+$100, y
    sta address+$100, y

    lda address+$200, y
    sta address+$200, y

    lda address+$300, y
    sta address+$300, y
    iny
    bne nextvalue

Few notes:

Faster, as loop overhead is reduced. Takes more space due to more commands.
If the assembler you use supports macros, you can easily make it configurable, how many blocks the code handles.

Might not be 100% relevant to this, but here's another way to have longer-than-255 loops:

nextblock:
    ldy #0
nextvalue:
    lda address, y
    iny
    bne nextvalue

;Insert code to be executed between each block here:

    dec numblocks
    bpl nextblock

numblocks:
    .byte 3

Few notes:

For now, the code doesn't really do anything meaningful, but runs the loop "numblocks" times. "Add your own code" :-) (Often I use this together with some self-modifying code that increments sta, y address for example)
bpl can be dangerous (if you don't know how it works), but works well enough in this case (but wouldn't, if numblocks address contained big enough value)
If you need to execute the same code again, numblocks needs to be re-set.
Code can be made a little bit faster by putting numblocks to zero page.
If not needed for something else (like it often is), you can use X register instead of memory location.

Exclamation answered 7/5, 2014 at 18:43 Comment(0)

Recommended topics

Hot tags