The Multiply16x16 subroutine - Revs on the BBC Micro

       Name: Multiply16x16                                           [Show more]
       Type: Subroutine
   Category: Maths (Arithmetic)
    Summary: Multiply a sign-magnitude 16-bit number and a signed 16-bit number
    Context: See this subroutine in context in the source code
 References: This subroutine is called as follows:
             * MultiplyCoords calls Multiply16x16


 This routine calculates:

   (A T) = (QQ PP) * (SS RR) / 256^2

 It uses the following algorithm:

  (QQ PP) * (SS RR) = (QQ << 8 + PP) * (SS << 8 + RR)
                    = (QQ << 8 * SS << 8) + (QQ << 8 * RR)
                                          + (PP * SS << 8)
                                          + (PP * RR)
                    = (QQ * SS) << 16 + (QQ * RR) << 8
                                      + (PP * SS) << 8
                                      + (PP * RR)

 Finally, it replaces the low byte multiplication in (PP * RR) with 128, as an
 estimate, as it's a pain to multiply the low bytes of a signed integer with a
 sign-magnitude number. So the final result that is returned in (A T) is as
 follows:

   (A T) = (QQ PP) * (SS RR) / 256^2
         = ((QQ * SS) << 16 + (QQ * RR) << 8 + (PP * SS) << 8 + 128) / 256^2

 which is the algorithm that is implemented in this routine.


 Arguments:

   (QQ PP)              16-bit signed integer

   (SS RR)              16-bit sign-magnitude integer with the sign bit in bit 0
                        of RR

   H                    The sign to apply to the result (in bit 7)


 Returns:

   (A  T)               (QQ PP) * (SS RR) * abs(H)


.Multiply16x16

 LDA QQ                 \ If (QQ PP) is positive, jump to muls1 to skip the
 BPL muls1              \ following

 LDA #0                 \ (QQ PP) is negative, so we now negate (QQ PP) so it's
 SEC                    \ positive, starting with the low bytes
 SBC PP
 STA PP

 LDA #0                 \ And then the high bytes
 SBC QQ                 \
 STA QQ                 \ So we now have (QQ PP) = |QQ PP|

 LDA H                  \ Flip bit 7 of H, so when we set the result to the sign
 EOR #%10000000         \ of H below, this ensures the result is the correct
 STA H                  \ sign

.muls1

 LDA RR                 \ If bit 0 of RR is clear, then (SS RR) is positive, so
 AND #1                 \ jump to muls2
 BEQ muls2

 LDA H                  \ Flip bit 7 of H, so when we set the result to the sign
 EOR #%10000000         \ of H below, this ensures the result is the correct
 STA H                  \ sign

.muls2

 LDA QQ                 \ Set U = QQ
 STA U

 LDA RR                 \ Set A = RR

 JSR Multiply8x8        \ Set (A T) = A * U
                        \           = RR * QQ

 STA W                  \ Set (W T) = (A T)
                        \           = RR * QQ

 LDA T                  \ Set (W V) = (A T) + 128
 CLC                    \           = RR * QQ + 128
 ADC #128               \
 STA V                  \ starting with the low bytes

 BCC muls3              \ And then the high byte
 INC W

                        \ So we now have (W V) = RR * QQ + 128

.muls3

 LDA SS                 \ Set A = SS

 JSR Multiply8x8        \ Set (A T) = A * U
                        \           = SS * QQ

 STA G                  \ Set (G T) = (A T)
                        \           = SS * QQ

 LDA T                  \ Set (G W V) = (G T 0) + (W V)
 CLC                    \
 ADC W                  \ starting with the middle bytes (as the low bytes are
 STA W                  \ simply V = 0 + V with no carry)

 BCC muls4              \ And then the high byte
 INC G

                        \ So now we have:
                        \
                        \   (G W V) = (G T 0) + (W V)
                        \           = (SS * QQ << 8) + RR * QQ + 128

.muls4

 LDA PP                 \ Set U = PP
 STA U

 LDA SS                 \ Set A = SS

 JSR Multiply8x8        \ Set (A T) = A * U
                        \           = SS * PP

 STA U                  \ Set (U T) = (A T)
                        \           = SS * PP

 LDA T                  \ Set (G T ?) = (G W V) + (U T)
 CLC                    \
 ADC V                  \ starting with the low bytes (which we throw away)

 LDA U                  \ And then the high bytes
 ADC W
 STA T

 BCC muls5              \ And then the high byte
 INC G

                        \ So now we have:
                        \
                        \   (G T ?) = (G W V) + (U T)
                        \           = (SS * QQ << 8) + RR * QQ + 128 + SS * PP
                        \           = (QQ * SS) << 8 + (QQ * RR) + (PP * SS)
                        \              + 128
                        \           = (QQ PP) * (SS RR) / 256
                        \
                        \ So:
                        \
                        \   (G T) = (G T ?) / 256
                        \         = (QQ PP) * (SS RR) / 256^2
                        \
                        \ which is the result that we want

.muls5

 LDA G                  \ Set (A T) = (G T)

 BIT H                  \ We are about to fall through into Absolute16Bit, so
                        \ this ensures we set the sign of (A T) to the sign in
                        \ H, so we get:
                        \
                        \   (A T) = (A T) * abs(H)

[X]

Variable G in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable H in workspace Zero page

Temporary storage, used in a number of places

[X]

Subroutine Multiply8x8 (category: Maths (Arithmetic))

Calculate (A T) = T * U

[X]

Variable PP in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable QQ in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable RR in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable SS in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable T in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable U in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable V in workspace Zero page

Temporary storage, used in a number of places

[X]

Variable W in workspace Zero page

Temporary storage, used in a number of places

[X]

Label muls1 is local to this routine

[X]

Label muls2 is local to this routine

[X]

Label muls3 is local to this routine

[X]

Label muls4 is local to this routine

[X]

Label muls5 is local to this routine

Maths (Arithmetic): Multiply16x16