2019-12-11 13:01:51 -05:00
|
|
|
; *** Requirements ***
|
|
|
|
; lib/util
|
2019-07-13 11:53:30 -04:00
|
|
|
; *** Code ***
|
|
|
|
|
2019-11-15 15:37:49 -05:00
|
|
|
; Parse the hex char at A and extract it's 0-15 numerical value. Put the result
|
|
|
|
; in A.
|
|
|
|
;
|
|
|
|
; On success, the carry flag is reset. On error, it is set.
|
|
|
|
parseHex:
|
|
|
|
; First, let's see if we have an easy 0-9 case
|
|
|
|
|
|
|
|
add a, 0xc6 ; maps '0'-'9' onto 0xf6-0xff
|
|
|
|
sub 0xf6 ; maps to 0-9 and carries if not a digit
|
|
|
|
ret nc
|
|
|
|
|
|
|
|
and 0xdf ; converts lowercase to uppercase
|
|
|
|
add a, 0xe9 ; map 0x11-x017 onto 0xFA - 0xFF
|
|
|
|
sub 0xfa ; map onto 0-6
|
|
|
|
ret c
|
|
|
|
; we have an A-F digit
|
|
|
|
add a, 10 ; C is clear, map back to 0xA-0xF
|
|
|
|
ret
|
|
|
|
|
2019-12-30 10:13:55 -05:00
|
|
|
; Parse string at (HL) as a decimal value and return value in DE.
|
|
|
|
; Reads as many digits as it can and stop when:
|
|
|
|
; 1 - A non-digit character is read
|
|
|
|
; 2 - The number overflows from 16-bit
|
|
|
|
; HL is advanced to the character following the last successfully read char.
|
|
|
|
; Error conditions are:
|
|
|
|
; 1 - There wasn't at least one character that could be read.
|
|
|
|
; 2 - Overflow.
|
2019-07-13 15:28:44 -04:00
|
|
|
; Sets Z on success, unset on error.
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
|
2019-07-13 11:53:30 -04:00
|
|
|
parseDecimal:
|
2019-12-30 10:13:55 -05:00
|
|
|
; First char is special: it has to succeed.
|
2019-11-13 21:14:29 -05:00
|
|
|
ld a, (hl)
|
2019-12-30 10:13:55 -05:00
|
|
|
; Parse the decimal char at A and extract it's 0-9 numerical value. Put the
|
|
|
|
; result in A.
|
|
|
|
; On success, the carry flag is reset. On error, it is set.
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
add a, 0xff-'9' ; maps '0'-'9' onto 0xf6-0xff
|
|
|
|
sub 0xff-9 ; maps to 0-9 and carries if not a digit
|
2019-12-30 10:13:55 -05:00
|
|
|
ret c ; Error. If it's C, it's also going to be NZ
|
|
|
|
; During this routine, we switch between HL and its shadow. On one side,
|
|
|
|
; we have HL the string pointer, and on the other side, we have HL the
|
|
|
|
; numerical result. We also use EXX to preserve BC, saving us a push.
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
parseDecimalSkip: ; enter here to skip parsing the first digit
|
2019-12-30 10:13:55 -05:00
|
|
|
exx ; HL as a result
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
ld h, 0
|
|
|
|
ld l, a ; load first digit in without multiplying
|
2019-07-13 11:53:30 -04:00
|
|
|
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
.loop:
|
2019-12-30 10:13:55 -05:00
|
|
|
exx ; HL as a string pointer
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
inc hl
|
|
|
|
ld a, (hl)
|
2019-12-30 10:13:55 -05:00
|
|
|
exx ; HL as a numerical result
|
2019-11-13 21:14:29 -05:00
|
|
|
|
2019-12-30 10:13:55 -05:00
|
|
|
; same as other above
|
|
|
|
add a, 0xff-'9'
|
|
|
|
sub 0xff-9
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
jr c, .end
|
2019-11-13 21:14:29 -05:00
|
|
|
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
ld b, a ; we can now use a for overflow checking
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
add hl, hl ; x2
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
sbc a, a ; a=0 if no overflow, a=0xFF otherwise
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
ld d, h
|
|
|
|
ld e, l ; de is x2
|
|
|
|
add hl, hl ; x4
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
rla
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
add hl, hl ; x8
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
rla
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
add hl, de ; x10
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
rla
|
|
|
|
ld d, a ; a is zero unless there's an overflow
|
|
|
|
ld e, b
|
Decimal parse optimisations (#45)
* Optimised parsing functions and other minor optimisations
UnsetZ has been reduced by a byte, and between 17 and 28 cycles saved based on branching. Since branching is based on a being 0, it shouldn't have to branch very often and so be 28 cycles saved most the time. Including the initial call, the old version was 60 cycles, so this should be nearly twice as fast.
fmtHex has been reduced by 4 bytes and between 3 and 8 cycles based on branching.
fmtHexPair had a redundant "and" removed, saving two bytes and seven cycles.
parseHex has been reduced by 7 bytes. Due to so much branching, it's hard to say if it's faster, but it should be since it's fewer operations and now conditional returns are used which are a cycle faster than conditional jumps. I think there's more to improve here, but I haven't come up with anything yet.
* Major parsing optimisations
Totally reworked both parseDecimal and parseDecimalDigit
parseDecimalDigit no longer exists, as it could be replaced by an inline alternative in the 4 places it appeared. This saves one byte overall, as the inline version is 4 bytes, 1 byte more than a call, and removing the function saved 5 bytes. It has been reduced from between 52 and 35 cycles (35 on error, so we'd expect 52 cycles to be more common unless someone's really bad at programming) to 14 cycles, so 2-3 times faster.
parseDecimal has been reduced by a byte, and now the main loop is just about twice as fast, but with increased overhead. To put this into perspective, if we ignore error cases:
For decimals of length 1 it'll be 1.20x faster, for decimals of length 2, 1.41x faster, for length 3, 1.51x faster, for length 4, 1.57x faster, and for length 5 and above, at least 1.48x faster (even faster if there's leading zeroes or not the worst case scenario).
I believe there is still room for improvement, since the first iteration can be nearly replaced with "ld l, c" since 0*10=0, but when I tried this I could either add a zero check into the main loop, adding around 40 cycles and 10 bytes, or add 20 bytes to the overhead, and I don't think either of those options are worth it.
* Inlined parseDecimalDigit
See previous commit, and /lib/parse.asm, for details
* Fixed tabs and spacing
* Fixed tabs and spacing
* Better explanation and layout
* Corrected error in comments, and a new parseHex
5 bytes saved in parseHex, again hard to say what that does to speed, the shortest possible speed is probably a little slower but I think non-error cases should be around 9 cycles faster for decimal and 18 cycles faster for hex as there's now only two conditional returns and no compliment carries.
* Fixed the new parseHex
I accidentally did `add 0xe9` without specifying `a`
* Commented the use of daa
I made the comments surrounding my use of daa much clearer, so it isn't quite so mystical what's being done here.
* Removed skip leading zeroes, added skip first multiply
Now instead of skipping leading zeroes, the first digit is loaded directly into hl without first multiplying by 10. This means the first loop is skipped in the overhead, making the method 2-3 times faster overall, and is now faster for the more common fewer digit cases too. The number of bytes is exactly the same, and the inner loop is slightly faster too thanks to no longer needing to load a into c.
To be more precise about the speed increase over the current code, for decimals of length 1 it'll be 3.18x faster, for decimals of length 2, 2.50x faster, for length 3, 2.31x faster, for length 4, 2.22x faster, and for length 5 and above, at least 2.03x faster. In terms of cycles, this is around 100+(132*length) cycles saved per decimal.
* Fixed erroring out for all number >0x1999
I fixed the errors for numbers >0x1999, sadly it is now 6 bytes bigger, so 5 bytes larger than the original, but the speed increases should still hold.
* Fixed more errors, clearer choice of constants
* Clearer choice of constants
* Moved and indented comment about fmtHex's method
* Marked inlined parseDecimalDigit uses
* Renamed .error, removed trailing whitespace, more verbose comments.
2019-10-24 07:58:32 -04:00
|
|
|
add hl, de
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
adc a, a ; same as rla except affects Z
|
2019-12-30 10:13:55 -05:00
|
|
|
; Did we oveflow?
|
|
|
|
jr z, .loop ; No? continue
|
|
|
|
; error, NZ already set
|
|
|
|
exx ; HL is now string pointer, restore BC
|
|
|
|
; HL points to the char following the last success.
|
|
|
|
ret
|
2019-07-13 11:53:30 -04:00
|
|
|
|
|
|
|
.end:
|
2019-12-30 10:13:55 -05:00
|
|
|
push hl ; --> lvl 1, result
|
|
|
|
exx ; HL as a string pointer, restore BC
|
|
|
|
pop de ; <-- lvl 1, result
|
|
|
|
cp a ; ensure Z
|
|
|
|
ret
|
|
|
|
|
|
|
|
; Call parseDecimal and then check that HL points to a whitespace or a null.
|
|
|
|
parseDecimalC:
|
|
|
|
call parseDecimal
|
|
|
|
ret nz
|
|
|
|
ld a, (hl)
|
|
|
|
or a
|
|
|
|
ret z ; null? we're happy
|
2019-11-20 20:58:26 -05:00
|
|
|
jp isWS
|
2019-11-18 15:17:56 -05:00
|
|
|
|
2019-12-29 19:47:19 -05:00
|
|
|
; Parse string at (HL) as a hexadecimal value without the "0x" prefix and
|
|
|
|
; return value in DE.
|
2019-12-29 21:39:51 -05:00
|
|
|
; HL is advanced to the character following the last successfully read char.
|
2019-12-29 19:47:19 -05:00
|
|
|
; Sets Z on success.
|
2019-11-18 15:17:56 -05:00
|
|
|
parseHexadecimal:
|
|
|
|
ld a, (hl)
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
call parseHex ; before "ret c" is "sub 0xfa" in parseHex
|
|
|
|
; so carry implies not zero
|
|
|
|
ret c ; we need at least one char
|
2019-12-29 21:39:51 -05:00
|
|
|
push bc
|
|
|
|
ld de, 0
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
ld b, d
|
|
|
|
ld c, d
|
|
|
|
|
|
|
|
; The idea here is that the 4 hex digits of the result can be represented "bdce",
|
|
|
|
; where each register holds a single digit. Then the result is simply
|
|
|
|
; e = (c << 4) | e, d = (b << 4) | d
|
|
|
|
; However, the actual string may be of any length, so when loading in the most
|
|
|
|
; significant digit, we don't know which digit of the result it actually represents
|
|
|
|
; To solve this, after a digit is loaded into a (and is checked for validity),
|
|
|
|
; all digits are moved along, with e taking the latest digit.
|
2019-12-29 21:39:51 -05:00
|
|
|
.loop:
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
dec b
|
|
|
|
inc b ; b should be 0, else we've overflowed
|
|
|
|
jr nz, .end ; Z already unset if overflow
|
|
|
|
ld b, d
|
|
|
|
ld d, c
|
|
|
|
ld c, e
|
|
|
|
ld e, a
|
|
|
|
inc hl
|
|
|
|
ld a, (hl)
|
|
|
|
call parseHex
|
|
|
|
jr nc, .loop
|
|
|
|
ld a, b
|
|
|
|
add a, a \ add a, a \ add a, a \ add a, a
|
|
|
|
or d
|
|
|
|
ld d, a
|
|
|
|
|
|
|
|
ld a, c
|
|
|
|
add a, a \ add a, a \ add a, a \ add a, a
|
|
|
|
or e
|
|
|
|
ld e, a
|
|
|
|
xor a ; ensure z
|
|
|
|
|
|
|
|
.end:
|
2019-12-29 21:39:51 -05:00
|
|
|
pop bc
|
2019-11-18 15:17:56 -05:00
|
|
|
ret
|
|
|
|
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
|
2019-12-29 19:47:19 -05:00
|
|
|
; Parse string at (HL) as a binary value (010101) without the "0b" prefix and
|
|
|
|
; return value in E. D is always zero.
|
2019-12-30 13:05:21 -05:00
|
|
|
; HL is advanced to the character following the last successfully read char.
|
2019-11-18 15:17:56 -05:00
|
|
|
; Sets Z on success.
|
|
|
|
parseBinaryLiteral:
|
2019-12-30 13:05:21 -05:00
|
|
|
ld de, 0
|
2019-11-18 15:17:56 -05:00
|
|
|
.loop:
|
|
|
|
ld a, (hl)
|
2019-12-30 13:05:21 -05:00
|
|
|
add a, 0xff-'1'
|
|
|
|
sub 0xff-1
|
|
|
|
jr c, .end
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
rlc e ; sets carry if overflow, and affects Z
|
|
|
|
ret c ; Z unset if carry set, since bit 0 of e must be set
|
2019-12-30 13:05:21 -05:00
|
|
|
add a, e
|
|
|
|
ld e, a
|
2019-11-18 15:17:56 -05:00
|
|
|
inc hl
|
2019-12-30 13:05:21 -05:00
|
|
|
jr .loop
|
2019-11-18 15:17:56 -05:00
|
|
|
.end:
|
2019-12-30 13:05:21 -05:00
|
|
|
; HL is properly set
|
|
|
|
xor a ; ensure Z
|
2019-11-18 15:17:56 -05:00
|
|
|
ret
|
|
|
|
|
2019-12-29 19:47:19 -05:00
|
|
|
; Parses the string at (HL) and returns the 16-bit value in DE. The string
|
|
|
|
; can be a decimal literal (1234), a hexadecimal literal (0x1234) or a char
|
|
|
|
; literal ('X').
|
2019-12-30 19:24:53 -05:00
|
|
|
; HL is advanced to the character following the last successfully read char.
|
2019-12-29 19:47:19 -05:00
|
|
|
;
|
|
|
|
; As soon as the number doesn't fit 16-bit any more, parsing stops and the
|
|
|
|
; number is invalid. If the number is valid, Z is set, otherwise, unset.
|
|
|
|
parseLiteral:
|
|
|
|
ld de, 0 ; pre-fill
|
2019-11-18 15:17:56 -05:00
|
|
|
ld a, (hl)
|
2019-12-29 19:47:19 -05:00
|
|
|
cp 0x27 ; apostrophe
|
|
|
|
jr z, .char
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
|
|
|
|
; inline parseDecimalDigit
|
|
|
|
add a, 0xc6 ; maps '0'-'9' onto 0xf6-0xff
|
|
|
|
sub 0xf6 ; maps to 0-9 and carries if not a digit
|
|
|
|
ret c
|
|
|
|
; a already parsed so skip first few instructions of parseDecimal
|
|
|
|
jp nz, parseDecimalSkip
|
2019-12-30 19:24:53 -05:00
|
|
|
; maybe hex, maybe binary
|
|
|
|
inc hl
|
|
|
|
ld a, (hl)
|
|
|
|
inc hl ; already place it for hex or bin
|
|
|
|
cp 'x'
|
|
|
|
jr z, parseHexadecimal
|
|
|
|
cp 'b'
|
|
|
|
jr z, parseBinaryLiteral
|
|
|
|
; nope, just a regular decimal
|
|
|
|
dec hl \ dec hl
|
|
|
|
jp parseDecimal
|
2019-11-18 15:17:56 -05:00
|
|
|
|
|
|
|
; Parse string at (HL) and, if it is a char literal, sets Z and return
|
2019-12-29 17:37:04 -05:00
|
|
|
; corresponding value in E. D is always zero.
|
2019-12-30 19:24:53 -05:00
|
|
|
; HL is advanced to the character following the last successfully read char.
|
2019-11-18 15:17:56 -05:00
|
|
|
;
|
|
|
|
; A valid char literal starts with ', ends with ' and has one character in the
|
|
|
|
; middle. No escape sequence are accepted, but ''' will return the apostrophe
|
|
|
|
; character.
|
2019-12-29 19:47:19 -05:00
|
|
|
.char:
|
2019-11-18 15:17:56 -05:00
|
|
|
inc hl
|
2019-12-30 19:24:53 -05:00
|
|
|
ld e, (hl) ; our result
|
2019-11-18 15:17:56 -05:00
|
|
|
inc hl
|
|
|
|
cp (hl)
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
; advance HL and return if good char
|
2019-11-18 15:17:56 -05:00
|
|
|
inc hl
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
ret z
|
|
|
|
|
|
|
|
; Z unset and there's an error
|
|
|
|
; In all error conditions, HL is advanced by 3. Rewind.
|
|
|
|
dec hl \ dec hl \ dec hl
|
2019-12-30 19:24:53 -05:00
|
|
|
; NZ already set
|
2019-11-18 15:17:56 -05:00
|
|
|
ret
|
|
|
|
|
2019-12-30 19:24:53 -05:00
|
|
|
|
|
|
|
; Returns whether A is a literal prefix, that is, a digit or an apostrophe.
|
|
|
|
isLiteralPrefix:
|
|
|
|
cp 0x27 ; apostrophe
|
|
|
|
ret z
|
|
|
|
; continue to isDigit
|
|
|
|
|
|
|
|
; Returns whether A is a digit
|
|
|
|
isDigit:
|
Reworked parseHexadecimal and parseDecimal, other minor tweaks (#85)
I've tweaked nearly every function in this file, so I'll go through them one by one.
parseDecimal has been reworked a little so that `a` can be used instead of `b` for checking for overflow. I had originally intended to redo it to work like the old parseDecimal, but I think the current method (once reworked a little) is cleaner and smaller, and should be just as fast. 7 bytes and 27 cycles saved.
parseHexadecimal has been changed to load hex digits into `b` `d` `c` `e` from the right (so all the digits move along to the left so the new digit can be inserted on the right), and then only at the end is any shifting done, using the faster `add a, a` to do left shifts. 9 bytes saved and 78 cycles saved inside the loop, and then 49 cycles added after the loop.
parseBinaryLiteral had a few instructions moved around, saving two bytes and 5 cycles inside the loop, and a further 15 cycles saved on error.
parseLiteral has been reworked slightly, the isDigit call has been replaced with an inline parseDecimalDigit, saving a byte and around 20-30 cycles, with around 16 more cycles saved if the number is a decimal. The .char routine has been reduced by a byte, and 6 cycles saved on success, but 5 cycles added on error.
isDigit has been reduced by 4 bytes and 10 cycles on success, with a few more cycles saved on fail (hard to estimate due to branching).
2020-01-08 16:12:40 -05:00
|
|
|
cp '0' ; carry implies not zero for cp
|
|
|
|
ret c
|
|
|
|
cp '9' ; zero unset for a > '9', but set for a='9'
|
|
|
|
ret nc
|
2019-12-30 19:24:53 -05:00
|
|
|
cp a ; ensure Z
|
|
|
|
ret
|