Go to:
Gentoo Home
Documentation
Forums
Lists
Bugs
Planet
Store
Wiki
Get Gentoo!
Gentoo's Bugzilla – Attachment 46814 Details for
Bug 75585
ARM gcc's soft-float softvfp patch
Home
|
New
–
[Ex]
|
Browse
|
Search
|
Privacy Policy
|
[?]
|
Reports
|
Requests
|
Help
|
New Account
|
Log In
[x]
|
Forgot Password
Login:
[x]
[patch]
gcc-3.3.4-3.3.5-softvfp.patch
gcc-3.3.4-3.3.5-softvfp.patch (text/plain), 71.85 KB, created by
Yuri Vasilevski (RETIRED)
on 2004-12-24 15:45:34 UTC
(
hide
)
Description:
gcc-3.3.4-3.3.5-softvfp.patch
Filename:
MIME Type:
Creator:
Yuri Vasilevski (RETIRED)
Created:
2004-12-24 15:45:34 UTC
Size:
71.85 KB
patch
obsolete
># ># Submitted: ># ># Robert Schwebel <r.schwebel@pengutronix.de>, 2004-01-28 ># ># Error: ># ># gcc-3.3.2 doesn't seem to have soft float support, so linking against ># libgcc.a doesn't work when compiled with -msoft-float ># ># Description: ># ># Nicholas Pitre released this patch here: ># http://lists.arm.linux.org.uk/pipermail/linux-arm/2003-October/006436.html ># ># The patch had to be extended by the first two hunks below, otherwhise ># the compiler claimed a mixup of old and new style FPU options while ># compiling U-Boot. ># ># State: ># ># unknown ># ># ># Modifications: ># ># Dimitry Andric <dimitry@andric.com>, 2004-04-30 ># ># Description: ># ># The original patch doesn't distinguish between softfpa and softvfp modes ># in the way Nicholas Pitre probably meant. His description is: ># ># "Default is to use APCS-32 mode with soft-vfp. The old Linux default for ># floats can be achieved with -mhard-float or with the configure ># --with-float=hard option. If -msoft-float or --with-float=soft is used then ># software float support will be used just like the default but with the legacy ># big endian word ordering for double float representation instead." ># ># Which means the following: ># ># * If you compile without -mhard-float or -msoft-float, you should get ># software floating point, using the VFP format. The produced object file ># should have these flags in its header: ># ># private flags = 600: [APCS-32] [VFP float format] [software FP] ># ># * If you compile with -mhard-float, you should get hardware floating point, ># which always uses the FPA format. Object file header flags should be: ># ># private flags = 0: [APCS-32] [FPA float format] ># ># * If you compile with -msoft-float, you should get software floating point, ># using the FPA format. This is done for compatibility reasons with many ># existing distributions. Object file header flags should be: ># ># private flags = 200: [APCS-32] [FPA float format] [software FP] ># ># The original patch from Nicholas Pitre contained the following constructs: ># ># #define SUBTARGET_EXTRA_ASM_SPEC "%{!mcpu=*:-mcpu=xscale} \ ># %{mhard-float:-mfpu=fpa} \ ># %{!mhard-float: %{msoft-float:-mfpu=softfpa;:-mfpu=softvfp}}" ># ># However, gcc doesn't accept this ";:" notation, used in the 3rd line. This ># is probably the reason Robert Schwebel modified it to: ># ># #define SUBTARGET_EXTRA_ASM_SPEC "%{!mcpu=*:-mcpu=xscale} \ ># %{mhard-float:-mfpu=fpa} \ ># %{!mhard-float: %{msoft-float:-mfpu=softfpa -mfpu=softvfp}}" ># ># But this causes the following behaviour: ># ># * If you compile without -mhard-float or -msoft-float, the compiler generates ># software floating point instructions, but *nothing* is passed to the ># assembler, which results in an object file which has flags: ># ># private flags = 0: [APCS-32] [FPA float format] ># ># This is not correct! ># ># * If you compile with -mhard-float, the compiler generates hardware floating ># point instructions, and passes "-mfpu=fpa" to the assembler, which results ># in an object file which has the same flags as in the previous item, but now ># those *are* correct. ># ># * If you compile with -msoft-float, the compiler generates software floating ># point instructions, and passes "-mfpu=softfpa -mfpu=softvfp" (in that ># order) to the assembler, which results in an object file with flags: ># ># private flags = 600: [APCS-32] [VFP float format] [software FP] ># ># This is not correct, because the last "-mfpu=" option on the assembler ># command line determines the actual FPU convention used (which should be FPA ># in this case). ># ># Therefore, I modified this patch to get the desired behaviour. Every ># instance of the notation: ># ># %{msoft-float:-mfpu=softfpa -mfpu=softvfp} ># ># was changed to: ># ># %{msoft-float:-mfpu=softfpa} %{!msoft-float:-mfpu=softvfp} ># ># I also did the following: ># ># * Modified all TARGET_DEFAULT macros I could find to include ARM_FLAG_VFP, to ># be consistent with Nicholas' original patch. ># * Removed any "msoft-float" or "mhard-float" from all MULTILIB_DEFAULTS ># macros I could find. I think that if you compile without any options, you ># would like to get the defaults. :) ># * Removed the extra -lfloat option from LIBGCC_SPEC, since it isn't needed ># anymore. (The required functions are now in libgcc.) ># ># More Modifications: ># ># Yuri Vasilevski <yuri@ciencias.unam.mx>, 2004-12-24 ># ># Description: ># ># Made this patch compatible with StrongArm processors. ># To archive so, needed to make the following changes: ># ># * Removed all occurrences of %{!mcpu=*:-mcpu=xscale} except where it ># is specified in the original gcc sources ># (i.e. in gcc/config/arm/xscale-elf.h). ># ># * Changed in gcc/config/arm/lib1funcs.asm the use of __ARM_ARCH_4T__ ># to the meaning gcc gives it (i.e. "Can do thumb, but won't unless ># explicitly requested"). ># ># So the line: ># "#elif (__ARM_ARCH__ > 4) || defined(__ARM_ARCH_4T__)" ># became: ># "#elif (__ARM_ARCH__ > 4) || defined(__thumb__) || defined(__THUMB_INTERWORK__)" ># ># * Also changed default soft-float from softfpa to softvfp, as it makes ># sense that if the compilator defaults to soft-float and if we ># explicitly specify to use soft-float, we will get the same results ># as if we do not explicitly specify to use soft-float. ># ># Plus this was needed to successfully compile soft-float uclibc. :-D ># > >diff -urNd gcc-3.3.3-orig/gcc/config/arm/coff.h gcc-3.3.3/gcc/config/arm/coff.h >--- gcc-3.3.3-orig/gcc/config/arm/coff.h 2002-08-29 23:40:09.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/coff.h 2004-04-30 23:51:01.350158400 +0200 >@@ -32,11 +32,15 @@ > #define TARGET_VERSION fputs (" (ARM/coff)", stderr) > > #undef TARGET_DEFAULT >-#define TARGET_DEFAULT (ARM_FLAG_SOFT_FLOAT | ARM_FLAG_APCS_32 | ARM_FLAG_APCS_FRAME) >+#define TARGET_DEFAULT \ >+ ( ARM_FLAG_SOFT_FLOAT \ >+ | ARM_FLAG_VFP \ >+ | ARM_FLAG_APCS_32 \ >+ | ARM_FLAG_APCS_FRAME ) > > #ifndef MULTILIB_DEFAULTS > #define MULTILIB_DEFAULTS \ >- { "marm", "mlittle-endian", "msoft-float", "mapcs-32", "mno-thumb-interwork" } >+ { "marm", "mlittle-endian", "mapcs-32", "mno-thumb-interwork" } > #endif > > /* This is COFF, but prefer stabs. */ >diff -urNd gcc-3.3.3-orig/gcc/config/arm/conix-elf.h gcc-3.3.3/gcc/config/arm/conix-elf.h >--- gcc-3.3.3-orig/gcc/config/arm/conix-elf.h 2002-05-14 19:35:48.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/conix-elf.h 2004-04-30 23:51:01.350158400 +0200 >@@ -29,7 +29,10 @@ > > /* Default to using APCS-32 and software floating point. */ > #undef TARGET_DEFAULT >-#define TARGET_DEFAULT (ARM_FLAG_SOFT_FLOAT | ARM_FLAG_APCS_32) >+#define TARGET_DEFAULT \ >+ ( ARM_FLAG_SOFT_FLOAT \ >+ | ARM_FLAG_VFP \ >+ | ARM_FLAG_APCS_32 ) > > #ifndef CPP_APCS_PC_DEFAULT_SPEC > #define CPP_APCS_PC_DEFAULT_SPEC "-D__APCS_32__" >diff -urNd gcc-3.3.3-orig/gcc/config/arm/elf.h gcc-3.3.3/gcc/config/arm/elf.h >--- gcc-3.3.3-orig/gcc/config/arm/elf.h 2002-11-21 22:29:24.000000000 +0100 >+++ gcc-3.3.3/gcc/config/arm/elf.h 2004-04-30 23:51:01.350158400 +0200 >@@ -46,7 +46,9 @@ > > #ifndef SUBTARGET_ASM_FLOAT_SPEC > #define SUBTARGET_ASM_FLOAT_SPEC "\ >-%{mapcs-float:-mfloat} %{msoft-float:-mno-fpu}" >+%{mapcs-float:-mfloat} \ >+%{mhard-float:-mfpu=fpa} \ >+%{!mhard-float: %{msoft-float:-mfpu=softvfp} %{!msoft-float:-mfpu=softvfp}}" > #endif > > #ifndef ASM_SPEC >@@ -106,12 +108,16 @@ > #endif > > #ifndef TARGET_DEFAULT >-#define TARGET_DEFAULT (ARM_FLAG_SOFT_FLOAT | ARM_FLAG_APCS_32 | ARM_FLAG_APCS_FRAME) >+#define TARGET_DEFAULT \ >+ ( ARM_FLAG_SOFT_FLOAT \ >+ | ARM_FLAG_VFP \ >+ | ARM_FLAG_APCS_32 \ >+ | ARM_FLAG_APCS_FRAME ) > #endif > > #ifndef MULTILIB_DEFAULTS > #define MULTILIB_DEFAULTS \ >- { "marm", "mlittle-endian", "msoft-float", "mapcs-32", "mno-thumb-interwork", "fno-leading-underscore" } >+ { "marm", "mlittle-endian", "mapcs-32", "mno-thumb-interwork", "fno-leading-underscore" } > #endif > > >diff -urNd gcc-3.3.3-orig/gcc/config/arm/ieee754-df.S gcc-3.3.3/gcc/config/arm/ieee754-df.S >--- gcc-3.3.3-orig/gcc/config/arm/ieee754-df.S 1970-01-01 01:00:00.000000000 +0100 >+++ gcc-3.3.3/gcc/config/arm/ieee754-df.S 2004-04-30 23:41:18.522092800 +0200 >@@ -0,0 +1,1224 @@ >+/* ieee754-df.S double-precision floating point support for ARM >+ >+ Copyright (C) 2003 Free Software Foundation, Inc. >+ Contributed by Nicolas Pitre (nico@cam.org) >+ >+ This file is free software; you can redistribute it and/or modify it >+ under the terms of the GNU General Public License as published by the >+ Free Software Foundation; either version 2, or (at your option) any >+ later version. >+ >+ In addition to the permissions in the GNU General Public License, the >+ Free Software Foundation gives you unlimited permission to link the >+ compiled version of this file into combinations with other programs, >+ and to distribute those combinations without any restriction coming >+ from the use of this file. (The General Public License restrictions >+ do apply in other respects; for example, they cover modification of >+ the file, and distribution when not linked into a combine >+ executable.) >+ >+ This file is distributed in the hope that it will be useful, but >+ WITHOUT ANY WARRANTY; without even the implied warranty of >+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >+ General Public License for more details. >+ >+ You should have received a copy of the GNU General Public License >+ along with this program; see the file COPYING. If not, write to >+ the Free Software Foundation, 59 Temple Place - Suite 330, >+ Boston, MA 02111-1307, USA. */ >+ >+/* >+ * Notes: >+ * >+ * The goal of this code is to be as fast as possible. This is >+ * not meant to be easy to understand for the casual reader. >+ * For slightly simpler code please see the single precision version >+ * of this file. >+ * >+ * Only the default rounding mode is intended for best performances. >+ * Exceptions aren't supported yet, but that can be added quite easily >+ * if necessary without impacting performances. >+ */ >+ >+ >+@ For FPA, float words are always big-endian. >+@ For VFP, floats words follow the memory system mode. >+#if defined(__VFP_FP__) && !defined(__ARMEB__) >+#define xl r0 >+#define xh r1 >+#define yl r2 >+#define yh r3 >+#else >+#define xh r0 >+#define xl r1 >+#define yh r2 >+#define yl r3 >+#endif >+ >+ >+#ifdef L_negdf2 >+ >+ARM_FUNC_START negdf2 >+ @ flip sign bit >+ eor xh, xh, #0x80000000 >+ RET >+ >+ FUNC_END negdf2 >+ >+#endif >+ >+#ifdef L_addsubdf3 >+ >+ARM_FUNC_START subdf3 >+ @ flip sign bit of second arg >+ eor yh, yh, #0x80000000 >+#if defined(__thumb__) && !defined(__THUMB_INTERWORK__) >+ b 1f @ Skip Thumb-code prologue >+#endif >+ >+ARM_FUNC_START adddf3 >+ >+1: @ Compare both args, return zero if equal but the sign. >+ teq xl, yl >+ eoreq ip, xh, yh >+ teqeq ip, #0x80000000 >+ beq LSYM(Lad_z) >+ >+ @ If first arg is 0 or -0, return second arg. >+ @ If second arg is 0 or -0, return first arg. >+ orrs ip, xl, xh, lsl #1 >+ moveq xl, yl >+ moveq xh, yh >+ orrnes ip, yl, yh, lsl #1 >+ RETc(eq) >+ >+ stmfd sp!, {r4, r5, lr} >+ >+ @ Mask out exponents. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r4, xh, ip >+ and r5, yh, ip >+ >+ @ If either of them is 0x7ff, result will be INF or NAN >+ teq r4, ip >+ teqne r5, ip >+ beq LSYM(Lad_i) >+ >+ @ Compute exponent difference. Make largest exponent in r4, >+ @ corresponding arg in xh-xl, and positive exponent difference in r5. >+ subs r5, r5, r4 >+ rsblt r5, r5, #0 >+ ble 1f >+ add r4, r4, r5 >+ eor yl, xl, yl >+ eor yh, xh, yh >+ eor xl, yl, xl >+ eor xh, yh, xh >+ eor yl, xl, yl >+ eor yh, xh, yh >+1: >+ >+ @ If exponent difference is too large, return largest argument >+ @ already in xh-xl. We need up to 54 bit to handle proper rounding >+ @ of 0x1p54 - 1.1. >+ cmp r5, #(54 << 20) >+ RETLDM "r4, r5" hi >+ >+ @ Convert mantissa to signed integer. >+ tst xh, #0x80000000 >+ bic xh, xh, ip, lsl #1 >+ orr xh, xh, #0x00100000 >+ beq 1f >+ rsbs xl, xl, #0 >+ rsc xh, xh, #0 >+1: >+ tst yh, #0x80000000 >+ bic yh, yh, ip, lsl #1 >+ orr yh, yh, #0x00100000 >+ beq 1f >+ rsbs yl, yl, #0 >+ rsc yh, yh, #0 >+1: >+ @ If exponent == difference, one or both args were denormalized. >+ @ Since this is not common case, rescale them off line. >+ teq r4, r5 >+ beq LSYM(Lad_d) >+LSYM(Lad_x): >+ @ Scale down second arg with exponent difference. >+ @ Apply shift one bit left to first arg and the rest to second arg >+ @ to simplify things later, but only if exponent does not become 0. >+ mov ip, #0 >+ movs r5, r5, lsr #20 >+ beq 3f >+ teq r4, #(1 << 20) >+ beq 1f >+ movs xl, xl, lsl #1 >+ adc xh, ip, xh, lsl #1 >+ sub r4, r4, #(1 << 20) >+ subs r5, r5, #1 >+ beq 3f >+ >+ @ Shift yh-yl right per r5, keep leftover bits into ip. >+1: rsbs lr, r5, #32 >+ blt 2f >+ mov ip, yl, lsl lr >+ mov yl, yl, lsr r5 >+ orr yl, yl, yh, lsl lr >+ mov yh, yh, asr r5 >+ b 3f >+2: sub r5, r5, #32 >+ add lr, lr, #32 >+ cmp yl, #1 >+ adc ip, ip, yh, lsl lr >+ mov yl, yh, asr r5 >+ mov yh, yh, asr #32 >+3: >+ @ the actual addition >+ adds xl, xl, yl >+ adc xh, xh, yh >+ >+ @ We now have a result in xh-xl-ip. >+ @ Keep absolute value in xh-xl-ip, sign in r5. >+ ands r5, xh, #0x80000000 >+ bpl LSYM(Lad_p) >+ rsbs ip, ip, #0 >+ rscs xl, xl, #0 >+ rsc xh, xh, #0 >+ >+ @ Determine how to normalize the result. >+LSYM(Lad_p): >+ cmp xh, #0x00100000 >+ bcc LSYM(Lad_l) >+ cmp xh, #0x00200000 >+ bcc LSYM(Lad_r0) >+ cmp xh, #0x00400000 >+ bcc LSYM(Lad_r1) >+ >+ @ Result needs to be shifted right. >+ movs xh, xh, lsr #1 >+ movs xl, xl, rrx >+ movs ip, ip, rrx >+ orrcs ip, ip, #1 >+ add r4, r4, #(1 << 20) >+LSYM(Lad_r1): >+ movs xh, xh, lsr #1 >+ movs xl, xl, rrx >+ movs ip, ip, rrx >+ orrcs ip, ip, #1 >+ add r4, r4, #(1 << 20) >+ >+ @ Our result is now properly aligned into xh-xl, remaining bits in ip. >+ @ Round with MSB of ip. If halfway between two numbers, round towards >+ @ LSB of xl = 0. >+LSYM(Lad_r0): >+ adds xl, xl, ip, lsr #31 >+ adc xh, xh, #0 >+ teq ip, #0x80000000 >+ biceq xl, xl, #1 >+ >+ @ One extreme rounding case may add a new MSB. Adjust exponent. >+ @ That MSB will be cleared when exponent is merged below. >+ tst xh, #0x00200000 >+ addne r4, r4, #(1 << 20) >+ >+ @ Make sure we did not bust our exponent. >+ adds ip, r4, #(1 << 20) >+ bmi LSYM(Lad_o) >+ >+ @ Pack final result together. >+LSYM(Lad_e): >+ bic xh, xh, #0x00300000 >+ orr xh, xh, r4 >+ orr xh, xh, r5 >+ RETLDM "r4, r5" >+ >+LSYM(Lad_l): >+ @ Result must be shifted left and exponent adjusted. >+ @ No rounding necessary since ip will always be 0. >+#if __ARM_ARCH__ < 5 >+ >+ teq xh, #0 >+ movne r3, #-11 >+ moveq r3, #21 >+ moveq xh, xl >+ moveq xl, #0 >+ mov r2, xh >+ movs ip, xh, lsr #16 >+ moveq r2, r2, lsl #16 >+ addeq r3, r3, #16 >+ tst r2, #0xff000000 >+ moveq r2, r2, lsl #8 >+ addeq r3, r3, #8 >+ tst r2, #0xf0000000 >+ moveq r2, r2, lsl #4 >+ addeq r3, r3, #4 >+ tst r2, #0xc0000000 >+ moveq r2, r2, lsl #2 >+ addeq r3, r3, #2 >+ tst r2, #0x80000000 >+ addeq r3, r3, #1 >+ >+#else >+ >+ teq xh, #0 >+ moveq xh, xl >+ moveq xl, #0 >+ clz r3, xh >+ addeq r3, r3, #32 >+ sub r3, r3, #11 >+ >+#endif >+ >+ @ determine how to shift the value. >+ subs r2, r3, #32 >+ bge 2f >+ adds r2, r2, #12 >+ ble 1f >+ >+ @ shift value left 21 to 31 bits, or actually right 11 to 1 bits >+ @ since a register switch happened above. >+ add ip, r2, #20 >+ rsb r2, r2, #12 >+ mov xl, xh, lsl ip >+ mov xh, xh, lsr r2 >+ b 3f >+ >+ @ actually shift value left 1 to 20 bits, which might also represent >+ @ 32 to 52 bits if counting the register switch that happened earlier. >+1: add r2, r2, #20 >+2: rsble ip, r2, #32 >+ mov xh, xh, lsl r2 >+ orrle xh, xh, xl, lsr ip >+ movle xl, xl, lsl r2 >+ >+ @ adjust exponent accordingly. >+3: subs r4, r4, r3, lsl #20 >+ bgt LSYM(Lad_e) >+ >+ @ Exponent too small, denormalize result. >+ @ Find out proper shift value. >+ mvn r4, r4, asr #20 >+ subs r4, r4, #30 >+ bge 2f >+ adds r4, r4, #12 >+ bgt 1f >+ >+ @ shift result right of 1 to 20 bits, sign is in r5. >+ add r4, r4, #20 >+ rsb r2, r4, #32 >+ mov xl, xl, lsr r4 >+ orr xl, xl, xh, lsl r2 >+ orr xh, r5, xh, lsr r4 >+ RETLDM "r4, r5" >+ >+ @ shift result right of 21 to 31 bits, or left 11 to 1 bits after >+ @ a register switch from xh to xl. >+1: rsb r4, r4, #12 >+ rsb r2, r4, #32 >+ mov xl, xl, lsr r2 >+ orr xl, xl, xh, lsl r4 >+ mov xh, r5 >+ RETLDM "r4, r5" >+ >+ @ Shift value right of 32 to 64 bits, or 0 to 32 bits after a switch >+ @ from xh to xl. >+2: mov xl, xh, lsr r4 >+ mov xh, r5 >+ RETLDM "r4, r5" >+ >+ @ Adjust exponents for denormalized arguments. >+LSYM(Lad_d): >+ teq r4, #0 >+ eoreq xh, xh, #0x00100000 >+ addeq r4, r4, #(1 << 20) >+ eor yh, yh, #0x00100000 >+ subne r5, r5, #(1 << 20) >+ b LSYM(Lad_x) >+ >+ @ Result is x - x = 0, unless x = INF or NAN. >+LSYM(Lad_z): >+ sub ip, ip, #0x00100000 @ ip becomes 0x7ff00000 >+ and r2, xh, ip >+ teq r2, ip >+ orreq xh, ip, #0x00080000 >+ movne xh, #0 >+ mov xl, #0 >+ RET >+ >+ @ Overflow: return INF. >+LSYM(Lad_o): >+ orr xh, r5, #0x7f000000 >+ orr xh, xh, #0x00f00000 >+ mov xl, #0 >+ RETLDM "r4, r5" >+ >+ @ At least one of x or y is INF/NAN. >+ @ if xh-xl != INF/NAN: return yh-yl (which is INF/NAN) >+ @ if yh-yl != INF/NAN: return xh-xl (which is INF/NAN) >+ @ if either is NAN: return NAN >+ @ if opposite sign: return NAN >+ @ return xh-xl (which is INF or -INF) >+LSYM(Lad_i): >+ teq r4, ip >+ movne xh, yh >+ movne xl, yl >+ teqeq r5, ip >+ RETLDM "r4, r5" ne >+ >+ orrs r4, xl, xh, lsl #12 >+ orreqs r4, yl, yh, lsl #12 >+ teqeq xh, yh >+ orrne xh, r5, #0x00080000 >+ movne xl, #0 >+ RETLDM "r4, r5" >+ >+ FUNC_END subdf3 >+ FUNC_END adddf3 >+ >+ARM_FUNC_START floatunsidf >+ teq r0, #0 >+ moveq r1, #0 >+ RETc(eq) >+ stmfd sp!, {r4, r5, lr} >+ mov r4, #(0x400 << 20) @ initial exponent >+ add r4, r4, #((52-1) << 20) >+ mov r5, #0 @ sign bit is 0 >+ mov xl, r0 >+ mov xh, #0 >+ b LSYM(Lad_l) >+ >+ FUNC_END floatunsidf >+ >+ARM_FUNC_START floatsidf >+ teq r0, #0 >+ moveq r1, #0 >+ RETc(eq) >+ stmfd sp!, {r4, r5, lr} >+ mov r4, #(0x400 << 20) @ initial exponent >+ add r4, r4, #((52-1) << 20) >+ ands r5, r0, #0x80000000 @ sign bit in r5 >+ rsbmi r0, r0, #0 @ absolute value >+ mov xl, r0 >+ mov xh, #0 >+ b LSYM(Lad_l) >+ >+ FUNC_END floatsidf >+ >+ARM_FUNC_START extendsfdf2 >+ movs r2, r0, lsl #1 >+ beq 1f @ value is 0.0 or -0.0 >+ mov xh, r2, asr #3 @ stretch exponent >+ mov xh, xh, rrx @ retrieve sign bit >+ mov xl, r2, lsl #28 @ retrieve remaining bits >+ ands r2, r2, #0xff000000 @ isolate exponent >+ beq 2f @ exponent was 0 but not mantissa >+ teq r2, #0xff000000 @ check if INF or NAN >+ eorne xh, xh, #0x38000000 @ fixup exponent otherwise. >+ RET >+ >+1: mov xh, r0 >+ mov xl, #0 >+ RET >+ >+2: @ value was denormalized. We can normalize it now. >+ stmfd sp!, {r4, r5, lr} >+ mov r4, #(0x380 << 20) @ setup corresponding exponent >+ add r4, r4, #(1 << 20) >+ and r5, xh, #0x80000000 @ move sign bit in r5 >+ bic xh, xh, #0x80000000 >+ b LSYM(Lad_l) >+ >+ FUNC_END extendsfdf2 >+ >+#endif /* L_addsubdf3 */ >+ >+#ifdef L_muldivdf3 >+ >+ARM_FUNC_START muldf3 >+ >+ stmfd sp!, {r4, r5, r6, lr} >+ >+ @ Mask out exponents. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r4, xh, ip >+ and r5, yh, ip >+ >+ @ Trap any INF/NAN. >+ teq r4, ip >+ teqne r5, ip >+ beq LSYM(Lml_s) >+ >+ @ Trap any multiplication by 0. >+ orrs r6, xl, xh, lsl #1 >+ orrnes r6, yl, yh, lsl #1 >+ beq LSYM(Lml_z) >+ >+ @ Shift exponents right one bit to make room for overflow bit. >+ @ If either of them is 0, scale denormalized arguments off line. >+ @ Then add both exponents together. >+ movs r4, r4, lsr #1 >+ teqne r5, #0 >+ beq LSYM(Lml_d) >+LSYM(Lml_x): >+ add r4, r4, r5, asr #1 >+ >+ @ Preserve final sign in r4 along with exponent for now. >+ teq xh, yh >+ orrmi r4, r4, #0x8000 >+ >+ @ Convert mantissa to unsigned integer. >+ bic xh, xh, ip, lsl #1 >+ bic yh, yh, ip, lsl #1 >+ orr xh, xh, #0x00100000 >+ orr yh, yh, #0x00100000 >+ >+#if __ARM_ARCH__ < 4 >+ >+ @ Well, no way to make it shorter without the umull instruction. >+ @ We must perform that 53 x 53 bit multiplication by hand. >+ stmfd sp!, {r7, r8, r9, sl, fp} >+ mov r7, xl, lsr #16 >+ mov r8, yl, lsr #16 >+ mov r9, xh, lsr #16 >+ mov sl, yh, lsr #16 >+ bic xl, xl, r7, lsl #16 >+ bic yl, yl, r8, lsl #16 >+ bic xh, xh, r9, lsl #16 >+ bic yh, yh, sl, lsl #16 >+ mul ip, xl, yl >+ mul fp, xl, r8 >+ mov lr, #0 >+ adds ip, ip, fp, lsl #16 >+ adc lr, lr, fp, lsr #16 >+ mul fp, r7, yl >+ adds ip, ip, fp, lsl #16 >+ adc lr, lr, fp, lsr #16 >+ mul fp, xl, sl >+ mov r5, #0 >+ adds lr, lr, fp, lsl #16 >+ adc r5, r5, fp, lsr #16 >+ mul fp, r7, yh >+ adds lr, lr, fp, lsl #16 >+ adc r5, r5, fp, lsr #16 >+ mul fp, xh, r8 >+ adds lr, lr, fp, lsl #16 >+ adc r5, r5, fp, lsr #16 >+ mul fp, r9, yl >+ adds lr, lr, fp, lsl #16 >+ adc r5, r5, fp, lsr #16 >+ mul fp, xh, sl >+ mul r6, r9, sl >+ adds r5, r5, fp, lsl #16 >+ adc r6, r6, fp, lsr #16 >+ mul fp, r9, yh >+ adds r5, r5, fp, lsl #16 >+ adc r6, r6, fp, lsr #16 >+ mul fp, xl, yh >+ adds lr, lr, fp >+ mul fp, r7, sl >+ adcs r5, r5, fp >+ mul fp, xh, yl >+ adc r6, r6, #0 >+ adds lr, lr, fp >+ mul fp, r9, r8 >+ adcs r5, r5, fp >+ mul fp, r7, r8 >+ adc r6, r6, #0 >+ adds lr, lr, fp >+ mul fp, xh, yh >+ adcs r5, r5, fp >+ adc r6, r6, #0 >+ ldmfd sp!, {r7, r8, r9, sl, fp} >+ >+#else >+ >+ @ Here is the actual multiplication: 53 bits * 53 bits -> 106 bits. >+ umull ip, lr, xl, yl >+ mov r5, #0 >+ umlal lr, r5, xl, yh >+ umlal lr, r5, xh, yl >+ mov r6, #0 >+ umlal r5, r6, xh, yh >+ >+#endif >+ >+ @ The LSBs in ip are only significant for the final rounding. >+ @ Fold them into one bit of lr. >+ teq ip, #0 >+ orrne lr, lr, #1 >+ >+ @ Put final sign in xh. >+ mov xh, r4, lsl #16 >+ bic r4, r4, #0x8000 >+ >+ @ Adjust result if one extra MSB appeared (one of four times). >+ tst r6, #(1 << 9) >+ beq 1f >+ add r4, r4, #(1 << 19) >+ movs r6, r6, lsr #1 >+ movs r5, r5, rrx >+ movs lr, lr, rrx >+ orrcs lr, lr, #1 >+1: >+ @ Scale back to 53 bits. >+ @ xh contains sign bit already. >+ orr xh, xh, r6, lsl #12 >+ orr xh, xh, r5, lsr #20 >+ mov xl, r5, lsl #12 >+ orr xl, xl, lr, lsr #20 >+ >+ @ Apply exponent bias, check range for underflow. >+ sub r4, r4, #0x00f80000 >+ subs r4, r4, #0x1f000000 >+ ble LSYM(Lml_u) >+ >+ @ Round the result. >+ movs lr, lr, lsl #12 >+ bpl 1f >+ adds xl, xl, #1 >+ adc xh, xh, #0 >+ teq lr, #0x80000000 >+ biceq xl, xl, #1 >+ >+ @ Rounding may have produced an extra MSB here. >+ @ The extra bit is cleared before merging the exponent below. >+ tst xh, #0x00200000 >+ addne r4, r4, #(1 << 19) >+1: >+ @ Check exponent for overflow. >+ adds ip, r4, #(1 << 19) >+ tst ip, #(1 << 30) >+ bne LSYM(Lml_o) >+ >+ @ Add final exponent. >+ bic xh, xh, #0x00300000 >+ orr xh, xh, r4, lsl #1 >+ RETLDM "r4, r5, r6" >+ >+ @ Result is 0, but determine sign anyway. >+LSYM(Lml_z): >+ eor xh, xh, yh >+LSYM(Ldv_z): >+ bic xh, xh, #0x7fffffff >+ mov xl, #0 >+ RETLDM "r4, r5, r6" >+ >+ @ Check if denormalized result is possible, otherwise return signed 0. >+LSYM(Lml_u): >+ cmn r4, #(53 << 19) >+ movle xl, #0 >+ bicle xh, xh, #0x7fffffff >+ RETLDM "r4, r5, r6" le >+ >+ @ Find out proper shift value. >+LSYM(Lml_r): >+ mvn r4, r4, asr #19 >+ subs r4, r4, #30 >+ bge 2f >+ adds r4, r4, #12 >+ bgt 1f >+ >+ @ shift result right of 1 to 20 bits, preserve sign bit, round, etc. >+ add r4, r4, #20 >+ rsb r5, r4, #32 >+ mov r3, xl, lsl r5 >+ mov xl, xl, lsr r4 >+ orr xl, xl, xh, lsl r5 >+ movs xh, xh, lsl #1 >+ mov xh, xh, lsr r4 >+ mov xh, xh, rrx >+ adds xl, xl, r3, lsr #31 >+ adc xh, xh, #0 >+ teq lr, #0 >+ teqeq r3, #0x80000000 >+ biceq xl, xl, #1 >+ RETLDM "r4, r5, r6" >+ >+ @ shift result right of 21 to 31 bits, or left 11 to 1 bits after >+ @ a register switch from xh to xl. Then round. >+1: rsb r4, r4, #12 >+ rsb r5, r4, #32 >+ mov r3, xl, lsl r4 >+ mov xl, xl, lsr r5 >+ orr xl, xl, xh, lsl r4 >+ bic xh, xh, #0x7fffffff >+ adds xl, xl, r3, lsr #31 >+ adc xh, xh, #0 >+ teq lr, #0 >+ teqeq r3, #0x80000000 >+ biceq xl, xl, #1 >+ RETLDM "r4, r5, r6" >+ >+ @ Shift value right of 32 to 64 bits, or 0 to 32 bits after a switch >+ @ from xh to xl. Leftover bits are in r3-r6-lr for rounding. >+2: rsb r5, r4, #32 >+ mov r6, xl, lsl r5 >+ mov r3, xl, lsr r4 >+ orr r3, r3, xh, lsl r5 >+ mov xl, xh, lsr r4 >+ bic xh, xh, #0x7fffffff >+ adds xl, xl, r3, lsr #31 >+ adc xh, xh, #0 >+ orrs r6, r6, lr >+ teqeq r3, #0x80000000 >+ biceq xl, xl, #1 >+ RETLDM "r4, r5, r6" >+ >+ @ One or both arguments are denormalized. >+ @ Scale them leftwards and preserve sign bit. >+LSYM(Lml_d): >+ mov lr, #0 >+ teq r4, #0 >+ bne 2f >+ and r6, xh, #0x80000000 >+1: movs xl, xl, lsl #1 >+ adc xh, lr, xh, lsl #1 >+ tst xh, #0x00100000 >+ subeq r4, r4, #(1 << 19) >+ beq 1b >+ orr xh, xh, r6 >+ teq r5, #0 >+ bne LSYM(Lml_x) >+2: and r6, yh, #0x80000000 >+3: movs yl, yl, lsl #1 >+ adc yh, lr, yh, lsl #1 >+ tst yh, #0x00100000 >+ subeq r5, r5, #(1 << 20) >+ beq 3b >+ orr yh, yh, r6 >+ b LSYM(Lml_x) >+ >+ @ One or both args are INF or NAN. >+LSYM(Lml_s): >+ orrs r6, xl, xh, lsl #1 >+ orrnes r6, yl, yh, lsl #1 >+ beq LSYM(Lml_n) @ 0 * INF or INF * 0 -> NAN >+ teq r4, ip >+ bne 1f >+ orrs r6, xl, xh, lsl #12 >+ bne LSYM(Lml_n) @ NAN * <anything> -> NAN >+1: teq r5, ip >+ bne LSYM(Lml_i) >+ orrs r6, yl, yh, lsl #12 >+ bne LSYM(Lml_n) @ <anything> * NAN -> NAN >+ >+ @ Result is INF, but we need to determine its sign. >+LSYM(Lml_i): >+ eor xh, xh, yh >+ >+ @ Overflow: return INF (sign already in xh). >+LSYM(Lml_o): >+ and xh, xh, #0x80000000 >+ orr xh, xh, #0x7f000000 >+ orr xh, xh, #0x00f00000 >+ mov xl, #0 >+ RETLDM "r4, r5, r6" >+ >+ @ Return NAN. >+LSYM(Lml_n): >+ mov xh, #0x7f000000 >+ orr xh, xh, #0x00f80000 >+ RETLDM "r4, r5, r6" >+ >+ FUNC_END muldf3 >+ >+ARM_FUNC_START divdf3 >+ >+ stmfd sp!, {r4, r5, r6, lr} >+ >+ @ Mask out exponents. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r4, xh, ip >+ and r5, yh, ip >+ >+ @ Trap any INF/NAN or zeroes. >+ teq r4, ip >+ teqne r5, ip >+ orrnes r6, xl, xh, lsl #1 >+ orrnes r6, yl, yh, lsl #1 >+ beq LSYM(Ldv_s) >+ >+ @ Shift exponents right one bit to make room for overflow bit. >+ @ If either of them is 0, scale denormalized arguments off line. >+ @ Then substract divisor exponent from dividend''s. >+ movs r4, r4, lsr #1 >+ teqne r5, #0 >+ beq LSYM(Ldv_d) >+LSYM(Ldv_x): >+ sub r4, r4, r5, asr #1 >+ >+ @ Preserve final sign into lr. >+ eor lr, xh, yh >+ >+ @ Convert mantissa to unsigned integer. >+ @ Dividend -> r5-r6, divisor -> yh-yl. >+ mov r5, #0x10000000 >+ mov yh, yh, lsl #12 >+ orr yh, r5, yh, lsr #4 >+ orr yh, yh, yl, lsr #24 >+ movs yl, yl, lsl #8 >+ mov xh, xh, lsl #12 >+ teqeq yh, r5 >+ beq LSYM(Ldv_1) >+ orr r5, r5, xh, lsr #4 >+ orr r5, r5, xl, lsr #24 >+ mov r6, xl, lsl #8 >+ >+ @ Initialize xh with final sign bit. >+ and xh, lr, #0x80000000 >+ >+ @ Ensure result will land to known bit position. >+ cmp r5, yh >+ cmpeq r6, yl >+ bcs 1f >+ sub r4, r4, #(1 << 19) >+ movs yh, yh, lsr #1 >+ mov yl, yl, rrx >+1: >+ @ Apply exponent bias, check range for over/underflow. >+ add r4, r4, #0x1f000000 >+ add r4, r4, #0x00f80000 >+ cmn r4, #(53 << 19) >+ ble LSYM(Ldv_z) >+ cmp r4, ip, lsr #1 >+ bge LSYM(Lml_o) >+ >+ @ Perform first substraction to align result to a nibble. >+ subs r6, r6, yl >+ sbc r5, r5, yh >+ movs yh, yh, lsr #1 >+ mov yl, yl, rrx >+ mov xl, #0x00100000 >+ mov ip, #0x00080000 >+ >+ @ The actual division loop. >+1: subs lr, r6, yl >+ sbcs lr, r5, yh >+ subcs r6, r6, yl >+ movcs r5, lr >+ orrcs xl, xl, ip >+ movs yh, yh, lsr #1 >+ mov yl, yl, rrx >+ subs lr, r6, yl >+ sbcs lr, r5, yh >+ subcs r6, r6, yl >+ movcs r5, lr >+ orrcs xl, xl, ip, lsr #1 >+ movs yh, yh, lsr #1 >+ mov yl, yl, rrx >+ subs lr, r6, yl >+ sbcs lr, r5, yh >+ subcs r6, r6, yl >+ movcs r5, lr >+ orrcs xl, xl, ip, lsr #2 >+ movs yh, yh, lsr #1 >+ mov yl, yl, rrx >+ subs lr, r6, yl >+ sbcs lr, r5, yh >+ subcs r6, r6, yl >+ movcs r5, lr >+ orrcs xl, xl, ip, lsr #3 >+ >+ orrs lr, r5, r6 >+ beq 2f >+ mov r5, r5, lsl #4 >+ orr r5, r5, r6, lsr #28 >+ mov r6, r6, lsl #4 >+ mov yh, yh, lsl #3 >+ orr yh, yh, yl, lsr #29 >+ mov yl, yl, lsl #3 >+ movs ip, ip, lsr #4 >+ bne 1b >+ >+ @ We are done with a word of the result. >+ @ Loop again for the low word if this pass was for the high word. >+ tst xh, #0x00100000 >+ bne 3f >+ orr xh, xh, xl >+ mov xl, #0 >+ mov ip, #0x80000000 >+ b 1b >+2: >+ @ Be sure result starts in the high word. >+ tst xh, #0x00100000 >+ orreq xh, xh, xl >+ moveq xl, #0 >+3: >+ @ Check if denormalized result is needed. >+ cmp r4, #0 >+ ble LSYM(Ldv_u) >+ >+ @ Apply proper rounding. >+ subs ip, r5, yh >+ subeqs ip, r6, yl >+ adcs xl, xl, #0 >+ adc xh, xh, #0 >+ teq ip, #0 >+ biceq xl, xl, #1 >+ >+ @ Add exponent to result. >+ bic xh, xh, #0x00100000 >+ orr xh, xh, r4, lsl #1 >+ RETLDM "r4, r5, r6" >+ >+ @ Division by 0x1p*: shortcut a lot of code. >+LSYM(Ldv_1): >+ and lr, lr, #0x80000000 >+ orr xh, lr, xh, lsr #12 >+ add r4, r4, #0x1f000000 >+ add r4, r4, #0x00f80000 >+ cmp r4, ip, lsr #1 >+ bge LSYM(Lml_o) >+ cmp r4, #0 >+ orrgt xh, xh, r4, lsl #1 >+ RETLDM "r4, r5, r6" gt >+ >+ cmn r4, #(53 << 19) >+ ble LSYM(Ldv_z) >+ orr xh, xh, #0x00100000 >+ mov lr, #0 >+ b LSYM(Lml_r) >+ >+ @ Result must be denormalized: put remainder in lr for >+ @ rounding considerations. >+LSYM(Ldv_u): >+ orr lr, r5, r6 >+ b LSYM(Lml_r) >+ >+ @ One or both arguments are denormalized. >+ @ Scale them leftwards and preserve sign bit. >+LSYM(Ldv_d): >+ mov lr, #0 >+ teq r4, #0 >+ bne 2f >+ and r6, xh, #0x80000000 >+1: movs xl, xl, lsl #1 >+ adc xh, lr, xh, lsl #1 >+ tst xh, #0x00100000 >+ subeq r4, r4, #(1 << 19) >+ beq 1b >+ orr xh, xh, r6 >+ teq r5, #0 >+ bne LSYM(Ldv_x) >+2: and r6, yh, #0x80000000 >+3: movs yl, yl, lsl #1 >+ adc yh, lr, yh, lsl #1 >+ tst yh, #0x00100000 >+ subeq r5, r5, #(1 << 20) >+ beq 3b >+ orr yh, yh, r6 >+ b LSYM(Ldv_x) >+ >+ @ One or both arguments is either INF, NAN or zero. >+LSYM(Ldv_s): >+ teq r4, ip >+ teqeq r5, ip >+ beq LSYM(Lml_n) @ INF/NAN / INF/NAN -> NAN >+ teq r4, ip >+ bne 1f >+ orrs r4, xl, xh, lsl #12 >+ bne LSYM(Lml_n) @ NAN / <anything> -> NAN >+ b LSYM(Lml_i) @ INF / <anything> -> INF >+1: teq r5, ip >+ bne 2f >+ orrs r5, yl, yh, lsl #12 >+ bne LSYM(Lml_n) @ <anything> / NAN -> NAN >+ b LSYM(Lml_z) @ <anything> / INF -> 0 >+2: @ One or both arguments are 0. >+ orrs r4, xl, xh, lsl #1 >+ bne LSYM(Lml_i) @ <non_zero> / 0 -> INF >+ orrs r5, yl, yh, lsl #1 >+ bne LSYM(Lml_z) @ 0 / <non_zero> -> 0 >+ b LSYM(Lml_n) @ 0 / 0 -> NAN >+ >+ FUNC_END divdf3 >+ >+#endif /* L_muldivdf3 */ >+ >+#ifdef L_cmpdf2 >+ >+FUNC_START gedf2 >+ARM_FUNC_START gtdf2 >+ mov ip, #-1 >+ b 1f >+ >+FUNC_START ledf2 >+ARM_FUNC_START ltdf2 >+ mov ip, #1 >+ b 1f >+ >+FUNC_START nedf2 >+FUNC_START eqdf2 >+ARM_FUNC_START cmpdf2 >+ mov ip, #1 @ how should we specify unordered here? >+ >+1: stmfd sp!, {r4, r5, lr} >+ >+ @ Trap any INF/NAN first. >+ mov lr, #0x7f000000 >+ orr lr, lr, #0x00f00000 >+ and r4, xh, lr >+ and r5, yh, lr >+ teq r4, lr >+ teqne r5, lr >+ beq 3f >+ >+ @ Test for equality. >+ @ Note that 0.0 is equal to -0.0. >+2: orrs ip, xl, xh, lsl #1 @ if x == 0.0 or -0.0 >+ orreqs ip, yl, yh, lsl #1 @ and y == 0.0 or -0.0 >+ teqne xh, yh @ or xh == yh >+ teqeq xl, yl @ and xl == yl >+ moveq r0, #0 @ then equal. >+ RETLDM "r4, r5" eq >+ >+ @ Check for sign difference. >+ teq xh, yh >+ movmi r0, xh, asr #31 >+ orrmi r0, r0, #1 >+ RETLDM "r4, r5" mi >+ >+ @ Compare exponents. >+ cmp r4, r5 >+ >+ @ Compare mantissa if exponents are equal. >+ moveq xh, xh, lsl #12 >+ cmpeq xh, yh, lsl #12 >+ cmpeq xl, yl >+ movcs r0, yh, asr #31 >+ mvncc r0, yh, asr #31 >+ orr r0, r0, #1 >+ RETLDM "r4, r5" >+ >+ @ Look for a NAN. >+3: teq r4, lr >+ bne 4f >+ orrs xl, xl, xh, lsl #12 >+ bne 5f @ x is NAN >+4: teq r5, lr >+ bne 2b >+ orrs yl, yl, yh, lsl #12 >+ beq 2b @ y is not NAN >+5: mov r0, ip @ return unordered code from ip >+ RETLDM "r4, r5" >+ >+ FUNC_END gedf2 >+ FUNC_END gtdf2 >+ FUNC_END ledf2 >+ FUNC_END ltdf2 >+ FUNC_END nedf2 >+ FUNC_END eqdf2 >+ FUNC_END cmpdf2 >+ >+#endif /* L_cmpdf2 */ >+ >+#ifdef L_unorddf2 >+ >+ARM_FUNC_START unorddf2 >+ str lr, [sp, #-4]! >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and lr, xh, ip >+ teq lr, ip >+ bne 1f >+ orrs xl, xl, xh, lsl #12 >+ bne 3f @ x is NAN >+1: and lr, yh, ip >+ teq lr, ip >+ bne 2f >+ orrs yl, yl, yh, lsl #12 >+ bne 3f @ y is NAN >+2: mov r0, #0 @ arguments are ordered. >+ RETLDM >+ >+3: mov r0, #1 @ arguments are unordered. >+ RETLDM >+ >+ FUNC_END unorddf2 >+ >+#endif /* L_unorddf2 */ >+ >+#ifdef L_fixdfsi >+ >+ARM_FUNC_START fixdfsi >+ orrs ip, xl, xh, lsl #1 >+ beq 1f @ value is 0. >+ >+ mov r3, r3, rrx @ preserve C flag (the actual sign) >+ >+ @ check exponent range. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r2, xh, ip >+ teq r2, ip >+ beq 2f @ value is INF or NAN >+ bic ip, ip, #0x40000000 >+ cmp r2, ip >+ bcc 1f @ value is too small >+ add ip, ip, #(31 << 20) >+ cmp r2, ip >+ bcs 3f @ value is too large >+ >+ rsb r2, r2, ip >+ mov ip, xh, lsl #11 >+ orr ip, ip, #0x80000000 >+ orr ip, ip, xl, lsr #21 >+ mov r2, r2, lsr #20 >+ tst r3, #0x80000000 @ the sign bit >+ mov r0, ip, lsr r2 >+ rsbne r0, r0, #0 >+ RET >+ >+1: mov r0, #0 >+ RET >+ >+2: orrs xl, xl, xh, lsl #12 >+ bne 4f @ r0 is NAN. >+3: ands r0, r3, #0x80000000 @ the sign bit >+ moveq r0, #0x7fffffff @ maximum signed positive si >+ RET >+ >+4: mov r0, #0 @ How should we convert NAN? >+ RET >+ >+ FUNC_END fixdfsi >+ >+#endif /* L_fixdfsi */ >+ >+#ifdef L_fixunsdfsi >+ >+ARM_FUNC_START fixunsdfsi >+ orrs ip, xl, xh, lsl #1 >+ movcss r0, #0 @ value is negative >+ RETc(eq) @ or 0 (xl, xh overlap r0) >+ >+ @ check exponent range. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r2, xh, ip >+ teq r2, ip >+ beq 2f @ value is INF or NAN >+ bic ip, ip, #0x40000000 >+ cmp r2, ip >+ bcc 1f @ value is too small >+ add ip, ip, #(31 << 20) >+ cmp r2, ip >+ bhi 3f @ value is too large >+ >+ rsb r2, r2, ip >+ mov ip, xh, lsl #11 >+ orr ip, ip, #0x80000000 >+ orr ip, ip, xl, lsr #21 >+ mov r2, r2, lsr #20 >+ mov r0, ip, lsr r2 >+ RET >+ >+1: mov r0, #0 >+ RET >+ >+2: orrs xl, xl, xh, lsl #12 >+ bne 4f @ value is NAN. >+3: mov r0, #0xffffffff @ maximum unsigned si >+ RET >+ >+4: mov r0, #0 @ How should we convert NAN? >+ RET >+ >+ FUNC_END fixunsdfsi >+ >+#endif /* L_fixunsdfsi */ >+ >+#ifdef L_truncdfsf2 >+ >+ARM_FUNC_START truncdfsf2 >+ orrs r2, xl, xh, lsl #1 >+ moveq r0, r2, rrx >+ RETc(eq) @ value is 0.0 or -0.0 >+ >+ @ check exponent range. >+ mov ip, #0x7f000000 >+ orr ip, ip, #0x00f00000 >+ and r2, ip, xh >+ teq r2, ip >+ beq 2f @ value is INF or NAN >+ bic xh, xh, ip >+ cmp r2, #(0x380 << 20) >+ bls 4f @ value is too small >+ >+ @ shift and round mantissa >+1: movs r3, xl, lsr #29 >+ adc r3, r3, xh, lsl #3 >+ >+ @ if halfway between two numbers, round towards LSB = 0. >+ mov xl, xl, lsl #3 >+ teq xl, #0x80000000 >+ biceq r3, r3, #1 >+ >+ @ rounding might have created an extra MSB. If so adjust exponent. >+ tst r3, #0x00800000 >+ addne r2, r2, #(1 << 20) >+ bicne r3, r3, #0x00800000 >+ >+ @ check exponent for overflow >+ mov ip, #(0x400 << 20) >+ orr ip, ip, #(0x07f << 20) >+ cmp r2, ip >+ bcs 3f @ overflow >+ >+ @ adjust exponent, merge with sign bit and mantissa. >+ movs xh, xh, lsl #1 >+ mov r2, r2, lsl #4 >+ orr r0, r3, r2, rrx >+ eor r0, r0, #0x40000000 >+ RET >+ >+2: @ chech for NAN >+ orrs xl, xl, xh, lsl #12 >+ movne r0, #0x7f000000 >+ orrne r0, r0, #0x00c00000 >+ RETc(ne) @ return NAN >+ >+3: @ return INF with sign >+ and r0, xh, #0x80000000 >+ orr r0, r0, #0x7f000000 >+ orr r0, r0, #0x00800000 >+ RET >+ >+4: @ check if denormalized value is possible >+ subs r2, r2, #((0x380 - 24) << 20) >+ andle r0, xh, #0x80000000 @ too small, return signed 0. >+ RETc(le) >+ >+ @ denormalize value so we can resume with the code above afterwards. >+ orr xh, xh, #0x00100000 >+ mov r2, r2, lsr #20 >+ rsb r2, r2, #25 >+ cmp r2, #20 >+ bgt 6f >+ >+ rsb ip, r2, #32 >+ mov r3, xl, lsl ip >+ mov xl, xl, lsr r2 >+ orr xl, xl, xh, lsl ip >+ movs xh, xh, lsl #1 >+ mov xh, xh, lsr r2 >+ mov xh, xh, rrx >+5: teq r3, #0 @ fold r3 bits into the LSB >+ orrne xl, xl, #1 @ for rounding considerations. >+ mov r2, #(0x380 << 20) @ equivalent to the 0 float exponent >+ b 1b >+ >+6: rsb r2, r2, #(12 + 20) >+ rsb ip, r2, #32 >+ mov r3, xl, lsl r2 >+ mov xl, xl, lsr ip >+ orr xl, xl, xh, lsl r2 >+ and xh, xh, #0x80000000 >+ b 5b >+ >+ FUNC_END truncdfsf2 >+ >+#endif /* L_truncdfsf2 */ >diff -urNd gcc-3.3.3-orig/gcc/config/arm/ieee754-sf.S gcc-3.3.3/gcc/config/arm/ieee754-sf.S >--- gcc-3.3.3-orig/gcc/config/arm/ieee754-sf.S 1970-01-01 01:00:00.000000000 +0100 >+++ gcc-3.3.3/gcc/config/arm/ieee754-sf.S 2004-04-30 23:41:18.542121600 +0200 >@@ -0,0 +1,815 @@ >+/* ieee754-sf.S single-precision floating point support for ARM >+ >+ Copyright (C) 2003 Free Software Foundation, Inc. >+ Contributed by Nicolas Pitre (nico@cam.org) >+ >+ This file is free software; you can redistribute it and/or modify it >+ under the terms of the GNU General Public License as published by the >+ Free Software Foundation; either version 2, or (at your option) any >+ later version. >+ >+ In addition to the permissions in the GNU General Public License, the >+ Free Software Foundation gives you unlimited permission to link the >+ compiled version of this file into combinations with other programs, >+ and to distribute those combinations without any restriction coming >+ from the use of this file. (The General Public License restrictions >+ do apply in other respects; for example, they cover modification of >+ the file, and distribution when not linked into a combine >+ executable.) >+ >+ This file is distributed in the hope that it will be useful, but >+ WITHOUT ANY WARRANTY; without even the implied warranty of >+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >+ General Public License for more details. >+ >+ You should have received a copy of the GNU General Public License >+ along with this program; see the file COPYING. If not, write to >+ the Free Software Foundation, 59 Temple Place - Suite 330, >+ Boston, MA 02111-1307, USA. */ >+ >+/* >+ * Notes: >+ * >+ * The goal of this code is to be as fast as possible. This is >+ * not meant to be easy to understand for the casual reader. >+ * >+ * Only the default rounding mode is intended for best performances. >+ * Exceptions aren't supported yet, but that can be added quite easily >+ * if necessary without impacting performances. >+ */ >+ >+#ifdef L_negsf2 >+ >+ARM_FUNC_START negsf2 >+ eor r0, r0, #0x80000000 @ flip sign bit >+ RET >+ >+ FUNC_END negsf2 >+ >+#endif >+ >+#ifdef L_addsubsf3 >+ >+ARM_FUNC_START subsf3 >+ eor r1, r1, #0x80000000 @ flip sign bit of second arg >+#if defined(__thumb__) && !defined(__THUMB_INTERWORK__) >+ b 1f @ Skip Thumb-code prologue >+#endif >+ >+ARM_FUNC_START addsf3 >+ >+1: @ Compare both args, return zero if equal but the sign. >+ eor r2, r0, r1 >+ teq r2, #0x80000000 >+ beq LSYM(Lad_z) >+ >+ @ If first arg is 0 or -0, return second arg. >+ @ If second arg is 0 or -0, return first arg. >+ bics r2, r0, #0x80000000 >+ moveq r0, r1 >+ bicnes r2, r1, #0x80000000 >+ RETc(eq) >+ >+ @ Mask out exponents. >+ mov ip, #0xff000000 >+ and r2, r0, ip, lsr #1 >+ and r3, r1, ip, lsr #1 >+ >+ @ If either of them is 255, result will be INF or NAN >+ teq r2, ip, lsr #1 >+ teqne r3, ip, lsr #1 >+ beq LSYM(Lad_i) >+ >+ @ Compute exponent difference. Make largest exponent in r2, >+ @ corresponding arg in r0, and positive exponent difference in r3. >+ subs r3, r3, r2 >+ addgt r2, r2, r3 >+ eorgt r1, r0, r1 >+ eorgt r0, r1, r0 >+ eorgt r1, r0, r1 >+ rsblt r3, r3, #0 >+ >+ @ If exponent difference is too large, return largest argument >+ @ already in r0. We need up to 25 bit to handle proper rounding >+ @ of 0x1p25 - 1.1. >+ cmp r3, #(25 << 23) >+ RETc(hi) >+ >+ @ Convert mantissa to signed integer. >+ tst r0, #0x80000000 >+ orr r0, r0, #0x00800000 >+ bic r0, r0, #0xff000000 >+ rsbne r0, r0, #0 >+ tst r1, #0x80000000 >+ orr r1, r1, #0x00800000 >+ bic r1, r1, #0xff000000 >+ rsbne r1, r1, #0 >+ >+ @ If exponent == difference, one or both args were denormalized. >+ @ Since this is not common case, rescale them off line. >+ teq r2, r3 >+ beq LSYM(Lad_d) >+LSYM(Lad_x): >+ >+ @ Scale down second arg with exponent difference. >+ @ Apply shift one bit left to first arg and the rest to second arg >+ @ to simplify things later, but only if exponent does not become 0. >+ movs r3, r3, lsr #23 >+ teqne r2, #(1 << 23) >+ movne r0, r0, lsl #1 >+ subne r2, r2, #(1 << 23) >+ subne r3, r3, #1 >+ >+ @ Shift second arg into ip, keep leftover bits into r1. >+ mov ip, r1, asr r3 >+ rsb r3, r3, #32 >+ mov r1, r1, lsl r3 >+ >+ add r0, r0, ip @ the actual addition >+ >+ @ We now have a 64 bit result in r0-r1. >+ @ Keep absolute value in r0-r1, sign in r3. >+ ands r3, r0, #0x80000000 >+ bpl LSYM(Lad_p) >+ rsbs r1, r1, #0 >+ rsc r0, r0, #0 >+ >+ @ Determine how to normalize the result. >+LSYM(Lad_p): >+ cmp r0, #0x00800000 >+ bcc LSYM(Lad_l) >+ cmp r0, #0x01000000 >+ bcc LSYM(Lad_r0) >+ cmp r0, #0x02000000 >+ bcc LSYM(Lad_r1) >+ >+ @ Result needs to be shifted right. >+ movs r0, r0, lsr #1 >+ mov r1, r1, rrx >+ add r2, r2, #(1 << 23) >+LSYM(Lad_r1): >+ movs r0, r0, lsr #1 >+ mov r1, r1, rrx >+ add r2, r2, #(1 << 23) >+ >+ @ Our result is now properly aligned into r0, remaining bits in r1. >+ @ Round with MSB of r1. If halfway between two numbers, round towards >+ @ LSB of r0 = 0. >+LSYM(Lad_r0): >+ add r0, r0, r1, lsr #31 >+ teq r1, #0x80000000 >+ biceq r0, r0, #1 >+ >+ @ Rounding may have added a new MSB. Adjust exponent. >+ @ That MSB will be cleared when exponent is merged below. >+ tst r0, #0x01000000 >+ addne r2, r2, #(1 << 23) >+ >+ @ Make sure we did not bust our exponent. >+ cmp r2, #(254 << 23) >+ bhi LSYM(Lad_o) >+ >+ @ Pack final result together. >+LSYM(Lad_e): >+ bic r0, r0, #0x01800000 >+ orr r0, r0, r2 >+ orr r0, r0, r3 >+ RET >+ >+ @ Result must be shifted left. >+ @ No rounding necessary since r1 will always be 0. >+LSYM(Lad_l): >+ >+#if __ARM_ARCH__ < 5 >+ >+ movs ip, r0, lsr #12 >+ moveq r0, r0, lsl #12 >+ subeq r2, r2, #(12 << 23) >+ tst r0, #0x00ff0000 >+ moveq r0, r0, lsl #8 >+ subeq r2, r2, #(8 << 23) >+ tst r0, #0x00f00000 >+ moveq r0, r0, lsl #4 >+ subeq r2, r2, #(4 << 23) >+ tst r0, #0x00c00000 >+ moveq r0, r0, lsl #2 >+ subeq r2, r2, #(2 << 23) >+ tst r0, #0x00800000 >+ moveq r0, r0, lsl #1 >+ subeq r2, r2, #(1 << 23) >+ cmp r2, #0 >+ bgt LSYM(Lad_e) >+ >+#else >+ >+ clz ip, r0 >+ sub ip, ip, #8 >+ mov r0, r0, lsl ip >+ subs r2, r2, ip, lsl #23 >+ bgt LSYM(Lad_e) >+ >+#endif >+ >+ @ Exponent too small, denormalize result. >+ mvn r2, r2, asr #23 >+ add r2, r2, #2 >+ orr r0, r3, r0, lsr r2 >+ RET >+ >+ @ Fixup and adjust bit position for denormalized arguments. >+ @ Note that r2 must not remain equal to 0. >+LSYM(Lad_d): >+ teq r2, #0 >+ eoreq r0, r0, #0x00800000 >+ addeq r2, r2, #(1 << 23) >+ eor r1, r1, #0x00800000 >+ subne r3, r3, #(1 << 23) >+ b LSYM(Lad_x) >+ >+ @ Result is x - x = 0, unless x is INF or NAN. >+LSYM(Lad_z): >+ mov ip, #0xff000000 >+ and r2, r0, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ moveq r0, ip, asr #2 >+ movne r0, #0 >+ RET >+ >+ @ Overflow: return INF. >+LSYM(Lad_o): >+ orr r0, r3, #0x7f000000 >+ orr r0, r0, #0x00800000 >+ RET >+ >+ @ At least one of r0/r1 is INF/NAN. >+ @ if r0 != INF/NAN: return r1 (which is INF/NAN) >+ @ if r1 != INF/NAN: return r0 (which is INF/NAN) >+ @ if r0 or r1 is NAN: return NAN >+ @ if opposite sign: return NAN >+ @ return r0 (which is INF or -INF) >+LSYM(Lad_i): >+ teq r2, ip, lsr #1 >+ movne r0, r1 >+ teqeq r3, ip, lsr #1 >+ RETc(ne) >+ movs r2, r0, lsl #9 >+ moveqs r2, r1, lsl #9 >+ teqeq r0, r1 >+ orrne r0, r3, #0x00400000 @ NAN >+ RET >+ >+ FUNC_END addsf3 >+ FUNC_END subsf3 >+ >+ARM_FUNC_START floatunsisf >+ mov r3, #0 >+ b 1f >+ >+ARM_FUNC_START floatsisf >+ ands r3, r0, #0x80000000 >+ rsbmi r0, r0, #0 >+ >+1: teq r0, #0 >+ RETc(eq) >+ >+ mov r1, #0 >+ mov r2, #((127 + 23) << 23) >+ tst r0, #0xfc000000 >+ beq LSYM(Lad_p) >+ >+ @ We need to scale the value a little before branching to code above. >+ tst r0, #0xf0000000 >+ movne r1, r0, lsl #28 >+ movne r0, r0, lsr #4 >+ addne r2, r2, #(4 << 23) >+ tst r0, #0x0c000000 >+ beq LSYM(Lad_p) >+ mov r1, r1, lsr #2 >+ orr r1, r1, r0, lsl #30 >+ mov r0, r0, lsr #2 >+ add r2, r2, #(2 << 23) >+ b LSYM(Lad_p) >+ >+ FUNC_END floatsisf >+ FUNC_END floatunsisf >+ >+#endif /* L_addsubsf3 */ >+ >+#ifdef L_muldivsf3 >+ >+ARM_FUNC_START mulsf3 >+ >+ @ Mask out exponents. >+ mov ip, #0xff000000 >+ and r2, r0, ip, lsr #1 >+ and r3, r1, ip, lsr #1 >+ >+ @ Trap any INF/NAN. >+ teq r2, ip, lsr #1 >+ teqne r3, ip, lsr #1 >+ beq LSYM(Lml_s) >+ >+ @ Trap any multiplication by 0. >+ bics ip, r0, #0x80000000 >+ bicnes ip, r1, #0x80000000 >+ beq LSYM(Lml_z) >+ >+ @ Shift exponents right one bit to make room for overflow bit. >+ @ If either of them is 0, scale denormalized arguments off line. >+ @ Then add both exponents together. >+ movs r2, r2, lsr #1 >+ teqne r3, #0 >+ beq LSYM(Lml_d) >+LSYM(Lml_x): >+ add r2, r2, r3, asr #1 >+ >+ @ Preserve final sign in r2 along with exponent for now. >+ teq r0, r1 >+ orrmi r2, r2, #0x8000 >+ >+ @ Convert mantissa to unsigned integer. >+ bic r0, r0, #0xff000000 >+ bic r1, r1, #0xff000000 >+ orr r0, r0, #0x00800000 >+ orr r1, r1, #0x00800000 >+ >+#if __ARM_ARCH__ < 4 >+ >+ @ Well, no way to make it shorter without the umull instruction. >+ @ We must perform that 24 x 24 -> 48 bit multiplication by hand. >+ stmfd sp!, {r4, r5} >+ mov r4, r0, lsr #16 >+ mov r5, r1, lsr #16 >+ bic r0, r0, #0x00ff0000 >+ bic r1, r1, #0x00ff0000 >+ mul ip, r4, r5 >+ mul r3, r0, r1 >+ mul r0, r5, r0 >+ mla r0, r4, r1, r0 >+ adds r3, r3, r0, lsl #16 >+ adc ip, ip, r0, lsr #16 >+ ldmfd sp!, {r4, r5} >+ >+#else >+ >+ umull r3, ip, r0, r1 @ The actual multiplication. >+ >+#endif >+ >+ @ Put final sign in r0. >+ mov r0, r2, lsl #16 >+ bic r2, r2, #0x8000 >+ >+ @ Adjust result if one extra MSB appeared. >+ @ The LSB may be lost but this never changes the result in this case. >+ tst ip, #(1 << 15) >+ addne r2, r2, #(1 << 22) >+ movnes ip, ip, lsr #1 >+ movne r3, r3, rrx >+ >+ @ Apply exponent bias, check range for underflow. >+ subs r2, r2, #(127 << 22) >+ ble LSYM(Lml_u) >+ >+ @ Scale back to 24 bits with rounding. >+ @ r0 contains sign bit already. >+ orrs r0, r0, r3, lsr #23 >+ adc r0, r0, ip, lsl #9 >+ >+ @ If halfway between two numbers, rounding should be towards LSB = 0. >+ mov r3, r3, lsl #9 >+ teq r3, #0x80000000 >+ biceq r0, r0, #1 >+ >+ @ Note: rounding may have produced an extra MSB here. >+ @ The extra bit is cleared before merging the exponent below. >+ tst r0, #0x01000000 >+ addne r2, r2, #(1 << 22) >+ >+ @ Check for exponent overflow >+ cmp r2, #(255 << 22) >+ bge LSYM(Lml_o) >+ >+ @ Add final exponent. >+ bic r0, r0, #0x01800000 >+ orr r0, r0, r2, lsl #1 >+ RET >+ >+ @ Result is 0, but determine sign anyway. >+LSYM(Lml_z): eor r0, r0, r1 >+ bic r0, r0, #0x7fffffff >+ RET >+ >+ @ Check if denormalized result is possible, otherwise return signed 0. >+LSYM(Lml_u): >+ cmn r2, #(24 << 22) >+ RETc(le) >+ >+ @ Find out proper shift value. >+ mvn r1, r2, asr #22 >+ subs r1, r1, #7 >+ bgt LSYM(Lml_ur) >+ >+ @ Shift value left, round, etc. >+ add r1, r1, #32 >+ orrs r0, r0, r3, lsr r1 >+ rsb r1, r1, #32 >+ adc r0, r0, ip, lsl r1 >+ mov ip, r3, lsl r1 >+ teq ip, #0x80000000 >+ biceq r0, r0, #1 >+ RET >+ >+ @ Shift value right, round, etc. >+ @ Note: r1 must not be 0 otherwise carry does not get set. >+LSYM(Lml_ur): >+ orrs r0, r0, ip, lsr r1 >+ adc r0, r0, #0 >+ rsb r1, r1, #32 >+ mov ip, ip, lsl r1 >+ teq r3, #0 >+ teqeq ip, #0x80000000 >+ biceq r0, r0, #1 >+ RET >+ >+ @ One or both arguments are denormalized. >+ @ Scale them leftwards and preserve sign bit. >+LSYM(Lml_d): >+ teq r2, #0 >+ and ip, r0, #0x80000000 >+1: moveq r0, r0, lsl #1 >+ tsteq r0, #0x00800000 >+ subeq r2, r2, #(1 << 22) >+ beq 1b >+ orr r0, r0, ip >+ teq r3, #0 >+ and ip, r1, #0x80000000 >+2: moveq r1, r1, lsl #1 >+ tsteq r1, #0x00800000 >+ subeq r3, r3, #(1 << 23) >+ beq 2b >+ orr r1, r1, ip >+ b LSYM(Lml_x) >+ >+ @ One or both args are INF or NAN. >+LSYM(Lml_s): >+ teq r0, #0x0 >+ teqne r1, #0x0 >+ teqne r0, #0x80000000 >+ teqne r1, #0x80000000 >+ beq LSYM(Lml_n) @ 0 * INF or INF * 0 -> NAN >+ teq r2, ip, lsr #1 >+ bne 1f >+ movs r2, r0, lsl #9 >+ bne LSYM(Lml_n) @ NAN * <anything> -> NAN >+1: teq r3, ip, lsr #1 >+ bne LSYM(Lml_i) >+ movs r3, r1, lsl #9 >+ bne LSYM(Lml_n) @ <anything> * NAN -> NAN >+ >+ @ Result is INF, but we need to determine its sign. >+LSYM(Lml_i): >+ eor r0, r0, r1 >+ >+ @ Overflow: return INF (sign already in r0). >+LSYM(Lml_o): >+ and r0, r0, #0x80000000 >+ orr r0, r0, #0x7f000000 >+ orr r0, r0, #0x00800000 >+ RET >+ >+ @ Return NAN. >+LSYM(Lml_n): >+ mov r0, #0x7f000000 >+ orr r0, r0, #0x00c00000 >+ RET >+ >+ FUNC_END mulsf3 >+ >+ARM_FUNC_START divsf3 >+ >+ @ Mask out exponents. >+ mov ip, #0xff000000 >+ and r2, r0, ip, lsr #1 >+ and r3, r1, ip, lsr #1 >+ >+ @ Trap any INF/NAN or zeroes. >+ teq r2, ip, lsr #1 >+ teqne r3, ip, lsr #1 >+ bicnes ip, r0, #0x80000000 >+ bicnes ip, r1, #0x80000000 >+ beq LSYM(Ldv_s) >+ >+ @ Shift exponents right one bit to make room for overflow bit. >+ @ If either of them is 0, scale denormalized arguments off line. >+ @ Then substract divisor exponent from dividend''s. >+ movs r2, r2, lsr #1 >+ teqne r3, #0 >+ beq LSYM(Ldv_d) >+LSYM(Ldv_x): >+ sub r2, r2, r3, asr #1 >+ >+ @ Preserve final sign into ip. >+ eor ip, r0, r1 >+ >+ @ Convert mantissa to unsigned integer. >+ @ Dividend -> r3, divisor -> r1. >+ mov r3, #0x10000000 >+ movs r1, r1, lsl #9 >+ mov r0, r0, lsl #9 >+ beq LSYM(Ldv_1) >+ orr r1, r3, r1, lsr #4 >+ orr r3, r3, r0, lsr #4 >+ >+ @ Initialize r0 (result) with final sign bit. >+ and r0, ip, #0x80000000 >+ >+ @ Ensure result will land to known bit position. >+ cmp r3, r1 >+ subcc r2, r2, #(1 << 22) >+ movcc r3, r3, lsl #1 >+ >+ @ Apply exponent bias, check range for over/underflow. >+ add r2, r2, #(127 << 22) >+ cmn r2, #(24 << 22) >+ RETc(le) >+ cmp r2, #(255 << 22) >+ bge LSYM(Lml_o) >+ >+ @ The actual division loop. >+ mov ip, #0x00800000 >+1: cmp r3, r1 >+ subcs r3, r3, r1 >+ orrcs r0, r0, ip >+ cmp r3, r1, lsr #1 >+ subcs r3, r3, r1, lsr #1 >+ orrcs r0, r0, ip, lsr #1 >+ cmp r3, r1, lsr #2 >+ subcs r3, r3, r1, lsr #2 >+ orrcs r0, r0, ip, lsr #2 >+ cmp r3, r1, lsr #3 >+ subcs r3, r3, r1, lsr #3 >+ orrcs r0, r0, ip, lsr #3 >+ movs r3, r3, lsl #4 >+ movnes ip, ip, lsr #4 >+ bne 1b >+ >+ @ Check if denormalized result is needed. >+ cmp r2, #0 >+ ble LSYM(Ldv_u) >+ >+ @ Apply proper rounding. >+ cmp r3, r1 >+ addcs r0, r0, #1 >+ biceq r0, r0, #1 >+ >+ @ Add exponent to result. >+ bic r0, r0, #0x00800000 >+ orr r0, r0, r2, lsl #1 >+ RET >+ >+ @ Division by 0x1p*: let''s shortcut a lot of code. >+LSYM(Ldv_1): >+ and ip, ip, #0x80000000 >+ orr r0, ip, r0, lsr #9 >+ add r2, r2, #(127 << 22) >+ cmp r2, #(255 << 22) >+ bge LSYM(Lml_o) >+ cmp r2, #0 >+ orrgt r0, r0, r2, lsl #1 >+ RETc(gt) >+ cmn r2, #(24 << 22) >+ movle r0, ip >+ RETc(le) >+ orr r0, r0, #0x00800000 >+ mov r3, #0 >+ >+ @ Result must be denormalized: prepare parameters to use code above. >+ @ r3 already contains remainder for rounding considerations. >+LSYM(Ldv_u): >+ bic ip, r0, #0x80000000 >+ and r0, r0, #0x80000000 >+ mvn r1, r2, asr #22 >+ add r1, r1, #2 >+ b LSYM(Lml_ur) >+ >+ @ One or both arguments are denormalized. >+ @ Scale them leftwards and preserve sign bit. >+LSYM(Ldv_d): >+ teq r2, #0 >+ and ip, r0, #0x80000000 >+1: moveq r0, r0, lsl #1 >+ tsteq r0, #0x00800000 >+ subeq r2, r2, #(1 << 22) >+ beq 1b >+ orr r0, r0, ip >+ teq r3, #0 >+ and ip, r1, #0x80000000 >+2: moveq r1, r1, lsl #1 >+ tsteq r1, #0x00800000 >+ subeq r3, r3, #(1 << 23) >+ beq 2b >+ orr r1, r1, ip >+ b LSYM(Ldv_x) >+ >+ @ One or both arguments is either INF, NAN or zero. >+LSYM(Ldv_s): >+ mov ip, #0xff000000 >+ teq r2, ip, lsr #1 >+ teqeq r3, ip, lsr #1 >+ beq LSYM(Lml_n) @ INF/NAN / INF/NAN -> NAN >+ teq r2, ip, lsr #1 >+ bne 1f >+ movs r2, r0, lsl #9 >+ bne LSYM(Lml_n) @ NAN / <anything> -> NAN >+ b LSYM(Lml_i) @ INF / <anything> -> INF >+1: teq r3, ip, lsr #1 >+ bne 2f >+ movs r3, r1, lsl #9 >+ bne LSYM(Lml_n) @ <anything> / NAN -> NAN >+ b LSYM(Lml_z) @ <anything> / INF -> 0 >+2: @ One or both arguments are 0. >+ bics r2, r0, #0x80000000 >+ bne LSYM(Lml_i) @ <non_zero> / 0 -> INF >+ bics r3, r1, #0x80000000 >+ bne LSYM(Lml_z) @ 0 / <non_zero> -> 0 >+ b LSYM(Lml_n) @ 0 / 0 -> NAN >+ >+ FUNC_END divsf3 >+ >+#endif /* L_muldivsf3 */ >+ >+#ifdef L_cmpsf2 >+ >+FUNC_START gesf2 >+ARM_FUNC_START gtsf2 >+ mov r3, #-1 >+ b 1f >+ >+FUNC_START lesf2 >+ARM_FUNC_START ltsf2 >+ mov r3, #1 >+ b 1f >+ >+FUNC_START nesf2 >+FUNC_START eqsf2 >+ARM_FUNC_START cmpsf2 >+ mov r3, #1 @ how should we specify unordered here? >+ >+1: @ Trap any INF/NAN first. >+ mov ip, #0xff000000 >+ and r2, r1, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ and r2, r0, ip, lsr #1 >+ teqne r2, ip, lsr #1 >+ beq 3f >+ >+ @ Test for equality. >+ @ Note that 0.0 is equal to -0.0. >+2: orr r3, r0, r1 >+ bics r3, r3, #0x80000000 @ either 0.0 or -0.0 >+ teqne r0, r1 @ or both the same >+ moveq r0, #0 >+ RETc(eq) >+ >+ @ Check for sign difference. The N flag is set if it is the case. >+ @ If so, return sign of r0. >+ movmi r0, r0, asr #31 >+ orrmi r0, r0, #1 >+ RETc(mi) >+ >+ @ Compare exponents. >+ and r3, r1, ip, lsr #1 >+ cmp r2, r3 >+ >+ @ Compare mantissa if exponents are equal >+ moveq r0, r0, lsl #9 >+ cmpeq r0, r1, lsl #9 >+ movcs r0, r1, asr #31 >+ mvncc r0, r1, asr #31 >+ orr r0, r0, #1 >+ RET >+ >+ @ Look for a NAN. >+3: and r2, r1, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ bne 4f >+ movs r2, r1, lsl #9 >+ bne 5f @ r1 is NAN >+4: and r2, r0, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ bne 2b >+ movs ip, r0, lsl #9 >+ beq 2b @ r0 is not NAN >+5: mov r0, r3 @ return unordered code from r3. >+ RET >+ >+ FUNC_END gesf2 >+ FUNC_END gtsf2 >+ FUNC_END lesf2 >+ FUNC_END ltsf2 >+ FUNC_END nesf2 >+ FUNC_END eqsf2 >+ FUNC_END cmpsf2 >+ >+#endif /* L_cmpsf2 */ >+ >+#ifdef L_unordsf2 >+ >+ARM_FUNC_START unordsf2 >+ mov ip, #0xff000000 >+ and r2, r1, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ bne 1f >+ movs r2, r1, lsl #9 >+ bne 3f @ r1 is NAN >+1: and r2, r0, ip, lsr #1 >+ teq r2, ip, lsr #1 >+ bne 2f >+ movs r2, r0, lsl #9 >+ bne 3f @ r0 is NAN >+2: mov r0, #0 @ arguments are ordered. >+ RET >+3: mov r0, #1 @ arguments are unordered. >+ RET >+ >+ FUNC_END unordsf2 >+ >+#endif /* L_unordsf2 */ >+ >+#ifdef L_fixsfsi >+ >+ARM_FUNC_START fixsfsi >+ movs r0, r0, lsl #1 >+ RETc(eq) @ value is 0. >+ >+ mov r1, r1, rrx @ preserve C flag (the actual sign) >+ >+ @ check exponent range. >+ and r2, r0, #0xff000000 >+ cmp r2, #(127 << 24) >+ movcc r0, #0 @ value is too small >+ RETc(cc) >+ cmp r2, #((127 + 31) << 24) >+ bcs 1f @ value is too large >+ >+ mov r0, r0, lsl #7 >+ orr r0, r0, #0x80000000 >+ mov r2, r2, lsr #24 >+ rsb r2, r2, #(127 + 31) >+ tst r1, #0x80000000 @ the sign bit >+ mov r0, r0, lsr r2 >+ rsbne r0, r0, #0 >+ RET >+ >+1: teq r2, #0xff000000 >+ bne 2f >+ movs r0, r0, lsl #8 >+ bne 3f @ r0 is NAN. >+2: ands r0, r1, #0x80000000 @ the sign bit >+ moveq r0, #0x7fffffff @ the maximum signed positive si >+ RET >+ >+3: mov r0, #0 @ What should we convert NAN to? >+ RET >+ >+ FUNC_END fixsfsi >+ >+#endif /* L_fixsfsi */ >+ >+#ifdef L_fixunssfsi >+ >+ARM_FUNC_START fixunssfsi >+ movs r0, r0, lsl #1 >+ movcss r0, #0 @ value is negative... >+ RETc(eq) @ ... or 0. >+ >+ >+ @ check exponent range. >+ and r2, r0, #0xff000000 >+ cmp r2, #(127 << 24) >+ movcc r0, #0 @ value is too small >+ RETc(cc) >+ cmp r2, #((127 + 32) << 24) >+ bcs 1f @ value is too large >+ >+ mov r0, r0, lsl #7 >+ orr r0, r0, #0x80000000 >+ mov r2, r2, lsr #24 >+ rsb r2, r2, #(127 + 31) >+ mov r0, r0, lsr r2 >+ RET >+ >+1: teq r2, #0xff000000 >+ bne 2f >+ movs r0, r0, lsl #8 >+ bne 3f @ r0 is NAN. >+2: mov r0, #0xffffffff @ maximum unsigned si >+ RET >+ >+3: mov r0, #0 @ What should we convert NAN to? >+ RET >+ >+ FUNC_END fixunssfsi >+ >+#endif /* L_fixunssfsi */ >diff -urNd gcc-3.3.3-orig/gcc/config/arm/lib1funcs.asm gcc-3.3.3/gcc/config/arm/lib1funcs.asm >--- gcc-3.3.3-orig/gcc/config/arm/lib1funcs.asm 2001-09-18 12:02:37.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/lib1funcs.asm 2004-04-30 23:41:18.552136000 +0200 >@@ -51,74 +51,117 @@ > #endif > #define TYPE(x) .type SYM(x),function > #define SIZE(x) .size SYM(x), . - SYM(x) >+#define LSYM(x) .x > #else > #define __PLT__ > #define TYPE(x) > #define SIZE(x) >+#define LSYM(x) x > #endif > > /* Function end macros. Variants for 26 bit APCS and interworking. */ > >+@ This selects the minimum architecture level required. >+#define __ARM_ARCH__ 3 >+ >+#if defined(__ARM_ARCH_3M__) || defined(__ARM_ARCH_4__) \ >+ || defined(__ARM_ARCH_4T__) >+/* We use __ARM_ARCH__ set to 4 here, but in reality it's any processor with >+ long multiply instructions. That includes v3M. */ >+# undef __ARM_ARCH__ >+# define __ARM_ARCH__ 4 >+#endif >+ >+#if defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5T__) \ >+ || defined(__ARM_ARCH_5TE__) >+# undef __ARM_ARCH__ >+# define __ARM_ARCH__ 5 >+#endif >+ >+/* How to return from a function call depends on the architecture variant. */ >+ > #ifdef __APCS_26__ >+ > # define RET movs pc, lr > # define RETc(x) mov##x##s pc, lr >-# define RETCOND ^ >+ >+#elif (__ARM_ARCH__ > 4) || defined(__thumb__) || defined(__THUMB_INTERWORK__) >+ >+# define RET bx lr >+# define RETc(x) bx##x lr >+ >+# if (__ARM_ARCH__ == 4) \ >+ && (defined(__thumb__) || defined(__THUMB_INTERWORK__)) >+# define __INTERWORKING__ >+# endif >+ >+#else >+ >+# define RET mov pc, lr >+# define RETc(x) mov##x pc, lr >+ >+#endif >+ >+/* Don't pass dirn, it's there just to get token pasting right. */ >+ >+.macro RETLDM regs=, cond=, dirn=ia >+#ifdef __APCS_26__ >+ .ifc "\regs","" >+ ldm\cond\dirn sp!, {pc}^ >+ .else >+ ldm\cond\dirn sp!, {\regs, pc}^ >+ .endif >+#elif defined (__INTERWORKING__) >+ .ifc "\regs","" >+ ldr\cond lr, [sp], #4 >+ .else >+ ldm\cond\dirn sp!, {\regs, lr} >+ .endif >+ bx\cond lr >+#else >+ .ifc "\regs","" >+ ldr\cond pc, [sp], #4 >+ .else >+ ldm\cond\dirn sp!, {\regs, pc} >+ .endif >+#endif >+.endm >+ >+ > .macro ARM_LDIV0 >-Ldiv0: >+LSYM(Ldiv0): > str lr, [sp, #-4]! > bl SYM (__div0) __PLT__ > mov r0, #0 @ About as wrong as it could be. >- ldmia sp!, {pc}^ >+ RETLDM > .endm >-#else >-# ifdef __THUMB_INTERWORK__ >-# define RET bx lr >-# define RETc(x) bx##x lr >+ >+ > .macro THUMB_LDIV0 >-Ldiv0: >+LSYM(Ldiv0): > push { lr } > bl SYM (__div0) > mov r0, #0 @ About as wrong as it could be. >+#if defined (__INTERWORKING__) > pop { r1 } > bx r1 >-.endm >-.macro ARM_LDIV0 >-Ldiv0: >- str lr, [sp, #-4]! >- bl SYM (__div0) __PLT__ >- mov r0, #0 @ About as wrong as it could be. >- ldr lr, [sp], #4 >- bx lr >-.endm >-# else >-# define RET mov pc, lr >-# define RETc(x) mov##x pc, lr >-.macro THUMB_LDIV0 >-Ldiv0: >- push { lr } >- bl SYM (__div0) >- mov r0, #0 @ About as wrong as it could be. >+#else > pop { pc } >-.endm >-.macro ARM_LDIV0 >-Ldiv0: >- str lr, [sp, #-4]! >- bl SYM (__div0) __PLT__ >- mov r0, #0 @ About as wrong as it could be. >- ldmia sp!, {pc} >-.endm >-# endif >-# define RETCOND > #endif >+.endm > > .macro FUNC_END name >-Ldiv0: >+ SIZE (__\name) >+.endm >+ >+.macro DIV_FUNC_END name >+LSYM(Ldiv0): > #ifdef __thumb__ > THUMB_LDIV0 > #else > ARM_LDIV0 > #endif >- SIZE (__\name) >+ FUNC_END \name > .endm > > .macro THUMB_FUNC_START name >@@ -147,7 +190,24 @@ > THUMB_FUNC > SYM (__\name): > .endm >- >+ >+/* Special function that will always be coded in ARM assembly, even if >+ in Thumb-only compilation. */ >+ >+#if defined(__thumb__) && !defined(__THUMB_INTERWORK__) >+.macro ARM_FUNC_START name >+ FUNC_START \name >+ bx pc >+ nop >+ .arm >+_L__\name: /* A hook to tell gdb that we've switched to ARM */ >+.endm >+#else >+.macro ARM_FUNC_START name >+ FUNC_START \name >+.endm >+#endif >+ > /* Register aliases. */ > > work .req r4 @ XXXX is this safe ? >@@ -156,16 +216,17 @@ > overdone .req r2 > result .req r2 > curbit .req r3 >+#if 0 > ip .req r12 > sp .req r13 > lr .req r14 > pc .req r15 >- >+#endif > /* ------------------------------------------------------------------------ */ >-/* Bodies of the divsion and modulo routines. */ >+/* Bodies of the division and modulo routines. */ > /* ------------------------------------------------------------------------ */ > .macro ARM_DIV_MOD_BODY modulo >-Loop1: >+LSYM(Loop1): > @ Unless the divisor is very big, shift it up in multiples of > @ four bits, since this is the amount of unwinding in the main > @ division loop. Continue shifting until the divisor is >@@ -174,18 +235,18 @@ > cmplo divisor, dividend > movlo divisor, divisor, lsl #4 > movlo curbit, curbit, lsl #4 >- blo Loop1 >+ blo LSYM(Loop1) > >-Lbignum: >+LSYM(Lbignum): > @ For very big divisors, we must shift it a bit at a time, or > @ we will be in danger of overflowing. > cmp divisor, #0x80000000 > cmplo divisor, dividend > movlo divisor, divisor, lsl #1 > movlo curbit, curbit, lsl #1 >- blo Lbignum >+ blo LSYM(Lbignum) > >-Loop3: >+LSYM(Loop3): > @ Test for possible subtractions. On the final pass, this may > @ subtract too much from the dividend ... > >@@ -226,10 +287,10 @@ > cmp dividend, #0 @ Early termination? > movnes curbit, curbit, lsr #4 @ No, any more bits to do? > movne divisor, divisor, lsr #4 >- bne Loop3 >+ bne LSYM(Loop3) > > .if \modulo >-Lfixup_dividend: >+LSYM(Lfixup_dividend): > @ Any subtractions that we should not have done will be recorded in > @ the top three bits of OVERDONE. Exactly which were not needed > @ are governed by the position of the bit, stored in IP. >@@ -241,7 +302,7 @@ > @ the bit in ip could be in the top two bits which might then match > @ with one of the smaller RORs. > tstne ip, #0x7 >- beq Lgot_result >+ beq LSYM(Lgot_result) > tst overdone, ip, ror #3 > addne dividend, dividend, divisor, lsr #3 > tst overdone, ip, ror #2 >@@ -250,39 +311,39 @@ > addne dividend, dividend, divisor, lsr #1 > .endif > >-Lgot_result: >+LSYM(Lgot_result): > .endm > /* ------------------------------------------------------------------------ */ > .macro THUMB_DIV_MOD_BODY modulo > @ Load the constant 0x10000000 into our work register. > mov work, #1 > lsl work, #28 >-Loop1: >+LSYM(Loop1): > @ Unless the divisor is very big, shift it up in multiples of > @ four bits, since this is the amount of unwinding in the main > @ division loop. Continue shifting until the divisor is > @ larger than the dividend. > cmp divisor, work >- bhs Lbignum >+ bhs LSYM(Lbignum) > cmp divisor, dividend >- bhs Lbignum >+ bhs LSYM(Lbignum) > lsl divisor, #4 > lsl curbit, #4 >- b Loop1 >-Lbignum: >+ b LSYM(Loop1) >+LSYM(Lbignum): > @ Set work to 0x80000000 > lsl work, #3 >-Loop2: >+LSYM(Loop2): > @ For very big divisors, we must shift it a bit at a time, or > @ we will be in danger of overflowing. > cmp divisor, work >- bhs Loop3 >+ bhs LSYM(Loop3) > cmp divisor, dividend >- bhs Loop3 >+ bhs LSYM(Loop3) > lsl divisor, #1 > lsl curbit, #1 >- b Loop2 >-Loop3: >+ b LSYM(Loop2) >+LSYM(Loop3): > @ Test for possible subtractions ... > .if \modulo > @ ... On the final pass, this may subtract too much from the dividend, >@@ -290,79 +351,79 @@ > @ afterwards. > mov overdone, #0 > cmp dividend, divisor >- blo Lover1 >+ blo LSYM(Lover1) > sub dividend, dividend, divisor >-Lover1: >+LSYM(Lover1): > lsr work, divisor, #1 > cmp dividend, work >- blo Lover2 >+ blo LSYM(Lover2) > sub dividend, dividend, work > mov ip, curbit > mov work, #1 > ror curbit, work > orr overdone, curbit > mov curbit, ip >-Lover2: >+LSYM(Lover2): > lsr work, divisor, #2 > cmp dividend, work >- blo Lover3 >+ blo LSYM(Lover3) > sub dividend, dividend, work > mov ip, curbit > mov work, #2 > ror curbit, work > orr overdone, curbit > mov curbit, ip >-Lover3: >+LSYM(Lover3): > lsr work, divisor, #3 > cmp dividend, work >- blo Lover4 >+ blo LSYM(Lover4) > sub dividend, dividend, work > mov ip, curbit > mov work, #3 > ror curbit, work > orr overdone, curbit > mov curbit, ip >-Lover4: >+LSYM(Lover4): > mov ip, curbit > .else > @ ... and note which bits are done in the result. On the final pass, > @ this may subtract too much from the dividend, but the result will be ok, > @ since the "bit" will have been shifted out at the bottom. > cmp dividend, divisor >- blo Lover1 >+ blo LSYM(Lover1) > sub dividend, dividend, divisor > orr result, result, curbit >-Lover1: >+LSYM(Lover1): > lsr work, divisor, #1 > cmp dividend, work >- blo Lover2 >+ blo LSYM(Lover2) > sub dividend, dividend, work > lsr work, curbit, #1 > orr result, work >-Lover2: >+LSYM(Lover2): > lsr work, divisor, #2 > cmp dividend, work >- blo Lover3 >+ blo LSYM(Lover3) > sub dividend, dividend, work > lsr work, curbit, #2 > orr result, work >-Lover3: >+LSYM(Lover3): > lsr work, divisor, #3 > cmp dividend, work >- blo Lover4 >+ blo LSYM(Lover4) > sub dividend, dividend, work > lsr work, curbit, #3 > orr result, work >-Lover4: >+LSYM(Lover4): > .endif > > cmp dividend, #0 @ Early termination? >- beq Lover5 >+ beq LSYM(Lover5) > lsr curbit, #4 @ No, any more bits to do? >- beq Lover5 >+ beq LSYM(Lover5) > lsr divisor, #4 >- b Loop3 >-Lover5: >+ b LSYM(Loop3) >+LSYM(Lover5): > .if \modulo > @ Any subtractions that we should not have done will be recorded in > @ the top three bits of "overdone". Exactly which were not needed >@@ -370,7 +431,7 @@ > mov work, #0xe > lsl work, #28 > and overdone, work >- beq Lgot_result >+ beq LSYM(Lgot_result) > > @ If we terminated early, because dividend became zero, then the > @ bit in ip will not be in the bottom nibble, and we should not >@@ -381,33 +442,33 @@ > mov curbit, ip > mov work, #0x7 > tst curbit, work >- beq Lgot_result >+ beq LSYM(Lgot_result) > > mov curbit, ip > mov work, #3 > ror curbit, work > tst overdone, curbit >- beq Lover6 >+ beq LSYM(Lover6) > lsr work, divisor, #3 > add dividend, work >-Lover6: >+LSYM(Lover6): > mov curbit, ip > mov work, #2 > ror curbit, work > tst overdone, curbit >- beq Lover7 >+ beq LSYM(Lover7) > lsr work, divisor, #2 > add dividend, work >-Lover7: >+LSYM(Lover7): > mov curbit, ip > mov work, #1 > ror curbit, work > tst overdone, curbit >- beq Lgot_result >+ beq LSYM(Lgot_result) > lsr work, divisor, #1 > add dividend, work > .endif >-Lgot_result: >+LSYM(Lgot_result): > .endm > /* ------------------------------------------------------------------------ */ > /* Start of the Real Functions */ >@@ -419,13 +480,13 @@ > #ifdef __thumb__ > > cmp divisor, #0 >- beq Ldiv0 >+ beq LSYM(Ldiv0) > mov curbit, #1 > mov result, #0 > > push { work } > cmp dividend, divisor >- blo Lgot_result >+ blo LSYM(Lgot_result) > > THUMB_DIV_MOD_BODY 0 > >@@ -436,11 +497,11 @@ > #else /* ARM version. */ > > cmp divisor, #0 >- beq Ldiv0 >+ beq LSYM(Ldiv0) > mov curbit, #1 > mov result, #0 > cmp dividend, divisor >- blo Lgot_result >+ blo LSYM(Lgot_result) > > ARM_DIV_MOD_BODY 0 > >@@ -449,7 +510,7 @@ > > #endif /* ARM version */ > >- FUNC_END udivsi3 >+ DIV_FUNC_END udivsi3 > > #endif /* L_udivsi3 */ > /* ------------------------------------------------------------------------ */ >@@ -460,13 +521,13 @@ > #ifdef __thumb__ > > cmp divisor, #0 >- beq Ldiv0 >+ beq LSYM(Ldiv0) > mov curbit, #1 > cmp dividend, divisor >- bhs Lover10 >+ bhs LSYM(Lover10) > RET > >-Lover10: >+LSYM(Lover10): > push { work } > > THUMB_DIV_MOD_BODY 1 >@@ -477,7 +538,7 @@ > #else /* ARM version. */ > > cmp divisor, #0 >- beq Ldiv0 >+ beq LSYM(Ldiv0) > cmp divisor, #1 > cmpne dividend, divisor > moveq dividend, #0 >@@ -490,7 +551,7 @@ > > #endif /* ARM version. */ > >- FUNC_END umodsi3 >+ DIV_FUNC_END umodsi3 > > #endif /* L_umodsi3 */ > /* ------------------------------------------------------------------------ */ >@@ -500,7 +561,7 @@ > > #ifdef __thumb__ > cmp divisor, #0 >- beq Ldiv0 >+ beq LSYM(Ldiv0) > > push { work } > mov work, dividend >@@ -509,24 +570,24 @@ > mov curbit, #1 > mov result, #0 > cmp divisor, #0 >- bpl Lover10 >+ bpl LSYM(Lover10) > neg divisor, divisor @ Loops below use unsigned. >-Lover10: >+LSYM(Lover10): > cmp dividend, #0 >- bpl Lover11 >+ bpl LSYM(Lover11) > neg dividend, dividend >-Lover11: >+LSYM(Lover11): > cmp dividend, divisor >- blo Lgot_result >+ blo LSYM(Lgot_result) > > THUMB_DIV_MOD_BODY 0 > > mov r0, result > mov work, ip > cmp work, #0 >- bpl Lover12 >+ bpl LSYM(Lover12) > neg r0, r0 >-Lover12: >+LSYM(Lover12): > pop { work } > RET > >@@ -537,11 +598,11 @@ > mov result, #0 > cmp divisor, #0 > rsbmi divisor, divisor, #0 @ Loops below use unsigned. >- beq Ldiv0 >+ beq LSYM(Ldiv0) > cmp dividend, #0 > rsbmi dividend, dividend, #0 > cmp dividend, divisor >- blo Lgot_result >+ blo LSYM(Lgot_result) > > ARM_DIV_MOD_BODY 0 > >@@ -552,7 +613,7 @@ > > #endif /* ARM version */ > >- FUNC_END divsi3 >+ DIV_FUNC_END divsi3 > > #endif /* L_divsi3 */ > /* ------------------------------------------------------------------------ */ >@@ -564,29 +625,29 @@ > > mov curbit, #1 > cmp divisor, #0 >- beq Ldiv0 >- bpl Lover10 >+ beq LSYM(Ldiv0) >+ bpl LSYM(Lover10) > neg divisor, divisor @ Loops below use unsigned. >-Lover10: >+LSYM(Lover10): > push { work } > @ Need to save the sign of the dividend, unfortunately, we need > @ work later on. Must do this after saving the original value of > @ the work register, because we will pop this value off first. > push { dividend } > cmp dividend, #0 >- bpl Lover11 >+ bpl LSYM(Lover11) > neg dividend, dividend >-Lover11: >+LSYM(Lover11): > cmp dividend, divisor >- blo Lgot_result >+ blo LSYM(Lgot_result) > > THUMB_DIV_MOD_BODY 1 > > pop { work } > cmp work, #0 >- bpl Lover12 >+ bpl LSYM(Lover12) > neg dividend, dividend >-Lover12: >+LSYM(Lover12): > pop { work } > RET > >@@ -594,14 +655,14 @@ > > cmp divisor, #0 > rsbmi divisor, divisor, #0 @ Loops below use unsigned. >- beq Ldiv0 >+ beq LSYM(Ldiv0) > @ Need to save the sign of the dividend, unfortunately, we need > @ ip later on; this is faster than pushing lr and using that. > str dividend, [sp, #-4]! > cmp dividend, #0 @ Test dividend against zero > rsbmi dividend, dividend, #0 @ If negative make positive > cmp dividend, divisor @ else if zero return zero >- blo Lgot_result @ if smaller return dividend >+ blo LSYM(Lgot_result) @ if smaller return dividend > mov curbit, #1 > > ARM_DIV_MOD_BODY 1 >@@ -613,7 +674,7 @@ > > #endif /* ARM version */ > >- FUNC_END modsi3 >+ DIV_FUNC_END modsi3 > > #endif /* L_modsi3 */ > /* ------------------------------------------------------------------------ */ >@@ -623,7 +684,7 @@ > > RET > >- SIZE (__div0) >+ FUNC_END div0 > > #endif /* L_divmodsi_tools */ > /* ------------------------------------------------------------------------ */ >@@ -636,22 +697,18 @@ > #define __NR_getpid (__NR_SYSCALL_BASE+ 20) > #define __NR_kill (__NR_SYSCALL_BASE+ 37) > >+ .code 32 > FUNC_START div0 > > stmfd sp!, {r1, lr} > swi __NR_getpid > cmn r0, #1000 >- ldmhsfd sp!, {r1, pc}RETCOND @ not much we can do >+ RETLDM r1 hs > mov r1, #SIGFPE > swi __NR_kill >-#ifdef __THUMB_INTERWORK__ >- ldmfd sp!, {r1, lr} >- bx lr >-#else >- ldmfd sp!, {r1, pc}RETCOND >-#endif >+ RETLDM r1 > >- SIZE (__div0) >+ FUNC_END div0 > > #endif /* L_dvmd_lnx */ > /* ------------------------------------------------------------------------ */ >@@ -720,24 +777,23 @@ > > .code 32 > .globl _arm_return >-_arm_return: >- ldmia r13!, {r12} >- bx r12 >+_arm_return: >+ RETLDM > .code 16 > >-.macro interwork register >- .code 16 >+.macro interwork register >+ .code 16 > > THUMB_FUNC_START _interwork_call_via_\register > >- bx pc >+ bx pc > nop >- >- .code 32 >- .globl .Lchange_\register >-.Lchange_\register: >+ >+ .code 32 >+ .globl LSYM(Lchange_\register) >+LSYM(Lchange_\register): > tst \register, #1 >- stmeqdb r13!, {lr} >+ streq lr, [sp, #-4]! > adreq lr, _arm_return > bx \register > >@@ -779,3 +835,7 @@ > SIZE (_interwork_call_via_lr) > > #endif /* L_interwork_call_via_rX */ >+ >+#include "ieee754-df.S" >+#include "ieee754-sf.S" >+ >diff -urNd gcc-3.3.3-orig/gcc/config/arm/linux-elf.h gcc-3.3.3/gcc/config/arm/linux-elf.h >--- gcc-3.3.3-orig/gcc/config/arm/linux-elf.h 2003-09-16 17:39:23.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/linux-elf.h 2004-04-30 23:51:01.350158400 +0200 >@@ -30,9 +30,26 @@ > /* Do not assume anything about header files. */ > #define NO_IMPLICIT_EXTERN_C > >-/* Default is to use APCS-32 mode. */ >+/* >+ * Default is to use APCS-32 mode with soft-vfp. >+ * The old Linux default for floats can be achieved with -mhard-float >+ * or with the configure --with-float=hard option. >+ * If -msoft-float or --with-float=soft is used then software float >+ * support will be used just like the default but with the legacy >+ * big endian word ordering for double float representation instead. >+ */ >+ > #undef TARGET_DEFAULT >-#define TARGET_DEFAULT (ARM_FLAG_APCS_32 | ARM_FLAG_MMU_TRAPS) >+#define TARGET_DEFAULT \ >+ ( ARM_FLAG_APCS_32 \ >+ | ARM_FLAG_SOFT_FLOAT \ >+ | ARM_FLAG_VFP \ >+ | ARM_FLAG_MMU_TRAPS ) >+ >+#undef SUBTARGET_EXTRA_ASM_SPEC >+#define SUBTARGET_EXTRA_ASM_SPEC "\ >+%{mhard-float:-mfpu=fpa} \ >+%{!mhard-float: %{msoft-float:-mfpu=softvfp} %{!msoft-float:-mfpu=softvfp}}" > > #define SUBTARGET_CPU_DEFAULT TARGET_CPU_arm6 > >@@ -40,7 +40,7 @@ > > #undef MULTILIB_DEFAULTS > #define MULTILIB_DEFAULTS \ >- { "marm", "mlittle-endian", "mhard-float", "mapcs-32", "mno-thumb-interwork" } >+ { "marm", "mlittle-endian", "mapcs-32", "mno-thumb-interwork" } > > #define CPP_APCS_PC_DEFAULT_SPEC "-D__APCS_32__" > >@@ -54,7 +72,7 @@ > %{shared:-lc} \ > %{!shared:%{profile:-lc_p}%{!profile:-lc}}" > >-#define LIBGCC_SPEC "%{msoft-float:-lfloat} -lgcc" >+#define LIBGCC_SPEC "-lgcc" > > /* Provide a STARTFILE_SPEC appropriate for GNU/Linux. Here we add > the GNU/Linux magical crtbegin.o file (see crtstuff.c) which >diff -urNd gcc-3.3.3-orig/gcc/config/arm/t-linux gcc-3.3.3/gcc/config/arm/t-linux >--- gcc-3.3.3-orig/gcc/config/arm/t-linux 2001-05-17 05:15:49.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/t-linux 2004-05-01 01:00:37.364972800 +0200 >@@ -7,7 +7,10 @@ > ENQUIRE= > > LIB1ASMSRC = arm/lib1funcs.asm >-LIB1ASMFUNCS = _udivsi3 _divsi3 _umodsi3 _modsi3 _dvmd_lnx >+LIB1ASMFUNCS = _udivsi3 _divsi3 _umodsi3 _modsi3 _dvmd_lnx \ >+ _negdf2 _addsubdf3 _muldivdf3 _cmpdf2 _unorddf2 _fixdfsi _fixunsdfsi \ >+ _truncdfsf2 _negsf2 _addsubsf3 _muldivsf3 _cmpsf2 _unordsf2 \ >+ _fixsfsi _fixunssfsi > > # MULTILIB_OPTIONS = mhard-float/msoft-float > # MULTILIB_DIRNAMES = hard-float soft-float >diff -urNd gcc-3.3.3-orig/gcc/config/arm/unknown-elf.h gcc-3.3.3/gcc/config/arm/unknown-elf.h >--- gcc-3.3.3-orig/gcc/config/arm/unknown-elf.h 2002-09-23 17:14:14.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/unknown-elf.h 2004-04-30 23:51:01.350158400 +0200 >@@ -29,7 +29,11 @@ > > /* Default to using APCS-32 and software floating point. */ > #ifndef TARGET_DEFAULT >-#define TARGET_DEFAULT (ARM_FLAG_SOFT_FLOAT | ARM_FLAG_APCS_32 | ARM_FLAG_APCS_FRAME) >+#define TARGET_DEFAULT \ >+ ( ARM_FLAG_SOFT_FLOAT \ >+ | ARM_FLAG_VFP \ >+ | ARM_FLAG_APCS_32 \ >+ | ARM_FLAG_APCS_FRAME ) > #endif > > /* Now we define the strings used to build the spec file. */ >diff -urNd gcc-3.3.3-orig/gcc/config/arm/xscale-elf.h gcc-3.3.3/gcc/config/arm/xscale-elf.h >--- gcc-3.3.3-orig/gcc/config/arm/xscale-elf.h 2002-05-20 19:07:04.000000000 +0200 >+++ gcc-3.3.3/gcc/config/arm/xscale-elf.h 2004-05-01 19:45:56.870952000 +0200 >@@ -28,9 +28,12 @@ > #define SUBTARGET_CPU_DEFAULT TARGET_CPU_xscale > #endif > >-#define SUBTARGET_EXTRA_ASM_SPEC "%{!mcpu=*:-mcpu=xscale} %{!mhard-float:-mno-fpu}" >+#define SUBTARGET_EXTRA_ASM_SPEC "\ >+%{!mcpu=*:-mcpu=xscale} \ >+%{mhard-float:-mfpu=fpa} \ >+%{!mhard-float: %{msoft-float:-mfpu=softvfp} %{!msoft-float:-mfpu=softvfp}}" > > #ifndef MULTILIB_DEFAULTS > #define MULTILIB_DEFAULTS \ >- { "mlittle-endian", "mno-thumb-interwork", "marm", "msoft-float" } >+ { "mlittle-endian", "mno-thumb-interwork", "marm" } > #endif
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Diff
View Attachment As Raw
Actions:
View
|
Diff
Attachments on
bug 75585
:
46814
|
46815
|
46816
|
46817
|
46818
|
46820
|
46821
|
46822
|
46823
|
46824
|
46825
|
62359
|
62360
|
62680