Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Segmentation fault with clang + optimizations (-O2) #18

Closed
daniloegea opened this issue Sep 18, 2015 · 16 comments
Closed

Segmentation fault with clang + optimizations (-O2) #18

daniloegea opened this issue Sep 18, 2015 · 16 comments

Comments

@daniloegea
Copy link

Hello,

I'm building solarus (https://github.com/christopho/solarus/) with openal-soft and I'm getting a segfault when I build it with clang. Actually, the program crashes when I build openal-soft with optimizations enabled. If I build openal with -O0 it works fine. With GCC it just works. I'm using openal-soft 1.16.0 and clang 3.6.1.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 808416400 (LWP 100215)]
Load_ALshort (srcstep=, samples=, dst=, src=) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/mixer.c:119
119 DECL_TEMPLATE(ALshort)

The backtrace:

bt

0 Load_ALshort (srcstep=, samples=, dst=, src=) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/mixer.c:119

1 LoadSamples (src=, srcstep=, srctype=, samples=, dst=) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/mixer.c:132

2 MixSource (src=0x80d78a000, Device=0x808538000, SamplesToDo=512) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/mixer.c:393

3 0x0000000805655d3f in aluMixData (device=0x808538000, buffer=, size=) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/ALu.c:1196

4 0x0000000805670c76 in ALCplaybackOSS_mixerProc (ptr=0x8084a4b80) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/Alc/backends/oss.c:119

5 0x0000000805671af6 in althrd_starter (arg=) at /usr/ports/audio/openal-soft/work/openal-soft-1.16.0/common/threads.c:526

6 0x0000000806d5c854 in thread_start (curthread=0x808416400) at /usr/src/lib/libthr/thread/thr_create.c:288

7 0x0000000000000000 in ?? ()

@kcat
Copy link
Owner

kcat commented Sep 18, 2015

This is unfortunate. I've heard other reports of a broken build with Clang 3.6 too, and I remember having build problems with earlier versions of Clang (3.2, I believe). It's been fine with 3.3, 3.4, and 3.5, though, so I assume it has to be a recently-resurfaced (or a new) problem in Clang.

@daniloegea
Copy link
Author

Hmm, that's interesting. We are using clang by default on FreeBSD to build thousands of packages and this kind of problem is uncommon. BTW, it really works with clang 3.4! =/ You can update your list, it fails with clang 3.8 as well.

@kcat
Copy link
Owner

kcat commented Sep 19, 2015

Is there any more information you can provide about what the problem might be? OpenAL Soft uses quite a few C99 features, and even some C11 features when available (threading stuff and atomics mostly), along with some GNU extensions to pick up some slack, so it's possible it's hitting some obscure bugs caused by certain (combinations of) features. Or perhaps OpenAL Soft is using some feature or function incorrectly that just happens to work on GCC and pre-3.6 Clang.

Unfortunately, Debian doesn't have 3.6 in the stable or unstable repos, so it's not that easy for me to test myself.

@daniloegea
Copy link
Author

The clang's AddressSanitizer caught a buffer overflow in openal. I've copied it here -> http://pastebin.com/kndRZnj2

@kcat
Copy link
Owner

kcat commented Sep 20, 2015

Are you able to test using the latest Git version, just to ensure it's still present?

The log doesn't make much sense. It says there's a 1-byte allocation, but it's using 16-bit samples, so the allocation must be at least a multiple of 2. I can only guess that's because it allocated a 0-byte buffer (from 0 length input) but the system gave it a minimum of 1 byte. In that case, the sample length would be 0, so it shouldn't try loading from the buffer. Are you able to get any more information, such as the offending ALbuffer's member values?

@daniloegea
Copy link
Author

There is a realloc call in LoadData() function receiving the second parameter as zero. The specification of realloc says that if the size is zero and the pointer is not null it'll be freed. The PoC patch avoids the crash, but no sound though.

--- OpenAL32/alBuffer.c.orig 2015-09-20 19:31:23 UTC
+++ OpenAL32/alBuffer.c
@@ -951,6 +951,9 @@ ALenum LoadData(ALbuffer *ALBuf, ALuint
if(newsize > INT_MAX)
return AL_OUT_OF_MEMORY;

  • if(newsize == 0)
  •    return AL_INVALID_OPERATION;
    
    WriteLock(&ALBuf->lock);
    if(ReadRef(&ALBuf->ref) != 0)
    {

I'll try the latest code from git.

@daniloegea
Copy link
Author

With the latest version it works.

@kcat
Copy link
Owner

kcat commented Sep 20, 2015

The specification of realloc says that if the size is zero and the pointer is not null it'll be freed.

Yeah, realloc is kinda funny. It also says that if the pointer is null, it'll act like malloc, and malloc(0) is allowed to return a unique pointer that needs to be freed. So it's possible for realloc(ptr, 0) to free the pointer and return NULL, while realloc(NULL, 0) can allocate and return a non-NULL pointer. Either way, though, you get something that can be passed to free.

Good to hear it works in the latest git, though I can't help but feel the issue is being masked. I don't remember fixing anything in this regard. Does the sound work too, or is it silent like with 1.16 and your patch? If ithe sound works, that's particularly curious since the newsize calculation is the same...

@daniloegea
Copy link
Author

Yes, the sound just works.
On Sep 20, 2015 5:11 PM, "kcat" notifications@github.com wrote:

The specification of realloc says that if the size is zero and the pointer
is not null it'll be freed.

Yeah, realloc is kinda funny. It also says that if the pointer is null,
it'll act like malloc, and malloc(0) is allowed to return a unique
pointer that needs to be freed. So it's possible for realloc(ptr, 0) to
free the pointer and return NULL, while realloc(NULL, 0) can allocate and
return a non-NULL pointer. Either way, though, you get something that can
be passed to free.

Good to hear it works in the latest git, though I can't help but feel the
issue is being masked. I don't remember fixing anything in this regard.
Does the sound work too, or is it silent like with 1.16 and your patch? If
ithe sound works, that's particularly curious since the newsize
calculation is the same...


Reply to this email directly or view it on GitHub
#18 (comment).

@daniloegea
Copy link
Author

This patch fixes the problem -> http://pastebin.com/YYzwdd9m
Now, I'm trying to figure out "WHY"... haha
For some reason, the optimizations clang is applying are breaking this function. It is putting 0 in "chans" and "type" memory locations.

uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Oct 17, 2015
For some reason clang is breaking the code when openal-soft is built with
optimizations. If the file alBuffer.c is built with -O1 the problem don't
happens. See kcat/openal-soft#18
The problem seems to happen just on CURRENT due the clang version.

PR:		199488, 203818
Tested by:	ohartman@zedat.fu-berlin.de
Approved by:	mva
MFH:		2015Q4


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@399540 35697150-7ecd-e111-bb59-0022644237b5
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Oct 17, 2015
For some reason clang is breaking the code when openal-soft is built with
optimizations. If the file alBuffer.c is built with -O1 the problem don't
happens. See kcat/openal-soft#18
The problem seems to happen just on CURRENT due the clang version.

PR:		199488, 203818
Tested by:	ohartman@zedat.fu-berlin.de
Approved by:	mva
MFH:		2015Q4
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Oct 19, 2015
- Add a workaround for a problem caused by clang

For some reason clang is breaking the code when openal-soft is built with
optimizations. If the file alBuffer.c is built with -O1 the problem don't
happens. See kcat/openal-soft#18
The problem seems to happen just on CURRENT due the clang version.

PR:		199488, 203818
Tested by:	ohartman@zedat.fu-berlin.de
Approved by:	mva
Approved by:	ports-secteam (feld)
@MrSapps
Copy link

MrSapps commented Aug 19, 2016

Is it cased by loop vectorization?

@kcat
Copy link
Owner

kcat commented Sep 6, 2016

I think I may have "fixed" the problem, assuming it's the same thing that was affecting OSX/XCode (which uses Clang too). Can you check to see if it's working properly now?

If it is fixed, it seems to have been an issue with certain functions being marked as __attribute__((const)), and the optimizer making assumptions about certain local variables that caused a bad allocation size.

@MrSapps
Copy link

MrSapps commented Sep 7, 2016

I think you should try with:

http://clang.llvm.org/docs/AddressSanitizer.html

And:

http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

Clang devs say if you run with these and fix all issues and optimization breaks the code, then raise a compiler bug. But likely there is memory/ub error some where and these sanitizers will find them at runtime.

@kcat
Copy link
Owner

kcat commented Sep 7, 2016

Seems it was because I marked a function as __attribute__((const)) that I shouldn't have. Specifically, it "returned" a couple values through pointer parameters, and Clang's optimizer assumed the variables given for the call wouldn't change for the following size calculation.

__attribute__((const)) static ALboolean DecomposeFormat(ALenum format, enum FmtChannels *chans, enum FmtType *type)
{
    static const struct {
        ALenum format;
        enum FmtChannels channels;
        enum FmtType type;
    } list[] = {
        ...
    };
    ALuint i;

    for(i = 0;i < COUNTOF(list);i++)
    {
        if(list[i].format == format)
        {
            *chans = list[i].channels;
            *type  = list[i].type;
            return AL_TRUE;
        }
    }

    return AL_FALSE;
}

The GCC docs say that const functions must not read global memory (do function-local static const variables count as global memory in this context?) and have no effects except for their return values (the return statement, and the two output parameters). It also says that such functions mustn't read from any pointer parameters (it doesn't, it only writes to them). I can only guess I had a more loose definition of "return value" -- results given back by the function, rather than strictly what the call's expression is evaluated as.

The Undefined Behavior Sanitizer didn't report any problems, and there was no warning by the compiler that I was doing something bad in a const function (using -Wall -Wextra).

@MrSapps
Copy link

MrSapps commented Sep 7, 2016

Hmm hard for me to say if this a bug in the tooling or not, I guess it can't tell if you are passing in pointers to "global" memory or not.

@kcat
Copy link
Owner

kcat commented Sep 14, 2016

I'm going to assume this was the same problem as issue #43. As such, it was fixed by commit a758cc8. If there's still a problem, please let me know.

@kcat kcat closed this as completed Sep 14, 2016
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Dec 2, 2016
- Add a workaround for a problem caused by clang

For some reason clang is breaking the code when openal-soft is built with
optimizations. If the file alBuffer.c is built with -O1 the problem don't
happens. See kcat/openal-soft#18
The problem seems to happen just on CURRENT due the clang version.

PR:		199488, 203818
Tested by:	ohartman@zedat.fu-berlin.de
Approved by:	mva
Approved by:	ports-secteam (feld)
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Apr 1, 2021
- Add a workaround for a problem caused by clang

For some reason clang is breaking the code when openal-soft is built with
optimizations. If the file alBuffer.c is built with -O1 the problem don't
happens. See kcat/openal-soft#18
The problem seems to happen just on CURRENT due the clang version.

PR:		199488, 203818
Tested by:	ohartman@zedat.fu-berlin.de
Approved by:	mva
Approved by:	ports-secteam (feld)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants