-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Build Box2D and make Cool Demos #22
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Quick compilation of Box2D + HelloWorld, unoptimized build: http://syntensity.com/static/box2d.ll Partially optimized: http://syntensity.com/static/box2d.opt.js (To see one of the .js files work, download it and then run it in a console JS engine.) I don't know Box2D much myself, but looks like it works, output is the same as HelloWorld compiled natively. Main question is how to move forward. The raw C++ functions are exposed (see _main, that is main() from HelloWorld), which are not JS-friendly. We should probably wrap them in a JS API somehow? |
Hey, this was fast! Great! Regarding the wrapper, is this something that needs to be done manually? Are there known strategies? And/or can it be set up in a way that future releases of box2d can easily be wrapped again? Another aspect to look into is size i guess. Closure compiles it from 2+mb to 640k which is still heavy for something that potentially is being used on the net. Btw. the compiler reports many (600+) "unreachable codes" which could be removed. |
Well, Emscripten exposes the raw C++ functions. We don't really have a good idea of how to automate the creation of nice JS-friendly APIs. Ideas are welcome! If the functions were C and not C++, this would be easier. You would have things like box2d_new_vector2() and so forth, with clear parameters. With C++, things are messier. But, even with C the functions are still not very JS-friendly. The current tools Emscripten has here are something that generates the unmangled C++ function names, so you can at least identify them in the compiled code. But getting from that to a proper API isn't clear to me. A manual API could be written. This would need some modifications for new versions of Box2D (how much depending on how much changed). About size: First, closure advanced yields something of similar size to the native binary, in my experience. Second, when you have a final program, you can remove unused functions and dead code. The "unreachable codes" are a bug in the compiler. Closure fixes it, so I haven't been motivated to fix the issue. |
Ok, thanks! Looking at the generated code and the Box2D API i think manually writing a full wrapper would be pretty much a whole project of it's own. To get to a fancy working demo one strategy could then be to write the Box2D code in C++ with ways to create shapes (addBox, addCircle, addPoly) and add a rendering service (drawBox, drawCircle, drawPoly) from outside. These ins and outs could then be replaced on the JS side by real implementations. Something like a "tiny simple API". Looking at the C++ code i have a hard time trying to grasp how that needs to be wrapped to be able to call it from outside, consider: // write a wrapper to create a world How would that look like using the generated code? How would one go about handling all that stackBase and such? Thanks! |
You shouldn't need to do anything with stackBase (which is the current position of the stack). In general, you might do stuff like this, to create a wrapper for a single function: function myWrapper(arg1, arg2) { This will leak memory though. Managing allocations is hard here. You can also wrap bigger stuff. For example converting the HelloWorld.cpp could start with var gravity = Box2D.b2Vec2.__new__1(0.0, -10.0); // This uses an autogenerated wrapper for the constructor (from Emscripten's namespacer tool). It is a parallel to new b2Vec2(0, -10); var world = Box2D.b2World.new(gravity, true); var groundBodyDef = ...; // call a wrapper, or allocate memory and call constructor yourself. Sadly the namespacer tool didn't succeed in creating an automatic wrapper here... I am starting to think we need a compiler plugin here, so we can scan all the classes and functions as we compile them, and generate wrappers automatically. Not sure how hard that is though. |
Aha! That looks much better, i think i can work with that ... Thanks! |
Here is for example what the namespacer generates for the scriptaclass test (python tests/runner.py clang_0_0.test_scriptaclass
That is generated from
There is some documentation inside the tools (tools/namespacer.py, etc.). Meanwhile, I am thinking perhaps using SWIG would make sense here. It's used for making bindings in general, maybe we can make a js/emscripten one. |
@kripken : I've been trying to get Box2D working through emscripten, and have been hitting something of a brick wall. The code compiles fine, but no matter what flags I pick, the resulting code seems to produce bad output (even for Box2D's HelloWorld, all the values come out "0"). I'm sure I'm doing something boneheaded, but can't seem to pin down what it is, so any help (even if it's just the Makefile you used to compile "box2d.opt.js" above) would be greatly appreciated. If you care to have a look, the code (with nasty hacky makefiles) I'm using is here: https://github.com/joelgwebber/bench2d (look in the "c" subdirectory -- Bench2d.c currently contains a simple benchmark #if 0'd out, with a copy of HelloWorld buried in the #else case). |
jgw: I took a look at the makefile there. Some possible issues:
|
@kripken: Thanks for the quick reply. I'm compiling on OSX, and I've tried building clang/llvm from trunk, as well as the 2.9 and 3.0 releases, to no avail. I've got an Ubuntu machine at work, but for various reasons (specific to my company's particular Ubuntu install) have been having trouble getting a reasonably modern version of clang/llvm working. I'll keep banging on that to see if I can get it going, but for now I don't know if that will fix it or not. I've removed all the flags and pushed a new version of the makefile (thanks for the heads-up on -std-compile-opts -- that was purely cargo-culting on my part). I'm still getting bad results -- Hello trips an assertion in b2PolygonShape now, and just prints "nan 0.00 nan" repeatedly if I disable assertions. Again, thanks for taking the time to look at this. I'll keep banging on it to see if I can get Clang working on my Linux box at work. Hopefully that will turn out to be the problem. In the meantime, if you happen to notice anything else I'm doing that's screwy, please feel free to chime in. And if you happen to have a working emscripten makefile for box2d's hello world lying about, I'd be glad to give that a shot over here :) |
My makefile from back then is too old, even if I could find it (emscripten changed a lot since then). But I am sure we can get this to work, at least on Linux (fixing everything for Windows and OS X might take a bit more work, I honestly don't know how much). I'll take another look at your makefile later today or tomorrow. |
With this diff: https://gist.github.com/1470674 it builds for me without any optimizations. The build commands are in the diff. Sorry for the hackishness, but it's late ;) Running it in node, I get
I hope that's good? :) |
@kripken Thanks again for your help. I finally managed to get a build of LLVM on my Ubuntu box, and oddly enough it works fine there (both with my original makefile and your modified one). I can't explain that, but the stack of clang, llvm, headers, and so forth is so complex that it could be almost anything :P I confirmed that both the Box2D HelloWorld and my benchmark produce sane output now, and the performance numbers aren't all over the map like they were with builds I did on the Mac. Now I feel comfortable moving forward with getting real benchmark numbers. To that end, I re-ran the build with all the emscripten flags in the makefile re-enabled, and confirmed that the output is a bit faster, and still generates sane values. But not being deeply familiar with emscripten, I don't have a good sense for whether those flags actually make sense. Given that none of them breaks the code, do you have any suggestions on whether I should change them in any way to get the best performance? And have you found that running emscripten output through the Closure compiler makes any difference in performance (other than for startup)? I would like to make sure I get an accurate representation of the best code emscripten can generate. |
I am actually right now working on a new compiler frontend (emcc) to make optimizing code much much easier. It should be usable in a few days. Until then, optimizing is not very convenient, but overall see the docs at https://github.com/kripken/emscripten/wiki/Optimizing-Code The crucial points are
You can see all of these in action in the emscripten benchmarks (python tests/runner.py benchmark). But again, the easiest thing might be to wait a few days for emcc, if you are not in a rush I would recommend that (you can also try emcc now, there is some --help in it, but expect breakage until its stabile). |
emcc is now usable for optimizing, see docs at https://github.com/kripken/emscripten/wiki/Optimizing-Code Basically, compile the final bitcode bc file, then do |
jgw, I just noticed this benchmark is hit by the problem mentioned at the bottom of issue 132. Until that is fixed (a few days, I hope), the JS code generated here will be much slower than it should be. Sorry about this. |
@kripken : Thanks for the heads-up. I've got it compiling with emcc, but am seeing 200-300ms/frame, which I presume is in line with what you'd expect given issue 132. I'll make a note in my benchmarks that the emscripten output is known to be suboptimal and on its way to being fixed. |
jgw, I see you checked in some emscripten-generated code into bench2d, is that the code you benchmarked with? It's only partially optimized (no closure compiler, for example, which emcc will run automatically for you). I fixed most of the slowdown bug in the emccbydefault branch in emscripten. It's not ready to be pushed to master yet. But I'm testing it in a fork of bench2d here, to get it to optimize as much as possible with the new emcc https://github.com/kripken/bench2d Edit: Note that emcc in master, while not as fast as in that branch, would be significantly faster than the generated code in the repo (due to closure, eliminator, and js optimizer passes), even with -O2. |
Last comment for today, -O3 gives the same output as -O2, and is about 10% faster. Both seem to be much faster than the code in the repo, which for some reason maxes out the memory on my machine (?) on both v8 and sm. I pushed an optimized build to to my fork. I still have some more optimizations to test (memory compression, etc.), and still need to finish fixing the slowdown bug, but most of the performance should be present in that build. If you can compare it to yours, I'm curious what the results are. |
Really last comment ;) Memory compression gives another 10% speedup, and the code doesn't seem to have broken. Pushed that to my fork. |
@kripken Thanks for all the help. I pulled the latest emscripten head and recompiled using EMCC and your flags. The numbers are now much more in line with the other implementations (mean=90ms, stddev=11ms). I've pushed the updated makefile and a copy of the compiled output (it's in c/bench2d.js) to save others the trouble of reproducing it. I didn't run the closure compiler on the output because I don't currently have it setup locally, and minification made no noticeable performance difference on the mandreel output. If your results are different, let me know and I'll take a moment to compile the output. You can see my updated numbers off to one side in the spreadsheet. I'm writing up a couple of followup edits, and will update all the graphs once I'm done with that. |
Closure compiler doesn't just minify: In advanced mode, it can greatly speed up the code in many cases by coalescing variables and inlining. emcc will run it in advanced mode by default, the emscripten compilation strategy relies on closure compiler - so not running it means the code is not fully optimized for speed. (Mandreel code is known to break on closure advanced, I spoke to them about that, and non-advanced just minifies as you said but has no effect on speed.) What spreadsheet do you refer to? |
Another question, even aside from closure compiler the code in your repo is unoptimized (it doesn't have the variable eliminator run on it, for example, which should happen with -O1 or above). But your makefile says it is running with -O3. Also, that command will not work at all if closure is not installed, so the emscripten-generated code in your repo seems to not be created by that makefile. Unless I am missing something? edit: Also, the makefile will create bench2d.opt.js, not bench2d.js as in the repo, further confusing me |
The spreadsheet I'm referring to is here: https://docs.google.com/spreadsheet/ccc?key=0Ag3_0ZPxr2HrdEdoUy1RVDQtX2k3a0ZISnRiZVZBaEE (it's the one the graphs on my writeup are derived from). The output that you see is what came from running: emcc -O3 -s USE_TYPED_ARRAYS=1 -s QUANTUM_SIZE=1 -s TOTAL_MEMORY=150000000 bench2d.bc -o bench2d.js I didn't realize emscripten code would actually survive Closure advanced optimizations intact. In that case, I can definitely see how it would make a difference. I've gone ahead and spun a new build with the closure compiler enabled, and the results are still better than the previous run. I've pushed that copy of bench2d.js, and updated the spreadsheet above (see the column "Emscripten Test" off to the right). |
I just pushed most of the emcc enhancements to master just now - it's worth pulling. |
I should have mentioned in the docs that we use closure advanced, sorry about that. I added a note to emcc --help now. I tried to run the other benchmarks to get some numbers on my machine, however the mandreel one seems broken in chrome and firefox, 'startApp is not defined'. |
Ok, just updated the compiled output and numbers with a new pull of emscripten, and it's yet again better than before. Now the mean is a bit over 70ms. I also updated the mandreel-compiled output -- it has a slightly odd loading machanism, so you'll need to load it out of the /c/mandreel directory directly. It's now compiled with full optimizations, though the performance isn't noticeably different from before. |
Can you please elaborate? I can't figure out how to run it, I tried both using a local httpserver and as a file:// url. Same error in both cases. |
Probably wasn't entirely clear -- what I meant was that I had to move the mandreel html file directly into the /c/mandreel subdirectory for it to load properly (I pushed this sometime yesterday). You can load it from a file URL with no problem: file:///path/to/bench2d/c/mandreel/bench2d_mandreel.html Just pulled onto my home laptop and verified this works properly. |
Thanks. It's still broken though, [11:10:09.784] uncaught exception: [Exception... "A parameter or an operation is not supported by the underlying object" code: "15" nsresult: "0x8053000f (NS_ERROR_DOM_INVALID_ACCESS_ERR)" location: "http://localhost:8888/mandreel/mandreel.js Line: 963"] I've seen this before with mandreel generated code, we discussed it with them on the FF bug tracker, it doesn't look like they test much on dev versions of browsers (I'm on FF11). |
Ah, I see that now in Firefox (v8) as well. I had been testing on Chrome. Not sure why it's screwed up on FF, but I was using Chrome as the baseline for my tests of different libraries, so it at least works out ok for the benchmarks. I'll ping the guy I've been talking to at Mandreel to see if they have a fix for this. |
Note that in Chrome perf results will differ a lot from FF9+ (FF9 will be stable in 1 week). Mandreel is tuned for Chrome and is significantly slower on FF, I can find the bug number where the details are discussed if you want. Emscripten is more balanced, at least in my benchmarks. So would be interesting to see results on a browser other than Chrome. |
@joelgwebber I recently finished some additional optimizations in emscripten and ran your box2d benchmark on it. Comparing the printed final averages, I get
(Note that I compare clang and not gcc as in your tests, I prefer clang because then I have the same compiler - or at least frontend - in both native code and JS). So the fastest JS engine, at least on my machine (2Ghz core 2 duo laptop, linux) is just 6.3x slower than native code, which is about twice as fast as in your earlier tests, I think? Code is in my fork, https://github.com/kripken/bench2d |
Great work, Alon. Does your fork include a new version of the emscripten-compiled output? If so, I'd like to go ahead and pull it, then re-run numbers on my machine as well. I also have a Flash version I keep meaning to finish up. When I do so, I'll post updated numbers. Side note: The only reason I used gcc rather than clang is that, for some reason, the clang output on my mac (using XCode's clang) produced non-functional output, and I just never had time to track down the issue. I doubt one will produce wildly better results than the other, though. |
The code is in my fork (the 'inline' file), but please wait on benchmarking as I want to finish a few last things (probably take a few days). I am very curious about Flash results. Any preliminary numbers there? About gcc and clang, yeah, I don't see a big difference in practice, my main reason is more theoretical to keep the comparison as close as possible. |
On Sun, Feb 19, 2012 at 12:41 AM, Alon Zakai <
No worries, just let me know when it's ready. I'm in no hurry. I am very curious about Flash results. Any preliminary numbers there? At a rough glance, I'm getting something on the order of 20-25 ms/frame, so |
Interesting. So Java was 2.5x slower than C, I think I remember? So Flash is 5x slower? Making it faster than JS at 6.3x slower, but not by a huge amount. Flash does have some potential advantages, aside from types and native matrix/vector classes it also has the alchemy stuff which lets it emulate memory very efficiently. That might not be relevant in this handwritten code (I think that is what Flash Box2D is?), but it is relevant in that JS engines running compiled code are slowed down by memory emulation quite a bit. However, Chrome and Firefox devs are working on this so it'll be interesting to see how much of a speedup that will give. |
On Tue, Feb 21, 2012 at 1:13 PM, Alon Zakai <
Yes, that's correct. However, the 6.3x number is for emscripten output, Flash does have some potential advantages, aside from types and native
To my knowledge, Alchemy is just a compiler; the VM is the same. I believe I'm no expert in this space, so take this with a grain of salt. But I |
Alchemy aside from being a compiler has also led to some VM additions (or perhaps the VM additions came first and I got that wrong?), that make it easier to run compiled code, stuff like special arrays that are accessed very quickly (more than typed arrays in JS). That is motivating some optimizations in JS engines to get similar performance. |
Ok, I just wanted to check some stuff but everything seems fine. bench2d.js in my fork is benchmarkable. Thanks for doing these benchmarks, btw :) Btw, two feature requests for your benchmarks: Code size (after gzip), and benchmarks of raw JS engines (not just in browsers). Code size is interesting for obvious reasons I think, benchmarks of raw JS engines are useful because sometimes people do run outside of browsers, say in node.js. Returning to the original topic of this bug, I finished porting Box2D using the emscripten bindings generator, so it's easy to use from normal JS (C++ classes get JS wrappers), https://github.com/kripken/box2d.js It isn't optimized yet (closure compiler breaks it, need to find out why) but a demo is up at http://syntensity.com/static/box2d.html |
Fixed the closure compiler bug, box2d.js is now closured too. |
Sorry for the slow response. Nathan Hammond submitted your additions as a pull request (joelgwebber/bench2d#4), which I just merged. Feel free to send any further changes, tweaks, optimizations, etc. As for code-size, I definitely agree that's worth measuring. It's a bit of work, because it would only be fair to measure on raw Javascript that's been as optimized as possible -- but it's tricky to pin down which Closure optimizations will actually work on it. Still worth doing, though. When you have a moment, please look at the code that's checked in under /emscripten to make sure nothing's broken. I'm getting numbers on my MacBook Pro (same as I used for the others) on the order of 90ms/frame. I seem to recall that you were getting better numbers than this. Can you try out the code in the repo to see if perhaps I'm missing something? |
The pull request uses box2d, which is a library version, and might perform a little differently than the raw compiled version of the benchmark, since for example the main benchmark code is in JS and not compiled C. Also, it's much larger. It would be fairer to compare the raw compiled benchmark like the previous benchmarks did, I think. I'll submit a pull request. |
Submitted joelgwebber/bench2d#5 I get 47.5ms on my 2009 laptop running Firefox nightly with that, and 114.5ms with Chrome dev. What browser are you using to test, that's probably the biggest factor here? |
[thanks, merged] Ah, I see. My numbers were with Chrome, as it was coming out ahead on the emscripten output in the past. On FF10, I'm getting just under 30ms pretty reliably. That's pretty impressive -- I can now definitively state that emscripten+ff has broken the order-of-magnitude barrier w.r.t. C++. A quick update on the AS3 tests -- I've confirmed that they aren't "cheating" by using built-in native vector/matrix classes (just the ones from the original Box2D, transliterated into AS3), and I'm getting a reliable 15ms/frame. I'm still surprised by this, to be honest, but so far I can't poke any holes in it. |
Great! About AS3, that's an impressive result for Flash. Part of the difference with JS is probably because the compiled JS uses patterns that JS engines haven't really optimized for, while the AS3 implementation uses patterns the runtime has been optimized for. I think that is changing though, at least in V8 and SpiderMonkey, so I hope to see big speedups later this year in Chrome and Firefox. |
Closing as resolved: Box2D demo at http://kripken.github.io/box2d.js/webgl_demo/box2d.html , and since Box2D, I think we pretty much have built the best demos in town! ;) |
Remove spam from file packager script and fix main thread prejs
…825.2 (emscripten-core#22) [dotnet/release/8.0] Update dependencies from dotnet/arcade
No description provided.
The text was updated successfully, but these errors were encountered: