<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5604096</id><updated>2011-07-30T08:48:52.333-07:00</updated><category term='runtime checking'/><category term='SSE'/><category term='compilers'/><category term='Miguel de Icaza'/><category term='CPUID'/><category term='benchmark'/><category term='assembler'/><category term='GNU'/><category term='local search'/><category term='Richard Stallman'/><category term='evolution'/><category term='OpenMP'/><category term='C++'/><category term='cdk molecular-modeling cheminformatics Java force-field'/><category term='GCC'/><category term='pthread'/><category term='Biology'/><category term='Win32'/><category term='automatic vectorization'/><category term='Mono'/><category term='performance'/><category term='vectorization'/><category term='Mono 2.4'/><category term='platform-indepent'/><category term='Advanced Vector Extension'/><category term='cpu'/><category term='proteases'/><category term='AVX'/><category term='catalytic triad'/><category term='java'/><category term='pKa'/><category term='FSF'/><category term='programming'/><category term='H++'/><category term='OCW'/><category term='multicore'/><category term='HPC'/><category term='open courseware'/><category term='SIMD'/><category term='MIT'/><category term='C#'/><category term='MinGW'/><category term='protein'/><category term='pKa predictor'/><category term='genetic programming'/><category term='Linux'/><category term='optimization'/><category term='PROPKA'/><category term='GNUC'/><category term='benchmarking'/><category term='global optimization'/><category term='alternative splicing'/><category term='Intel'/><category term='recursion'/><title type='text'>A Scientific Computing Blog with BZoli</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>16</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5604096.post-1550319230986395092</id><published>2009-09-29T21:34:00.000-07:00</published><updated>2009-09-29T21:38:35.208-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='FSF'/><category scheme='http://www.blogger.com/atom/ns#' term='GNU'/><category scheme='http://www.blogger.com/atom/ns#' term='Mono'/><category scheme='http://www.blogger.com/atom/ns#' term='Miguel de Icaza'/><category scheme='http://www.blogger.com/atom/ns#' term='Richard Stallman'/><title type='text'>The Saint and the Farmers</title><content type='html'>Imagine a land where people eat software. The farmers of the land make a decent living by sowing and reaping compilers, editors, browsers, and so on.&lt;br /&gt;&lt;br /&gt;One day a Saint arrives to this land. He sees that some people are starving because they don't get their daily software to eat. He starts to preach the New Era where software is free. He sets up a Free Foodware Foundation that distributes tasty new software at no cost to everyone.&lt;br /&gt;&lt;br /&gt;People rush to his Store and stack up on delicious, crispy, fresh software. Everybody is celebrating... except the farmers. They suddenly cannot earn their living any more, as no one needs their product. A few of them manage to switch to growing some obscure, 'embedded' software, that is not offered by the Foundation, but most of them go bankrupt.&lt;br /&gt;&lt;br /&gt;Yet some farmers decide to pick up the gauntlet and start distributing their own software for free, while still making some revenue from offering support, selling ad space, etc.&lt;br /&gt;&lt;br /&gt;When the Saint learns this, he becomes furious. What a 'traitors' these farmers are! They are worse than those who sold their software for money: they do this only to keep their pathetic products on the market, hindering the way of the Free Software! He warns all his followers not to eat these software, because it is evil and might be poisonous!&lt;br /&gt;&lt;br /&gt;Well, this story came to my mind while reading blogposts and comments about the recent Software Freedom Day in Boston, 19th Sept 2009. Being a great fan of Mono, C# and .NET, &lt;a href="http://www.osnews.com/story/22225/RMS_De_Icaza_Traitor_to_Free_Software_Community"&gt;Richard Stallman's harsh words&lt;/a&gt; about Miguel de Icaza left me with bitter feelings.&lt;br /&gt;&lt;br /&gt;I use many free software, both GNU and other. But we should not think that without free software the world would come to a sudden stop. What would happen if we did not have GCC, Linux, Openoffice, Gnome, Firefox, Apache, and so on?&lt;br /&gt;&lt;br /&gt;We would simply buy software, as we buy books, music, PC games, or even food.&lt;br /&gt;&lt;br /&gt;We can afford it. And reason we can afford it is that we have jobs, thanks to the fact that there are still products and services, for which people are willing to pay.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-1550319230986395092?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/1550319230986395092/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=1550319230986395092' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/1550319230986395092'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/1550319230986395092'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/09/saint-and-farmers.html' title='The Saint and the Farmers'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-3051934412709388884</id><published>2009-09-12T12:31:00.000-07:00</published><updated>2009-09-12T13:37:07.927-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Mono 2.4'/><category scheme='http://www.blogger.com/atom/ns#' term='benchmarking'/><category scheme='http://www.blogger.com/atom/ns#' term='SSE'/><category scheme='http://www.blogger.com/atom/ns#' term='global optimization'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='Mono'/><category scheme='http://www.blogger.com/atom/ns#' term='C++'/><title type='text'>Why Mono 2.4 is slow?</title><content type='html'>Let me start by saying that I am a big fan of  .NET, and &lt;a href="http://www.mono-project.com"&gt;Mono&lt;/a&gt;.  I believe that C# as a language is superior to Java and C++, and if I have the choice, I always work in C#. So I would like to believe that C# is as portable as Java, thanks to Novell and the Mono project.&lt;br /&gt;&lt;br /&gt;Recently, in connection with a &lt;a href="http://www.qath.net"&gt;molecular docking software&lt;/a&gt;, I wrote a sort of benchmark program, which takes a large list of points in 3D space, calculates the distances between each pair of points, and finds the largest distance.  The core of the program is a double loop which looks like this:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    for (int i = 0; i &lt; N3; i++)&lt;br /&gt;    {&lt;br /&gt;        P = points[i];&lt;br /&gt;&lt;br /&gt;        for (int j=i+1; j &lt; N; j++)&lt;br /&gt;        {&lt;br /&gt;            if ( dist(P, Q) &gt; max_dist)&lt;br /&gt;               max_dist = dist(P,Q);&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;... where points is a System.Collections.Generic.List with 27000 items. I have rewritten this program also in C++, using STL vectors, and compiled with GCC for my Intel Core 2 processor, using &lt;code&gt;--mtune=native&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;If I compare the running time on Windows between the C++ and the C# versions, they are roughly equal. However, on Linux, if I use Mono 2.4 as the CLR, the GNU/C++ version is much faster, it is around 2.5 sec, while running with Mono takes more than 7 seconds.&lt;br /&gt;&lt;br /&gt;So why running with Mono is slow? Using the Mono Profiler, it turns out that each access to the &lt;code&gt;points&lt;/code&gt; list entails an 'array bounds check' for the i and j indices:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;########################&lt;br /&gt; 58070.103 364540502    0.000   System.Collections.Generic.List`1::get_Count()&lt;br /&gt;  Callers (with count) that contribute at least for 1%:&lt;br /&gt;    364540502   5 % Benchmarks.MainClass::Main(string[])&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;... even though allegedly array bound checks are eliminated from for loops! Indeed, they are, but here I am using List and not Array. Let's switch to Array, and the running time decreases from 7.3 sec to 5.4. This is still twice as much as the time taken by the C++ version, so let's run &lt;code&gt;mono --profile&lt;/code&gt; again: &lt;br /&gt;&lt;pre&gt;&lt;br /&gt;########################&lt;br /&gt; 59853.928 364486500    0.000   Benchmarks.MainClass::dist(Point,Point)&lt;br /&gt;  Callers (with count) that contribute at least for 1%:&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;... we still have 3 billion calls to the dist(Point,Point) function. Why is it not inlined? Checking the GNU Compiler's output for the C++ version, I can see that the dist() is inlined. But for mono, it is probably too big (let me add here that I always use the &lt;code&gt;--optimize=all&lt;/code&gt; option for mono, which includes &lt;code&gt;inline&lt;/code&gt;.)&lt;br /&gt;Let's inline the dist(...) function manually:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    for (int i = 0; i &lt; N3; i++)&lt;br /&gt;    {&lt;br /&gt;        P = points[i];&lt;br /&gt;&lt;br /&gt;        for (int j=i+1; j &lt; N; j++)&lt;br /&gt;        {&lt;br /&gt;            double d = Math.Sqrt( .... );&lt;br /&gt;&lt;br /&gt;            if ( d &gt; max_dist)&lt;br /&gt;               max_dist = d;&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This speeds up the C# / Mono version to 3.9 sec, still far from the 2.5 s of the C++ version. &lt;br /&gt;&lt;br /&gt;Why is it that the C# version on Windows / Microsoft CLR is as fast as the C++ version, without any manual optimization (using generic List, no manual inlining...)&lt;br /&gt;&lt;br /&gt;Comparing the GCC generated assembly code with the code, generated by &lt;code&gt;mono --aot -O=all&lt;/code&gt;, we can see the the GCC-emitted code is using SSE-based arithmetics, while the Mono JIT code is using x87 floating point instructions:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    10ae:       dd 45 e8                fldl   0xffffffe8(%ebp)&lt;br /&gt;    10b1:       dd 45 e8                fldl   0xffffffe8(%ebp)&lt;br /&gt;    10b4:       de c9                   fmulp  %st,%st(1)&lt;br /&gt;    10b6:       de c1                   faddp  %st,%st(1)&lt;br /&gt;    10b8:       d9 fa                   fsqrt&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Moreover, for the &lt;code&gt;if ( d &gt; max_dist) max_dist = d;&lt;/code&gt; statement, the GCC code is using the &lt;code&gt;maxsd&lt;/code&gt; instruction, while the Mono-generated code is branching with a jump...&lt;br /&gt;&lt;br /&gt;So &lt;a href="http://tirania.org/blog/index.html"&gt;Miguel de Icasa&lt;/a&gt; and his team still has a lot to do... I look forward to &lt;a href="http://www.mono-project.com/Roadmap"&gt;Mono 2.6&lt;/a&gt;, which perhaps will amend some of these issues.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-3051934412709388884?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/3051934412709388884/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=3051934412709388884' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/3051934412709388884'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/3051934412709388884'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/09/why-mono-24-is-slow.html' title='Why Mono 2.4 is slow?'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-469278473503539522</id><published>2009-02-21T05:04:00.000-08:00</published><updated>2009-02-21T05:11:02.997-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='global optimization'/><category scheme='http://www.blogger.com/atom/ns#' term='alternative splicing'/><category scheme='http://www.blogger.com/atom/ns#' term='local search'/><category scheme='http://www.blogger.com/atom/ns#' term='evolution'/><category scheme='http://www.blogger.com/atom/ns#' term='genetic programming'/><title type='text'>Evolution and speciation as global optimization</title><content type='html'>The current model of philogenetic evolution is basically a random searchto optimize a fitness function. Genetic programming works by the same principles and it is able to find good approximations to global optima.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Genetic_programming"&gt;Genetic programming&lt;/a&gt; is an optimization technique where we maintain a set of candidate solutions, the 'genomes', and modify them randomly by 1) pointlike mutations, or 2) crossovers: creating a new solution by combining halves from existing 'genomes'.&lt;br /&gt;&lt;br /&gt;Crossovers are very important; without crossover, genetic programming would be only a local search, capable of finding only local optima.&lt;br /&gt;&lt;br /&gt;Seemingly, evolution has the same twofold mechanism: small genetic changes that 'implement' a local search on the fitness landscape; and large, abrupt changes like whole-genome duplication, etc. causing jumps  that land on the slopes of faraway peaks.&lt;br /&gt;&lt;br /&gt;However, this is only an illusion. Even chromosome or whole-genome duplication is not radical enough for a truly global search for optimum, as no new genes are introduced. Lateral gene transfer works only for bacteria and viruses.&lt;br /&gt;&lt;br /&gt;There is one more problem with abrupt, large-scale genetic changes in multicellular, diploid life forms: the offsprings of the new phenotype would be isolated, and cannot find a mate. While this is not a problem for slow &lt;a href="http://en.wikipedia.org/wiki/Genetic_drift"&gt;genetic drift&lt;/a&gt;, where offsprings are similar enough to their parents and relatives, it poses a problem for 'jumps' that are so successful in genetic programming.&lt;br /&gt;&lt;br /&gt;One solution could be that we state that evolution is a local search, and the fitness landscape is smooth enough to make every peak attainable on a local search path.&lt;br /&gt;&lt;br /&gt;I don't know if the fossil records support this hypothesis or not; I feel that there are large 'gaps' between groups (taxa), that would suggest that from time to time abrupt changes are necessary, and evolution cannot be explained as a smooth, simple local search.&lt;br /&gt;&lt;br /&gt;In the following I share my own '&lt;a href="http://math.ucr.edu/home/baez/crackpot.html"&gt;crackpot&lt;/a&gt;' theory on evolution; it is not the result of any kind of study or investigation, just my thoughts on the topic; it was not reviewed or approved by any expert in the field (and what regards me, I am an absolute novice or amateur...). So quit reading or take the following cautiously, as it could be the dumbest thing you've ever heard.&lt;br /&gt;&lt;br /&gt;I think there are two distinct mechanisms for evolution: one is the well-known darwinian process: small random mutations accumulate in exons and selected or eliminated by natural selection: this is the local search.&lt;br /&gt;&lt;br /&gt;And I think there is another mechanism, that is behind the abrupt, jump-like changes in the genome: it is much less known or discussed in mainstream science. These changes accumulate silently, in non-expressed parts of the genome, like introns or 'junk' DNA, and they are passed on to offsprings so that they are  widespread in the genome before the Big Change.&lt;br /&gt;&lt;br /&gt;Then there comes a time when the genome becomes bistable: a single point-mutation can turn lots of the exons into introns or junk DNA, while the latent, hidden new genes suddenly become expressed.&lt;br /&gt;&lt;br /&gt;As the latent new genome is already widespread at this point, this 'flip over' can happen simultaneously in many offsprings, so they will be able to find mates and create their on population.&lt;br /&gt;&lt;br /&gt;The crucial notion here is the bistability of the genome:  a state where it contains really two genomes, one of which is suppressed, the other expressed, but a single mutation can turn it around, so that the latent genome becomes expressed. Of course there could be thousands of such mutations, so this flipover event is not at all improbable.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Alternative_splicing"&gt;Alternative splicing&lt;/a&gt; in some genes is a model for this flip-over in small scale.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-469278473503539522?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/469278473503539522/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=469278473503539522' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/469278473503539522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/469278473503539522'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/02/evolution-and-speciation-as-global.html' title='Evolution and speciation as global optimization'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-5204808409128719066</id><published>2009-02-12T12:59:00.000-08:00</published><updated>2009-02-12T13:15:22.972-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='protein'/><category scheme='http://www.blogger.com/atom/ns#' term='PROPKA'/><category scheme='http://www.blogger.com/atom/ns#' term='proteases'/><category scheme='http://www.blogger.com/atom/ns#' term='H++'/><category scheme='http://www.blogger.com/atom/ns#' term='pKa predictor'/><category scheme='http://www.blogger.com/atom/ns#' term='pKa'/><category scheme='http://www.blogger.com/atom/ns#' term='catalytic triad'/><title type='text'>PROPKA vs H++: pKa predictors for protein residues</title><content type='html'>Recently I decided to try out some free, online servers for predicting &lt;a href="http://en.wikipedia.org/wiki/Protein_pKa_calculations"&gt;pKA values on protein&lt;/a&gt; residues: &lt;a href="http://propka.ki.ku.dk/"&gt;PROPKA&lt;/a&gt;, &lt;a href="http://biophysics.cs.vt.edu/H++/index.php"&gt;H++&lt;/a&gt;, and &lt;a href="http://agknapp.chemie.fu-berlin.de/karlsberg/"&gt;Karlsberg+&lt;/a&gt;. This latter promises to send the results in email, but I never got back anything, so I guess it is dropped from the competition.&lt;br /&gt;&lt;br /&gt;First, I tried a famous example for abnormal pKa value: the &lt;a href="http://en.wikipedia.org/wiki/Chymotrypsin"&gt;bovine chymotrypsin&lt;/a&gt;, which was one of the first proteases whose catalytic mechanism was unveiled. All textbooks on biochemistry describe this: we have a 'catalytic triad', which are Asp102, His57, and Ser195. Asp102 forms a Hydrogen bond with His57, making its other aromatic nitrogen a strong bases, which deprotonates Ser195. Thus the Serine becomes a nucleophile and attacks the carbon in the peptide bond.&lt;br /&gt;&lt;br /&gt;So I expected from the pKa predictors that they find out that the Histidine becomes a strong base, and that the Serine is deprotonated.&lt;br /&gt;&lt;br /&gt;First, I tried the PROPKA server, giving &lt;a href="http://www.pdb.org/pdb/explore/explore.do?structureId=1AB9"&gt;1ab9&lt;/a&gt; (BOVINE GAMMA-CHYMOTRYPSIN)&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;            pKa   pKmodel&lt;br /&gt;ASP 102B   0.40      3.80&lt;br /&gt;HIS  57B   6.19      6.50&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This is not what I expected. The Histidine is actually getting a bit acidic; while the Aspartate is very acidic. Trying on H++:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;HIS57: 11.3&lt;br /&gt;ASP102 -5.1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;I am satisfied with the prediction on the Histidine: it became a strong base; but -5.1 for the Asp is a bit off the mark...&lt;br /&gt;&lt;br /&gt;Next, I tried Caspase 3, a cysteine protease, with PDB structure &lt;a href="http://www.pdb.org/pdb/explore/explore.do?structureId=1RHK"&gt;1rhk&lt;/a&gt;. H++ did not accept it, complaining about missing residues. The catalytic residue is Cys165A, for which propka gives pKa = 11.14, instead of the standard 9.0. So the cysteine indeed became a strong base and nucleophile.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-5204808409128719066?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/5204808409128719066/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=5204808409128719066' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5204808409128719066'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5204808409128719066'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/02/propka-vs-h-pka-predictors-for-protein.html' title='PROPKA vs H++: pKa predictors for protein residues'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-8537428950367341941</id><published>2009-02-01T04:58:00.000-08:00</published><updated>2009-02-01T05:05:10.807-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OCW'/><category scheme='http://www.blogger.com/atom/ns#' term='MIT'/><category scheme='http://www.blogger.com/atom/ns#' term='open courseware'/><category scheme='http://www.blogger.com/atom/ns#' term='Biology'/><title type='text'>MIT OpenCourseWare: there is no such thing as a free lunch</title><content type='html'>The idea is great. Collect all the course materials: handouts, ppt slides, lecture notes, problem sets &amp;amp; answers, 'further reading lists', and so on from the brilliant professors of the MIT, and put them on the web, so that anyone can access them freely.&lt;br /&gt;&lt;br /&gt;A generous gesture for all the students and young people in poor countries, who are eager to learn, but can't afford it. Obviously MIT has nothing to lose with this move, but can gain a lot by its PR value.&lt;br /&gt;&lt;br /&gt;I come back to the MIT OCW website from time to time, look around, and leave disappointed. There are appealing titles that I would love to go through: like "&lt;a href="http://ocw.mit.edu/OcwWeb/Biology/7-343Spring-2008/CourseHome/index.htm"&gt;7.343 Sophisticated Survival Skills of Simple Microorganisms&lt;/a&gt;", or  "&lt;a href="http://ocw.mit.edu/OcwWeb/Biology/7-343Spring-2007/CourseHome/index.htm"&gt;7.343 Neuron-glial Cell Interactions in Biology and Disease&lt;/a&gt;" or "&lt;a href="http://ocw.mit.edu/OcwWeb/Biology/7-343Fall-2004/CourseHome/index.htm"&gt;7.343 Protein Folding, Misfolding and Human Disease&lt;/a&gt;", - just to mention a few (I am mostly browsing graduate courses in biology); - but when I want to  download the 'courseware', it turns out that this includes nothing more than a meagre syllabus, a short abstract of each or some of the lectures, and a list of references to papers in scientific journals that are behind paywalls.&lt;br /&gt;&lt;br /&gt;Especially this list of references is the most common 'courseware' that is offered, although I think that someone who goes to MIT OCW for knowledge does not have a ScienceDirect account. Obviously all these courses are taught using a PPT slideshow - why cannot they just put them online? (I dare not think of copyright issues...)&lt;br /&gt;&lt;br /&gt;Working at a huge multinational company, comparable in size with MIT, I think I know the answer. Top managers love to be creative, they are enthusiastic  about novel ideas and being a 'leader' in an area like OCW (undisputably  initiated by MIT), - but they are sluggish and lazy to follow up these ideas and see that they are implemented correctly and completely.&lt;br /&gt;&lt;br /&gt;Most probably these top officials of MIT never actually go to the OCW site and click through to see any of the course materials they are so proud of. If they did, they would probably feel cheated and angry, just as any third-world student who find nothing but a teaser in place of the knowledge they seek.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-8537428950367341941?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/8537428950367341941/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=8537428950367341941' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/8537428950367341941'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/8537428950367341941'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/02/mit-opencourseware-there-is-no-such.html' title='MIT OpenCourseWare: there is no such thing as a free lunch'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-5691304419413340518</id><published>2009-01-16T02:16:00.000-08:00</published><updated>2009-01-16T02:42:58.574-08:00</updated><title type='text'>GNU make: target-specific settings</title><content type='html'>I am a great fan of Make and specifically of the GNU Make: I use it as my primary scripting and building tool. Although I tried other tools like ANT, I soon got disappointed and fled back to Make.&lt;br /&gt;&lt;br /&gt;An important feature of GNU make is the use of &lt;a href="http://www.gnu.org/software/make/manual/html_node/Target_002dspecific.html"&gt;target-specific variable settings&lt;/a&gt;. This is &lt;i&gt;the&lt;/i&gt; feature you need to create Makefiles that work for several platforms and compilers. Eg. you could write (just an oversimplified illustration):&lt;br /&gt;&lt;code&gt;&lt;br /&gt;gcc: CC = gcc&lt;br /&gt;gcc: CCFLAGS =  -g -I.. -std=c99 -pedantic -O3&lt;br /&gt;gcc: CCFLAGS += -Wall -Wwrite-strings  -Wmissing-format-attribute -Wstrict-aliasing&lt;br /&gt;gcc: all&lt;br /&gt;&lt;br /&gt;icc: CC = icc&lt;br /&gt;icc: CCFLAGS =  -g -I.. -std=c99 -fast -fbuiltin -finline -check-uninit&lt;br /&gt;icc: all&lt;br /&gt;&lt;br /&gt;all: ... ( the actual targets)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;- so that you can compile your code both with GNU and Intel compiler, depending on the main target (gcc or icc).&lt;br /&gt;&lt;br /&gt;I use this technique all the time. But recently I found that these Makefiles are not really portable, since they don't work well on GNU make version 3.79 or earlier. And as the current version of GNU make is 3.81, we cannot say that 3.79 is a terribly old and obsolete release, and in fact lots of computers at my workplace have this version.&lt;br /&gt;&lt;br /&gt;Actually the feature of target-specific variable settings does exist on version 3.79, the problem is that it is buggy and crashes at slight changes in the Makefile. For example, consider the following code:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;.PHONY: test&lt;br /&gt;&lt;br /&gt;MYVAR = Error&lt;br /&gt;&lt;br /&gt;test: MYVAR = Hello,&lt;br /&gt;test: MYVAR += world!&lt;br /&gt;test:&lt;br /&gt; @echo $(MYVAR)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;which works fine with make 3.81, printing "Hello, world!", as intended. However, on 3.79, the output is "Error world!". The following Makefile (adding only the &lt;b&gt;export&lt;/b&gt; keyword)&lt;br /&gt;&lt;code&gt;&lt;br /&gt;.PHONY: test&lt;br /&gt;&lt;br /&gt;export MYVAR = Error&lt;br /&gt;&lt;br /&gt;test: MYVAR = Hello,&lt;br /&gt;test: MYVAR += world!&lt;br /&gt;test:&lt;br /&gt; @echo $(MYVAR)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;works fine with 3.81, but it crashes with 3.79 with the error&lt;br /&gt;&lt;code&gt;&lt;br /&gt;../3.79.1/expand.c:489: failed assertion `current_variable_set_list-&gt;next != 0'&lt;br /&gt;     0 [sig] make-3.79 7772 open_stackdumpfile: Dumping stack trace to make-3.79.exe.stackdump&lt;br /&gt;Aborted (core dumped)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;It is unfortunate that this bug exists in a widely distributed and used version of GNU Make, since open source projects rely on GNU make on the user's site.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-5691304419413340518?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/5691304419413340518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=5691304419413340518' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5691304419413340518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5691304419413340518'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/01/gnu-make-target-specific-settings.html' title='GNU make: target-specific settings'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-8329719829277687658</id><published>2009-01-14T12:40:00.000-08:00</published><updated>2009-01-14T13:38:03.270-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cpu'/><category scheme='http://www.blogger.com/atom/ns#' term='SSE'/><category scheme='http://www.blogger.com/atom/ns#' term='GCC'/><category scheme='http://www.blogger.com/atom/ns#' term='optimization'/><category scheme='http://www.blogger.com/atom/ns#' term='SIMD'/><category scheme='http://www.blogger.com/atom/ns#' term='automatic vectorization'/><category scheme='http://www.blogger.com/atom/ns#' term='Intel'/><title type='text'>Automatic Vectorization - Part 2</title><content type='html'>I wonder if automatic vectorization is also possible without the &lt;code&gt;__attribute__((vector_size(16)))&lt;/code&gt; annotation, which is cheating in a way, since we give a quite explicit hint about how to pack data in vectors... The simplest example of truly automatic vectorization is this:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/47064.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;When compiling with &lt;code&gt;gcc -O3 -msse3 -mfpmath=sse -std=c99 -S&lt;/code&gt;, we get this assembly output:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/47065.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;We can see that four float values are added in every iteration, using the &lt;code&gt;addps&lt;/code&gt; instruction, which adds four packed single-precision floating point numbers. - If the length of the arrays is not a multiple of 4 we still obtain this vectorization using addps, up to 1020, and the last 3 elements are added using addss.&lt;br /&gt;&lt;br /&gt;Now if we try to compile the following, very similar C function:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/47075.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;the output is much more complicated, see it &lt;a href="http://gist.github.com/47092"&gt;here&lt;/a&gt;. There are two loops: the L7 loop is with scalar instructions (addss), the L5 is with addps (parallel addition of 4 floats). Which one is used is decided according to the alignment of the input data: this is checked by the&lt;br /&gt;&lt;code&gt;testb $15, %bl&lt;/code&gt;&lt;br /&gt;instruction at the beginning.&lt;br /&gt;&lt;br /&gt;I was not able to modify my C code so that the function arguments are aligned at 16-bytes boundaries. For example, we could try&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/47096.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;... but it gives the same assembler code as before, checking the alignment and deciding if it is worth to vectorize the loop or not.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-8329719829277687658?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/8329719829277687658/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=8329719829277687658' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/8329719829277687658'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/8329719829277687658'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2009/01/automatic-vectorization-part-2.html' title='Automatic Vectorization - Part 2'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-5284517365587216793</id><published>2008-12-28T04:13:00.000-08:00</published><updated>2008-12-28T04:56:45.564-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cpu'/><category scheme='http://www.blogger.com/atom/ns#' term='SSE'/><category scheme='http://www.blogger.com/atom/ns#' term='GNUC'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='GCC'/><category scheme='http://www.blogger.com/atom/ns#' term='vectorization'/><category scheme='http://www.blogger.com/atom/ns#' term='SIMD'/><category scheme='http://www.blogger.com/atom/ns#' term='Intel'/><title type='text'>Automatic Vectorization - does it work?</title><content type='html'>In my previous post I showed an example how smart GCC can be. But the most incredible feature of GCC is automatic vectorization. Normally, game developers etc use 'intrinsics', that is, predefined small inline functions that translate directly into SIMD instructions, to exploit vector processing capabilities of the CPU. All major compilers: GNU, Intel's, and Microsoft's, provide support for intrinsics. However, GNU is the only one (?) trying to &lt;i&gt;vectorize&lt;/i&gt;, without human intervention, the code as a step of optimization. Of course, we need to give some hints how to group data into vectors, using the &lt;code&gt;vector_size&lt;/code&gt; attribute. For example, the following C function:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/40458.js"&gt;&lt;/script&gt;&lt;br /&gt;will produce this output:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;_addv4:&lt;br /&gt; addps %xmm1, %xmm0&lt;br /&gt; ret&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Impressive! The arguments are passed in XMM registers, so there is even no read/write from memory! &lt;br /&gt;&lt;br /&gt;Should I worry about my job? Will compilers finally outsmart humans in producing better assembly output? Not yet... If we take another, very similar example:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/40494.js"&gt;&lt;/script&gt;&lt;br /&gt;- we just want vectors of 3 floats, instead of 4, which is in a way natural, living in a 3D space; we get this error message from GCC:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;error: number of components of the vector not a power of two&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Oops! Good that I don't get this message from malloc() when I want to reserve lets say 95 bytes. Next, look at the following simple function:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/40511.js"&gt;&lt;/script&gt;&lt;br /&gt;Compiling with &lt;code&gt;gcc -S -O3 -msse3 -fomit-frame-pointer -foptimize-register-move&lt;/code&gt; gives&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/40516.js"&gt;&lt;/script&gt;&lt;br /&gt;What is disturbing for me is that register moves are not optimized. Basically all the &lt;code&gt;moveaps&lt;/code&gt; instructions could be optimized away, if we pack the floats in the right registers, something like this:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/40529.js"&gt;&lt;/script&gt;&lt;br /&gt;It seems that for performance critical applications or code segments we'd better do it manually, in assembly or using intrinsics.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-5284517365587216793?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/5284517365587216793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=5284517365587216793' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5284517365587216793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5284517365587216793'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/automatic-vectorization-does-it-work.html' title='Automatic Vectorization - does it work?'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-5440545104764174602</id><published>2008-12-20T12:37:00.000-08:00</published><updated>2008-12-20T13:04:44.007-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='compilers'/><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='runtime checking'/><category scheme='http://www.blogger.com/atom/ns#' term='recursion'/><category scheme='http://www.blogger.com/atom/ns#' term='GCC'/><category scheme='http://www.blogger.com/atom/ns#' term='optimization'/><title type='text'>Optimized code from GCC</title><content type='html'>Some time ago I got into the habit of writing small, simple functions in C, compiling them, and then check the assembly code output of the compiler. For the GNU C compiler, you may get the assembly listing by using the -S command-line switch. In most cases, the output is more or less what I expected. In some cases, however, the compiler excel my expectations. Here is an example:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38395.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;This is a stupid, useless function that decrements its argument, and calls itself recursively. The result is obviously zero, but this result is obtained in a tedious and inefficient way. For a big input, eg. 1000000, the program obviously results in stack overflow. Compiling this function without optimization will result in this code:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38418.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;We can see that _f is indeed calling _f, that is, itself. However, if we compile this function with the -O3 command line option, meaning level-3 optimization, we get a much different object code:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38419.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;We can see that the whole recursivity is optimized away. Instead, the function argument is removed from the stack, the EAX register is loaded with zero, and the function returns. &lt;br /&gt;&lt;br /&gt;I am really, totally impressed. How could the compiler find out that this function will always return zero? OK, but let us modify the function f a little bit:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38421.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;Did you spot the difference? Now the input value x can be negative. In that case, the function never returns, it exhausts the stack and crashes. Let's see what GCC makes out of it:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38419.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;- Exactly the same. Did it realize that there is no point to implement the case when x is negative? Or this is simply a bug in GCC? - If you have an opinion or know the answer please leave me a comment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-5440545104764174602?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/5440545104764174602/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=5440545104764174602' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5440545104764174602'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5440545104764174602'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/optimized-code-from-gcc.html' title='Optimized code from GCC'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-6703221229050696488</id><published>2008-12-20T12:13:00.000-08:00</published><updated>2008-12-20T12:25:09.888-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='C++'/><category scheme='http://www.blogger.com/atom/ns#' term='Intel'/><category scheme='http://www.blogger.com/atom/ns#' term='CPUID'/><title type='text'>CPUID: getting the brand name of the CPU</title><content type='html'>Last time I gave an example how to check &lt;a href="http://en.wikipedia.org/wiki/SIMD"&gt;SIMD &lt;/a&gt;support using the &lt;a href="http://en.wikipedia.org/wiki/CPUID"&gt;CPUID &lt;/a&gt;assembly instruction. We can use CPUID also to obtain the exact brand name of the CPU. For example, running this C program on my desktop I get&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;CPU brand:               Intel(R) Pentium(R) 4 CPU 2.40GHz&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38389.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;We need to invoke the CPUID instruction three times with different integer constants in EAX, and copy the 16 bytes from EAX, EBX, ECX and EDX into a char buffer. So the brand name can be 48 characters long; and in my case, it is padded by spaces.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-6703221229050696488?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/6703221229050696488/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=6703221229050696488' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/6703221229050696488'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/6703221229050696488'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/cpuid-getting-brand-name-of-cpu.html' title='CPUID: getting the brand name of the CPU'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-4118079691275529619</id><published>2008-12-19T12:54:00.000-08:00</published><updated>2008-12-19T13:10:34.182-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='GCC'/><category scheme='http://www.blogger.com/atom/ns#' term='Advanced Vector Extension'/><category scheme='http://www.blogger.com/atom/ns#' term='SIMD'/><category scheme='http://www.blogger.com/atom/ns#' term='AVX'/><category scheme='http://www.blogger.com/atom/ns#' term='Intel'/><category scheme='http://www.blogger.com/atom/ns#' term='CPUID'/><title type='text'>Checking for SIMD extension with CPUID</title><content type='html'>The last time I pondered how to check the number of CPU cores in the system from software. It might be equally useful to check if various SIMD extensions (SSE ... SSE4) are present. The good old CPUID instruction on Intel and AMD can do the trick.&lt;br /&gt;&lt;br /&gt;This code snippet must be compiled with GCC, since it uses the &lt;a href="http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Machine-Constraints.html"&gt;inline assembler&lt;/a&gt; feature:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/38115.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;I got the description of the feature bits from &lt;a href="http://www.intel.com/assets/pdf/appnote/241618.pdf"&gt;this Intel document&lt;/a&gt;; except that it does not describe ECX:bit 28 as the flag for AVX (&lt;a href="http://software.intel.com/sites/avx/"&gt;Advanced Vector Extensions&lt;/a&gt; to be released in 2010). This info is from &lt;a href="http://www.sandpile.org/ia32/cpuid.htm"&gt;sandpile.org&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-4118079691275529619?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/4118079691275529619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=4118079691275529619' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/4118079691275529619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/4118079691275529619'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/checking-for-simd-extension-with-cpuid.html' title='Checking for SIMD extension with CPUID'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-168498032215859872</id><published>2008-12-16T02:00:00.000-08:00</published><updated>2008-12-16T03:04:02.211-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multicore'/><category scheme='http://www.blogger.com/atom/ns#' term='pthread'/><category scheme='http://www.blogger.com/atom/ns#' term='OpenMP'/><category scheme='http://www.blogger.com/atom/ns#' term='Win32'/><category scheme='http://www.blogger.com/atom/ns#' term='MinGW'/><category scheme='http://www.blogger.com/atom/ns#' term='Linux'/><category scheme='http://www.blogger.com/atom/ns#' term='platform-indepent'/><title type='text'>Getting the number of CPU cores in a platform-independent way</title><content type='html'>I like very much &lt;a href="http://mingw.org"&gt;MinGW&lt;/a&gt;, it makes possible to write C/C++ code that somewhat platform-independent, - that is, it can be built both for Linux and Windows. Unfortunately, I always bump into features that are missing.&lt;br /&gt;&lt;br /&gt;For example, sys/sysinfo.h should contain a function, called get_nprocs(), which gives the number of processors/CPU cores in the system. This is needed to scale a numerically-intensive application for multicore architectures. However, it seems that MinGW is lacking this header file - even though I downloaded the most recent (4.3.0) GCC version for MinGW.&lt;br /&gt;&lt;br /&gt;So what can we do? Since &lt;a href="http://gcc.gnu.org/gcc-4.3/"&gt;GCC 4.3&lt;/a&gt; has full &lt;a href="http://openmp.org/wp/"&gt;OpenMP &lt;/a&gt;support, my first attempt was to call omp_get_num_procs():&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/36404.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;... and compile it with gcc -fopenmp: - but I get lots linking errors... It turns out that OpenMP depends on pthreads, which is not part of MinGW.&lt;br /&gt;&lt;br /&gt;So I downloaded "&lt;a href="http://sourceware.org/pthreads-win32/"&gt;Pthreads for Win32&lt;/a&gt;", and now I was able to build my executable with&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;gcc -fopenmp -o test.exe main.c /path/to/pthreadGC2.dll&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Invoking test.exe gives the correct result of 2. But I am still not happy, since I detest adding a new 3rd-party library each time I need something less trivial than printf. Moreover, pthreads-for-windows is released under LGPL, which does not allow static linking.&lt;br /&gt;&lt;br /&gt;So we have to resort to the good old #ifdef ... style of programming, call the WIN32 API on windows, and use sys/sysinfo.h on LINUX:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/36406.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;This code snippet also shows how to get the process affinity mask on Windows, which is again a headache with MinGW, since pthreads-for-win32 does not implement the cpu_set_t stuff (although it has a sched.h header).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-168498032215859872?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/168498032215859872/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=168498032215859872' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/168498032215859872'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/168498032215859872'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/getting-number-of-cpu-cores-in-platform.html' title='Getting the number of CPU cores in a platform-independent way'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-4477673665381876241</id><published>2008-12-09T07:51:00.000-08:00</published><updated>2008-12-09T08:04:41.314-08:00</updated><title type='text'>Adding explicit Hydrogens with CDK 1.1.2.</title><content type='html'>In my previous post, I created the 3D model of decalin, an organic molecule which is the building block of the steroid framework, using the Chemistry Development Toolkit. You might notice that the final structure shows only the Carbon atoms, all Hydrogens are 'implicit', that is, they are taken into account by the ModelBuilder3D, but not written out into the PDB file.&lt;br /&gt;&lt;br /&gt;I had a hard time until I finally managed to add explicit hydrogens. It seems that in earlier versions of CDK, the HydrogenAdder class had a method 'addExplicitHydrogens()', but this class is now replaced with CDKHydrogenAdder, which handles only implicit hydrogens. After lots of experimenting, I wrote a function that basically copies a molecule and turns all implicit hydrogens into explicit:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/33939.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;Then I used the ModelBuilder3D class, as before, to create a 3D model:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5bwkvHH7ku8/ST6VyAdXvXI/AAAAAAAAAFk/qfFuklYpGvU/s1600-h/decalin_with_hydrogens.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 319px;" src="http://4.bp.blogspot.com/_5bwkvHH7ku8/ST6VyAdXvXI/AAAAAAAAAFk/qfFuklYpGvU/s320/decalin_with_hydrogens.jpg" border="0" alt="Decalin with explicit hydrogens - 3D model"id="BLOGGER_PHOTO_ID_5277820499673988466" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I have no idea why the feature to add explicit hydrogens were removed from CDK. It is necessary eg. for proper docking, detecting hydrogen bonds, and so on. Maybe in later releases they will add it back; if not, you can use my snippet above as a substitute.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-4477673665381876241?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/4477673665381876241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=4477673665381876241' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/4477673665381876241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/4477673665381876241'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/adding-explicit-hydrogens-with-cdk-112.html' title='Adding explicit Hydrogens with CDK 1.1.2.'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_5bwkvHH7ku8/ST6VyAdXvXI/AAAAAAAAAFk/qfFuklYpGvU/s72-c/decalin_with_hydrogens.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-5386676827452703276</id><published>2008-12-01T22:57:00.000-08:00</published><updated>2008-12-09T07:47:09.620-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk molecular-modeling cheminformatics Java force-field'/><title type='text'>3D geometry modeling in CDK 1.1.2</title><content type='html'>Recently I am experimenting with CDK, the &lt;a href="http://sourceforge.net/projects/cdk/"&gt;Chemistry Development Kit (CDK)&lt;/a&gt;, a free, open-source  (&lt;a href="http://www.gnu.org/copyleft/lesser.html"&gt;GNU LGPL&lt;/a&gt;) Java toolkit for cheminformatics. An outstanding feature of CDK is its 3D modeller: you provide a 2D (planar) model, and it can generate a true three-dimensional optimized geometry, using a chosen molecular force field.  For example, I downloaded a 2-dimensional .mol file of &lt;a href="http://en.wikipedia.org/wiki/Decalin"&gt;decalin &lt;/a&gt;from &lt;a href="http://www.ebi.ac.uk/chebi/"&gt;ChEBI&lt;/a&gt;, - this molecule is obviously not planar. Then I used CDK to generate the 3D geometry. Here is a code snippet how I did it:&lt;br /&gt;&lt;br /&gt;&lt;script src="http://gist.github.com/31032.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;Here is the 2D image of Decalin, that corresponds to the .mol file from ChEBI:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5bwkvHH7ku8/STTwB3gCFpI/AAAAAAAAAFI/PsACXJZH7RI/s1600-h/decalin2d.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 200px; height: 200px;" src="http://4.bp.blogspot.com/_5bwkvHH7ku8/STTwB3gCFpI/AAAAAAAAAFI/PsACXJZH7RI/s320/decalin2d.png" alt="" id="BLOGGER_PHOTO_ID_5275104978426533522" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a pic of the model I generated:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_5bwkvHH7ku8/STTwX8EcR9I/AAAAAAAAAFQ/2QyK0L7XtO8/s1600-h/decalin3d.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 246px;" src="http://2.bp.blogspot.com/_5bwkvHH7ku8/STTwX8EcR9I/AAAAAAAAAFQ/2QyK0L7XtO8/s320/decalin3d.jpg" alt="" id="BLOGGER_PHOTO_ID_5275105357610108882" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Decalin has two stereoisomers, this is the &lt;i&gt;trans&lt;/i&gt;-model which is energetically favourable, compared to the &lt;i&gt;cis&lt;/i&gt;-model, so CDK has found the global optimal geometry.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-5386676827452703276?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/5386676827452703276/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=5386676827452703276' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5386676827452703276'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/5386676827452703276'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/12/3d-geometry-modeling-in-cdk-112.html' title='3D geometry modeling in CDK 1.1.2'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_5bwkvHH7ku8/STTwB3gCFpI/AAAAAAAAAFI/PsACXJZH7RI/s72-c/decalin2d.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-2391228742060158890</id><published>2008-11-26T07:22:00.000-08:00</published><updated>2008-11-26T07:50:12.547-08:00</updated><title type='text'>Ten reasons I hate C++</title><content type='html'>&lt;span style="font-weight: bold;"&gt;1. Header files.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;They contain no information that cannot be automatically extracted from the cpp files. Forcing all symbols to be declared ahead of usage is convenient for compiler developers, but a big pain for application developers. Other languages (Java, D, C#, etc) can do without header files, why we need them for C++?&lt;br /&gt;&lt;br /&gt;Another problem is that the 'standard' header files differ on each platforms in subtle ways. So a program that compiles on one Unix system will fail on another as some obscure symbol with lots of underscores in it is not defined.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;2. Preprocessor&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;At the company I work, people still use preprocessor macros as an alternative to writing a subroutine. I simply cannot abstain from using macros, as my unscrupulous coworkers force them on me.&lt;br /&gt;&lt;br /&gt;Even conditional compilation is possible without preprocessor, eg. in Java by using static final boolean flags.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3. Deployment and platform dependency&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;C/C++ has the reputation to produce very fast code. True, but only when the compiler (eg. g++) is supplied with the right command-line switches, like -O3 -march=core2 -fomit-frame-pointer -finline-functions ...&lt;br /&gt;&lt;br /&gt;Also, unless you compile for the same computer where the code will be run, you must specify quite precisely the type of CPU and instruction set, like 'SSE2' and so on.&lt;br /&gt;&lt;br /&gt;If I want to deploy an application as executable code how am I supposed to guess the CPU type of my future customers? Or should I offer a version for each possible CPU type?&lt;br /&gt;&lt;br /&gt;Languages with JIT compilation don't have this problem. They detect the CPU type at runtime and produce correctly optimized code.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;4. Linking&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;As the last step of compilation, you link your binary with a number of libraries. This 'number' is growing rapidly. In a recent project, we had well over one hundred -lxxxx options, some 90% of which were totally unknown for me.&lt;br /&gt;&lt;br /&gt;When compilation fails I am usually pretty confident that I'll find and fix the  problem easy. However, when linking fails I panick. How am I supposed to know which of the 100 obscure library should contain that equally obscure symbol that 'collect2' is looking for?&lt;br /&gt;&lt;br /&gt;Also, we cannot statically link all libs with our code, both because of size limitation, and because of that infamous GPL license which prohibits it. But this complicates deployment beyond all measures. You must check at install time that the target computer has all the necessary libraries with the correct version; and if not, you should inconvenience your customer by asking him to obtain those 'free' libraries and try the install again.&lt;br /&gt;&lt;br /&gt;Of course, this problem is not specific (but quite typical) for  C/C++...&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;5. No garbage collection&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;For a huge application, written by many people and integrating third-party software, it is almost impossible to find precisely the right place to free a certain chunk of memory, that is, to delete objects on the heap.&lt;br /&gt;&lt;br /&gt;There is no good strategy for that. Some say that the class that  allocates an object should delete it as well. But this rule is impossible to enforce, unless you implement awkward schemes only to this purpose.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;6. Multithreading&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;It is a headache in C++. There is Pthreads, but it lacks Monitors and Barriers which makes multithreading in Java so easy and powerful.&lt;br /&gt;&lt;br /&gt;The most popular framework for scientific calculations on multicore CPUs is OpenMP. It is really great, but I would say it is an extension to the C++ language, not part of it.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;7. Complexity&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Recently the company where I worked asked me to check and evaluate the  C++ tests some new recruits wrote on their job interview. Although I work with C++ for more than a decade, I had to write small programs and run them to find the right answer to some questions. (Admittedly, they were very tricky questions.)&lt;br /&gt;&lt;br /&gt;If you have ever tried to read the ANSI/ISO C++ specification, you'll know what I mean. C++ is extremely complicated, and no one can remember all the fine details that is written in the spec.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;8. STL Iterators&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We still wait for C++ to adopt the foreach concept.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;9. String and regexp handling. date &amp;amp; time&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;No comment. You can use &lt;a href="http://www.boost.org/"&gt;boost&lt;/a&gt; if you dare to import that mess of templates into your codebase.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;10. Lack of free, standard GUI / Widget toolkit&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are many alternatives, but each has its flaws. FLTK is ugly, Qt and GTK  are not entirely free to use in commercial applications, Qt has its 'meta object compiler' as it modifies the syntax of C++, and so on.&lt;br /&gt;&lt;br /&gt;Java and C# come with their own widget toolkit, - so you don't have to learn a new one in each project...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-2391228742060158890?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/2391228742060158890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=2391228742060158890' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/2391228742060158890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/2391228742060158890'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/11/ten-reasons-i-hate-c.html' title='Ten reasons I hate C++'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5604096.post-250287023235429482</id><published>2008-11-25T06:21:00.000-08:00</published><updated>2008-11-25T06:49:17.897-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='multicore'/><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='benchmark'/><title type='text'>Java 6 at the top of the Computer Language Shootout  benchmark</title><content type='html'>I often spend time on &lt;a href="http://shootout.alioth.debian.org/"&gt;this wonderful website&lt;/a&gt;, running benchmark implementations in many mainstream and obscure programming languages, and displaying the results... It is instructive to study the submitted code (especially when adopted to a multicore CPU), and interesting to see how some languages get to the top, and some months/years later fall back and give way to new compilers and new programming environments.&lt;br /&gt;&lt;br /&gt;For a long time, C/C++ with GCC (the GNU compiler) was the absolute winner. Some other, similar languages, eg. the D language from &lt;a href="http://digitalmars.com/d/index.html"&gt;digitalmars &lt;/a&gt;challenged their first place, but could never displace them. (I was a great fan of the D language for some time...) - So finally who would believe that one day Java will take over and - at least in some benchmarks - beat C and C++?&lt;br /&gt;&lt;br /&gt;This 'one day' has come. Recently I checked the &lt;a href="http://shootout.alioth.debian.org/u32q/benchmark.php?test=nbody&amp;amp;lang=all"&gt;results for the n-body benchmark&lt;/a&gt;, and here is a screenshot of the result (click to see it in full size):&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_5bwkvHH7ku8/SSwMqiJhuEI/AAAAAAAAAEM/C6JkpeivwIM/s1600-h/screenshot.jpg"&gt;&lt;img style="cursor: pointer; width: 300px; height: 320px;" src="http://2.bp.blogspot.com/_5bwkvHH7ku8/SSwMqiJhuEI/AAAAAAAAAEM/C6JkpeivwIM/s320/screenshot.jpg" alt="" id="BLOGGER_PHOTO_ID_5272603188604811330" border="0" /&gt;&lt;/a&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;Note that while Fortran is at the first place, it is only 1 millisecond faster than Java, and the Java running time also includes a much longer startup time and the JIT compilation. C is one whole second slower than Java.&lt;br /&gt;&lt;br /&gt;I know this is not a definitive verdict over C/C++, probably a few weeks later I'll see completely different results, - still it is very impressive. It also confirms my own results with&lt;a href="http://math.nist.gov/scimark2/"&gt; SciMark 2.0&lt;/a&gt; on my dual-core laptop (Windows), giving consistently better scores for Java over C++.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5604096-250287023235429482?l=bzoli.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bzoli.blogspot.com/feeds/250287023235429482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5604096&amp;postID=250287023235429482' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/250287023235429482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5604096/posts/default/250287023235429482'/><link rel='alternate' type='text/html' href='http://bzoli.blogspot.com/2008/11/java-6-at-top-of-computer-language.html' title='Java 6 at the top of the Computer Language Shootout  benchmark'/><author><name>- bzoli -</name><uri>http://www.blogger.com/profile/02528424583321902497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='21' src='http://4.bp.blogspot.com/_5bwkvHH7ku8/SYYDHrlZl4I/AAAAAAAAAGU/jxX0YoQn2sc/S220/bzoli2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_5bwkvHH7ku8/SSwMqiJhuEI/AAAAAAAAAEM/C6JkpeivwIM/s72-c/screenshot.jpg' height='72' width='72'/><thr:total>2</thr:total></entry></feed>
