Modern C++, iterators and loops compared to C#

It has been a while since I looked much at modern C++. I’ve started intentionally simple and was exploring some of the ways that a simple list can be iterated. I hate to say this, but I remember the days when the standard template library was not necessarily something that was the easiest to get working on Windows with a Microsoft C++ compiler in a simple, safe and reliable way. My, how times have changed for the better!

While the overall syntax of C++ lacks the refined and designed elegance of a C# application, it has come a very long way!

I’ve thrown together a simple sample. The code creates a list of strings, pushes (not add, arrgh!) strings onto the list, and then iterates through the list in three similar, but syntactically different methods below:

image

OK, there’s a few awesome things here that make C++ much more approachable these days:

  1. a decent string class! No more hunting for a library or header file that has most of the functionality you need. (there’s a wstring as well for wchar_t needs)
  2. All the common Data structure types like list, map, vector, etc… that you might need without writing them your self.
  3. With using, the code doesn’t need to refer to types with namespaces (much like C#). So, instead of std::list<std:string>, it’s just list<string>, which I consider far more readable at a glance.
  4. No weird casting or worrying about strings when values are passed to the push_back function. Again, it just works.
  5. auto – the var of C# is called auto in C++. If I hadn’t used auto in the first iterator, the code would have declared the type as list<string>::iterator. While not terrible, auto reads well, as it’s likely that I’ll not need the detailed declaration to understand what’s being iterated upon.
  6. Second option is just syntactical sugar as it internally uses a standard for loop like in the first example internally. But, you’ll note there are anonymous functions/lambdas! The square [] brackets is the capture clause of the lambda expression. Unlike C#, where the compiler automatically determines what outer scoped variables will be used by the inner lambda function, in C++, it’s necessary to explicitly declare what is required. While this is a bit annoying, there are times where this extra declaration might cause programmers to think twice about all of the variables that are required in the lambda expression. In this instance, there aren’t any variables needed, so it’s empty.
  7. The last example is the most concise, and maybe a little less friendly at first and that’s mostly due to the heritage of C++ and how to make code most efficient. It’s called a range-based for statement. First, the code is using auto again and the type is a string as it’s declaring the type used within the list. The & symbol remember is a reference and the const is indicating that the value will not be changed by the inner block. These two together ideally make it so the value is observed in-place, without any need for a copy.

While the range-based for statement isn’t quite as readable as C#:

image

I think you might agree that the C++ version is more than passable, and nearly friendly!

image

(Aside, there is a shortcut initializer list syntax that can be used with non-aggregate types in C++, so an int for example that would have gotten rid of the calls to push_back).

Using Windows CSCRIPT to compile a Handlebar.js template

I was looking for an alternative to using Node.JS for a JavaScript build process today on a Windows machine. I wanted something that relied as much on natively installed elements of a modern Windows PC as possible so that the build process would be portable.

So, I broke out my rusty Windows Script Host skills.

First, I created a file called, compile.wsf with the following contents:

image

When using cscript.exe, you can execute a more complex combination of scripts and include other script files by using a Windows Script File. The content of the file is an XML definition of jobs. A job represents a unit of work. If you only have one job in a file, the name won’t matter as the script engine will select it by default. If you do have more than one job you’d like to store in a single WSF file, you can use the /Job:{id} parameter of cscript.exe to run a single job.

Using the WSF file, you can include other script files using a script element (much like the script tag in HTML). In the example above, I’ve referenced a local copy of handlebars.js and a custom script called compile.js.

You can also inline script as shown above. After doing a basic check on the number of arguments provided to the script, compile is called, which is from the compile.js script reference.

Compile.js is simple:

image

Using an instance of the FileSystemObject, first the input file is verified to exist. Next, both the input and output files are opened. The Handlebars object is available globally by including it in the WSF definition and is used to precompile the contents of the input file’s template definition. Once compiled, it’s written to the output file and both files are closed.

I threw the three files in a folder called lib, and created a simple batch file called handlebars.bat which called the cscript executable with the Windows Script File shown above as the first parameter and then the values of the other parameters passed along:

image

While this solution only works on Windows, it doesn’t hurt to keep the Windows Script Host in mind when throwing together general repeatable tasks that you:

  • consider too complex for a batch file
  • consider too simple for a full .NET application
  • require usage of existing JavaScript libraries, like Handlebars.js for some work

Finding duplicates in MongoDB via the shell

I thought this was an interesting question to answer on StackOverflow (summarized here):

I’m trying to create an index, but an error is returned that duplicates exist for the field I want to index. What should I do?

I answered with one possibility.

The summary is that you can use the power of MongoDB’s aggregation framework to search and return the duplicates. It’s really quite slick.

For example, in the question, Wall documents had a field called event_time. Here’s one approach:

db.Wall.aggregate([
       {$group : { _id: "$event_time" ,  count : { $sum: 1}}},
       {$match : { count : { $gt : 1 } }} ])

The trick is to use the $group pipeline operator to select and count each unique event_time. Then, match on only those groups that contained more than one match.

While it’s not necessarily as readable as the equivalent SQL statement potentially, it’s still easy to read. The only really odd thing is the mapping of the event_time into the _id. As all documents pass through the pipeline, the event_time is used as the new aggregate document key. The $ sign is used as the field reference to a property of the document in the pipeline (a Wall document). Remember that the _id field of a MongoDB document must be unique (and this is how the $group pipeline operator does its magic).

So, if the following event_times were in the documents:

event_time
4:00am
5:00am
4:00am
6:00pm
7:00a

It would results in a aggregate set of documents:

_id count
4:00am 2
5:00am 1
6:00pm 1
7:00am 1

Notice how the _id is the event_time. The aggregate results would look like this:

{
        "result" : [
                {
                        "_id" : "4:00am",
                        "count" : 2
                }
        ],
        "ok" : 1
}

How to debug an underscore.js template

Given a simple template like this:

<div class="solution" data-question-id="<%= model.get('Id') %>">    
    <div class="title"><%= model.get('Name') %></div>
    <div class="company"><%= model.get('Company') %></div>
    <div class="version"><%= model.get('Version') %></div>
    <div class="detail"><%= model.get('Summary') %></div>
    <div class="actions">
        <a href="/#solution/<%= Id %>/<%= model.get('Id') %>">Detail</a>
    </div>
</div>

And using your favorite JavaScript interactive debugger (Visual Studio 2012 is my favorite when I’m doing MVC 4 Razor development), just add a debugger statement within your template temporarily:

<div class="solution" data-question-id="<%= model.get('Id') %>">
    <% debugger; %>
    <div class="title"><%= model.get('Name') %></div>

Assuming debugging is enabled, this will break (in Visual Studio for example) on the debugger line whenever your code template is executed.

The emitted template code thankfully has line feeds embedded so it’s readable:

function anonymous(obj,_) {
var __t,__p='',__j=Array.prototype.join,print=function(){__p+=__j.call(arguments,'');};
with(obj||{}){
__p+='<div class="solution" data-question-id="'+
((__t=( model.get('Id') ))==null?'':__t)+
'">\r\n    ';
 debugger; 
__p+='\r\n    <div class="title">'+
((__t=( model.get('Name') ))==null?'':__t)+
'</div>\r\n    <div class="company">'+
((__t=( model.get('Company') ))==null?'':__t)+
'</div>\r\n    <div class="version">'+
((__t=( model.get('Version') ))==null?'':__t)+
'</div>\r\n    <div class="detail">'+
((__t=( model.get('Summary') ))==null?'':__t)+
'</div>\r\n    <div class="actions">\r\n        <a href="/#solution/'+
((__t=( Id ))==null?'':__t)+
'/'+
((__t=( model.get('Id') ))==null?'':__t)+
'">Detail</a>\r\n    </div>\r\n</div>\r\n';
}
return __p;

}

You’ll see the debugger emitted in-line. It’s very handy for inspect the values of variables, objects, etc. In the example above, I wanted to confirm that the model being passed was in the proper format.

I’ve used it many times to help debug a template that wasn’t working the way I’d expected.

This also works well in Chrome’s debugger.