The Mystique of CodeGen

For a few years now, there seems to be a lot of buzz about “CodeGen” as if it were some magical technology that is going to completely transform the software creation process. The boring lines of code that monkeys could probably write will now be magically created for us thanks to the quantum advances in technology.

The truth is, CodeGen is not only just a grossly overused term, it is also a commonly misrepresented one. So let’s take a minute and look at what the term really means:

  • Code: text that is compiled into machine instructions
  • Gen: “Generated” by something other than human fingers

OK, so that all still sounds pretty good, right? But what is it that drives the Gen part? I’ve been around computers for about 25 years, and the one thing that has remained the same is that they will do exactly what we tell them. Nothing more, nothing less. So we must be telling them how to generate this time-saving code, right?

This is where I think it is important to make a distinction between different types of CodeGen.

  • Object Relational Mapping (ORM)
  • Meta Data / Configuration
  • Environmental

Given the ambiguous nature of these terms, I feel it is justified to take a paragraph or two and describe why I see the CodeGen world split into these three categories

Object Relational Mapping (ORM)

ORM Diagram

ORM has gained quite a bit of notariaty due to the dramatic productivity increases that it can create. This technology is essentially taking your database object (tables, procedures, etc.) and mapping them into classes in code. There are even sub-categories of ORM depending on whether you have an existing database or plan to generate your database. However, for the purpose of this discussion, we’ll classify anything that directly maps code objects to relational database entities as ORM.

Meta Data / Configuration

A less written about and discussed method of generating code is to use configuration files, meta data, or other guidance to create the code. To me, the most interesting facet of this type of CodeGen is that most people use it and don’t even realize it. Have you ever used Visual Studio? If you have done any GUI development, you have probably seen the standard InitializeComponent(); method. The contents of this method are generated by Visual Studio based on the configuration in your .resx file. And is this resulting text anything other than bonafide code?

Another commonly used form of this type of CodeGen is to use custom configuration data to generate code classes. Whether this source data is stored in an XML file, database table, or any other type of data store the purpose is the same. Consistent images of code can be created from the configuration data before each build in order to blur the line between application code and configuration.


I do feel that it is justified in noting one additional category of code generation techniques. Evironmental CodeGen refers to the process of gathering environmental data and using it to create code. In this context, environmental data can be server information, file properties, security information, or anything else that is dynamic and describes the state of a given environment. The distinction between environmental CodeGen inputs and configuration CodeGen inputs is that the designer or user may not know what the inputs will be for a given execution of the generation cycle.

My company uses an example of this type of code generation to generate HTML code that describes the versions of applications in all of the different technical environments. Although HTML walks the line of actually qualifying as code (since it is interpreted, not compiled) the concept holds true.

The last note about the environmental CodeGen approach reiterates an important distinction. A process that writes out a report should not be thought of as CodeGen. The results of the CodeGen operation should always be inputs into some compilation or other code utilization process. CodeGen is not simply scripting, but scripting can be CodeGen.

To summarize, I do believe that code generation is an amazing concept that everyone should be considering in today’s world of software development. It is important to spend time first understanding your problem, then knowing which approaches and tools will best suit your needs.