C# and VB.NET Line Count Utility – Version 1

This program has now been upgraded to version 2, which additionally deals with C++ .NET solutions. This can be downloaded from my later article.

Overview

The attached program is a line count utility written in C#. This:

  • Counts the number of lines in a .NET solution, project or individual code file.
  • Works with both C# and VB.NET.
  • Works with both Visual Studio 2003/.NET 1.1 and Visual Studio 2005/.NET 2.0 (but needs .NET 2.0 to run).
  • Provides a sortable grid of results so you can easily find your biggest projects or biggest code files.
  • Caches results at all levels and provides views onto them. For a solution you can see a view of all projects and their sizes, all code files and their sizes, or a view that combines both projects and code files.
  • Shows the number of blank lines at every level.
  • Shows the number of lines auto-generated by code-generators at every level (e.g. layout code on forms, or typed DataSet code).
  • Shows the number of lines of comments at each level.
  • Allows the grid to be copied into the clipboard in a format that can be pasted into Excel.
  • Comes in a fetching green and white colour scheme.

Background

I was recently asked how many lines of code there are in our current C# project, and how that compared with another similar project. The ‘other’ project is much bigger in terms of resources (numbers of developers), although it’s been running for slightly less time than our project. Our project has had two or three developers working on it for about a year.

I looked around for a line count utility on the internet, but couldn’t really find anything I liked the look of. So I upgraded an old VB6 line count utility I wrote several years ago. I used the VB6 to VB.NET upgrade wizard initially. It still amazes me that the upgrade wizard works at all, but in this case I got a VB.NET project (with VB6-style code) that compiled immediately. With a little work I got it counting code in individual C# projects.

This program told me we had about 180,000 lines of code in our entire C# solution. If you do the maths on that it comes out at about 1500 lines of code per developer per week, or over 300 lines per day.

300 lines per day per developer of production code seemed very high, so I decided I needed a tool that could analyze the data in a little more detail. This program is the result of that. Below I will discuss why our developers (myself included) are nowhere near as productive as the initial analysis suggests, and why.

Design – Code Containers

This is not a highly complex application, and there isn’t all that much to say about the design. However it was clear early on that I would want classes that represented the three possible types of entity that the program can be run on. These are solutions, projects and individual code files. In addition I would want some polymorphic behaviour from these three classes. That is I would want them to implement the same methods to do stuff like counting lines and getting results.

I have implemented this using an abstract base class for the three classes, with abstract methods CountLines and PopulateResults. I call the solutions, projects and individual code files ‘code containers’, and hence the abstract base class is called ‘CodeContainer’. Note that semantically the code containers are not just the individual solution file, project file or code file (although they contain a reference to that file), but represent the associated code structures and the line counts on them.

In particular each code container contains a list of other code containers that are contained directly within them: a Solution object will contain a list of Project objects which will in turn contain a list of CodeFile objects. These objects are then used to cache the line count results at the appropriate level: after calculation each CodeFile object will contain its own line count in its numberLines member variable, and the Project object will similarly contain the overall total number of lines in all its CodeFiles in its numberLines member variable. So in some ways this is a simple composite pattern, although it can only have three levels with specific types at each level.

The CodeContainer abstract base class lets me cache the actual line counts in member variables in the base (since all the code containers need to store these), and to have a ToResultString method that just output these numbers with a bit of blurb. Finally a factory idiom (CodeContainerFactory) allows the correct code container to be instantiated when necessary based on the extension on the name of the file.

All this means that the client code doesn’t need to know which type of code container it is dealing with: it instantiates the correct one by calling the CodeContainerFactory and then just calls the abstract methods on the base class when it needs to do something.

Usage

At start up the application opens a dialog to allow the user to select the solution, project or code file (.vb, .cs) that the program will run on initially. Once a file is selected the application calculates the line counts for that item and displays the results as below. Here a solution file has been selected and both project files and individual code files are being shown in the resulting grid:

Line Count Main Small

The grid can as usual be sorted by clicking the column headers. Here it has been sorted by the number of lines in individual code files.

Additional functionality is available on both the traditional menus and a context menu. These can be used to hide the code files and show only project files, with one line in the grid per project file (by clearing the check mark alongside ‘Show Code Files’):

Line Count Projects

For a simpler view at code file level, the application can also be used to show code files only (by checking ‘Show Code Files’ and clearing ‘Show Project Files’). The breakdown columns (numbers of blank lines, code designer lines and comments) can also be hidden using the ‘Show Breakdown’ menu option:

Line Count Code Files

The other functionality on the menus is pretty self-explanatory.

If you want to copy the grid into Excel you can simply select the entire grid (Ctrl-A), copy to the clipboard (Ctrl-C) and then launch Excel and paste (Ctrl-V). In a later version of the application I will add a menu option to do all this.

Issues

There are some issues around the counting of auto-generated code with this application, particularly with Visual Studio 2003 projects. In Visual Studio 2005 we have auto-generated code neatly split into partial ‘designer’ files, which makes it much easier to identify and count. For Visual Studio 2003 I have tried to identify the auto-generated code regions, but have been forced to do this by looking for the #Region or #region strings that precede these regions. This probably isn’t the most accurate method of identifying this code. See method ‘SetCodeBlockFlags’ in CodeFile.cs.

A further problem arises if your project references a web service. The proxy code for this is generated by Visual Studio in a file called ‘Reference.cs’. At the moment this is being identified by name and by the fact that it will have the text ‘Web References’ in its file path. Again, this isn’t a great solution.

Note that in any case only auto-generated code in .cs or .vb files is counted.

Analysis

The Line Count program showed us that whilst our project does have 180,000 lines of code, 100,000 of them are auto-generated by Microsoft’s code generators.

Of the 100,000 auto-generated lines 73,000 are in our data access component. Our application is a low-volume but reasonably complex product, and for ease of development we have extensively used typed DataSets to get our data out of our database. Those 73,000 lines of code are mainly in these typed DataSets. In addition 22,000 auto-generated lines of code (out of the 100,000) are in our presentation layer. As you’d expect these are mainly auto-generated layout code for our forms and user controls.

So we’re down to 80,000 lines of code written by developers. Of this, a further 10,000 lines are blank, and another 10,000 are comments. Even this exaggerates the size of the actual application code as we can see that our unit test project has 16,000 lines of code.

I expect these numbers are not untypical of enterprise .NET applications. I’d be interested in some statistics from other projects.

As for the ‘other’ project I mentioned above, that has 50,000 lines of code, 10,000 auto-generated, 5,000 blank, 6,000 comments (and no unit tests).

Conclusion

In the end all this goes to back up something that all developers know instinctively: using lines of code as a metric for the ‘size’ of an application really doesn’t make much sense. Maybe that’s why I couldn’t find a decent line count program in the first place.

However counting lines can provide some interesting analysis. We can see at a glance which our biggest classes are, and these are clearly candidates for refactoring. Also, if you look closely at the screenshots you can see that we probably have too much logic in our presentation layer compared to our model layer (middle tier business layer). We knew that already, but the line count statistics bring it home.

Downloads

Executable download.

Source code download.

Advertisements

48 thoughts on “C# and VB.NET Line Count Utility – Version 1

  1. Rich,

    Thanks for writing this utility. I needed a line count util today to have a number to quote to a customer – they are impressed by how many lines you are supporting for them, even if we all know the metric is suspect.

    Anyway I needed to get the data out to Excel and as I hadn’t really read the article I wrote an additional menu item that does the export to Excel – uses COM interop. Drop me an email and I’ll send you the source back. The Excel export probably isn’t perfect but it is ‘good enough’.

    BTW, my customers project was 29403 lines, 12 classes (1381), 40 forms (21341 inc designer code), 34 crystal reports (150-250 lines each all auto-generated). Plus some other files. I haven’t done a complete analysis of it to see how much white space, comments etc.

    Thanks

    John

  2. I grabbed your utility which works fine on some of my smaller, simple solutions. But, when I point it to 2 of my larger Visual Studio 2003 solution files, it does not do anything. It shows 0 counts. These have 15 or more projects in them. It seems to work if I point at the individual projects, but would be nice if it could read the entire solution.

  3. Thanks for a truly stellar utility!

    I’ve been curious about how many lines of code our project has for quite some time and yours was the perfect application to provide that result.

    After looking at this, it’s time to get refactoring!
    Over 422,000 lines – only 85,000 designer-generated 🙂

  4. Sir,
    can i use this code what you wrote for c language for counting number of lines in the c program.
    how we can think for getting this type logics.I am trying to write a program in c# language that counts the number of lines in c language

    mail me
    chowdary.ravipati@gmail.com

  5. Thanx for the code it really helped me.

    But can i have a code which help me to count folders and its sub folders.
    And no. of html and txt pages separately.

  6. Thanks for this utility. I was using Code Metrics in VS2008 but that does not tell me the number of classes in my projects. I changed your source code to include # of classes in a project. EMail me if you are interested in it.

  7. Hi, I am using Visual Studio 2005, and loaded the sln. file. It shows it as blank. Do I load each file individually in my project?

  8. Thanks for the program! A great utility.

    And my project with two developers has the following statistics :
    Number of lines = 77,082, number of code files = 99, number of code-generated lines = 63,533, number of user-entered blank lines = 2,587, number of user-entered comments = 2,047

    I see that my framework has really come out good reducing the number of lines! thanks!

  9. I was looking for a way to count lines in a large web application that I developed at work, and found this. It didn’t quite work for me out of the box (.NET Web Projects don’t have project files so this utility only found and counted my Helpers code, which is compiled in it’s own proj file to a DLL). However with the great documentation and design of the application I was able to modify the project in about an hour to do exactly what I wanted.

    Thanks for the great utility, and for making it open-source so I could make it work for me too!

    PS: If you want the code to make it work with Web Projects shoot me an email, I’d be happy to send it over. I did it with very minor changes to the CodeContainer and Factory classes and adding some items to the interface, all of which were needed to allow the utility to count based on Directory and added a class (called WebProject) which inherits and implements CodeContainer.

    1. Hi Ryan,
      Please send me the code how you implemented for web projects. I am developing web site using Visual Studio 2008.

  10. Ha, very nice work. I really like that i’m able to see a blank line and comment count. Not that line counting is brain surgery, but you delivered an app cuts straight to the point and works exactly as i assumed it would. I didn’t even read your blog post, i just scrolled to the bottom, downloaded it, and ran it.

    All software should be this straight forward.

    thanks.

  11. Thanks so much. It really is the best utility I’ve used in a long time. The search for ‘asp dot net line count’ brought it up quickly and it worked just like you said it would. Loved the sorting columns. Cheers from Pakistan.

  12. Very useful. And a count with blank lines removed is great (it seems I have about 1 blank line fore every 2 lines of code).

    +1 for you 🙂

  13. Thanks for the source. I added rough xaml support. Yay. If you want me to give you the update I can. It wasn’t hard. Just a new xml iterator for Page-es rather than Compile tags.

  14. I was just wandering how many lines of code were in my little project so thought I would give this a go. It is brilliant I could not believe it when it told me 15000. You have done a great job here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s