Wednesday, September 24, 2008

Exception handling - Patterns, Classification, Best Practices

Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of some condition that changes the normal flow of execution.

Exception safety

A piece of code is said to be exception-safe if run-time failures within the code will not produce ill effects, such as memory leaks, garbled stored data or invalid output. Exception-safe code must satisfy invariants placed on the code even if exceptions occur.

There are several levels of exception safety:
Failure transparency, also known as the no throw guarantee: Operations are guaranteed to succeed and satisfy all requirements even in presence of exceptional situations. If an exception occurs, it will not throw the exception further up. (Best level of exception safety)

Commit or rollback semantics, also known as strong exception safety or no-change guarantee: Operations can fail, but failed operations are guaranteed to have no side effects so all data retain original values.

Basic exception safety: Partial execution of failed operations can cause side effects, but invariants on the state are preserved. Any stored data will contain valid values even if data has different values now from before the exception.

Minimal exception safety also known as no-leak guarantee: Partial execution of failed operations may store invalid data but will not cause a crash, and no resources get leaked.

No exception safety: No guarantees are made. (Worst level of exception safety)

Exception handling Patterns

Here is the list of patterns and questions defining the problems ( For the complete exposition visit here)

Error Object
What characterizes an error? How to structure and administrate error information?

Exception Hierarchy

How to structure error types? What role does inheritance play in the structuring of errors?

Error Traps
What indicators are useful to detect erroneous situations and where to install the traps in the application code?

Assertion Checking Object

How to implement Error Traps in an object oriented language without using a generative approach?

Backtrace

How to collect and trace useful information for the system developers or the maintenance team, so that it supports them by the analysis of the error situation? Especially, if we have no or limited access to the stack administered by the system itself.

Centralized Error Logging

How do you organize exception reporting so that you can offer your maintenance personnel good enough information for analyzing the branch offices problems?

Error Handler
Where and how do you handle errors?

Default Error Handling
How do you ensure that you handle every possible exception correctly (no unhandled exception and limited damage)?

Error Dialog

How to signal errors to an application user?

Resource Preallocation
How to ensure error processing although resources are short?

Checkpoint Restart
How do you avoid a complete rerun of a batch as a result of an error?

Exception Abstraction
How do you generate reasonable error messages without violating abstraction levels?

Exception Wrapper

How do you integrate a ready-to-use library into your exception handling system?

Multithread Exception Handling

How to schedule exceptions in a multithread environment?

Eric Lippert's classification


Writing good error handling code is hard in any language, whether you have exception handling or not. When Eric Lippert first classifies every exception into one of four buckets which he labels fatal, boneheaded, vexing and exogenous. You can read the entire article here.

Fatal exceptions are not your fault, you cannot prevent them, and you cannot sensibly clean up from them.

Boneheaded exceptions are your own darn fault, you could have prevented them and therefore they are bugs in your code. These are all problems that you could have prevented very easily in the first place, so prevent the mess in the first place rather than trying to clean it up.

Vexing exceptions are the result of unfortunate design decisions. Vexing exceptions are thrown in a completely non-exceptional circumstance, and therefore must be caught and handled all the time.

And finally, exogenous exceptions appear to be somewhat like vexing exceptions except that they are not the result of unfortunate design choices. Rather, they are the result of untidy external realities impinging upon your beautiful, crisp program logic.

Best practices

Here is the gist of what Daniel Turini talks in his article Exception Handling Best Practices in .NET.


Plan for the worst

-Check it early
-Don't trust external data
-The only reliable devices are: the video, the mouse and keyboard.
-Writes can fail, too

Code Safely

-Don't throw new Exception()
-Don't put important exception information on the Message field
-Put a single catch (Exception ex) per thread
-Generic Exceptions caught should be published
-Log Exception.ToString(); never log only Exception.Message!
-Don't catch (Exception) more than once per thread
-Don't ever swallow exceptions
-Cleanup code should be put in finally blocks
-Use "using" everywhere
-Don't return special values on error conditions
-Don't use exceptions to indicate absence of a resource
-Don't use exception handling as means of returning information from a method
-Use exceptions for errors that should not be ignored
-Don't clear the stack trace when re-throwing an exception
-Avoid changing exceptions without adding semantic value
-Exceptions should be marked [Serializable]
-When in doubt, don't Assert, throw an Exception
-Each exception class should have at least the three original constructors
-Be careful when using the AppDomain.UnhandledException event

Bulk inserts and updates

A common development task is transferring data between disparate data sources. In this post have outlined a few approaches that can help you to do bulk inserts and updates.

1. Bulk inserts and updates by using the OpenXML method.
OpenXML method is one of the approaches to do bulk inserts and updates with different Microsoft .NET data providers.

Following are two good articles on this
http://support.microsoft.com/kb/315968
http://dotnet.org.za/ncode/archive/2007/05/14/Bulk-Update-and-Insert-of-Object-State-Stored-by-SQL-Server-2000-Using-.NET-C_2300_.aspx

2. SqlBulkCopy to streamline data transfers
The .NET Framework 2.0's SqlBulkCopy class allows you to easily move data programmatically from any data source to a SQL Server table.

Here are some articles that throw light on this
http://video.techrepublic.com.com/5100-10878_11-6187181.html
http://www.codeproject.com/KB/database/TransferUsingSQLBulkCopy.aspx
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx

3. Bulk copy large files using bcp
The bcp (bulk copy) command allows you to quickly bulk copy large files into SQL Server tables or views. With .NET Framework 1.1, you can utilize bcp via a SqlCommand object.

Following is an article that talks of bcp:

10 things you should know of Microsoft's SharePoint Services

SharePoint Services is touted as a document management system, and there's a built-in problem with that concept, because we all have a pretty fixed and mundane idea of what a document management system is. SharePoint's Web-centric orientation, however, gives it some unexpected punch, and may change your thinking. Here are some points to consider.

1. SharePoint extends Exchange Server
If you're using Exchange Server to handle your email traffic, SharePoint can greatly simplify distribution.

2. SharePoint collaboration solutions are scalable
Creating sites for team interaction, sharing and management of project-specific documents and files, testing, and other collaborative functions are a natural application of SharePoint. A less hyped aspect of SharePoint is that this collaborative utility is highly scalable.

3. SharePoint sites are highly customizable
SharePoint Services comes fully integrated with FrontPage 2003, so all of FrontPage's WYSIWYG Web editing tools are available for use in crafting SharePoint sites.

4. SharePoint extends InfoPath
InfoPath 2003 is Microsoft's desktop application technology for integrated forms management and data transport. Specifically, you’ll find it useful to publish InfoPath forms directly to a SharePoint library.

5. Metadata can be used to create dynamically parsed storage systems

Metadata is critical to the SharePoint Server concept, and comes in several flavors. With metadata you can effectively create customized search arguments that permit you to organize information dynamically, and to use search criteria from one document library to retrieve information from another.

6. SharePoint can be a data transport mechanism
Depending on what your organization's sites contain, content-wise, and the role(s) the sites are playing in your system, you can actually distribute data from server to server by means of SharePoint's site-moving utilities (see #10).
For instance, if you have SharePoint sites deployed internally to represent data in different workflow stages, the SharePoint content databases of those sites can be rotated in a de facto batch process using these utilities (which are Command Line programs and therefore scriptable).

7. Use the Task Pane to turn Word libraries into collaborative systems with built-in administration
You have a Task Pane that ties documents to libraries, and within it lie a number of important features that take you from the simple management of documents to real collaboration and administration. Through the Task Pane, you can:
track status and versioning of documents
define and track who has site/document access
do task monitoring
create alerts
You can, of course, save from all Office applications—not just Word—to SharePoint.

8. SharePoint can pull data from external databases and other data sources
Data View Web Parts allow you to add views to your sites from a variety of data sources. You can create views specific to your SharePoint sites and link views together. Data sources can be databases, Web services, or any XML source (InfoPath documents, etc.).

9. Leverage Excel for data management
Exporting data to Excel is well-supported in SharePoint and makes graphing and printing convenient (via the Print with Excel and Chart with Excel options). But it's also possible (and may often be desirable) to export data to Excel just for the sake of manageability. The Excel Export function creates an Excel Web query linking to the original data. In this way, you can create spreadsheets that will accept data, and then push that data to SharePoint.

10. Sites and entire site collections can be backed up in a single operation
The ability to move a site, lock-stock-and-barrel (and even more so a site collection, which includes primary site, sub-sites and all their contents), should not be under-appreciated. Anyone who's migrated sites the hard way knows it can be maddeningly frustrating. SharePoint Services includes two utilities that will greatly reduce the frustration: STSADM and SMIGRATE.

SMIGRATE is for backup/restore and for moving sites wholesale. It's a command line utility, so it's tailor-made for scripting, and can simplify the process of moving a site and its contents to the point that it can conceivably be a content distribution tool in some scenarios.

This is a cut down version of the very good article by Scott Robinson and can be read here.

Wednesday, September 17, 2008

Reporting Services 2008

Reporting Services provides companies with the ability to fill a variety of reporting scenarios.

Managed Reporting

Also often referred to as enterprise reporting - supports the creation of reports that span all aspects of the business and delivers them across the enterprise to provide every employee real time access to information relevant for their business area and enable better decision making.

Ad-Hoc Reporting

Enables users to create their own reports on an ad-hoc basis and provides them with the flexibility to quickly get the information that they need, in the format that they need it without submitting a request and waiting for a report developer to create the report for them.

Embedded Reporting

Enables organizations to embed reports directly into business applications and web portals, enabling users to consume reports within the context of their business process. Deep integration with Microsoft Office SharePoint Server 2007 also enables organizations to deliver reports through a central report library or to use new web parts for thin rendering of reports directly within SharePoint enabling easy creation of dashboards. In this way organizations are able to bring all business critical data, structured as well as unstructured, from across the company together in one central location providing one common experience for information access so that users can see key business performance information at a glance.

Authoring Reports
Report authoring is a major activity in many organizations. Executives, business analysts, managers, and increasingly information workers throughout the enterprise rely on timely and accurate information from easy to understand reports to perform their job effectively. SQL Server 2008 Reporting Services includes comprehensive report authoring tools, and a range of report format innovations that make it easy to create reports that bring data to life and provide the information that employees need in whatever format is most effective for your organization.

Using Report Development Tools

In most organizations, there are two distinct groups of people who create reports; experienced business intelligence solution developers who are used to working in a comprehensive development environment, and business users who are unfamiliar with database schema designs and need an intuitive report design environment that abstracts the underlying technical complexities.

SQL Server 2008 meets both of these needs by providing distinct report development tools specifically designed to meet the needs for these two audiences. This enables developers to create sophisticated reporting solutions for the entire enterprise, while making it easy for business users to focus on the specific data relevant for their business area.

Dundas visualizations platform

With the arrival of SSRS 2008 users will now gain out-of-the-box access to the Dundas visualizations platform. SSRS 2008 (as of the February 2008 CTP build) contains both Dundas Gauge and Dundas Chart products. In addition, according to the prior mentioned Dundas press release it states that Dundas Calendar will also be included into SSRS 2008.

This fairly recent Microsoft purchase of the Dundas source code for integration into SSRS 2008 is a great move as the Dundas suite of SSRS add-ons have become the premier choice for such advanced visualization needs. By including the Dundas technologies into SSRS 2008, Reporting Services customers will not only gain access to a much improved Report Server Architecture (without requiring IIS) but also an enhanced Visualization platform. There are literally about three times as many chart types in SSRS 2008 as compared to SSRS 2005. Some of the brand new chart types include the Funnel, Range, Pyramid, and Polar. In addition to the added chart types customers will also gain access to the Dundas Gauge capabilities via a new Gauge Data Region. Finally, we get a few other ‘goodies’ with the inclusion of Dundas suite including:
Secondary Axes
Runtime Calculated Series
WYSIWYG Chart Editor (design-time)

GRASP Patterns

GRASP stands for General Responsibility Assignment Software Patterns (or sometimes Principles). It is used in object oriented design, and gives guidelines for assigning responsibility to classes and objects...

Larman claims that GRASP can be used as a methodical approach to learning basic object design. These are patterns of assigning responsibilities. He also says that there are two types of responsibilities:

knowing responsibilities that include knowing about private encapsulated data, about related objects, and things it can derive or calculate

doing responsibilities include doing something itself, like creating another object or doing a calculation, initiating action in other objects, and controlling and coordinating activities in other objects.

The full set of GRASP patterns are:
Information Expert
Creator
Controller
Low Coupling
High Cohesion
Polymorphism
Pure Fabrication
Indirection
Protected Variations

Information Expert

The Information Expert pattern provides the general principles associated with the assignment of responsibilities to objects. The information expert pattern states that responsibility should be assigned to the information expert—the class that has all the essential information. Systems which appropriately utilize the information expert pattern are easier to understand, maintain and expand as well as increase the possibility that an element can be reused in future development.

Related patterns are
Low Coupling/High Cohesion : The Expert pattern promotes low coupling by putting methods in the classes that have the information that the methods need. Classes whose methods only need the class’ own information have less need to rely on other classes. A set of methods that all operate on the same information tends to be cohesive.

Creator

The Creator pattern solves the problem of who should be responsible for the creation of a new instance of a class. The creator pattern is important because creation of objects is one of the most ubiquitous activities in an object-oriented system. A system that effectively utilizes the creator pattern can also support low coupling, increased understandability, encapsulation and the likelihood that the object in question will be capable of sustaining reuse. Given two classes, class B and Class A, class B should be responsible for the creation of A if class B contains or compositely aggregates, records, closely uses or contains the initializing information for class A. It could then be stated that B is natural object to be a creator of A objects.

The Factory pattern is a common alternative to Creator when there are special considerations, such as complex creation logic. This is achieved by creating a Pure Fabrication object (see below), called Factory that handles the creation.

Controller

The Controller pattern assigns the responsibility of dealing with system events to a non-UI class that represent the overall system or a use case scenario. A use case controller should be used to deal with all system events of a use case, and may be used for more than one use case (for instance, for use cases Create User and Delete User, one can have one UserController, instead of two separate use case controllers). It is defined as the first object beyond the UI layer that receives and coordinates ("controls") a system operation. The controller should delegate to other objects the work that needs to be done; it coordinates or controls the activity. It should not do much work itself. The GRASP Controller can be thought of as being a part of the Application/Service layer (assuming that the application has made an explicit distinction between the App/Service layer and the Domain layer) in an object-oriented system with common layers.

Related patterns are
Pure Fabrication: The Controller pattern is a specialized form of the Pure Fabrication pattern.
Mediator: The Mediator pattern is used to coordinates events from a GUI. Like controller objects, a highly coupled and incohesive mediator object may involve less overall complexity than an arrangement that distributes the same responsibilities over more objects.

Low Coupling

Low Coupling is an evaluative pattern, which dictates how to assign responsibilities to support:
low dependency between classes;
low impact in a class of changes in other classes;
high reuse potential;

Related patterns are
Interface: One form of coupling between classes is the coupling between a subclass and its superclass. It is often possible to avoid subclassing by using the Interface pattern.
Mediator: It is not necessary or even always desirable for all of the classes in a design to have low coupling and high cohesion. Sometimes the overall complexity of a class can be reduced by concentrating complexity in one class. The Mediator pattern provides an example of that.
Composed Method: It is possible for methods to be uncohesive and difficult to work with. Some common causes are excessive length or too many execution paths within a method. The Composed Method pattern provides guidance of breaking up such methods into smaller, simpler and more cohesive methods.

High Cohesion

High Cohesion is an evaluative pattern that attempts to keep objects appropriately focused, manageable and understandable. High cohesion is generally used in support of Low Coupling. High cohesion means that the responsibilities of a given element are strongly related and highly focused. Breaking programs into classes and subsystems is an example of activities that increase the cohesive properties of a system. Alternatively, low cohesion is a situation in which a given element has too many unrelated responsibilities. Elements with low cohesion often suffer from being hard to comprehend, hard to reuse, hard to maintain and adverse to change.

Related patterns same as related patterns for Low Coupling

Polymorphism

According to the Polymorphism pattern, responsibility of defining the variation of behaviors based on type is assigned to the types for which this variation happens. This is achieved using polymorphic operations.

Related patterns are
Dynamic Linkage You can implement plug-ins or pluggable software components using a combination of polymorphism and the Dynamic Linkage pattern.

Pure Fabrication

A pure fabrication is a class that does not represent a concept in the problem domain, specially made up to achieve low coupling, high cohesion, and the reuse potential thereof derived (when a solution presented by the Information Expert pattern does not). This kind of class is called "Service" in Domain-driven design.

Related patterns are
Low Coupling/High Cohesion The point of the Pure Fabrication pattern is to maintain the low coupling and high cohesion of the classes in an object oriented design.

Indirection

The Indirection pattern supports low coupling (and reuse potential) between two elements by assigning the responsibility of mediation between them to an intermediate object. An example of this is the introduction of a controller component for mediation between data (model) and its representation (view) in the Model-view-controller pattern.

Related patterns are
Low Coupling/High Cohesion: The fundamental motivation for the Don’t Talk to Strangers pattern is to maintain low coupling.
Pure Fabrication: There are sometimes good reasons for calls made to classes added to a design using the Pure Fabrication pattern to violate the guidelines of the Don’t Talk to Strangers pattern.
Mediator: The Mediator pattern provides an example of a class created through pure fabrication that receives direct method calls from classes unrelated to it with a benefit that outweighs the disadvantages of the direct calls.

Protected Variations
The Protected Variations pattern protects elements from the variations on other elements (objects, systems, subsystems) by wrapping the focus of instability with an interface and using polymorphism to create various implementations of this interface.

"The critical design tool for software development is a mind well educated in design principles. It is not the UML or any other technology" (Larman, Craig. Applying UML and Patterns - Third Edition). Thus, GRASP is really a mental toolset, a learning aid to help in the design of object oriented software.

Gang of Four (GoF) patterns

The Gang of Four (GoF) patterns are generally considered the harbinger of the whole software patterns movement. They are categorized in three groups: Creational, Structural, and Behavioral.

· Creational patterns create objects for you rather than having you instantiate objects directly. This gives your program more flexibility indeciding which objects need to be created for a given case.

· Structural patterns help you compose groups of objects into larger structures, such as complex user interfaces or accounting data.

· Behavioral patterns help you define the communication between objects in your system and how the flow is controlled in a complex program.

Creational patterns

Abstract factory
Provide an interface for creating families of related or dependent objects without specifying their concrete classes.
Factory method
Define an interface for creating an object, but let subclasses decide which class to instantiate. Factory Method lets a class defer instantiation to subclasses.
Builder
Separate the construction of a complex object from its representation so that the same construction process can create different representations.
Prototype
Specify the kinds of objects to create using a prototypical instance, and create new objects by copying this prototype.
Singleton
Ensure a class only has one instance, and provide a global point of access to it.

Structural patterns

Adapter
Convert the interface of a class into another interface clients expect. Adapter lets classes work together that couldn't otherwise because of incompatible interfaces.
Bridge
Decouple an abstraction from its implementation so that the two can vary independently.
Composite
Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly.
Decorator
Attach additional responsibilities to an object dynamically. Decorators provide a flexible alternative to subclassing for extending functionality.
Facade
Provide a unified interface to a set of interfaces in a subsystem. Facade defines a higher-level interface that makes the subsystem easier to use.
Flyweight
Use sharing to support large numbers of fine-grained objects efficiently.
Proxy
Provide a surrogate or placeholder for another object to control access to it.

Behavioral patterns

Chain of responsibility
Avoid coupling the sender of a request to its receiver by giving more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handles it.
Command
Encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations.
Interpreter
Given a language, define a representation for its grammar along with an interpreter that uses the representation to interpret sentences in the language.
Iterator
Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.
Mediator
Define an object that encapsulates how a set of objects interact. Mediator promotes loose coupling by keeping objects from referring to each other explicitly, and it lets you vary their interaction independently.
Memento
Without violating encapsulation, capture and externalize an object's internal state so that the object can be restored to this state later.
Observer
Define a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.
State
Allow an object to alter its behavior when its internal state changes. The object will appear to change its class.
Strategy
Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from clients that use it.
Template method
Define the skeleton of an algorithm in an operation, deferring some steps to subclasses. Template Method lets subclasses redefine certain steps of an algorithm without changing the algorithm's structure.
Visitor
Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

Tuesday, September 16, 2008

Design patterns

The origin of design patterns lies in work done by an architect named Christopher Alexander during the late 1970s. He began by writing two books, A Pattern Language and A Timeless Way of Building which, in addition to giving examples, described his rationalle for documenting patterns. The pattern movement became very quiet until 1987 when patterns appeared again at an OOPSLA conference.

Design patterns gained popularity in computer science after the book Design Patterns: Elements of Reusable Object-Oriented Software was published in 1994 (Gamma et al.). The Gang of Four (GoF) patterns are generally considered the foundation for all other patterns. They are categorized in three groups: Creational, Structural, and Behavioral.

What are Design Patterns?

In software engineering, a design pattern is a general reusable solution to a commonly occurring problem in software design. A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that can be used in many different situations. Object-oriented design patterns typically show relationships and interactions between classes or objects, without specifying the final application classes or objects that are involved. Algorithms are not thought of as design patterns, since they solve computational problems rather than design problems.

Why Study Design Patterns?

Now that we have an idea about what design patterns are, let us see, "Why to study them?" There are several reasons that are obvious and some that are not so obvious.

The most commonly stated reasons for studying patterns are because patterns allow us to:

Reuse solutions— By reusing already established designs, we get a head start on our problems and avoid gotchas. We get the benefit of learning from the experience of others. We do not have to reinvent solutions for commonly recurring problems.

Establish common terminology— Communication and teamwork require a common base of vocabulary and a common viewpoint of the problem. Design patterns provide a common point of reference during the analysis and design phase of a project.

However, there is a third reason to study design patterns:

Patterns give a higher-level perspective on the problem and on the process of design and object orientation. This frees from the tyranny of dealing with the details too early.
Design patterns can speed up the development process by providing tested, proven development paradigms. Effective software design requires considering issues that may not become visible until later in the implementation. Reusing design patterns helps to prevent subtle issues that can cause major problems, and it also improves code readability for coders and architects who are familiar with the patterns.

Notable books in the design pattern genre include:

Fowler, Martin (2002). Patterns of Enterprise Application Architecture. Addison-Wesley. ISBN 978-0321127426.

Gamma, Erich; Richard Helm, Ralph Johnson, and John Vlissides (1995). Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. ISBN 0-201-63361-2.

Hohpe, Gregor; Bobby Woolf (2003). Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley. ISBN 0-321-20068-3.

Reporting Solution Alternatives

The following section discusses some common reporting solution alternatives. The alternatives usually represent an evolution in a company’s reporting sophistication. Generally, organizations start with some main reports from an OLTP (Online Transaction Processing) system. Once they meet the limitations of the OLTP system, they evolve their reporting into data warehouses. Eventually, even more complex reports and interactivity are required. This usually leads to the implementation of an OLAP system. We will take a look at each of these alternatives and their relative advantages.

Reporting with Relational Data (OLTP)

Transactional databases are designed to capture and manage real data as it is generated, for example, as products are purchased and as services are rendered. Relational databases are designed according to the rules of normal form and typically have many tables, each containing fragments of data rather than comprehensive information or business facts. This helps preserve the integrity and accuracy of data at the detail level, but it presents challenges for deriving useful information from a large volume of transactional data. In order to obtain information with meaningful context, tables must be joined and values must be aggregated.

For simple report requests, this usually is not an issue. Take the example of an invoice. An invoice is a simple report. It displays custom information along with detail for a small number of transactions. For this type of report, querying an OLTP system is not very costly and the query should be relatively straightforward. However, users will eventually move past these simple reports as they start to look for information for an entire year or product line. Developing these types of reports will eventually consume considerable resources on an OLTP system as well as require increasingly difficult queries. Although relational database systems may support complex queries, reporting against these queries routinely could prove to be slow and inefficient.

Relational Data Warehouses

Many organizations evolve away from reporting on their OLTP data. Usually their first step is to create a carbon copy of the OLTP system on another server. This alleviates the resource constraints on the original system, but it does not solve the issues around increasingly difficult queries. OLTP systems simply are not organized in a logical reporting structure.

To deal with increasing reporting needs, an entire industry has evolved to simply handle reporting. From this industry, individuals such as Ralph Kimball have refined standard patterns and methodologies for developing data warehouses. A common misconception is that a data warehouse is simply a denormalized transactional system. In reality, a data warehouse is another form of relational database that is organized into a reporting-friendly schema. Data is centered around what is known as a “fact” table. A fact table relates to business processes such as orders or enrollments. Radiating out from the fact table are dimensional tables. Dimensional tables contain attributes that further define the facts. These attributes could contain product names, geographic sales locations, or time and date information.

Relational data warehouses can significantly improve query performance on large data sets. However, they too have related drawbacks. These drawbacks generally relate to the fact that data is still stored in a relational format. Relational databases require joins to combine information. They also require aggregate functions to calculate summary-level detail. Both joins and aggregate functions can slow queries on very large sets of data. Relational databases also do not understand inherit associations in the data. Take the example of a product table. Each product table has a related subcategory and each subcategory has a related category. If you need to create a report that is product sales with its percentage makeup of each related subcategory, you have to understand the relationship and write it in your query. The same holds true for time relationships. If you need to create a report that contains year-to-date information, you need to understand what the current date is as well as all the related periods in the same year. These things are possible in SQL queries but take additional effort and require more maintenance. That moves us into our next type of reporting alternative: OLAP.

Reporting with Multidimensional Data (OLAP)

Multidimensional databases take a much different approach to data retrieval and storage than relational databases. Multidimensional databases are organized into objects called cubes. Cubes act as a semantic layer above your underlying database. These databases can contain numerous different relationships and very large sets of aggregate data.

As a multidimensional database, information can be aggregated across many dimensions. This data is preprocessed into the multidimensional structure. Because it is preprocessed, query times are significantly reduced for large additive data sets. Multidimensional databases also have the advantage of understanding relationships between and across dimensions. This opens the door to creating calculations and reports that would be extremely difficult in a relational database.

Imagine that a user asks you to create a report that displays the top five customers with their top three products by this year’s sales amount and compared to last year’s sales amount. Writing a SQL query to return the top five customers is fairly straightforward. However, returning each one’s top three products would require additional subqueries because the relational database does not understand the association between products and customers. The final part of the request can prove even more burdensome. Returning a single year’s data is easy, but nesting that data next to last year’s data can prove almost impossible. The SQL query for the above scenario would most likely contain a number of nested queries as well as some creative use of temporary tables. Besides being a terribly complex query, it probably would not perform that well. On the other hand, Multidimensional Expressions (MDX), the language used to query multidimensional databases, can handle this in a few simple calls—not because MDX is a more advanced language, but simply because the underlying database understands the associations in the data and has stored this information for quick retrieval.
References: Professional SQL Server™ 2005 Reporting Services

Monday, September 8, 2008

Domain-Driven Design

The most complicated aspect of large software projects is not the implementation, it is the real world domain that the software serves. Over the last decade or two, a Domain-Driven Design philosophy has developed as an undercurrent in the object community.

What Is Domain-Driven Design?

Domain-driven design (DDD) is an approach to the design of software, based on the two premises that complex domain designs should be based on a model, and that, for most software projects, the primary focus should be on the domain and domain logic (as opposed to being the particular technology used to implement the system). The term was coined by Eric Evans in his book of the same title.

Domain-driven design is not a technology or a methodology. It is a vision and approach for dealing with highly complex domains that is based on making the domain itself the main focus of the project, and maintaining a software model that reflects a deep understanding of the domain. It is a way of thinking and a set of priorities, aimed at accelerating software projects that have to deal with complicated domains. To accomplish that goal, teams need an extensive set of design practices, techniques and principles.

In DDD a number of specific software design patterns are useful, such as:

Entities (a.k.a. Reference Objects): An object in the domain model that is not defined by its attributes, but rather by a thread of continuity and identity.

Value Objects: An object that has no conceptual identity. These objects describe a characteristic of a thing.

Repository: methods for retrieving domain objects should delegate to a specialised 'repository' object such that alternative implementations may be easily interchanged.

Factory: methods for creating domain objects should delegate to a specialised 'factory' object such that alternative implementations may be easily interchanged.

Service: When an operation does not conceptually belong to any object. Following the natural contours of the problem, you can implement these operations in services. The Service concept is called "Pure Fabrication" in GRASP.

Some of the very good books for reference are :

Domain-Driven Design: Tackling Complexity in the Heart of Software

In the book Domain-Driven Design a number of high-level concepts and practices are articulated, such as ubiquitous language meaning that the domain model should form a common language for describing system requirements, that works equally well for the business users or sponsors and for the software developers. The book is very focused at describing the Domain layer which is one of the common layers in an object-oriented system with a multilayered architecture.

With this book in hand, object-oriented developers, system analysts, and designers will have the guidance they need to organize and focus their work, create rich and useful domain models, and leverage those models into quality, long-lasting software implementations.

Applying Domain-Driven Design and Patterns: With Examples in C# and .NET

From the Back Cover
“[This] is a book about design in the .NET world, driven in an agile manner and infused with the products of the enterprise patterns community. [It] shows you how to begin applying such things as TDD, object relational mapping, and DDD to .NET projects...techniques that many developers think are the key to future software development.... As the technology gets more capable and sophisticated, it becomes more important to understand how to use it well. This book is a valuable step toward advancing that understanding.”
– Martin Fowler, author of Refactoring and Patterns of Enterprise Application Architecture

Patterns, Domain-Driven Design (DDD), and Test-Driven Development (TDD) enable architects and developers to create systems that are powerful, robust, and maintainable.

Drawing on seminal work by Martin Fowler (Patterns of Enterprise Application Architecture) and Eric Evans (Domain-Driven Design), Jimmy Nilsson shows how to create real-world architectures for any .NET application. Nilsson illuminates each principle with clear, well-annotated code examples based on C# 1.1 and 2.0. His examples and discussions will be valuable both to C# developers and those working with other .NET languages and any databases–even with other platforms, such as J2EE.

WPF Globalization and Localization

When you limit your product's availability to only one language, you limit your potential customer base to a fraction of our world’s 6.5 billion population. If you want your applications to reach a global audience, cost-effective localization of your product is one of the best and most economical ways to reach more customers.

Globalization is the design and development of applications that perform in multiple locations. For example, globalization supports localized user interfaces and regional data for users in different cultures. WPF provides globalized design features, including automatic layout, satellite assemblies, and localized attributes and commenting.

Localization is the translation of application resources into localized versions for the specific cultures that the application supports.

There are different approaches to implement Globalization and Localization in WPF. Here are the list of few (Refer to the XAML localization blog by Robert)

Localization via use of resx files and custom code generator for public resource class (this generator is used due to wpf binding engine restriction - you can bind data only from public properties/fields in public classes) : pros - supports design-time; easy to use (just create ObjectDataProvider, put your resources' class in it and bind your strings to DEPENDENCY properties of objects in xaml). Cons result from pros (pardon me my English) - first of all we are strongly restricted in use, 'cause binding can be applied ONLY to the dependency properties of the DependencyObject descendants; and second disadvantage is that additional VS installation (custom resource generator tool) is required.

Localization via LocBaml (simplified variant - it uses ResourceDictionary with localizable strings that are used by merging this dictionary in Application.Resources.MergedDictionaries and referring to them by means of DynamicResource extension) it is still not straightforward enough, because Locbaml forces us to compile binaries twice.

Localization via LocBaml (official MS tutorial)
Main difference here is that article author writes wrappers of resources classes manually. But I've got almost 50 projects in my solution. I don't want to write this bunch of classes manually. (Suddenly I thought that Paul Stovell's idea with custom type descriptors could be useful here... or wrapping could become less complicated via file reference in VS... I'll think about it later... Main advantage of described solution is the ability to switch languages in runtime.

Localization via xml files, XmlDataProvider, XPath queries in Binding : This is thee most elegant solution among all listed here. Pros - design-time support and runtime language switching. The only disadvantage is complexity of files' maintenance.

Other good references:

Claims-based Security

Traditional security models for intranet and Internet applications use some form of username and password to authenticate users. Client-server applications deployed to a common domain often rely on Windows credentials (NTLM or Kerberos), while services exposed to the Internet often require a username and password to be passed in an interoperable format (WS-Security) to be authenticated against a custom credential store. These scenarios are frequently accompanied by role-based security checks that authorize access to functionality. Although popular, role-based security is often too coarse an approach since custom roles are often necessary to represent different combinations of permissions or rights. Thus, applications are usually better off authorizing calls based on permissions granted, using roles to gather the appropriate permission set. This type of permission-based security model will provide a more fine-grained result over role-based security – the downside is that .NET doesn’t inherently support it so it requires more work to implement.

WCF introduces a claims-based approach to security at service boundaries, improving on role-based and permission-based security models. Claims can represent many different types of information including identity, roles, permissions or rights and even general information about the caller that may be useful to the application. A set of claims is also vouched for by an issuer such as a security token service, adding credibility to the information described by each claim – something not present in role-based or permission-based models. An additional benefit of using a claims-based security model is that it supports federated and single sign-on scenarios.

Michele Leroux Bustamante has written a very good article on Claims-based Security Model (Part 1 Part 2 ). This two-part article will explain how claims-based security is supported by WCF, and show you how to implement a claims-based security model for your services.

Microsoft Code Name “Zermatt”

The Federated Identity team has offered a public beta of Microsoft Code Name "Zermatt". Zermatt is a framework for implementing claims-based identity in your applications. By using it, you’ll more easily reap the benefits of the claims-based identity model described in this paper. For more information, see Zermatt White Paper for Developers.

Friday, September 5, 2008

Remote Debugging

Today we faced an interesting challenge. We had a service runioing remotely and we needed to debug it. Though we selected the easiest option, that is, we checked if VS2008 was on QA machine and debugged locally. My search for remote debugging led me to some very interesting articles.

This howto focuses mainly on setting up and configuring Visual Studio 2008 (should work with Visual Studio 2005 as well) remote debugging where the client machine that runs the IDE is not on a domain or is not on the same domain as the target machine. You may as well find this howto useful if you connect via a virtual private network and want to develop or debug your server applications from there or you have problems connecting to the remote debugger service.

http://support.microsoft.com/kb/318041
This step-by-step article describes how to set up and use remote debugging in Microsoft Visual Studio .NET or in Microsoft Visual Studio 2005.

Remote debugging is considered one of the toughest topics in ASP.NET, but it's a really cool feature and is really helpful when we cannot have a local Web server or when we have to store the applications at a centralized location. This column covers how to set up and use remote debugging in Visual Studio 2005.

Thursday, September 4, 2008

Exception handling in Sql Server

Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of some condition that changes the normal flow of execution.
Error handling plays a vital role when writing stored procedures or scripts. Sql server has matured with years and here is how the error handling is being address

SQL Server Versions before 2005

SQL Server versions before 2005 offered only one simple way to work with exceptions: the @@ERROR function. This function can be used to determine if an error occurred in the last statement that was executed before evaluating @@ERROR.
For example:

SELECT 1/0
SELECT @@ERROR
-----------
Msg 8134, Level 16, State 1, Line 1
Divide by zero error encountered.
-----------
(1 row(s) affected)

In this case @@ERROR returns 8134, which is the error number for a divide-by-zero error.
Using @@ERROR, you can detect errors and control them to some degree. However, proper use of this function requires that you check it after every statement; otherwise it will reset, as shown in the following example:

SELECT 1/0
IF @@ERROR <> 0
BEGIN
SELECT @@ERROR
END
-----------
Msg 8134, Level 16, State 1, Line 1
Divide by zero error encountered.
-----------0

(1 row(s) affected)

Trying to catch the error in this case actually ends up resetting it; the @@ERROR in the SELECT returns 0 rather than 8134 because the IF statement did not throw an exception.

In addition to the fact that the exception resets after each statement, @@ERROR does not actually handle the exception -- it only reports it. The exception is still sent back to the caller, meaning that even if you do something to fix the exception in your T-SQL code, the application layer will still receive a report that it occurred. This can mean additional complexity when creating application code because you need to handle exceptions that may needlessly bubble up from stored procedures.

SQL Server 2005

In SQL Server 2005, exceptions can now be handled with a new T-SQL feature: TRY/CATCH blocks. This feature emulates the exception handling paradigm that exists in many languages derived from the C family, including C/C++, C#, Java and JavaScript. Code that may throw an exception is put into a try block. Should an exception occur anywhere in the code within the try block, code execution will immediately switch to the catch block, where the exception can be handled.

The term "catch" is of special importance here. When TRY/CATCH is used, the exception is not returned to the client. It is "caught" within the scope of the T-SQL that caused it to be thrown.
For an example of TRY/CATCH, consider a divide-by-zero error:

BEGIN TRY
SELECT 1/0
END TRY
BEGIN CATCH
SELECT 'Error Caught'
END CATCH
-----------
(0 row(s) affected)
------------
Error Caught
(1 row(s) affected)

When this batch is run, no exception is reported. Instead, the message "Error Caught" is selected back. Of course, your T-SQL code does not have to send back any kind of specific message in the CATCH block. Any valid T-SQL can be used, so you can log the exception or take action to remedy the situation programmatically, all without reporting it back to the caller.

While merely being able to catch an exception is a great enhancement, T-SQL is also enhanced with new informational functions that can be used within the CATCH block. These functions are: ERROR_MESSAGE(), ERROR_NUMBER(), ERROR_LINE(), ERROR_SEVERITY(), ERROR_STATE() and ERROR_PROCEDURE(). Unlike @@ERROR, the values returned by these functions will not reset after each statement and, as a result, the functions will return consistent values over the entire time a CATCH block is executed.
For instance:
BEGIN TRY
SELECT 1/0
END TRY
BEGIN CATCH
SELECT 'Error Caught'
SELECT ERROR_MESSAGE(), ERROR_NUMBER()
END CATCH
-----------
(0 row(s) affected)
------------Error Caught
(1 row(s) affected)
-------------------------------------------- ---------------
Divide by zero error encountered. 8134
(1 row(s) affected)
In this example, the ERROR_MESSAGE() and ERROR_NUMBER() functions return the correct values, even though a SELECT occurred between the exception and evaluation of the functions -- quite an improvement over @@ERROR!
References

Exception handling best practices in SQL Server 2005

Coding and Design Guidelines

A coding standard is a set of guidelines, rules and regulations on how to write code. Usually a coding standard includes guide lines on how to name variables, how to indent the code, how to place parenthesis and keywords etc. The idea is to be consistent in programming so that, in case of multiple people working on the same code, it becomes easier for one to understand what others have done. Even for individual programmers, and especially for beginners, it becomes very important to adhere to a standard when writing the code. The idea is, when we look at our own code after some time, if we have followed a coding standard, it takes less time to understand or remember what we meant when we wrote some piece of code.

C# Coding Guidelines

Here are a number of article that can serve as good reference.

http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=336
http://blogs.msdn.com/ericgu/archive/2004/01/19/60315.aspx
http://www.tiobe.com/standards/gemrcsharpcs.pdf

Design Guidelines for Class Library Developers

The .NET Framework's managed environment allows developers to improve their programming model to support a wide range of functionality. The goal of the .NET Framework design guidelines is to encourage consistency and predictability in public APIs while enabling Web and cross-language integration. It is strongly recommended that you follow these design guidelines when developing classes and components that extend the .NET Framework. Inconsistent design adversely affects developer productivity. Development tools and add-ins can turn some of these guidelines into de facto prescriptive rules, and reduce the value of nonconforming components. Nonconforming components will function, but not to their full potential.


These guidelines are intended to help class library designers understand the trade-offs between different solutions. There might be situations where good library design requires that you violate these design guidelines. Such cases should be rare, and it is important that you provide a solid justification for your decision. The section provides naming and usage guidelines for types in the .NET Framework as well as guidelines for implementing common design patterns.

Refer MSDN for design guidelines above at http://msdn.microsoft.com/en-us/library/czefa0ke.aspx

Source Code Comments

In computer programming, a comment is a programming language construct used to embed information in the source code of a computer program.

According to Jeffrey Kotula,"Source code documentation is a fundamental engineering practice critical to efficient software development. Regardless of the intent of its author, all source code is eventually reused, either directly, or just through the basic need to understand it. In either case, the source code documentation acts as a specification of behavior for other engineers. Without documentation, they are forced to get the information they need by making dangerous assumptions, scrutinizing the implementation, or interrogating the author. These alternatives are unacceptable. Although some developers believe that source code "self-documents", there is a great deal of information about code behavior that simply cannot be expressed in source code, but requires the power and flexibility of natural language to state. Consequently, source code documentation is an irreplaceable necessity, as well as an important discipline to increase development efficiency and quality."(For more quotes click here).


C# offers several XML tags that can be placed directly within your source files to document the code, and Microsoft documents these tags very nicely in Visual Studio .NET help files. Once the developer has documented her source using the XML tags, she can use Sandcastle to produce integrated .chm files that contain the source documentation.


Although the description above and in the article may use the term ‘comment’, the XML tags are used to produce external source documentation. This differs from traditional code comments in that the realized documentation can be distributed as API (Application Programmer’s Interface) documentation. (For more details refer here.)


The following information outlines what you need to install so you can create proper Inline API (MSDN-style) documentation. Here is a list of programs you need to install.

Sandcastle

Sandcastle produces accurate, MSDN style, comprehensive documentation by reflecting over the source assemblies and optionally integrating XML Documentation Comments. Sandcastle has the following key features

* Works with or without authored comments* Supports Generics and .NET Framework 1.1, 2.0, 3.0, 3.5 and other released versions of Framework* Sandcastle has 2 main components (MrefBuilder and Build Assembler)* MrefBuilder generates reflection xml file for Build Assembler* Build Assembler includes syntax generation, transformation, etc* Sandcastle is used to build Microsoft Visual Studio and .Net Framework documentation.

You can download it from CodePlex http://www.codeplex.com/Sandcastle

SHFB (Sandcastle Help File Builder)

SHFB is a GUI interface that almost looks identical to the NDoc interface so anyone familiar with NDoc should be quite comfortable using it. It uses the underlining Sandcastle API to generate an HTML 1.x (.CHM) file, an HTML 2.x (.HxS) file, and/or a web site.

You can download SHFB from CodePlex http://www.codeplex.com/SHFB

DocProject for 2008

DocProject drives the Sandcastle help generation tools using the power of Visual Studio 2005/2008 and MSBuild. Choose from various project templates that build compiled help version 1.x or 2.x for all project references. DocProject facilitates the administration and development of project documentation with Sandcastle, allowing you to use the integrated tools of Visual Studio to customize Sandcastle's output.

You can download DocProject from CodePlex http://www.codeplex.com/DocProject

Some of my older articles

Few years back I had written a number of articles and following are the details.

CREATE A SITE SEARCH ENGINE IN ASP.NET

My Article “CREATE A SITE SEARCH ENGINE IN ASP.NET” was displayed on the official website http://www.developerfusion.com/show/4389/. This article was also printed in Developers Magazine. This article was written keeping the best practices in mind and exploited the object-oriented features of .net, data grid features, and the excellent features of ADO.Net.

THREE TIER CODE GENERATORS

The other article worth mentioning is "THREE TIER CODE GENERATORS" at http://www.codeproject.com/aspnet/ThreeTierCodeGenerator.asp. Generates all the code required for 3 tier architecture to be directly used in Visual Studio .NET. Makes use of .Net patterns and the best practices.

Wednesday, September 3, 2008

Refactoring




    1. Introduction

Refactoring is the process of restructuring code using a disciplined technique.




    1. Benefits

The benefits of refactoring include:



  1. more maintainable


  1. easier to read and understand


  1. easier to modify/add new features

Refactoring plays an important role in Extreme Programming (XP) where aggressive refactoring is encouraged.




    1. Unit Testing

Unit testing is a crucial component of refactoring.


1) Make sure the code being refactored has sufficient unit tests.


2) Make sure the code passes the unit tests.


3) Refactor.


4) Re-run the unit tests.




    1. When to Refactor

Guidelines indicating when to refactor include:



  1. When doing a code review.


  1. When adding new functionality.


  1. When bug-fixing.


  1. Whenever you have trouble understanding code.



    1. Smelly Code

Martin Fowler refers to code smells for identifying when to refactor. Some of these include:


1) Duplicated code. Extract out the common code into a method.


2) Long methods. Break up some of the code into separate methods.


3) Shotgun surgery - changes in one class require making changes in several others. Try to move all related pieces of code into a single, cohesive class.


4) Poor names - Rename data and methods to something more meaningful.




    1. Rules


  • Extract Method

You have a code fragment that can be grouped together. Turn the fragment into a method whose name explains the purpose of the method.



    void PrintOwing()



    {


PrintBanner();


//print details


System.Console.WriteLine("name:" + _name);


System.Console.WriteLine ("amount" + getOutstanding());


}







void PrintOwing()


{


PrintBanner();


PrintDetails(getOutstanding());


}


void printDetails (double outstanding)


{


System.Console.WriteLine ("name:" + _name);


System.Console.WriteLine ("amount" + outstanding);


}



  • Pull Up Method


    You have methods with identical results on subclasses. Move them to the superclass.





  • Rename Method


    The name of a method does not reveal its purpose. Change the name of the method.






  • Decompose Conditional


    You have a complicated conditional (if-then-else) statement. Extract methods from the condition, then part, and else parts.



    if (date.before (SUMMER_START) date.after(SUMMER_END))



    charge = quantity * _winterRate + _winterServiceCharge;



    else



    charge = quantity * _summerRate;






    if (notSummer(date))



    charge = winterCharge(quantity);



    else



    charge = summerCharge (quantity);



  • Remove Double Negative


    You have a double negative conditional. Make it a single positive conditional



    if ( !item.isNotFound() )






    if ( item.isFound() )



    Motivation



    Double negatives are often frowned on by mavens of natural language. Often this frowning is inappropriate - certainly in English the double negative has its uses.



    But that is not true in programming. There double negatives are just plain confusing. So kill them on sight.



    Mechanics




    • If you don't have a method with the opposite sense, create one. The body can just call the original (you can fix that later).

    • Compile

    • Replace each double negative with a call to the new function

    • Compile and test after each replace

    • If the negative form of the method isn't used elsewhere, use private method to inline it into the positive form



  • Extract Interface


    Several clients use the same subset of a class's interface, or two classes have part of their interfaces in common. Extract the subset into an interface.






    1. More Rules


  • Unit Testing: Don't Even Start Without It. You won't know your refactoring didn't break the code unless you have unit test coverage of the affected code. If you don't have that coverage already, add new tests.


  • Keep Each Refactoring Narrowly-Focused. Don't combine unrelated changes, refactor each one separately.


  • Keep Your Unit Tests Fine-Grained. If a refactoring involves multiple functions, you're best off having unit tests for each function. If you depend on a single unit test for a function that then calls the refactored functions, a test failure leaves you not knowing which refactored function failed.


  • Use Many Small Refactorings, Even When It Seems Inefficient. I prefer making a small refactoring even when I know that a subsequent refactoring will change that same code yet again. This is critical for refactoring tangled, poorly-written code.


  • Unit Test Each Refactoring. Don't wait.


  • Refactor Separate Functionality Into Separate Functions. If I encounter code that combines too much distinct functionality into one function, I try to break it apart into separate, individually-testable functions. With unit tests for each new function.



    1. Refactor Mercilessly (Extreme Programming approach)


We computer programmers hold onto our software designs long after they have become unwieldy. We continue to use and reuse code that is no longer maintainable because it still works in some way and we are afraid to modify it. But is it really cost effective to do so? Extreme Programming (XP) takes the stance that it is not. When we remove redundancy, eliminate unused functionality, and rejuvenate obsolete designs we are refactoring. Refactoring throughout the entire project life cycle saves time and increases quality.


Refactor mercilessly to keep the design simple as you go and to avoid needless clutter and complexity. Keep your code clean and concise so it is easier to understand, modify, and extend. Make sure everything is expressed once and only once. In the end it takes less time to produce a system that is well groomed.


There is a certain amount of Zen to refactoring. It is hard at first because you must be able to let go of that perfect design you have envisioned and accept the design that was serendipitously discovered for you by refactoring. You must realize that the design you envisioned was a good guide post, but is now obsolete.


A caterpillar is perfectly designed to eat vast amounts of foliage but he can't find a mate, he must refactor himself into a butterfly before he is designed to search the sky for others of his own kind. Let go of your notions of what the system should or should not be and try to see the new design as it emerges before you.


Measure your Application Performance with Performance Counters



The counter is the mechanism by which performance data is collected. It is that part of a modern microprocessor that measures and gathers performance-relevant events of the microprocessor without affecting the performance of a program. The registry stores the names of all the counters, each of which is related to a specific area of system functionality. Examples include a processor's busy time, memory usage, or the number of bytes received over a network connection.

Windows uses performance counters to collect and present performance data from running processes. Windows itself provides hundreds of performance counters, each monitoring a specific aspect of your system, from CPU utilization to network traffic. In addition, other applications such as SQL Server or Exchange publish their own custom performance counters that integrate into the Windows performance monitoring system.

Windows performance counters allow your applications and components to publish, capture, and analyze the performance data that applications, services, and drivers provide. You can use this information to determine system bottlenecks, and fine-tune system and application performance.

But developers will always find the need to have counters specific to their applications. For example, an accounting application can have counters like Accounts created, Transactions processed, total running balance, number of users logged on to the application and so on. You might also use a performance counter to track the number of orders processed per second or the number of users currently connected to the system.

Categories

The counter information must include the category, or performance object, that the counter measures data for. A computer's categories include physical components, such as processors, disks, and memory. There are also system categories, such as processes and threads. Each category is related to a functional element within the computer and has a set of standard counters assigned to it.

Performance Counter Types

Different performance counter types are available, covering different performance interests. They range from counts to those which calculate averages. Some of the performance counter types are for special situations only, but the following list contains the most common types you will normally use:




PurposeCounter TypeExample
Maintain a simple count of items, operations, and so on.NumberofItems32You might use this counter type to track the number of transactions executed as a 32-bit number.
Maintain a simple count with a higher capacityNumberofItems64You might use this counter type to track number of transactions executed for a
site that experiences very high volume; stored as a 64-bit number.
Track the rate per second of an item or operationRateOfCountsPerSecond32
You might use this counter type to track number of transactions executed per
second on a retail site; stored as a 32-bit number.
Track the rate per second with a higher capacityRateOfCountsPerSecond64You might use this counter type to track the number of transactions executed per second for a site that experiences very high volume; stored as a 64-bit number.
Calculate average time to perform a process or to process an itemAverageTimer32You might use this counter type to calculate the average time an number of a transaction takes to be processed; stored as a 32-bit number.


Creation and Setup of Performance Counters

There are two ways by which we can create Performance Counters: through VS.NET Server Explorer or programmatically. Creating performance counters using the Server-Explorer is much easier than doing it by code, but on production machines, you might not be able to install Visual Studio .NET to take advantage of this feature.

1) Using Server-Explorer

The simplest way to set up performance categories and counters is by using Server-Explorer integrated within Visual Studio .NET that ships with Server Explorer (Enterprise Architect, Enterprise Developer, or Professional).

Normally, you will find it at left side toolbar where you also have your Toolbox. If you don't see it then, make Server Explorer visible. Press Ctrl-Alt-S to make it active if it isn't currently being displayed. OR direct to the "View" menu and select the "Server-Explorer" option.

Under the Servers node, locate and expand the tree for the computer that will host the performance counters. Expand the Performance Counters node. Figure 1 shows the performance counters configured on a machine.




Figure 1: The Visual Studio .NET Server Explorer will display the list of all performance counters registered on a particular machine. Server Explorer makes it easy to add custom performance counters to the current list of counters.

To add your own performance counter, you will first create a new category. A category is typically the name of the application that will publish the performance data, although you may wish to use multiple categories for larger systems. Right click on the Performance Counters node and select Create New Category. This will launch the Performance Counter Builder dialog box shown in Figure 2.







Figure 2: The Performance Counter Builder will allow you to create new performance counters or edit existing ones.

Now create a category say ‘SoftChamps’ and add a new counter ‘Total number of operations executed’ as shown below. Add more counters by clicking the ‘New’ button.



Figure 3: SoftChamps category is created and counters are added

Click OK and you've created the performance counters and registered them on your machine.

You can also right-click on a category and select "Show category" to add some more
counters to any existing category.

In short to create counters,

  • Open the Server Explorer

  • Select Servers > your computer name > Performance Counters

  • Right click Performance Counters and select "Create New Category..."

  • In the dialog box, enter the name of the Category and any description you would like in the Category description text box.

  • In the counter list builder frame click "New" to add a new Performance Counter.

  • Enter the name of the performance counter and select the Type, and any description you would like as the Counter description.

  • Click OK.



2) Creating Programmatically

You can instrument your code with custom performance counters. .NET Framework classes provide rich set of APIs which can help developer to create custom counters with relative ease.

The System.Diagnostics namespace provides access to the performance counter libraries.

  • PerformanceCounterCategory
    – Represents a performance object, which defines a category of performance counters. Can be used to do operations on a performance category (Create, Delete, Exists, etc).

  • CounterCreationDataCollection
    - Provides a strongly typed collection of
    CounterCreationData objects which is used to create counters for a category.
  • CounterCreationData
    - Defines the counter type, name, and Help string for a custom counter.

There are a few things to consider regarding custom performance counters.

  • Each counter is uniquely identified through its name and its location. In the same way that a
    file path includes a drive, a directory, one or more subdirectories, and a file name, counter information consists of four elements: the computer, the category, the category instance, and the counter name.

  • You cannot create custom categories and counters on remote machines.
  • Our interaction with custom counters and categories is restricted to read-only mode unless you explicitly specify otherwise. By default the counters created under server explorer are read only and under read-only mode you cannot update the value of
    the counter.

  • When creating performance counters and/or categories by code, you must ensure that the user running the code must have the proper administrative rights. This might be a problem when using performance counters in Web Applications because the ASP.NET user does not have them. Hence you should create your custom performance counters outside ASP.NET, by using either a console application or Microsoft Visual Studio® .NET Server Explorer.
  • You cannot create new counters within existing custom categories. If you need to add counters to categories that already exist, the only way you can do so is to delete the category and recreate it with all of its contents, including the new counters you want to add.
  • If you try to create a counter that already exists, an error would be thrown. You can check the existence of a counter before you create one.

Creating a Single Performance Counter Using PerformanceCounterCategory

If you only need to create a single counter, you can use PerformanceCounterCategory.Create
to do so.

public static PerformanceCounterCategory Create (string categoryName,string categoryHelp,PerformanceCounterCategoryType categoryType, string counterName, string counterHelp)

Parameters

categoryName

The name of the custom performance counter category to create and register with the system.

categoryHelp

A description of the custom category.

categoryType

One of the PerformanceCounterCategoryType values specifying whether the category is MultiInstance, SingleInstance, or Unknown.

counterName

The name of a new counter to create as part of the new category.

counterHelp

A description of the counter that is associated with the new custom category.

Return Value

A PerformanceCounterCategory that is associated with the new system category, or performance object.

Example



if (!PerformanceCounterCategory.Exists("SoftChamps"))

PerformanceCounterCategory.Create("SoftChamps","SoftChamps Operations",
PerformanceCounterCategoryType.SingleInstance,
"Number of operations", "Total
number of operations executed"
);

Remarks

The categoryType parameter specifies whether the performance counter category is single-instance or multi-instance. By default, a category is single-instance when it is created and becomes multi-instance when another instance is added. Categories are created when an application is setup, and instances are added at runtime. In .NET Framework 1.0 and 1.1 it is not necessary to know if a performance counter category is multi-instance or single-instance. In the Microsoft .NET Framework version 2.0 the PerformanceCounterCategoryType enumeration is used to indicate whether a performance counter can have
multiple instances.



Use the Create method of the PerformanceCounterCategory class to create a performance
counter category and a single counter at the same time.



Creating Multiple Performance Counters Using CounterCreationDataCollection

If you need to create multiple counters, you can use a CounterCreationDataCollection to programmatically create the custom counter(s) and category. This technique enables you to create the category and multiple counters at the same time.


  • To create a new performance category, first create a System.Diagnostics.CounterCreationDataCollection to hold the information about the counters.
  • Example

    CounterCreationDataCollection
    counters =
    new CounterCreationDataCollection();


  • Then you create for each counter, one System.Diagnostics.CounterCreationData instance, and add it to the collection.

Example

CounterCreationData totalTransactions = new CounterCreationData();

totalTransactions.CounterName = "no of transactions executed";

totalTransactions.CounterHelp = "Total number of transactions executed";

totalTransactions.CounterType = PerformanceCounterType.NumberOfItems64;

counters.Add(totalTransactions);

  • Use the System.Diagnostics.PerformanceCategory.Create method to create category and all related counters stored in the collection.

public static PerformanceCounterCategory Create (

string categoryName,

string categoryHelp,

PerformanceCounterCategoryType
categoryType,

CounterCreationDataCollection
counterData

)

Parameters

categoryName


    The name of the custom performance counter category to create and register with the system.

categoryHelp

    A description of the custom category.

categoryType

counterData

Return Value

    Example

    PerformanceCounterCategory.Create("SCCategory", "SoftChamps
    Sample"
    , PerformanceCounterCategoryType.SingleInstance,
    counters);

Remarks

The categoryType parameter specifies whether the performance counter category is single-instance or multi-instance. By default, a category is single-instance when it is created and becomes multi-instance when another instance is added. Categories are created when an application is setup, and instances are added at runtime. In .NET Framework 1.0 and 1.1 it is not necessary to know if a performance counter category is multi-instance or single-instance. In the Microsoft .NET Framework version 2.0 the PerformanceCounterCategoryType enumeration is used to indicate whether a performance counter can have multiple instances.



A complete sample

if
(!
PerformanceCounterCategory.Exists("SCCategory"))

{

CounterCreationDataCollection counters = new CounterCreationDataCollection();

CounterCreationData totalTransactions = new CounterCreationData();

totalTransactions.CounterName = "no of transactions executed";

totalTransactions.CounterHelp = "Total number of transactions executed";

totalTransactions.CounterType = PerformanceCounterType.NumberOfItems64;

counters.Add(totalTransactions);

CounterCreationData transactionsPerSecond = new CounterCreationData();



transactionsPerSecond.CounterName = "no of transactions executed per sec";

transactionsPerSecond.CounterHelp = "Number of transactions executed per second";

transactionsPerSecond.CounterType = PerformanceCounterType.RateOfCountsPerSecond64;

counters.Add(transactionsPerSecond);

PerformanceCounterCategory.Create("SCCategory", "SoftChamps
Sample"
, PerformanceCounterCategoryType.SingleInstance, counters);



}



This is how the counters that are added look in the Server explorer.





Figure 4: Custom Counters added in server explorer

Finally to create a new category and add some performance counters to it, you must:


  • See if the category already exists (PerformanceCounterCategory.Exists()).

  • Create a CounterCreationDataCollection and add some CounterCreationData to it.

  • Create the category (PerformanceCounterCategory.Create()).

Writing to Performance Counters



After installing performance counters, you usually want to use them to monitor performance. Therefore, you can use the System.Diagnostics.PerformanceCounter class. The difference between System.Diagnostics.CounterCreationData and System.Diagnostics.PerformanceCounter is that System.Diagnostics.CounterCreationData physically adds a performance counter to a category on your local machine,
while System.Diagnostics.PerformanceCounter is used to create an instance of a performance counter and to change performance values.

Instantiate Performance Counters

To write to a performance counter, create an instance of the PerformanceCounter class, set the CategoryName, CounterName and, optionally, InstanceName or MachineName properties, and then call the IncrementBy, Increment, or Decrement methods.

The properties that need to be set are given below:

  • CategoryName - Gets or sets the name of the performance counter category for this performance counter.

  • CounterName - Gets or sets the name of the performance counter that is associated with this PerformanceCounter instance.

  • CounterHelp - Gets the description for this performance counter.

  • MachineName - Gets or sets the computer name for this performance counter ("." is the
    local machine).

  • InstanceName - Gets or sets an instance name for this performance counter.

  • ReadOnly - Gets or sets a value indicating whether this PerformanceCounter instance is in read-only mode; as we want to write performance data, we mark it as false.
  • Example

    _TotalTransactions = new PerformanceCounter();

_TotalTransactions.CategoryName = "SCCategory";

_TotalTransactions.CounterName = "no of transactions executed";

_TotalTransactions.MachineName = ".";

_TotalTransactions.ReadOnly = false;

_TotalTransactions.RawValue = 0;



_TransactionsPerSecond = new PerformanceCounter();

_TransactionsPerSecond.CategoryName = "SCCategory";

_TransactionsPerSecond.CounterName = "no of transactions executed per sec";

_TransactionsPerSecond.MachineName = ".";

_TransactionsPerSecond.ReadOnly = false;

_TransactionsPerSecond.RawValue = 0;



Changing performance Counter Values



You can set a counter's value either by incrementing it with the PerformanceCounter.Increment method or by setting it to a specific value by calling PerformanceCounter.RawValue.

The following are the methods on a System.Diagnostics.PerformanceCounter that will help to
write to performance counters:



  • Increment - Increments the associated performance counter by one through an efficient atomic operation.

  • IncrementBy - Increments or decrements the value of the associated performance counter by a specified amount through an efficient atomic operation.

  • Decrement - Decrements the associated performance counter by one through an efficient atomic operation.

  • RawValue - Gets or sets the raw, or uncalculated, value of this counter.

Example

_TotalTransactions.Increment();

_TransactionsPerSecond.Increment();

We have seen how to install and instantiate counters. Now we will look at the counter that
will measure the average.

Performance Counters for measuring Average

An average counter measures the time it takes, on average, to complete a process or operation. Counters of this type display a ratio of the total elapsed time of the sample interval to the number of processes or operations completed during that time. This counter type measures time in ticks of the system clock.

Associated with each average counter is a base counter that tracks the number of samples involved. It is important to know that the associated base counter always must follow the counter which will monitor average!

Example

CounterCreationData averageTime = new CounterCreationData();

averageTime.CounterName = "average time per transaction";

averageTime.CounterHelp = "Average duration per transaction execution";

averageTime.CounterType = PerformanceCounterType.AverageTimer32;

counters.Add(averageTime);

//Base counter for above counter

CounterCreationData averageTimeBase = new CounterCreationData();

averageTimeBase.CounterName = "average time per transaction base";

averageTimeBase.CounterHelp = "Average duration per transaction execution base";

averageTimeBase.CounterType = PerformanceCounterType.AverageBase;

counters.Add(averageTimeBase);

Write to average counter




  • As usual first step will be to instantiate the counter.

Example


    _AverageTime = new PerformanceCounter();

    _AverageTime.CategoryName = "SCCategory";

    _AverageTime.CounterName = "average time per
    transaction"
    ;

    _AverageTime.MachineName = ".";

    _AverageTime.ReadOnly = false;

    _AverageTime.RawValue = 0;

    _AverageTimeBase = new PerformanceCounter();

    _AverageTimeBase.CategoryName = "SCCategory";

    _AverageTimeBase.CounterName = "average time per transaction base";

    _AverageTimeBase.MachineName = ".";

    _AverageTimeBase.ReadOnly = false;

    _AverageTimeBase.RawValue = 0;

  • The next step will be increment the counter

Example

    _AverageTime.IncrementBy(ticks); // increment the timer
    by the time cost of the transaction

    _AverageTimeBase.Increment(); // increment base counter
    only by 1

While the counter of type PerformanceCounterType.AverageTimer32 is incremented by the time elapsed between two calls, the base counter - of type PerformanceCounterType.AverageBase - is incremented by 1 for each operation taken.

According to .NET documentation for PerformanceCounterType.AverageTimer32 the Formula is (N1 -N0)/(B1 -B0), where N1 and N0 are performance counter readings, and the B 1 and B 0 are their corresponding AverageBase values. Thus, the numerator represents the numbers of items processed during the sample interval, and the denominator represents the number of operations completed during the sample interval.

To measure N1 and N0, we can use System.DateTime.Now.Ticks

Example:



startTime = DateTime.Now.Ticks;

System.Threading.Thread.Sleep(interval.Next(500));

long ticks = DateTime.Now.Ticks - startTime;

Use ticks to increment the PerformanceCounterType.AverageTimer32 counter.

But for more accurate results it is better you use QueryPerformanceCounter() method
via InterOp.

Example:

/// <summary>

/// Imports the <code>QueryPerformanceFrequency</code> method into the class.
The method is used to measure the current

/// tickcount of the system.

/// </summary>

/// <param
name="ticks">
current
tick count
</param>

[DllImport("Kernel32.dll")]

public static extern void QueryPerformanceCounter(ref long ticks);

The extern modifier is used to declare a method that is implemented externally. A common use of the extern modifier is with the DllImport attribute when using Interop services to call into unmanaged code; in this case, the method must also be declared as static, as shown above.

Further the code given below will be used to increment the average counter.

long startTime = 0;

long endTime = 0;

QueryPerformanceCounter(ref startTime);

System.Threading.Thread.Sleep(rand.Next(400));

// measure ending time

QueryPerformanceCounter(ref
endTime);

long ticks = endTime - startTime;

Use ticks to increment the PerformanceCounterType.AverageTimer32 counter.

Performance Monitor

The performance monitor, or system monitor, is a utility used to track a range of processes and give a real time graphical display of the results, on a Windows system. (I am presently using Windows 2003) This tool can be used to assist you with the planning of upgrades, tracking of processes that need to be optimized, monitoring results of tuning and configuration scenarios, and the understanding of a workload and its effect on resource usage to identify bottlenecks.

It can be opened by navigating to the performance icon in the administrative tools folder in the control panel, from the start menu or by typing perfmon.msc in the run box.

After you have launched Perfmon, you must select a target system, or computer, to monitor. This system can be the localhost from which you launched Perfmon or another Windows system on the local network. Because of the overhead associated with running Perfmon, it is recommended that Perfmon run on a remote computer while scrutinizing production servers for performance issues across a network. To select a system, click on the plus sign icon in the Perfmon toolbar. This action invokes a network browser showing localhost as the default monitoring target.

After you have selected the system you wish to monitor, choose an object, or subsystem, to monitor. These include such components as system, memory, network interfaces, or disk I/O subsystems. Next, choose the counters you wish to monitor.



Adding a counter

Right click anywhere on the graph and choose Add Counter.

The Add Counter box consists of the following options:


  • Computer: The source system of the object. You can choose to select the local computer or another computer on your network - type \\computer_name in the appropriate box.
  • Object: The subsystem of interest. This refers to the virtual part of the computer that you want to monitor. Memory, Processor or Network Interface, for example.
  • Counter: The aspect of performance of interest. This refers to what parts of the object you want to monitor - they differ depending on the object.
  • Instance: The specific object to be measured when multiple objects of the same type exist on a single system. For example, if you go to the Process performance object, the instances list will display all the active processes on the specified computer.




Figure 5:
The above image shows the Add Counters window.

From the Performance Object dropdown select SCCategory, click ‘Add’ to add the counters.





 

Figure 6: Add the custom counters to the monitor

Conclusion

Custom performance counters are a great value addition that a development team can make to a production environment. It is definitely one of the easiest ways to monitor the health and performance of the application. The .NET Framework and VS.NET definitely makes implementation of counters a relatively simple task.