Sep 02 2008

Thoughts of Software Engineering: Calendar Woes and Date Interchange Formats

For a laugh, I decided to create a calendar event in Gmail for a very old event. In fact, I was looking for a date in the region of 2 BC.

When adding an event to a Gmail message, you enter the year/month/date of the start of the event into one box, and of the end of the event into another box. The boxes are prepopulated with today’s date. Entering “2BC” as the year gets changed automatically to 2002 when you click outside the box, as does “0002″. Dates down to “100″ do not get changed, but entering “10″ gets mapped to “2010″. Entering “49″ gets mapped to “2049″ but “50″ gets mapped to “1950″. In summary, one or two digit dates 0..49 map to 2000..2049, and 50..99 map to 1950..1999.

On to iCal. iCal only offers a two digit year field. :-( 0..50 map to 2000..2050, and 51..99 map to 1951..1999.

On to Emacs. Surely Emacs gets it right. Well, nearly. Emacs calendar mode allows you to create an event for any year > 0.

So Emacs wins with 1 AD, followed closely by Gmail with 100 AD and iCal a distant third at 1951.

While this exercise was fairly frivolous, the consequences of the results could be nontrivial: what happens if you want to exchange dates and events between calendars? What happens in 1951? What happens if you want to add historical dates to your calendar?

PS Yes, I remember Y2K!


Jul 28 2008

Thoughts on Software Engineering: Interfaces

Interfaces Contribute to Software Risk

When I was at NASA, I did some research on software risk. The most common identifiable cause of software failures, apart from “functional defect” (i.e. the code was perfectly valid, but just did not implement the right thing) is “interface defect”, i.e. a failure in the interfaces between components.

Interface Problems are Worse in Large Multilanguage Systems

The reason why interface defects are particularly common is probably two-fold: compilers and other tools do not provide good support for preventing this kind of cross-component defect, components tend to correspond to the boundaries between the work of different people, and interface defects are due to lack of coordination between different people.

A large system – especially one providing a consumer web interface – is often composed from many components, written in several different languages, some of them running on a single machine and others broken into client and server parts. The system and its components may be evolving quite rapidly and different parts are usually written by different people. The checks provided by compilers are only of benefit on a component-by-component basis and so are not very helpful in this situation.

Check Types + Properties At Runtime

The kinds of type-checking available in programming languages are generally limited by the need to check types statically, which imposes a severe restriction on the notion of “type”. Many types in practice would be well represented by specifying an underlying data type plus properties which must hold of valid elements, e.g. lists (underlying type) whose elements are pairs (property), arrays of strings (underlying type) which are sorted (property). Such types are known as dependent types. Dependent types are not supported by most programming languages, because their use sacrifices automation of static type inference and checking.

Although dependent types cannot be checked statically, or used explicitly in most programming languages, a lot of the benefit of dependent types can be realized by writing functions which check whether given data have the right underlying data type and satisfy the required properties. Instead of statically checking types, which is mathematically hard or impossible, we just check data input to or output from functions to ensure they satisfy the properties, i.e. have the right (dependent) types.

Represent Interface Contracts As Types + Properties

These runtime checkable properties (dependent types) provide a good way to tackle part of the interface problem, by representing the interfaces in a machine-checkable way and building mechanisms to check the consistency of the component interfaces both at build time and at runtime. Speed is often important, so components should distinguish between a “development” environment, in which the interface checks are performed, and a “production” environment in which they are skipped.

My proposal is, therefore, to define APIs which specify the classes/functions/methods provided by the components in a system. Those APIs should specify at an appropriate level of detail the signatures, types, and properties of those classes/functions/methods. “Appropriate level of detail” here means exercising judgement about when to use heavy-weight, detailed representations, and when simpler abstractions are good enough. The functions which check the required properties of data can be used both at runtime in the development environment to check much or all of the data passing between components, and can be used at build time to check aspects of the correctness of test results. One way to view these functions is as test oracles.


Jul 27 2008

Thoughts on: Software Engineering

Tag: software engineeringadmin @ 8:54 am

Today I am starting a new “Thoughts on…” series, to distill some of my thoughts on work related matters. These will be published as I write them, and later edited together into a more compact collection.