Software doctor's 999 service

Dr Connie Smith performs lifesaving surgery on software programming disasters, writes Karlin Lillington

Dr Connie Smith performs lifesaving surgery on software programming disasters, writes Karlin Lillington

THINK OF Dr Connie Smith as the 999 service for software. When big programming projects fail, she's the first person some of the world's biggest organisations ring.

She's the pioneer of software performance engineering, an approach to creating software increasingly taught at university level that emphasises careful planning before you start coding, rather than making it up as you go along.

In other words, constructing systems to meet the performance requirements of the organisation, so that an ATM dispenses cash quickly, an online shop doesn't freeze up because too many sales come in at once, or a bank can quickly retrieve a person's account records.

READ MORE

But surely, programmers always take the time to figure out the best way to achieving their goal before they start writing code? "Ideally," she says. "In reality, what happens is that people don't do it and have disasters." Look close to home, or internationally, for some high profile, software disasters in government agencies and large companies, and it's easy to see this must be so (see panel).

She points to a comparison between a 1979 study of federal software spending in the US done by the General Accountability Office and a 1995 study of almost the exact same subject by the US Department of Defence, which demonstrates no improvement in some appalling software performance statistics over 16 years.

In 1979, 47 per cent of federal software was delivered but never used due to performance problems, 29 per cent was paid for but never delivered, 19 per cent was used but extensively reworked, 3 per cent was used after changes, and only a miniscule 2 per cent was used as delivered.

In 1995, in the same categories, the statistics were 46 per cent, 29 per cent, 20 per cent, 3 per cent and 2 per cent.

She also cites a recent survey of 430 IT executives, in which 67 per cent revealed they were not generally aware of software system problems in their organisations until users called the help desk.

"That's a pretty sorry state of affairs, especially when we have the software to solve it," she says.

She isn't sure why programmers often do not take the time to do advanced planning and preparation. By contrast, "engineers do this all the time. They wouldn't think of building a bridge without modelling it first. But software programmers do the equivalent all the time."

Smith as well as others have come up with a variety of software tools that will model systems, enabling programmers to see just how long an ATM transaction will take, or at what point too many users will cause an e-commerce website to crash.

Her theories and models are the basis of an entire lab at University College Dublin - the Performance Engineering Lab run by four academics, including brothers and computer science lecturers Dr Liam and Dr John Murphy.

"We follow her book for our courses and our lab is based on performance engineering," says John Murphy. Smith was at UCD last week to give a talk to students, researchers and programmers. She thinks the state of practice is better in Europe than in the US, as performance engineering is taught more widely here than there.

She taught at university level herself, but eventually moved back to working as a consultant. Teaching "was too far from the fire-fighting". Her client list now - companies she advises on how to produce better software - is a who's who of technology companies, multinationals and government agencies: the US Internal Revenue Service, IBM, HP, Motorola, Apple, ATT, the US military branches, Boeing, Cray Research.

She's also a prolific author, regularly churning out academic papers. She enjoys writing and finds it improves her programming practice: "It solidifies the ideas you have if you have to express them."

She does a lot of collaborative work with academics in Spain and Italy through her company, Software Performance Engineering (www.spe-ed.com) and hints that Ireland may also be on the horizon, given the connection to the UCD lab.

Shifts in technology, and new models and platforms for how software processes data and works with other software, such as SOA (services-oriented architecture, where software functions as sets of services that can talk to each other), create fresh challenges for performance engineering.

"With SOA, there are so many diverse things that have to integrate and no one has the big picture," Smith says. "If you do it right, your organisation is in a great position to move forward. If you do it wrong, it's a big mess."

Can she name some notable software successes? "It's actually a hard question to answer, because people call me after they have failed. But lots of people are doing things right and under the radar," she says.

"In this business, success is invisible, but failure is highly visible." She adds: "I also think we wouldn't be building software if we weren't optimists."

Software catastrophes

UK Census:went online in January 2002, crashed on the first day - unable to cope with the number of visits. "The scaling problems ran deeper than the site's multiple contractors appreciated (incorrect architectural design), and it did not go live again until September 2002," says Dr Liam Murphy, UCD Performance Engineering Lab

Nectar:the Nectar retail loyalty card website collapsed under the strain of people applying online in September 2002, despite having increased its internet server capacity sixfold, two days before the launch

Egg (online UK bank):Egg suffered embarrassment as their new online bank crashed repeatedly under customer demand in January 2001. "An upgrade hardware crash triggered an outage - because they didn't thoroughly test after changes were made," says Murphy

Amazon.co.uk:incorrect prices in March 2003 led to the site freezing and crashing, and it had to be taken down. "The site design didn't consider such a sudden increase in load," Murphy says

London Stock Exchange:a project to create a new automated stock settlement system was scrapped after seven years in 1993, at a cost of $600 million, due to poor design and management

Sainsbury:in 2004, a faulty software system designed to manage shipment of store inventory was written off at the cost of $526 million and 3,000 extra staff were hired to stock shelves manually