Ted Codd, Chris Date

There are a lot of false heroes in high tech. The late Dr. Edgar (Ted) Codd was the real deal.

Codd came up with the notion of relational databases, by applying the beauty of math and predicate logic to the problem of managing, finding and sorting data. His work, much of which he did while employed by IBM, forms the basis for what is now a multibillion-dollar-a-year business for Oracle, Microsoft and IBM itself. In 2003 alone, more than $7 billion worth of new RDBMS licenses were sold worldwide, according to Gartner.

Codd may not be a household name, but among those who know databases, he remains an icon a year and a half after his death. “Codd is a god,” says one longtime database expert matter-of-factly.

Codd’s biggest single contribution to IT was the idea, described in published papers starting in 1969, that there must be a divorce between the physical and logical layers of the data system. Simply put, that means someone who wants to find a certain piece of information shouldn’t need to know the niceties of how fact “X” came to reside here or there. He or she must simply know what to ask for. Admittedly, that in itself is not always easy, but it is certainly simpler than tracking the spaghetti code that snakes through prerelational hierarchical databases.

With mathematically elegant underpinnings—the basic foundation of row-and-column tables—the system will hold true no matter what, says Codd’s widow, Sharon Codd. “Having that knowledge is great power. Most of the other systems were developed ad hoc, and when they needed more function, people pasted new capabilities on top of what were really faulty foundations,” says Sharon Codd, no slouch herself when it comes to databases and math.

AD
id unit-1659132512259
type Sponsored post

According to Sharon Codd and Chris Date, Ted’s longtime collaborator, Codd was irked that it took IBM so long to glom onto what he had done. IBM’s delay, some say, paved the way for Larry Ellison and his startup Oracle to gain huge market share and vast riches.

Sharon Codd says her husband had a cordial relationship with Ellison and was invited to speak at an Oracle function some years back. “Larry respected what Ted did. … After all, his fortune is based on it,” she says.

Adds Date, who helped popularize and evangelize Codd’s work in lectures, books and articles: “There was a general feeling at the time that research belonged to everyone. It was public. People at IBM didn’t like it, but all the papers were available. Ellison felt it was obvious that Codd was onto something, so he built a clone of what Ted outlined.”

IBM had assigned Codd to a group examining approaches to the database problem, but since the computing giant had already come up with the hierarchical Information Management System (IMS), their work was restricted to hierarchical approaches, Sharon Codd recalls. Ted Codd saw things very differently. When IBM tried to stop him, he demanded that he be taken off the team and was reassigned to a new group that pretty much let him do his own thing, she says. By 1969, Codd had published his first take on the relational model of data in an IBM research report. A revised version published by the Association of Computing Machinery a year later is what most people usually point to when they talk about the birth of the relational model.

Sharon Codd, who worked at IBM when she met Ted, continues to promote his work and explains it patiently to reporters. “Consider a whole lot of tables as your database. If you’re interested in ‘Employee A,’ you know to go to the employee table. If you’re interested in ‘Employee 12345,’ you go to that specific part of the table. Then, if you want to know eye color, that’s a characteristic of the employee [and] will be in that row or column intersection. You have a unique addressing scheme right there. You don’t need to know about indexes, about storage, about pointers that relate things to each other,” she says.

\

The beauty is that this table structure is typically understood by nontechies. “From a layman’s perspective, everyone understands tables and the notion of a unique key—a unique identifier, whether it’s an employee number or a department number or an automobile policy number,” Sharon Codd says.

But that notion of a “primary key” or identifier to point out the relevant part of a table was not easily grasped by experts at the time. The elegance of the concept was immortalized by one of Codd’s followers, Bill Kent. Speaking of the relational model, Kent once said, “The relation is based on the primary key, the whole key and nothing but the key, so help me Codd.”

Database solution providers wax poetic about Codd’s contributions. His research “changed the whole issue of how to access data from procedural to nonprocedural,” says Frank Cullen, principal at Blackstone & Cullen, a VAR in Atlanta. “In the old days, we had to program data access and know all about file managers like VSAM and index files and had to go up and down data structures, parent to child, programming for all that. Relational systems told you what results you wanted in a non-procedural fashion.”

By getting rid of this procedural morass, Codd’s model eliminated a lot of room for error. “Programmers screw up most consistently in procedures. And relational databases took the procedures out of the programming requirement,” Cullen says.

George Brown, president of Database Solutions, Cherry Hill, N.J., is even more effusive. “Codd’s work in databases was a turning point in IT. It was probably as big a deal as the invention of the microprocessor,” he says.

Codd’s model, because it leeched much of the hardwiring out of the system, also provided greater flexibility, solution providers say.

“With hierarchical queries, you needed to record the layout [of the data], and if anything in that layout changed, including the type of data, you had to know that. I suspect Codd may have been the first to figure out a way to deal with and accept change. Things needn’t be set in stone, the database could accommodate change in the information and the relationship of one piece of information to another without breaking the system,” says Richard Warren, an independent technology consultant in Winchester, Va.

Sharon Codd agrees enthusiastically: “By having this separation, you inured the user from all the changes occurring underneath. If you really stuck to it, to what Ted prescribed, you could rip out the system underneath and do a better one without changing the interface at all,” she says.

While the RDMBS market took off starting with Oracle’s work in the 1980s and picked up steam in the 90s, Codd was none-too-thrilled about the commercial implementations of his work, according to his widow and Date. He found Structured Query Language, for example, to be less than optimal. “SQL was a disaster. Ted thought so, too,” Sharon Codd says, decrying SQL’s use of pointers in particular. “Talk about trashing the foundation! The general impression was that [the SQL inventors] either did not understand or did not fully appreciate what Ted was asking for,” she says.

Another source of frustration for Ted was the perennial assertion that a relational database cannot handle nonstructured data.

“That is a misconception,” Date says. “All data has structure, it’s just that it is sometimes unclear. Generically saying things like ‘We now have to deal with XML documents so we can’t use relational databases’ is just plain wrong.”

He likens that contention to a similar debate more than a decade ago that relational databases could not handle “objects.” At the height of that debate there were a handful of ODBMS companies all plying their trade. They are now all gone.

“ODBMSes never had a solid foundation,” Date says. “The relational model rigidly stays at the logical level, not the physical level so there are many degrees of freedom. The object guys and other contenders, including the XML guys, mix up the logical and physical levels. They don’t understand. I like to say they’re closer to the metal and as such give you less freedom.”

Sharon Codd and Date continue to build on Ted Codd’s vision of the relational model of data. During the 2004 Industry Hall of Fame induction ceremony last month, Sharon Codd said she is working on what she calls a relational enterprise management system, dubbed the Delta model, to transform business knowledge into working applications. “This [product is] more abstracted so someone who knows the business rather than someone who knows programming can describe it to the system, which then programs it,” she says.

It’s just one more example of how Ted Codd’s work continues to touch the database world.

Published for the Week Of December 13, 2004

You could say that Chris Date played James Boswell to Ted Codd’s Samuel Johnson.

It was Date, a veteran IBMer, who spotted what Codd was onto with his idea for relational databases and helped spread the word about Codd and his work to the masses. “Ted was the guy who invented this stuff, I was the one to explain it,” Date says.

Date’s background in mathematics convinced him Codd was right that the fundamental underpinnings of a database should be based on math and predicate logic. Date started out programming in 1962 but found that he loved to teach. And the revolutionary relational model of data Codd devised proved fertile ground for a teacher who could explain those ideas well.

Explaining the beauty and logic of the relational model, which divorces the logical layer from the physical layer of the data system, is not necessarily easy—even though Date is quick to say that the model’s use of tables as its basis makes it accessible to non-techies. Ironically, that accessibility may have been one reason Codd’s idea had a hard time gaining traction within IBM. The company’s techie eggheads apparently had preconceived notions of what a database was and should always be. To them, a database should be hierarchical. That way IBM’s existing Information Management System (IMS) database was protected from competition.

So, to build a following, Codd and Date took their act on the road.

“We evolved a strategy more or less consciously that it would be hard to persuade IBM to build a relational product, so we started talking to customers about what it would be like in the hope that the customers would pressure IBM to build it. And that’s what happened,” Date says.

Date remembers meeting an old friend for a beer in England. The friend, a linguist, wanted to hear about Codd and Date’s work. Describing the idea, Date illustrated the database as a vessel for keeping cricket scores and stats over time. The linguist latched onto the concept. “He thought about it for a minute and said, ‘Wouldn’t you just keep a lot of tables and start looking things up?’ He sort of invented the idea of the RDBMS in 60 seconds over a beer,” Date says. “The basic idea is very simple, an abstraction of what most people do anyway, [which is] find things in one table and look [them] up in another. Codd formalized those simple ideas and applied predicate logic to do it.”

George Brown, president of Database Solutions, Cherry Hill, N.J., credits Date with making the abstract practical. “Date came out with rules for what constituted an RDBMS, because at that time, it was very unclear. The initial message was ‘it slices, it dices, it does everything.’ And the truth is, it did do a lot, but you had to explain and show examples of what it could do,” he said.

Database admininistrators and programmers of all kinds laud what Date has wrought. Peter Eddy, a software engineer at ATG, Cambridge, Mass., says he has schlepped a dog-eared copy of Date’s “A Guide to the SQL Standard” from job to job for more than a decade. It is the only book he has carried with him. “It’s probably out of print by now, but it still works for me,” Eddy says. And that says it all.