Michael J. Swart

January 26, 2011

Ridiculously Unnormalized Database Schemas – Part Two

Filed under: Data Cartoons,SQLServerPedia Syndication,Technical Articles — Tags: , , , — Michael J. Swart @ 12:00 pm

Ridiculously Unnormalized Database Schemas

Second Normal Form (2NF)

I’ve seen 2NF defined this way:

“A table is considered 2NF if it is 1NF and if its non-prime attributes depends on the whole of every candidate key not just part of it.

It means that for tables that have multicolumn primary keys (or could), the other columns depend on all the columns in the key, not just a subset. If you follow this rule you’ll see that a field in a record will contain data that is a fact only about the record it belongs to, not others. From another point of view, I’ve seen normalization defined as removing redundant data

Get rid of redundant data (Star Trek's Data gets fired)

Things to Remember

  • For 2NF, you only need to look at the “non-prime attributes” or in other words the attributes that aren’t part of the primary key.
  • Look at these columns and ask whether they depend on the whole primary key.
  • Tables with a single-column primary keys are automatically in 2NF
  • BUT it’s not fair to make identity columns your primary keys on every table and call your job done (The definition of 2NF closes this loophole by mentioning candidate keys).

The Example

Take a look at the following table. It tracks reviews from the talent search t.v. show “American Idol”. The primary key is defined on the columns (Date, Singer, Song, Reviewer). Look at each other column and ask whether it depends on the whole key, or a subset of the key.

American Idol Reviews

Date

Singer

Song

Reviewer

Age

Show Order

Show Theme

Result

Review

Aug-21

Justin Guarini

Before Your Love

Paula Abdul

23

1

Finale

Runner-Up

Beautiful

Aug-21

Justin Guarini

Before Your Love

Randy Jackson

23

1

Finale

Runner-Up

A little pitchy dawg

Aug-21

Justin Guarini

Before Your Love

Simon Cowell

23

1

Finale

Runner-Up

Rubbish

Aug-21

Kelly Clarkson

A Moment Like This

Paula Abdul

23

2

Finale

Winner

Beautiful

Aug-21

Kelly Clarkson

A Moment Like This

Randy Jackson

23

2

Finale

Winner

A little pitchy dawg

Aug-21

Kelly Clarkson

A Moment Like This

Simon Cowell

23

2

Finale

Winner

Rubbish

Aug-21

Justin Guarini

Get Here

Paula Abdul

20

3

Finale

Runner-Up

Beautiful

Aug-21

Justin Guarini

Get Here

Randy Jackson

20

3

Finale

Runner-Up

A little pitchy dawg

Aug-21

Justin Guarini

Get Here

Simon Cowell

20

3

Finale

Runner-Up

Rubbish

Aug-21

Kelly Clarkson

Respect

Paula Abdul

23

4

Finale

Winner

Beautiful

Aug-21

Kelly Clarkson

Respect

Randy Jackson

23

4

Finale

Winner

A little pitchy dawg

Aug-21

Kelly Clarkson

Respect

Simon Cowell

23

4

Finale

Winner

Rubbish

Aug-21

Justin Guarini

A Moment Like This

Paula Abdul

23

5

Finale

Runner-Up

Beautiful

Aug-21

Justin Guarini

A Moment Like This

Randy Jackson

23

5

Finale

Runner-Up

A little pitchy dawg

Aug-21

Justin Guarini

A Moment Like This

Simon Cowell

23

5

Finale

Runner-Up

Rubbish

Aug-21

Kelly Clarkson

Before Your Love

Paula Abdul

23

6

Finale

Winner

Beautiful

Aug-21

Kelly Clarkson

Before Your Love

Randy Jackson

23

6

Finale

Winner

A little pitchy dawg

Aug-21

Kelly Clarkson

Before Your Love

Simon Cowell

23

6

Finale

Winner

Rubbish

You can see that with the exception of the column Review that all the columns in tables are not dependent on the whole key. We can pull these columns into separate tables:

Normalized:

ShowContestants

Singer

Date

Age

Result

Justin Guarini

Aug-21

23

Runner-Up

Kelly Clarkson

Aug-21

20

Winner

Performances

Date

Singer

Song

Show Order

Aug-21

Justin Guarini

Before Your Love

1

Aug-21

Kelly Clarkson

A Moment Like This

2

Aug-21

Justin Guarini

Get Here

3

Aug-21

Kelly Clarkson

Respect

4

Aug-21

Justin Guarini

A Moment Like This

5

Aug-21

Kelly Clarkson

Before Your Love

6

Shows

Date

Show Theme

Aug-21

Finale

We still have a table to hold all the reviews by the judges. Defined as:

    Reviews(Date, Singer, Song, Reviewer, Review)

But it’s still a point of debate whether or not the reviews depend on the whole primary key or a subset of the key (especially based on the example).

8 Comments »

  1. By the way, if you laughed at the cartoon here, check out my other Data cartoons: here and here.

    If you groaned at the Data cartoon, then I apologize, That will be the last one for a while.

    Comment by Michael J. Swart — January 26, 2011 @ 12:07 pm

  2. Noooooooooooo ……. more data more data more data !!!!!! πŸ™‚

    Great series Michael …. as always ……

    Comment by David Nelles — January 27, 2011 @ 8:43 am

  3. “But it’s still a point of debate whether or not the reviews depend on the whole primary key or a subset of the key (especially based on the example).”

    LOL… too true.

    And keep the Data cartoons coming… The “Data(Bass)” one is classic.

    Comment by Brad Schulz — January 27, 2011 @ 1:19 pm

  4. 2NF (in common with the other normal forms) is all about dependencies on the candidate keys, not just the primary key – something which you mention in passing but which you haven’t made explicit. An important point I think. “Non-prime” means not part of any candidate key rather than simply not part of the “primary” key. It is possible for a relation with single attribute keys to be in violation of 2NF but it’s just very unusual.

    Comment by David — February 1, 2011 @ 4:20 am

  5. Hi David,

    Now we’re talking about style! I see that I did put the word “candidate” in italics if that helps πŸ™‚

    But anyway, thanks for stopping by David, I hope you’ll subscribe to my rss and weigh in. Next week I’m doing 3NF!

    Comment by Michael J. Swart — February 1, 2011 @ 8:28 am

  6. Age does not seem dependent on Date… Shouldn’t it be separated as well together with it’s key Singer…

    Comment by Jan Moons — January 8, 2012 @ 2:51 pm

  7. Hi Jan, I thought about that while writing the article. And in this case it does matter a little bit. I think one of the contestants had a birthday during the taping of the season and so the age does depend on the date. I chose not to bring it up because it seemed distracting to the article.

    But it’s a good question and it shows that you know this stuff.

    Comment by Michael J Swart — January 8, 2012 @ 3:29 pm

  8. […] https://michaeljswart.com/2011/01/ridiculously-unnormalized-database-schemas-part-two/ […]

    Pingback by Database Design: Second Normal Form – Piffany's Musings — August 30, 2018 @ 2:42 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress