MaxiCompu Fagaad: Data, Databases, and the Software Engineering Process

1 Data, Databases, and the

Software Engineering Process

1.1 INTRODUCTION

In this chapter, we introduce some concepts and ideas that are fundamental

to our presentation of the design of a database. We define data,

describe the notion of a database, and explore a process of how to design

a database.

1.2 DATA

Data, as we use the term, are facts about something or someone. For

example, a person has a name, an address, and a gender. Some data

(facts) about a specific person might be “Mary Smith,” “123 4th St.,”

“female.” If we had a list of several people’s names, addresses, and genders,

we would have a set of facts about several people. A database is a

collection of related data. For this “set of facts about several people” to

be a database, we would expect that the people in the database had something

in common—that they were “related” in some way. Here related

does not imply a familial relationship, but rather something more like

“people who play golf,” “people who have dogs,” or “people I interviewed

on the street today.” In a “database of people,” one expects the people to

have some common characteristic that ties them together. A “set of facts

about some people” is not a database until the common characteristic is

also defined. To put it another way: Why are these people’s names and

addresses being kept in one list?

2 • Database Design Using Entity-Relationship Diagrams

CHECKPOINT 1.1

1. A tree is classified as a “large oak tree about 100 years old.” What are

three facts about this tree?

2. Another tree has the following characteristics: pine, small, 15 years

old. If I write about the two trees and their facts on a piece of paper,

what do I have?

3. Why is the piece of paper not a database of trees?

1.3 BUILDING A DATABASE

How do we construct a database? Suppose you were asked to put together a

database of items one keeps in a pantry. How would you go about doing this?

You might grab a piece of paper and begin listing items that you see. When you

are done, you would have a database of items in the pantry. Simple enough,

but is it a good database or a poor one? Was your approach to database construction

a good methodology or not-so-good methodology? The answer to

these questions would depend on why you constructed the list—who will use

the list and for what. If you are more methodical, you might first ask yourself

how best to construct this database before you grab the paper and begin a list

of items. A bit of prethinking might save time in the long run because you

might think about how the list was to be used and by whom.

When dealing with software and computer-related activity like databases,

we have a science of “how to” called software engineering (SE). SE is a process

of specifying systems and writing software. To design a good database, we will

use ideas from SE. By being aware of SE and respecting its known systematic

approach, we can see why we handle database design the way we do. In this

chapter, we present a brief outline of SE. After this brief background/overview

of SE in this chapter, we explore database models, in particular the relational

database model, in subsequent chapters. While there are many kinds of database

models, most of the databases in use today are relational. Our focus in

this book is to put forward a methodology based on SE to design a sound

relational database (as opposed to other database models).

CHECKPOINT 1.2

You have a set of books on bookshelves in your house. Your mother asks you

to create a list of all the books she has.

1. Who is going to use this list?

2. When the list is completed, is it a database?

3. What questions should be asked before you begin?

4. What is the question-and-answer procedure in question 3 going to

accomplish?

Data, Databases, and the Software Engineering Process • 3

1.4 WHAT IS THE SOFTWARE ENGINEERING PROCESS?

The term software engineering refers to a process of specifying, designing,

writing, delivering, maintaining, and finally retiring software. Software

engineers often refer to the “life cycle” of software; software has a beginning

and an ending. There are many excellent references on the topic of

SE (Schach, 2011). Some authors use the term software engineering synonymously

with “systems analysis and design,” but the underlying point

is that any information system requires some process to develop it correctly.

SE spans a wide range of information system tasks. The task we are

primarily interested in here is that of specifying and designing a database.

“Specifying a database” means that we will decide on and document what

the database is supposed to contain and how we will go about the overall

task itself.

A basic idea in SE is to build software correctly, a series of steps or phases

is required to progress through a life cycle. These steps ensure that a process

of thinking precedes action—thinking through “what is needed”

precedes “what software is written.” Further, the “thinking before action”

necessitates that all parties involved in software development understand

and communicate with one another. One common version of presenting

the thinking before acting scenario is referred to as a “waterfall” model

(Schach, 2011); the software development process is supposed to flow in a

directional way without retracing.

Generally, the first step in the SE process involves formally specifying

what is to be done. We actually break this first step down into two

steps: requirement elucidation and actually writing of the specification

document. The waterfall model implies that once the specification of the

software is written and accepted by a user, it is not changed, but rather

it is used as a basis for design. One may liken the overall SE exercise to

building a house. The specification is the phase of “what you want in

your house.” Once agreed on, the next step is to design the house to the

specification. As the house is designed and the blueprint is drawn, it is

not acceptable to revisit the specification except for minor alterations.

There has to be a “meeting of the minds” at the end of the specification

phase to move along with the design (the blueprint) of the house to be

constructed. So it is with software and database development. Software

production is a life-cycle process—software (a database) is created, used,

and eventually retired.

3 • Database Design Using Entity-Relationship Diagrams

The “players” in the software development life cycle may be placed into

two camps, often referred to as the user and the analyst. Software is designed

by the analyst for the user according to the user’s specification. In our presentation,

we will think of ourselves as the analyst trying to enunciate what

the users think they want. Recall the example in this chapter in which your

mother asked you to draw up a list of items in a home library. Here, the

mother is the user; the person drawing up the list of objects is the analyst.

There is no general agreement among software engineers regarding the

exact number of steps or phases in the waterfall-type software development

model. Models vary depending on the interest of the SE-researcher

in one part or another in the process. A very brief description of the software

process goes like this (software in the following may be taken to mean

a database):

Step 1 (or Phase 1): Requirements. Find out what the user wants/needs.

The “finding-out procedure” is often called “elucidation.”

Step 2: Specification. Write out the user wants/needs as precisely as

possible. In this step, the user and analyst document not only what

is desired but also how much it will cost and how long it will take. A

credo of SE is to generate software on time and on budget.

Step 2a: Feed back the specification to the user. A formal review of the

specification document is performed to see if the analyst (you) has

it right.

Step 2b: Redo the specification as necessary and return to step 2a until

the analyst and the user both understand one another and agree to

move on.

Step 3: Design—software is designed to meet the specification from

step 2. As in house building, now that the analyst knows what is

required, the plan for the software is formalized—the blueprint is

drawn up.

Step 3a: Software design is independently checked against the specification.

If it is necessary, the design is repaired or redone until the

analyst has clearly met the specification. Note the sense of agreement

in step 2 and the use of step 2 as a basis for further action. When step

3 begins, going back up the waterfall is difficult; it is supposed to be

that way. Perhaps minor specification details might be revisited, but

the idea is to move on once each step is finished. Once step 3a is completed,

both the user and the analyst know what is to be done. In the

building-a-house analogy, the blueprint is now drawn up.

Data, Databases, and the Software Engineering Process • 5

One final point here: In the specification, a budget and timeline

are proposed by the analyst and accepted by the user. In the design,

this budgetary part of the overall design is sometimes refined. All SE

takes money and time and not only is it vital to correctly produce a

given product, but also the ancillary items of time and money must

be clear to all parties.

Step 4: Development. Software is written; a database is created.

Step 4a: In the development phase, software, as written, is checked

against the design until the analyst has clearly met the design. Note

that the specification in step 2 is long past, and only minor modifications

of the design would be tolerated here. The point of step 4 is to

build the software according to the design (the blueprint, if you will)

from step 3. In our case, the database is actually created and populated

in this phase.

Step 5: Implementation. Software is turned over to the user to be used

in the application.

Step 5a: User tests the software and accepts it or rejects it until it is

written correctly (that is, until it meets the specification and design).

In our case, the database is queried, data are added or deleted, and the

user uses what was created. A person may think that this is the end of

the software life cycle, but there are two more important steps.

Step 6: Maintenance. Maintenance is performed on the software until

it is retired. No matter how well specified, designed, and written,

some parts of the software may fail. Some parts may need to be modified

over time to suit the user. Times change; demands and needs

change. Maintenance is a very time-consuming and expensive part

of the software process—particularly if the SE process has not been

done well. Maintenance involves correcting hidden software faults

as well as enhancing the functionality of the software.

In databases, new data are often required; some old data may no

longer be needed. Hardware changes. Operating systems change.

The database engine itself, which is software, is often upgraded—

new versions are imposed on the market. The data in the database

must conform to change, and a system of changing the data in the

database has to be in place.

Step 7: Retirement. Eventually, whatever software is written becomes

outdated. Database engines, computers, and technology in general

are all evolving. Think of the old software package you used on some

old personal computer. It does not work any longer because the

5 • Database Design Using Entity-Relationship Diagrams

operating system has been updated, the computer is obsolete, and

the old software has to be retired. Basically, the SE process has to

start all over with new specifications. The same is true with databases

and designed systems. At times, the most cost-effective thing

to do is to start anew.

CHECKPOINT 1.3

1. In what phase is the database actually created?

2. Which person tests the database?

3. Where does the user say what is wanted in the database?

1.5 ENTITY RELATIONSHIP DIAGRAMS AND THE

SOFTWARE ENGINEERING LIFE CYCLE

This text concentrates on steps 1 through 3 of the software life cycle for databases.

A database is a collection of related data. The concept of related data

means that a database stores information about one enterprise: a business,

an organization, a grouping of related people or processes. For example, a

database might contain data about Acme Plumbing and involve customers

and service calls. A different database might be about the members and

activities of the Over 55 Club in town. It would be inappropriate to have data

about the Over 55 Club and Acme Plumbing in the same database because

the two organizations are not related. Again, a database is a collection of

related data. To keep a database about each of the above entities is fine, but

not in the same database.

Database systems are often modeled using an entity relationship (ER)

diagram as the blueprint from which the actual data are stored; the blueprint

is the output of the design phase. The ER diagram is an analyst’s tool

to diagram the data to be stored in a database system. Phase 1, the requirements

phase, can be quite frustrating as the analyst has to elicit needs and

wants from the user. The user may or may not be computer sophisticated

and may or may not know the capabilities of a software system. The analyst

often has a difficult time deciphering a user’s needs and wants to create

a specification that (a) makes sense to both parties (user and analyst) and

(b) allows the analyst to do design efficiently.

In the real world, the user and the analyst may each be committees of

professionals, but the idea is that users (or user groups) must convey their

ideas to an analyst (or team of analysts)—users have to express what they

Data, Databases, and the Software Engineering Process • 7

want and what they think they need; analysts have to elicit these desires,

document them, and create a plan to realize the user’s desires.

User descriptions may seem vague and unstructured. Typically, users

are successful at a business. They know the business; they understand

the business model. The computer person is typically ignorant of the

business but understands the computer end of the problem. To the

computer-oriented person, the user’s description of the business is as

new to the analyst as the computer jargon is to the user. We present

a methodology that is designed to make the analyst’s language precise

enough so that the user is comfortable with the to-be-designed

database and still provide the analyst with a tool that can be mapped

directly into a database.

In brief, we next review the early steps in the SE life cycle as it applies to

database design.

1.5.1 Phase 1: Get the Requirements for the Database

In phase 1, we listen and ask questions about what facts (data) the user

wants to organize into a database retrieval system. This step often involves

letting users describe how they intend to use the data. You, the analyst,

will eventually provide a process for loading data into and retrieving data

from a database. There is often a “learning curve” necessary for the analyst

as the user explains the system he or she knows so well to a person who

may be uninformed of their specific business.

1.5.2 Phase 2: Specify the Database

Phase 2 involves grammatical descriptions and diagrams of what the

analyst thinks the user wants. Database design is usually accomplished

with an ER diagram that functions as the blueprint for the to-be-designed

database. Since most users are unfamiliar with the notion of an ER diagram,

our methodology will supplement the ER diagram with grammatical

descriptions of what the database is supposed to contain and how the

parts of the database relate to one another. The technical description of a

database can be dry and uninteresting to a user; however, when the analysts

put what they think they heard into English statements, the users and

the analysts have a better meeting of the minds. For example, if the analyst

makes statements like, “All employees must generate invoices,” the user

may then affirm, deny, or modify the declaration to fit what is actually the

8 • Database Design Using Entity-Relationship Diagrams

case. To continue the example, it makes a big difference in the database if

“all employees must generate invoices” versus “some employees may generate

invoices.”

1.5.3 Phase 3: Design the Database

Once the database has been diagrammed and agreed to, the ER diagram

becomes the finalized blueprint for construction of the database in phase

3. Moving from the ER diagram to the actual database is akin to asking a

builder of houses to take a blueprint and commence construction.

As we have seen, there are more steps in the SE process, but also as stated,

this book is about design and hence the remaining steps of the waterfall

model are not emphasized.

CHECKPOINT 1.4

1. Briefly describe the major steps of the SE life cycle as it applies to

databases.

2. Who are the two main players in the software development life

cycle?

3. Why is written communication between the parties in the design

process important?

1.6 CHAPTER SUMMARY

This chapter serves as a background chapter. The chapter briefly describes

data, databases, and the SE process. The SE process is presented as it applies

to ER diagrams—the database design blueprint.

CHAPTER 1 EXERCISES

Fred Jones operates a golf shop. He has golf equipment and customers,

and his primary business is selling retail to customers. Fred has so

many customers that he wants to keep track of them on a computer. He

approaches Sally Smith, who is knowledgeable about computers, and asks

her what to do.

Data, Databases, and the Software Engineering Process • 9

1. In our context, Fred is a __________; Sally is a ______________.

2. When Fred explains to Sally what he wants, Sally begins writing

what?

3. When Fred says, “Sally, this specification is all wrong,” what happens

next?

4. If Fred says, “Sally, this specification is acceptable,” what happens

next?

5. If, during the design, Sally finds out that Fred forgot to tell her about

something he wants, what is Sally to do?

6. How does Sally get Fred’s specifications in the first place?

7. Step 3a says: “Software design is independently checked against the

specification.” What does this mean?

BIBLIOGRAPHY

Schach, S. R. 2011. Object-Oriented and Classical Software Engineering. New York:

McGraw-Hill.

MaxiCompu Fagaad

lunes, 7 de agosto de 2017

Data, Databases, and the Software Engineering Process

No hay comentarios.:

Publicar un comentario