logo

lunes, 7 de agosto de 2017

Data, Databases, and the Software Engineering Process

1          Data, Databases, and the
Software Engineering Process

1.1  INTRODUCTION

In this chapter, we introduce some concepts and ideas that are fundamental
to our presentation of the design of a database. We define data,
describe the notion of a database, and explore a process of how to design
a database.

1.2  DATA

Data, as we use the term, are facts about something or someone. For
example, a person has a name, an address, and a gender. Some data
(facts) about a specific person might be “Mary Smith,” “123 4th St.,”
“female.” If we had a list of several people’s names, addresses, and genders,
we would have a set of facts about several people. A database is a
collection of related data. For this “set of facts about several people” to
be a database, we would expect that the people in the database had something
in common—that they were “related” in some way. Here related
does not imply a familial relationship, but rather something more like
“people who play golf,” “people who have dogs,” or “people I interviewed
on the street today.” In a “database of people,” one expects the people to
have some common characteristic that ties them together. A “set of facts
about some people” is not a database until the common characteristic is
also defined. To put it another way: Why are these people’s names and
addresses being kept in one list?

2      • Database Design Using Entity-Relationship Diagrams

CHECKPOINT 1.1

1. A tree is classified as a “large oak tree about 100 years old.” What are
three facts about this tree?
2. Another tree has the following characteristics: pine, small, 15 years
old. If I write about the two trees and their facts on a piece of paper,
what do I have?
3. Why is the piece of paper not a database of trees?

1.3 BUILDING A DATABASE

How do we construct a database? Suppose you were asked to put together a
database of items one keeps in a pantry. How would you go about doing this?
You might grab a piece of paper and begin listing items that you see. When you
are done, you would have a database of items in the pantry. Simple enough,
but is it a good database or a poor one? Was your approach to database construction
a good methodology or not-so-good methodology? The answer to
these questions would depend on why you constructed the list—who will use
the list and for what. If you are more methodical, you might first ask yourself
how best to construct this database before you grab the paper and begin a list
of items. A bit of prethinking might save time in the long run because you
might think about how the list was to be used and by whom.
When dealing with software and computer-related activity like databases,
we have a science of “how to” called software engineering (SE). SE is a process
of specifying systems and writing software. To design a good database, we will
use ideas from SE. By being aware of SE and respecting its known systematic
approach, we can see why we handle database design the way we do. In this
chapter, we present a brief outline of SE. After this brief background/overview
of SE in this chapter, we explore database models, in particular the relational
database model, in subsequent chapters. While there are many kinds of database
models, most of the databases in use today are relational. Our focus in
this book is to put forward a methodology based on SE to design a sound
relational database (as opposed to other database models).

CHECKPOINT 1.2

You have a set of books on bookshelves in your house. Your mother asks you
to create a list of all the books she has.
1. Who is going to use this list?
2. When the list is completed, is it a database?
3. What questions should be asked before you begin?
4. What is the question-and-answer procedure in question 3 going to
accomplish?

Data, Databases, and the Software Engineering Process • 3

1.4 WHAT IS THE SOFTWARE ENGINEERING PROCESS?

The term software engineering refers to a process of specifying, designing,
writing, delivering, maintaining, and finally retiring software. Software
engineers often refer to the “life cycle” of software; software has a beginning
and an ending. There are many excellent references on the topic of
SE (Schach, 2011). Some authors use the term software engineering synonymously
with “systems analysis and design,” but the underlying point
is that any information system requires some process to develop it correctly.
SE spans a wide range of information system tasks. The task we are
primarily interested in here is that of specifying and designing a database.
“Specifying a database” means that we will decide on and document what
the database is supposed to contain and how we will go about the overall
task itself.
A basic idea in SE is to build software correctly, a series of steps or phases
is required to progress through a life cycle. These steps ensure that a process
of thinking precedes action—thinking through “what is needed”
precedes “what software is written.” Further, the “thinking before action”
necessitates that all parties involved in software development understand
and communicate with one another. One common version of presenting
the thinking before acting scenario is referred to as a “waterfall” model
(Schach, 2011); the software development process is supposed to flow in a
directional way without retracing.
Generally, the first step in the SE process involves formally specifying
what is to be done. We actually break this first step down into two
steps: requirement elucidation and actually writing of the specification
document. The waterfall model implies that once the specification of the
software is written and accepted by a user, it is not changed, but rather
it is used as a basis for design. One may liken the overall SE exercise to
building a house. The specification is the phase of “what you want in
your house.” Once agreed on, the next step is to design the house to the
specification. As the house is designed and the blueprint is drawn, it is
not acceptable to revisit the specification except for minor alterations.
There has to be a “meeting of the minds” at the end of the specification
phase to move along with the design (the blueprint) of the house to be
constructed. So it is with software and database development. Software
production is a life-cycle process—software (a database) is created, used,
and eventually retired.

3      • Database Design Using Entity-Relationship Diagrams
4       
The “players” in the software development life cycle may be placed into
two camps, often referred to as the user and the analyst. Software is designed
by the analyst for the user according to the user’s specification. In our presentation,
we will think of ourselves as the analyst trying to enunciate what
the users think they want. Recall the example in this chapter in which your
mother asked you to draw up a list of items in a home library. Here, the
mother is the user; the person drawing up the list of objects is the analyst.
There is no general agreement among software engineers regarding the
exact number of steps or phases in the waterfall-type software development
model. Models vary depending on the interest of the SE-researcher
in one part or another in the process. A very brief description of the software
process goes like this (software in the following may be taken to mean
a database):

Step 1 (or Phase 1): Requirements. Find out what the user wants/needs.
The “finding-out procedure” is often called “elucidation.”

Step 2: Specification. Write out the user wants/needs as precisely as
possible. In this step, the user and analyst document not only what
is desired but also how much it will cost and how long it will take. A
credo of SE is to generate software on time and on budget.

Step 2a: Feed back the specification to the user. A formal review of the
specification document is performed to see if the analyst (you) has
it right.

Step 2b: Redo the specification as necessary and return to step 2a until
the analyst and the user both understand one another and agree to
move on.

Step 3: Design—software is designed to meet the specification from

step 2. As in house building, now that the analyst knows what is
required, the plan for the software is formalized—the blueprint is
drawn up.

Step 3a: Software design is independently checked against the specification.

If it is necessary, the design is repaired or redone until the
analyst has clearly met the specification. Note the sense of agreement
in step 2 and the use of step 2 as a basis for further action. When step
3 begins, going back up the waterfall is difficult; it is supposed to be
that way. Perhaps minor specification details might be revisited, but
the idea is to move on once each step is finished. Once step 3a is completed,
both the user and the analyst know what is to be done. In the
building-a-house analogy, the blueprint is now drawn up.
Data, Databases, and the Software Engineering Process • 5
One final point here: In the specification, a budget and timeline
are proposed by the analyst and accepted by the user. In the design,
this budgetary part of the overall design is sometimes refined. All SE
takes money and time and not only is it vital to correctly produce a
given product, but also the ancillary items of time and money must
be clear to all parties.

Step 4: Development. Software is written; a database is created.

Step 4a: In the development phase, software, as written, is checked
against the design until the analyst has clearly met the design. Note
that the specification in step 2 is long past, and only minor modifications
of the design would be tolerated here. The point of step 4 is to
build the software according to the design (the blueprint, if you will)
from step 3. In our case, the database is actually created and populated
in this phase.

Step 5: Implementation. Software is turned over to the user to be used
in the application.

Step 5a: User tests the software and accepts it or rejects it until it is
written correctly (that is, until it meets the specification and design).
In our case, the database is queried, data are added or deleted, and the
user uses what was created. A person may think that this is the end of
the software life cycle, but there are two more important steps.

Step 6: Maintenance. Maintenance is performed on the software until
it is retired. No matter how well specified, designed, and written,
some parts of the software may fail. Some parts may need to be modified
over time to suit the user. Times change; demands and needs
change. Maintenance is a very time-consuming and expensive part
of the software process—particularly if the SE process has not been
done well. Maintenance involves correcting hidden software faults
as well as enhancing the functionality of the software.
In databases, new data are often required; some old data may no
longer be needed. Hardware changes. Operating systems change.
The database engine itself, which is software, is often upgraded—
new versions are imposed on the market. The data in the database
must conform to change, and a system of changing the data in the
database has to be in place.

Step 7: Retirement. Eventually, whatever software is written becomes
outdated. Database engines, computers, and technology in general
are all evolving. Think of the old software package you used on some
old personal computer. It does not work any longer because the

5      • Database Design Using Entity-Relationship Diagrams

operating system has been updated, the computer is obsolete, and
the old software has to be retired. Basically, the SE process has to
start all over with new specifications. The same is true with databases
and designed systems. At times, the most cost-effective thing
to do is to start anew.

CHECKPOINT 1.3

1. In what phase is the database actually created?
2. Which person tests the database?
3. Where does the user say what is wanted in the database?

1.5 ENTITY RELATIONSHIP DIAGRAMS AND THE
SOFTWARE ENGINEERING LIFE CYCLE

This text concentrates on steps 1 through 3 of the software life cycle for databases.
A database is a collection of related data. The concept of related data
means that a database stores information about one enterprise: a business,
an organization, a grouping of related people or processes. For example, a
database might contain data about Acme Plumbing and involve customers
and service calls. A different database might be about the members and
activities of the Over 55 Club in town. It would be inappropriate to have data
about the Over 55 Club and Acme Plumbing in the same database because
the two organizations are not related. Again, a database is a collection of
related data. To keep a database about each of the above entities is fine, but
not in the same database.
Database systems are often modeled using an entity relationship (ER)
diagram as the blueprint from which the actual data are stored; the blueprint
is the output of the design phase. The ER diagram is an analyst’s tool
to diagram the data to be stored in a database system. Phase 1, the requirements
phase, can be quite frustrating as the analyst has to elicit needs and
wants from the user. The user may or may not be computer sophisticated
and may or may not know the capabilities of a software system. The analyst
often has a difficult time deciphering a user’s needs and wants to create
a specification that (a) makes sense to both parties (user and analyst) and
(b) allows the analyst to do design efficiently.
In the real world, the user and the analyst may each be committees of
professionals, but the idea is that users (or user groups) must convey their
ideas to an analyst (or team of analysts)—users have to express what they
Data, Databases, and the Software Engineering Process • 7
want and what they think they need; analysts have to elicit these desires,
document them, and create a plan to realize the user’s desires.
User descriptions may seem vague and unstructured. Typically, users
are successful at a business. They know the business; they understand
the business model. The computer person is typically ignorant of the
business but understands the computer end of the problem. To the
computer-oriented person, the user’s description of the business is as
new to the analyst as the computer jargon is to the user. We present
a methodology that is designed to make the analyst’s language precise
enough so that the user is comfortable with the to-be-designed
database and still provide the analyst with a tool that can be mapped
directly into a database.
In brief, we next review the early steps in the SE life cycle as it applies to
database design.

1.5.1 Phase 1: Get the Requirements for the Database

In phase 1, we listen and ask questions about what facts (data) the user
wants to organize into a database retrieval system. This step often involves
letting users describe how they intend to use the data. You, the analyst,
will eventually provide a process for loading data into and retrieving data
from a database. There is often a “learning curve” necessary for the analyst
as the user explains the system he or she knows so well to a person who
may be uninformed of their specific business.

1.5.2 Phase 2: Specify the Database

Phase 2 involves grammatical descriptions and diagrams of what the
analyst thinks the user wants. Database design is usually accomplished
with an ER diagram that functions as the blueprint for the to-be-designed
database. Since most users are unfamiliar with the notion of an ER diagram,
our methodology will supplement the ER diagram with grammatical
descriptions of what the database is supposed to contain and how the
parts of the database relate to one another. The technical description of a
database can be dry and uninteresting to a user; however, when the analysts
put what they think they heard into English statements, the users and
the analysts have a better meeting of the minds. For example, if the analyst
makes statements like, “All employees must generate invoices,” the user
may then affirm, deny, or modify the declaration to fit what is actually the
8 • Database Design Using Entity-Relationship Diagrams
case. To continue the example, it makes a big difference in the database if
all employees must generate invoices” versus “some employees may generate
invoices.”

1.5.3 Phase 3: Design the Database
Once the database has been diagrammed and agreed to, the ER diagram
becomes the finalized blueprint for construction of the database in phase
3. Moving from the ER diagram to the actual database is akin to asking a
builder of houses to take a blueprint and commence construction.
As we have seen, there are more steps in the SE process, but also as stated,
this book is about design and hence the remaining steps of the waterfall
model are not emphasized.
CHECKPOINT 1.4

1. Briefly describe the major steps of the SE life cycle as it applies to
databases.
2. Who are the two main players in the software development life
cycle?
3. Why is written communication between the parties in the design
process important?

1.6 CHAPTER SUMMARY

This chapter serves as a background chapter. The chapter briefly describes
data, databases, and the SE process. The SE process is presented as it applies
to ER diagrams—the database design blueprint.

CHAPTER 1 EXERCISES
Fred Jones operates a golf shop. He has golf equipment and customers,
and his primary business is selling retail to customers. Fred has so
many customers that he wants to keep track of them on a computer. He
approaches Sally Smith, who is knowledgeable about computers, and asks

her what to do.

Data, Databases, and the Software Engineering Process • 9

1. In our context, Fred is a __________; Sally is a ______________.
2. When Fred explains to Sally what he wants, Sally begins writing
what?
3. When Fred says, “Sally, this specification is all wrong,” what happens
next?
4. If Fred says, “Sally, this specification is acceptable,” what happens
next?
5. If, during the design, Sally finds out that Fred forgot to tell her about
something he wants, what is Sally to do?
6. How does Sally get Fred’s specifications in the first place?
7. Step 3a says: “Software design is independently checked against the
specification.” What does this mean?

BIBLIOGRAPHY

Schach, S. R. 2011. Object-Oriented and Classical Software Engineering. New York:
McGraw-Hill.



No hay comentarios.:

Publicar un comentario