How does normalization affect performance




















If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations. A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database. What is an "inconsistent dependency"? While it is intuitive for a user to look in the Customers table for the address of a particular customer, it may not make sense to look there for the salary of the employee who calls on that customer.

The employee's salary is related to, or dependent on, the employee and thus should be moved to the Employees table. Inconsistent dependencies can make data difficult to access because the path to find the data may be missing or broken. There are a few rules for database normalization. Each rule is called a "normal form. As with many formal rules and specifications, real world scenarios do not always allow for perfect compliance.

In general, normalization requires additional tables and some customers find this cumbersome. If you decide to violate one of the first three rules of normalization, make sure that your application anticipates any problems that could occur, such as redundant data and inconsistent dependencies.

Do not use multiple fields in a single table to store similar data. For example, to track an inventory item that may come from two possible sources, an inventory record may contain fields for Vendor Code 1 and Vendor Code 2. What happens when you add a third vendor? Adding a field is not the answer; it requires program and table modifications and does not smoothly accommodate a dynamic number of vendors. Instead, place all vendor information in a separate table called Vendors, then link inventory to vendors with an item number key, or vendors to inventory with a vendor code key.

Records should not depend on anything other than a table's primary key a compound key, if necessary. For example, consider a customer's address in an accounting system.

Instead of storing the customer's address as a separate entry in each of these tables, store it in one place, either in the Customers table or in a separate Addresses table. There are some goals in mind when undertaking the data normalization process.

The first one is to get rid of any duplicate data that might appear within the data set. This basically goes through the database and eliminates any redundancies that may occur. Expunging them from the database helps to clean up the data, making it easier to analyze. The other goal is to logically group data together. You want data that relates to each other to be stored together. This will occur in a database which has undergone data normalization.

If data is dependent on each other, they should be in close proximity within the data set. While the process can vary depending on the type of database you have and what type of information you collect, it usually involves several steps. One such step is eliminating duplicate data as discussed above. Another step is resolving any conflicting data. Sometimes, datasets will have information that conflicts with each other, so data normalization is meant to address this conflicting issue and solve it before continuing.

A third step is formatting the data. This takes data and converts it into a format that allows further processing and analysis to be done.

Finally, data normalization consolidates data, combining it into a much more organized structure. Consider of the state of big data today and how much of it consists of unstructured data. Organizing it and turning it into a structured form is needed now more than ever, and data normalization helps with that effort. Put in simple terms, a properly designed and well-functioning database should undergo data normalization in order to be used successfully.

Data normalization gets rid of a number of anomalies that can make analysis of the data more complicated. In most cases, these matching fields are the primary key from one table and a foreign key in the other table. The kind of relationships, that the system creates, depends on how the related fields are defined.

When you physically join two tables by connecting fields with related information, you create a relationship that is recognized by the system e.

The specified relationship is important. It tells system how to find and display information from fields in two or more tables. The program needs to know whether to look for only one record in a table or to look for several records on the basis of the relationship. A database is an organised, integrated collection of data items. The integration is important; data items relate to other data items, and groups of related data items called entities relate to other entities.

The relationships between entities can be one of three types, one-to-one , one-to-many 1:M , and many-to-many M:N :. The relationships are usually displayed represented through a technique called entity relationship modeling ERM. Entities can be of two types: noun-type entities people, places, and things and verb-type entities actions and interactions between the noun-type entities.

Entity relationship modeling is a way to graphically represent the structural database design and to model the informational requirements. The result of the entity modeling efforts is an entity-relationship diagram ERD. Entities in an entity-relational diagram eventually become tables in a database. NOTE: Although, there are many different entity-modeling methodologies, there are two commonly accepted rules:. A primary key PK is an attribute data column in a table that serves a special purpose.

The data items that make up this attribute are unique; no two data item values in this data column can be are the same. The primary key value serves to uniquely identify each row in the table. Each table must have an explicitly designated primary key. Each table has only one primary key; however, it can include more than one attribute called a composite or concatenated primary key.

A foreign key FK is an attribute data field that forms an implied link between two tables that are in a 1:M relationship. The foreign key, which is a column in the table of the many, is usually a primary key in the table of the one.

Foreign keys represent a type of controlled redundancy. This is known as a parent-child relationship between tables. Also, a new record cannot be added to the related table if there is no associated record in the primary table. In addition to specifying relationships between two tables in a database, you also set up referential integrity rules that help in maintaining a degree of accuracy between tables.

This type of problem could be catastrophic for any system. The referential integrity rules keep the relationships between tables intact and unbroken in a relational database management system - referential integrity prohibits you from changing existing data in ways that invalidate and harm the links between tables. It checks each time a key field, whether primary or foreign, is added, changed or deleted.

If any of these listed actions creates an invalid relationship between two tables, it is said to violate referential integrity. Referential integrity is a system of rules that Microsoft Access uses to ensure that relationships between records in related tables are valid, and that you don't accidentally delete or incorrectly change related data. Normalization is part of successful database design. Without normalization, database systems can be inaccurate, slow, and inefficient and they might not produce the data you expect.

SUMMARY: First normal form 1NF is the "basic" level of normalization and generally corresponds to the definition of any database, namely: It contains two-dimensional tables with rows and columns Each column corresponds to a sub-object or an attribute of the object represented by the entire table.

Each row represents a unique instance of that sub-object or attribute and must be different in some way from any other row that is, no duplicate rows are possible. All entries in any column must be of the same kind. For example, in the column labeled"Student Names" only student names are permitted. SUMMARY: Second normal form 2NF - At this level of normalization, each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the table.



0コメント

  • 1000 / 1000