Data Warehousing
Essay Preview: Data Warehousing
Report this essay
Data Warehouses
MGT 327
April 13th, 2004
In the past decade, we have witnessed a computer revolution that was unimaginable. Ten to fifteen years ago, this world never would have imagined what computers would have done for business. Furthermore, the Internet and the ability to conduct electronic commerce have changed the way we are as consumers. One of the upcoming concepts of the computer revolution in the past ten years has been that of Data Warehousing. In the following pages, we will examine this concept in the broadest sense first looking at a brief history of how databases and data warehouses have unrolled. Then we will look at Data warehousing, what it is, its definition, etc. Secondly, we will focus in on how it coincides with the Internet/Intranets and how this is affecting business today. Finally, this discussion will be summarized in what the future might hold for how information is stored and the effects the Internet will play in the scheme of things. Before examining the development of data warehousing and how databases are emerging in business, lets first review what has been done with data before data warehouses to better understand this issue.
In the 1970s virtually all business system development was done on the IBM mainframe using tools like Cobol and IMS. The 1980s brought about mini-computer platforms such as AS/400 and VAX/VMS. The late 1980s and early nineties made UNIX a popular platform with the introduction of client/server architecture. During the past decade, the sharply increasing popularity of the personal computer on business desktops has introduced many new options and opportunities for business analysis. The gap between the programmer and the end user has started to close as Analysts now have at their fingertips many of the tools required to gain proficiency in the uses of spreadsheets and databases. The most important factor in this evolution of data warehousing has been the sharply increasing power of computer hardware. Along with the increase of this power, their prices have fallen just as sharply. This has played a key role in business today. No longer will high costs and huge mainframes be dominant factors in our ability to do business. The wide array of choices with the PC has allowed databases to evolve quickly both commercially and on the information superhighway
So what is a Data Warehouse? A data warehouse is much like that of a storage or distribution warehouse. It is simply a place where data can be gathered, organized in an orderly fashion and made available for easy access when required. It is storage that facilitates easy locating of the required goods when an order needs to be picked for delivery to the customer, that is, when data is required by the end user. Data warehousing assembles and organizes data from the enterprise operations such as transaction systems (registers, online order systems, etc.) and stores the data in a format that business or technical people can analyze. The data warehouse is then made accessible through different means to those individuals in need of detailed information. The following diagram depicts the role a data warehouse plays in an order process system.
There are many benefits to using the data warehouse for this business fashion. First, the information is non-volatile. This means that after the data is in the data warehouse, there are no modifications to be made to this information. For example, the order status doesnt change, the inventory snapshot doesnt change, and marketing doesnt change. It is important to realize that once data is brought to the data warehouse, it should be modified only on rare occasions. It is very difficult, if not impossible, to maintain dynamic data in the data warehouse. Many data warehousing projects have failed miserably when they attempt to synchronize volatile data between the operational and data warehousing systems. Second, data put into a warehouse can be combined from several applications making a vast amount of information available to the end user. Data warehousing systems are most successful when data can be combined from more than one operational system. When the data needs to be brought together from more than one source application, it is only natural to integrate the information totally separate from the source applications. The data warehouse may very effectively combine data from multiple source applications such as sales, marketing, finance, and production. Many large data warehouses allow for the source applications to be integrated incrementally. The primary reason for combing data from multiple sources is the ability to cross reference data from these applications. Nearly all data in a typical warehouse is built around time. Time is the primary criteria for filtering the information going into and out of the data warehouse. For example, an analyst may generate queries for a given week, month, quarter, or year. If designed properly, the data warehouse can allow for a year to year analysis even though a base operational application has changed. Finally, data put into a warehouse can be store over a long period of time. Data from most operational systems is archived after data becomes inactive. For example, an order may become inactive after a set period from the fulfillment of the order, or a bank account becoming inactive over an extended period of time. The primary reason for archiving the inactive data has been the performance of the operational system. Large amounts of inactive data mixed with operational live data can significantly degrade the performance of a transaction that is only processing live data. This helps when a business has information that has already been processed, yet they might need it at a later date.
As one can see, this process is much like a product warehouse. Instead of storing goods, the commodity is information. However, just like product storage warehouses, there are well-designed data warehouses and poorly and inflexible data warehouses. Therefore, there is a need for requirements to make a successful and usable installation of a data warehouse. There are two requirements of computer data in any business. The first is an operational requirement to facilitate the processing of business transactions. The second is the need to analyze the results of these business transactions deliver, or could deliver, if they were better understood and utilized. In other words, there is an operational use and an informational use of data.
It is important to recognize that data warehousing is still an evolving science. As with any evolving technology, particular care must be taken