Data Warehouse Delivery Process
Now we discuss the delivery process of the data warehouse. Main steps used in data warehouse delivery process which are as follows:
IT Strategy: DWH project must contain IT strategy for procuring and retaining funding.
Business Case Analysis: After the IT strategy has been designed, the next step is the business case. It is essential to understand the level of investment that can be justified and to recognize the projected business benefits which should be derived from using the data warehouse.
Education & Prototyping: Company will experiment with the ideas of data analysis and educate themselves on the value of the data warehouse. This is valuable and should be required if this is the company first exposure to the benefits of the DS record. Prototyping method can progress the growth of education. It is better than working models. Prototyping requires business requirement, technical blueprint, and structures.
Business Requirement: It contains such as
The logical model for data within the data warehouse.
The source system that provides this data (mapping rules)
The business rules to be applied to information.
The query profiles for the immediate requirement
Technical blueprint: It arranges the architecture of the warehouse. Technical blueprint of the delivery process makes an architecture plan which satisfies long-term requirements. It lays server and data mart architecture and essential components of database design.
Building the vision: It is the phase where the first production deliverable is produced. This stage will probably create significant infrastructure elements for extracting and loading information but limit them to the extraction and load of information sources.
History Load: The next step is one where the remainder of the required history is loaded into the data warehouse. This means that the new entities would not be added to the data warehouse, but additional physical tables would probably be created to save the increased record volumes.
AD-Hoc Query: In this step, we configure an ad-hoc query tool to operate against the data warehouse.
These end-customer access tools are capable of automatically generating the database query that answers any question posed by the user.
Automation: The automation phase is where many of the operational management processes are fully automated within the DWH. These would include:
Extracting & loading the data from a variety of sources systems
Transforming the information into a form suitable for analysis
Backing up, restoring & archiving data
Generating aggregations from predefined definitions within the Data Warehouse.
Monitoring query profiles & determining the appropriate aggregates to maintain system performance.
Extending Scope: In this phase, the scope of DWH is extended to address a new set of business requirements. This involves the loading of additional data sources into the DWH i.e. the introduction of new data marts.
Requirement Evolution: This is the last step of the delivery process of a data warehouse. As we all know that requirements are not static and evolve continuously. As the business requirements will change it supports to be reflected in the system.
Concept hierarchy is directed acyclic graph of ideas, where a unique name identifies each of the theories.
An arc from the concept a to b denotes which is a more general concept than b. We can tag the text with ideas.
Each text report is tagged by a set of concepts which corresponds to its content.
Tagging a report with a concept implicitly entails its tagging with all the ancestors of the concept hierarchy. It is, therefore desired that a report should be tagged with the lowest concept possible.
The method to automatically tag the report to the hierarchy is a top-down approach. An evaluation function determines whether a record currently tagged to a node can also be tagged to any of its child nodes.
If so, then then the tag moves down the hierarchy till it cannot be pushed any further.
The outcome of this step is a hierarchy of report and, at each node, there is a set of the report having a common concept related to the node.
The hierarchy of reports resulting from the tagging step is useful for many texts mining process.
It is assumed that the hierarchy of concepts is called a priori. We can even have such a hierarchy of documents without a concept hierarchy, by using any hierarchical clustering algorithm, which results in such a hierarchy.
Concept hierarchy defines a sequence of mapping from a set of particular, low-level concepts to more general, higher-level concepts.
In a data warehouse, it is usually used to express different levels of granularity of an attribute from one of the dimension tables.
Concept hierarchies are crucial for the formulation of useful OLAP queries. The hierarchies allow the user to summarize the data at various levels.
For example, using the location hierarchy, the user can retrieve data which summarizes sales for each location, for all the areas in a given state, or even a given country without the necessity of reorganizing the data.