What is Business Intelligence?
As information
technology evolved over the years, enterprises automated more and more of their
operations. A great deal of very valuable data resided underutilized in these
systems. Data found in sales, accounting, production, human resources, and many
other systems could yield significant information to provide historical,
current, and predictive views of business operations.
Business intelligence is
the use of an organization’s disparate data to provide meaningful information
and analyses to employees, customers, suppliers, and partners for more
efficient and effective decision-making. It transforms information into
actionable strategies and tactics to improve the efficiency of the enterprise,
to reduce costs, to attract and retain customers, to improve sales, and to
provide many other significant benefits.
For example, some
typical instances of the use of business intelligence include:
Retail:
Sales Patterns Integrated Customer View Campaign Management Customer Valuation Analytical CRM
Telecom:
Call-Behavior Analysis Fraud Detection Number Portability Service-Usage Analysis Promotion Effectiveness |
Manufacturing:
Order Life Cycle Inventory Analysis Quality Assurance Supplier Compliance Distribution Analysis
Financial:
Credit Risk Monetary Risk Asset Management Liability Management Fraud Detection |
Government:
National Security Crime Analysis Health Welfare Fraud Detection
All
Industries:
P & L Analysis Profitability Performance Analysis Value-Chain Analysis Profiling |
The History of Business Intelligence
The Early Days of Computing
Typically, early business applications had their own databases that supported their functions. These databases became “islands of information” in that no other systems had access to them. These islands of information proliferated as more and more departments were automated. Mergers and acquisitions compounded the problem since the companies integrated totally different systems, many of which were doing the same job.
However, businesses soon recognized the analytical value of the data that they had available in their many islands of information. In fact, as businesses automated more systems, more data became available. However, collecting this data for analysis was a challenge because of the incompatibilities among systems. There was no simple way (and often no way) for these systems to interact. An infrastructure was needed for data exchange, collection, and analysis that could provide a unified view of an enterprise’s data. The data warehouse evolved to fill this need.
The Data Warehouse
The concept of the data warehouse (Figure 1) is a single system that is the repository of all of an organization’s data in a form that can be effectively analyzed so that meaningful reports can be prepared for management and other knowledge workers.
However, meeting this goal presents several very significant challenges:
The concept of the data warehouse (Figure 1) is a single system that is the repository of all of an organization’s data in a form that can be effectively analyzed so that meaningful reports can be prepared for management and other knowledge workers.
However, meeting this goal presents several very significant challenges:
- Data must be acquired from a variety of incompatible systems.
- The same item of information might reside in the databases of different systems in different forms. A particular data item might not only be represented in different formats, but the values of this data item might be different in different databases. Which value is the correct one?
- Data is continually changing. How often should the data warehouse be updated to reflect a reasonably current view?
- The amount of data is massive. How is it analyzed and presented simply so that it is useful?
To meet these needs, a broad range of powerful tools were developed over the years and became productized. They included:
- Extract, Transform, and Load (ETL) utilities for the moving of data from the various data sources to the common data warehouse.
- Data-mining engines for complex predetermined analyses and ad hoc queries of the enterprise data stored in the data warehouse.
- Reporting tools to provide management and knowledge workers with the results of the analysis in easy to absorb formats.
Offline Extract, Transform, and Load (ETL)
Early on, the one common interface that was provided between the disparate systems in an organization was magnetic tape. Tape formats were standardized, and any system could write tapes that could be read by other systems. Therefore, the first data warehouses were fed by magnetic tapes prepared by the various systems within the organization. However, that left the problem of data disparity. The data written by the different systems reflected their native data organizations. The data written to tape by one system often bore little relation to similar data written by another system.
Even more important was that the data warehouse’s database was designed to support the analytical functions required for the business intelligence function. This database design was typically a highly structured database with complex indices to support online analytical processing (OLAP). Databases configured for OLAP allowed complex analytical and ad hoc queries with rapid execution time. The data fed to the data warehouse from the enterprise systems was converted to a format meaningful to the data warehouse.
To solve the problem of initially loading this data into a data warehouse, keeping it updated, and resolving discrepancies, Extract, Transform, and Load (ETL) utilities were developed. As their name implies, these utilities extract data from source databases, transform them into the common data warehouse format, and load them into the data warehouse, as shown in Figure 2.
The transform function is the key to the success of this approach. Its job is to apply a series of rules to extracted data so that it is properly formatted for loading into the data warehouse. Examples of transformation rules include:
- the selection of data to load.
- the translation of encoded items (for instance, 1 for male, 2 for female to M, F).
- encoding and standardizing free-form values (New Jersey, N. Jersey, N. J. to NJ).
- deriving new calculated values (sale price = price - discount).
- merging data from multiple sources.
- summarizing (aggregating) certain rows and columns.
- splitting a column into multiple columns (for instance, a comma-separated list).
- resolving discrepancies between similar data items.
- validating the data.
- ensuring data consistency.
The ETL function allows the consolidation of multiple data sources into a well-structured database for use in complex analyses. The ETL process is executed periodically, such as daily, weekly, or monthly, depending upon the business needs. This process is called offline ETL because the target database is not continuously updated. It is updated on a periodic batch basis. Though offline ETL serves its purpose well, it has some serious drawbacks:
- The data in the data warehouse is stale. It could be weeks old. Therefore, it is useful for strategic functions but is not particularly adaptable to tactical uses.
- The source database typically must be quiesced during the extract process. Otherwise, the target database is in an inconsistent state following the load. With this result, the applications must be shut down, often for hours.
Offline ETL technology has served businesses for decades. The intelligence that is derived from this data informs long-term reactive strategic decision making. However, short-term operational and proactive tactical decision making continues to rely on intuition.
Data-Mining Engines
The ETL utilities make data collection from many diverse systems practical. However, the captured data needs to be converted into information and knowledge in order to be useful.
To expound on this concept,
- Data are simply facts, numbers, and text that can be processed by a computer. For instance, a transaction at a retail point-of-sale is data.
- Information embodies the understanding of a relationship of some sort between data. For example, analysis of point-of-sale transactions yield information on consumer buying behavior.
- Knowledge represents a pattern that connects information and generally provides a high level of predictability as to what is described or what will happen next. An example of knowledge is the prediction of promotional efforts on sales of particular items based on consumers’ buying behavior.
Powerful data-mining engines were developed to support complex analyses and ad hoc queries on a data warehouse’s database. Data mining looks for patterns among hundreds of seemingly unrelated fields in a large database, patterns that identify previously unknown trends. These trends play a critical role in strategic decision making because they reveal areas for process improvement. Examples of data-mining engines are those from SPSS and Oracle. Facilities such as these are the foundation for OLAP (online analytical processing) systems.
Reporting Tools
The knowledge created by a data-mining engine is not very useful unless it is presented easily, clearly, and succinctly to those who need it. Many formats for reporting information and knowledge results have been created. A common technique for displaying information is the digital dashboard. A digital dashboard (as shown by the examples of Figures 3 and 4) provides a business manager with the input necessary to “drive” the business. It is focused on providing rapid insight into key performance indicators, and as such is highly graphical with colored lights, alerts, drill-downs, graphics, and gauges.
A digital dashboard gives the user a graphical high-level view of business processes. The user then drills down at will to see more detail on a particular business process. This level of detail is often buried deep in the enterprise’s data, making it otherwise unnoticeable to a business manager. For instance, with the digital dashboard shown in Figure 4, a knowledge worker clicking on an object will see the detailed statistics for that object.
Today, many versions of digital dashboards are available from a variety of software vendors. Driven by information discovered by a data-mining engine, they give the business manager the information required to:
- immediately see key performance measures.
- identify and correct negative trends.
- measure efficiencies and inefficiencies.
- generate detailed reports showing new trends.
- increase revenues.
- decrease costs.
- make more informed decisions.
- align strategies and organizational goals.
Data Marts
As corporate-wide data warehouses came into use, it became clear in many cases that a full-blown data warehouse was overkill for many applications. Data marts evolved to solve this problem. A data mart is a specialized version of a data warehouse. Whereas a data warehouse is a single organizational repository of enterprise-wide data across all subject areas, a data mart is a subject-oriented repository of data designed to answer specific questions for a specific set of users. A data mart holds just a subset of the data that a data warehouse holds.
A data mart includes all of the needs of a data warehouse – ETL, data mining, and reporting. However, since a data mart deals with only a subset of data, it is much smaller and more cost effective than a full-scale data warehouse. In addition, because its database is much smaller since it only needs to hold subject-oriented data rather than all of an enterprise’s data, it is much more responsive to ad hoc queries and other analyses.
Data marts have become popular not only because they are less costly and more responsive but also because they are under the control of a department or division within the enterprise. Managers have their own local sources for information and knowledge rather than having to depend on a remote organization controlling the enterprise data warehouse
.
No comments:
Post a Comment