What is Business Intelligence? The History of Business Intelligence

What is Business Intelligence?

As information technology evolved over the years, enterprises automated more and more of their operations. A great deal of very valuable data resided underutilized in these systems. Data found in sales, accounting, production, human resources, and many other systems could yield significant information to provide historical, current, and predictive views of business operations.

Business intelligence is the use of an organization’s disparate data to provide meaningful information and analyses to employees, customers, suppliers, and partners for more efficient and effective decision-making. It transforms information into actionable strategies and tactics to improve the efficiency of the enterprise, to reduce costs, to attract and retain customers, to improve sales, and to provide many other significant benefits.

For example, some typical instances of the use of business intelligence include:

Sales Patterns
Integrated Customer View
Campaign Management
Customer Valuation
Analytical CRM
Call-Behavior Analysis
Fraud Detection
Number Portability
Service-Usage Analysis
Promotion Effectiveness
Order Life Cycle
Inventory Analysis
Quality Assurance
Supplier Compliance
Distribution Analysis
Credit Risk
Monetary Risk
Asset Management
Liability Management
Fraud Detection
National Security
Crime Analysis
Fraud Detection
All Industries:
P & L Analysis
Performance Analysis
Value-Chain Analysis

The History of Business Intelligence

The Early Days of Computing

Typically, early business applications had their own databases that supported their functions. These databases became “islands of information” in that no other systems had access to them. These islands of information proliferated as more and more departments were automated. Mergers and acquisitions compounded the problem since the companies integrated totally different systems, many of which were doing the same job.

However, businesses soon recognized the analytical value of the data that they had available in their many islands of information. In fact, as businesses automated more systems, more data became available. However, collecting this data for analysis was a challenge because of the incompatibilities among systems. There was no simple way (and often no way) for these systems to interact. An infrastructure was needed for data exchange, collection, and analysis that could provide a unified view of an enterprise’s data. The data warehouse evolved to fill this need.

The Data Warehouse

The concept of the data warehouse (Figure 1) is a single system that is the repository of all of an organization’s data in a form that can be  effectively analyzed so that meaningful reports can be prepared for management and other knowledge workers.

However, meeting this goal presents several very significant challenges:
  • Data must be acquired from a variety of incompatible systems.
  • The same item of information might reside in the databases of different systems in different forms. A particular data item might not only be represented in different formats, but the values of this data item might be different in different databases. Which value is the correct one?
  • Data is continually changing. How often should the data warehouse be updated to reflect a reasonably current view?
  • The amount of data is massive. How is it analyzed and presented simply so that it is useful?
To meet these needs, a broad range of powerful tools were developed over the years and became productized. They included:
  • Extract, Transform, and Load (ETL) utilities for the moving of data from the various data sources to the common data warehouse.
  • Data-mining engines for complex predetermined analyses and ad hoc queries of the enterprise data stored in the data warehouse.
  • Reporting tools to provide management and knowledge workers with the results of the analysis in easy to absorb formats.
The Data Warehouse
Offline Extract, Transform, and Load (ETL)
Early on, the one common interface that was provided between the disparate systems in an organization was magnetic tape. Tape formats were standardized, and any system could write tapes that could be read by other systems. Therefore, the first data warehouses were fed by magnetic tapes prepared by the various systems within the organization. However, that left the problem of data disparity. The data written by the different systems reflected their native data organizations. The data written to tape by one system often bore little relation to similar data written by another system.

Even more important was that the data warehouse’s database was designed to support the analytical functions required for the business intelligence function. This database design was typically a highly structured database with complex indices to support online analytical processing (OLAP). Databases configured for OLAP allowed complex analytical and ad hoc queries with rapid execution time. The data fed to the data warehouse from the enterprise systems was converted to a format meaningful to the data warehouse.

To solve the problem of initially loading this data into a data warehouse, keeping it updated, and resolving discrepancies, Extract, Transform, and Load (ETL) utilities were developed. As their name implies, these utilities extract data from source databases, transform them into the common data warehouse format, and load them into the data warehouse, as shown in Figure 2.

The transform function is the key to the success of this approach. Its job is to apply a series of rules to extracted data so that it is properly formatted for loading into the data warehouse. Examples of transformation rules include:
  • the selection of data to load.
  • the translation of encoded items (for instance, 1 for male, 2 for female to M, F).
  • encoding and standardizing free-form values (New Jersey, N. Jersey, N. J. to NJ).
  • deriving new calculated values (sale price = price - discount).
  • merging data from multiple sources.
  • summarizing (aggregating) certain rows and columns.
  • splitting a column into multiple columns (for instance, a comma-separated list).
  • resolving discrepancies between similar data items.
  • validating the data.
  • ensuring data consistency.
The ETL function allows the consolidation of multiple data sources into a well-structured database for use in complex analyses. The ETL process is executed periodically, such as daily, weekly, or monthly, depending upon the business needs. This process is called offline ETL because the target database is not continuously updated. It is updated on a periodic batch basis. Though offline ETL serves its purpose well, it has some serious drawbacks:
  • The data in the data warehouse is stale. It could be weeks old. Therefore, it is useful for strategic functions but is not particularly adaptable to tactical uses.
  • The source database typically must be quiesced during the extract process. Otherwise, the target database is in an inconsistent state following the load. With this result, the applications must be shut down, often for hours.
In order to evolve to support real-time business intelligence, the ETL function must be continuous and noninvasive, which is called online ETL, and is described later. In contrast to offline ETL, which provides stale but consistent responses to queries, online ETL provides current but varying responses to successive queries since the data that it is using is continually being updated to reflect the current state of the enterprise.

Offline ETL technology has served businesses for decades. The intelligence that is derived from this data informs long-term reactive strategic decision making. However, short-term operational and proactive tactical decision making continues to rely on intuition.

Data-Mining Engines

The ETL utilities make data collection from many diverse systems practical. However, the captured data needs to be converted into information and knowledge in order to be useful.

To expound on this concept,
  • Data are simply facts, numbers, and text that can be processed by a computer. For instance, a transaction at a retail point-of-sale is data.
  • Information embodies the understanding of a relationship of some sort between data. For example, analysis of point-of-sale transactions yield information on consumer buying behavior.
  • Knowledge represents a pattern that connects information and generally provides a high level of predictability as to what is described or what will happen next. An example of knowledge is the prediction of promotional efforts on sales of particular items based on consumers’ buying behavior.
Powerful data-mining engines were  developed to support complex analyses and ad hoc queries on a data warehouse’s database. Data mining looks for patterns among hundreds of seemingly unrelated fields in a large database, patterns that identify previously unknown trends. These trends play a critical role in strategic decision making because they reveal areas for process improvement. Examples of data-mining engines are those from SPSS and Oracle. Facilities such as these are the foundation for OLAP (online analytical processing) systems.
Reporting Tools
Digital Dashboard

The knowledge created by a data-mining engine is not very useful unless it is presented easily, clearly, and succinctly to those who need it. Many formats for reporting information and knowledge results have been created. A common technique for displaying information is the digital dashboard. A digital dashboard (as shown by the examples of Figures 3 and 4) provides a business manager with the input necessary to “drive” the business. It is focused on providing rapid insight into key performance indicators, and as such is highly graphical with colored lights, alerts, drill-downs, graphics, and gauges.

A digital dashboard gives the user a graphical high-level view of business processes. The user then drills down at will to see more detail on a particular business process. This level of detail is often buried deep in the enterprise’s data, making it otherwise unnoticeable to a business manager. For instance, with the digital dashboard shown in Figure 4, a knowledge worker clicking on an object will see the detailed statistics for that object.
Digital Dashboard
Today, many versions of digital dashboards are available from a variety of software vendors. Driven by information discovered by a data-mining engine, they give the business manager the information required to:
  • immediately see key performance measures.
  • identify and correct negative trends.
  • measure efficiencies and inefficiencies.
  • generate detailed reports showing new trends.
  • increase revenues.
  • decrease costs.
  • make more informed decisions.
  • align strategies and organizational goals.
These dashboards and other sophisticated reporting tools are the collective product of business intelligence systems.

Data Marts
As corporate-wide data warehouses came into use, it became clear in many cases that a full-blown data warehouse was overkill for many applications. Data marts evolved to solve this problem. A data mart is a specialized version of a data warehouse. Whereas a data warehouse is a single organizational repository of enterprise-wide data across all subject areas, a data mart is a subject-oriented repository of data designed to answer specific questions for a specific set of users. A data mart holds just a subset of the data that a data warehouse holds.

A data mart includes all of the needs of a data warehouse – ETL, data mining, and reporting. However, since a data mart deals with only a subset of data, it is much smaller and more cost effective than a full-scale data warehouse. In addition, because its database is much smaller since it only needs to hold subject-oriented data rather than all of an enterprise’s data, it is much more responsive to ad hoc queries and other analyses.
Data marts have become popular not only because they are less costly and more responsive but also because they are under the control of a department or division within the enterprise. Managers have their own local sources for information and knowledge rather than having to depend on a remote organization controlling the enterprise data warehouse

No comments: