geordee


Real-time Data Warehouse & Mixed Grains

Increasingly, the trends point to real-time data warehouses. There are quite a lot of discussions on what is meant by real-time and the purpose of real-time data warehouses. Whatever it may be, the very fact that analysts, and hence enterprises, are increasingly interested in real-time data warehouses demands us to think about how to architect such a data warehouse.

There are two distinct problems in real-time data warehousing - real-time data integration and real-time decision making. Changed data capture (CDC) tools and trickle-feed techniques are addressing the real-time data integration problems. Increasingly, the application integration techniques are aligning closer to ETL...


Implementing Master Data Management

Software applications were process-oriented for a long time. With new technologies and architectures being introduced with tangible benefits at rapid pace, lifespan of software applications have considerably reduced. Mergers and acquisitions, tool/technology rationalization efforts, competitive business practices have all increased the likelihood of shortening application lifespan. However, one thing remains fairly constant – data. Organizations now consider their data as an enterprise-asset, and more so with master data which is the information about an organization’s physical and realizable assets.

Master data refers to information regarding non-transactional entities in an organization, such as data related to customer, product and accounts. Master...


Yes to NoSQL!

I have been thinking about Workaday’s architecture, following Curt Monash’s post The Workday architecture — a new kind of OLTP software stack. What struck me most is the approach towards data storage. They boldy decided to forgo the conventional approach and is quite successful in that. Not that there is no comparable approaches in the past, the Curt’s coverage emphasizes the recent trend of applications stepping into the new territory commonly known as NoSQL.

Attribute-value (or key-value) store is closer in spirit to the other growing trend of MapReduce implementations for data processing. Then there are document stores such as...


A Few Thoughts on Mobile Business Intelligence

As the traditional business intelligence tools and techniques get matured, innovations take place, firstly to retain competitive advantage, and then to push the bar further, bringing better capabilities and usefulness. A few offshoots of business intelligence in the past few years were business analytics, operational business intelligence, pervasive business intelligence, interactive dashboards and of course, as the title implies, mobile business intelligence.

Mobiles are recognized to be the future of computing. Today’s mobiles armed with processors with high computing powers and liberal amounts of both volatile and persistent memory, can definitely compete with other channels of interaction with computing infrastructures....


Data Integration Patterns

Gregor Hohpe and Bobby Woolf have written about enterprise integration patterns for messaging solutions, many years back. It is interesting to see that there is hardly any authoritative work like EI Patterns in Data Integration world. There are communities working SOA patterns, already.

One probable reason why patterns did not become a design technique in data integration world is the early entrance of tools and almost complete reliance on them thereafter. Tools implemented the patterns, and everyone used those without explicitly thinking about. At the point, patterns may not add much value in the form of reusable components or solution...


Dimensional Modeling and Semantic Relationships

Modeling data warehouses is sometimes a contentious issue. When it comes to data warehouse design there are a few schools of thought and many sub-schools! One of the often-heard discussions is the different between dimensional modeling and entity-relationship modeling. It is interesting to hear how dimensional modeling and de-normalization are different from modeling transactional systems.

It should be recognized that the concept of dimensions and facts bring a lot of clarity to understanding data analytics, and the loop-free stars and snowflakes help analytical and reporting tools to generate queries easily. However deviations from the entity-relationship model may not be so...


Actionable Information in Real Time

In the past decade enterprises saw a widespread adoption of business intelligence and data warehousing solutions to understand the patterns hidden in data, fragmented across various source systems. Most of these efforts built business intelligence solutions based on structured transactional data generated by business operations. However, information is also exchanged over various unstructured and semi-structured formats such as emails, instant messages, documents, spreadsheets, presentations, news, blogs etc. Such information is not captured by traditional business intelligence and data warehousing solutions, partly because of BI/DW tools and techniques were conceived to deal with structured data.

Today we see enterprises taking a...


A Case for Evolution of Data Warehousing

Data Warehousing has come a long way. The evolution of technology – especially in storage and processing – causes the technical solutions to evolve as well. It is time to revisit the traditional techniques of data warehousing, right from the data modeling aspects.

Most data warehouse implementations, from my experience, focused on quantitative aspects such as managing large volumes of data. There was an interviewer who almost mocked at me when I said I have not worked with terabytes of data. What we missed was the qualitative aspects of data, until it became an imperative. A large portion is then...


Trapped in Star Schema

Fact-less facts – an oxymoron. We had a requirement to track assignments of different individuals to different programs. At any point in time we should be able to retrieve the counts of assignments by various characteristics including the assignment status. Fact-less facts, obviously.

Here is a typical conversation.

“Let’s then design a fact-less fact table.”

“But wait, what’s the granularity for time dimension? The assignments can change any time, any number of times.”

“Probably we should use daily. That’s when the client’s reconcile the systems. Probably weekly would be fine, that’s when the reports are used. Or, monthly? That’s when...


Meta Data Warehousing

Imagine an ideal enterprise where the data is clean and consistent across. This means you can relate the data from one system to another without any transformations or lookups. This is probably the ultimate goal of master data management where the master data across the enterprise is managed for consistency, integrity and accuracy. Imagine ideal systems in that enterprise with data and functional services exposed so that those can be discovered, invoked and orchestrated. This is what service-oriented architecture is moving towards. Imagine these systems having unified authentication mechanisms and access controls. This is probably what enterprise directory services, single...