The concept of embedded metadata is getting a fair amount of traction recently, due to Bill Inmon and other leading thinkers. I'm going to take a contrarian view, however...
Embedded metadata basically says, "Do away with the repository." And it's easy to see why this is attractive. Centralized repositories are a lot of work and have historically had challenges in demonstrating business value. On the other hand, if the metadata is simply pervasive throughout the infrastructure, why replicate it?
This is a seductive argument, and one that I have played with in some of my writing and speaking. But I am increasingly skeptical that it will work, for several reasons:
1. The need to map technical metadata to logical abstractions for analysis and understandability
2. The politics of production system security
Take for example an RDBMS catalog, probably the best known type of metadata. A metadata "portal" infrastructure could in theory be built that would federate and publish all the database schemas, right where they sit. Why replicate them? With the proper modifications to your data modeling CASE tool, even the business definitions could be embedded in the metadata (via Oracle & DB2's COMMENT feature, and SQL Server's Extended Properties).
However, what about the logical model that drove that schema? Even higher, what about abstractions like data subject areas? Do we push them down into the production system as well? This could be very cumbersome, and you still need to model them somewhere...
This is analogous to dimensions in data warehousing, which also can be seen as abstractions. Take the item/price/location/date fact so well known from retail. Usually, locations roll up into districts, regions, and so forth. It is also true that this rollup is not evident in the raw data collected from point of sale. It is maintained within the data warehouse, and only when the raw transactional data is combined with such dimensions does the real business value emerge. The same is true for metadata. An exhaustive list of columns may be useful to a support team, but it's worthless to enterprise architects -- they are interested in higher level subject areas.
Similarly, another major emerging area of metadata is the concept of the systems inventory, or application portfolio. When used to drive integration metadata (what system is talking to what) a real, robust picture of the enterprise starts to emerge. But again, the lowest technical level -- the level at which one would have to "embed" the metadata -- must be rolled up into understandable abstractions. The alternative is to be lost in a bewildering forest of components, libraries, queues, messages, batch jobs, and so forth. And, while some metadata embedding in all deployed components may be possible, this will be a nontrivial technical challenge, and there is still the need to tie it all back to higher level abstractions such as system, capability, business process, and so forth.
Associating a sale with its rollup dimensions of course cannot be done in real time in environments past a certain size, due to performance reasons. One of the things the embedded metadata advocates are relying on is that metadata is small, and therefore (perhaps) not susceptible to such challenges. This brings us to the next challenge for embedded metadata: access to the production systems. It can be difficult to gain access to a critical production RDBMS even for a tightly controlled weekly batch scan into a repository. I guarantee it will be impossible to say, "Give my web portal ad-hoc access to the system catalog so anyone can introspect into it at any time." Ain't gonna happen, folks, especially in this security-conscious environment.
So, for better or worse, I think we are stuck with the centralized or at least federated repository model for the time being. It's unfortunate, because the concept of embedded metadata is provocative at least. Maybe I'll come around to it again.
What is to be done in the meantime? Well, as I have written in other places, it is high time for metadata administrators to seek out and make common cause with their counterparts in configuration management. Not just the configuration management of the software development lifecycle, but true configuration management of the production infrastructure as advocated by IT Service Management. Both metadata and configuration management are ambitious, sophisticated infrastructure management concepts all too vulnerable to funding vagaries. In unity there is strength...
