Wednesday, November 4, 2015

Big Data & Analytics - Predictions about the Present...



“On two occasions I have been asked,… if you put the wrong figures into the machine, will the right answers come out. I am not rightly able to apprehend the kind of confusion of ideas that could provoke such a question”

Charles Babbage. 1864. Passages from the Life of a Philosopher (chap. 5).



About 9 months ago, ~ 4½ generations in data science evolution, IDC released a report from their FutureScape Program titled: Big Data & Analytics 2015 Predictions. It’s still 2015, & I have been doing a good amount of work with healthcare organizations deploying & readying these groups to use contemporary (Hadoop-based) analytics in their strategic decision making, so,… I thought I’d look back at these predictions & react to them from my current experience.

4½ generations old, you say… The good news about this is that actual field deployment & use of “big data” analytics has not been growing as fast as the amount of material written in general (& the hype) about this topic[1]. First, deployment & use in any productive way requires some amount of organizational change & alignment & that slows down deployment. Second, as I have written about previously, acquisition & deployment of a technology is not the same as its adoption[2]. This is especially true in this case as adoption requires: 1) the deployment of additional different infrastructure & user-facing technologies &, 2) some (not small) amount of change in the way an organization thinks about data, the use of information & decision-making. Finally, most organizations are conservative when it comes to this level of change, so adoption & productive use requires management commitment & a senior champion to keep things moving until leverage can be demonstrated. Given all this, I think we are 2-4 years from the adoption & productive use of big data analytics being in general use in well-resourced organizations & 5-8 years for everyone else.

The first thing to do if I’m going to comment on IDC’s predictions is to define what big data is. There are many, many attempts to do this on the net, but suffice to say that there is no known (at least by me) consensus as to what the term means. Quite a number of definitions are couched in terms of volume of data, & most of these use 1PB as the boundary for big data. I think it is more nuanced than that; in fact I think that at least the following considerations are relevant:
·      Volume - 1PB is as good a number as any, but much smaller amounts of data can be productively analyzed, see material below on data volume & data quality
·      Storage technology – Currently relational-based data warehousing is the primary enterprise systems method for the storage of large amounts of data. This technology has definite limitations in terms of volume of data & performance that will be covered later. More contemporary, & effective storage technologies include: NoSQL database, graph database & massive, parallel distributed file systems e.g. Hadoop Distributed File System (HDFS) or sparse array technologies e.g. Google Cloud Big Table. These last two are most associated with big data storage.
·      Variety – The variety of the data stored in a system is also an issue. Data warehouses require a standardized data model that all data is normalized against. Data types that cannot be normalized are not stored in native form. Normalization & transformation is a complicated & very time-consuming process, as is updating the model if/when data requirements change. HDFS & Big Table like systems do not require a standardized data model or this type of normalization. An almost unlimited number of types of data can be stored & utilized in these systems making them much more aligned with real-world needs.
·      Analysis technology – Many systems, regardless of the volume of data, provide SQL based query as well as conventional BI as the primary analysis technology. Even using Turing-complete versions of SQL (e.g. T-SQL) limits the types of queries & analysis that can be done, though all HDFS based systems allow some form of SQL-based query. There certainly are many more SQL programmers in the world than data scientists who model & query in other systems/languages. Yarn (MapReduce2) is often used for analysis with HDFS & Big Table based systems although the most effective analysis is provided by model development in a language such as R, Pig, Python, etc. operating against HDFA/Yarn.
·      Search - Storage of substantial amounts of information requires a search function that is integrated & well aligned with storage & analysis functions. Conventional search techniques used in relational-based data warehouses tend to lower performance as data volume increases. Applications such as Apache Solr, elasticsearch & the search associated with HBase, Hive etc. are designed to work with massive data storage facilities.

Prediction is hard, we are told. The great Yogi Berra is reported to have opined that, “Prediction is hard, especially when it’s about the future”. One of the best ways to ensure that your predictions are accurate is to make them about the present. As a technology futurist, I’ve been guilty of this myself. Many of the IDC predictions are just this. They are not wrong… they may be accurate & prescient statements about the state of the technology – they just are not predictions.

So what are the IDC predictions? I’ll list them here with associated comments:


1.     Visual data discovery tools will be growing 2.5x faster than rest of the BI market. By 2018 investing in this enabler of end user self-service will become a requirement for all enterprises.
I think this is essentially correct & already happening faster than this prediction. Many organizations already use visual front-ends for their business intelligence (BI) analysis & so their expectations are set by current practice. Unfortunately, this often means that it is difficult to look beyond current practice to find unique & productive ways of using tools that are already being used in another context. IDC points out that the adoption of visualization tools is driven by a demand for “self-service” & BYOT (tools) in analysis. While this might be true, a more compelling motivation is that often the results of complex modeling, or even complex statistical analysis, are difficult to interpret if one is not a data scientist. Tools such as Qlik & Tableau allow for quick summarization of certain types of results in an easily understandable form. One of the downsides of the use of these tools is that often results are complex or more nuanced than can be expressed in visualizations. I have found that these tools are best used to summarize results for executive review.
2.     Over the next 5 years spending on cloud-based BDA (big data & analytics) solutions will grow 3x faster than spending for on-premise solutions. Hybrid on/off premise deployments will become a requirement.
Also essentially correct, but what it says about big data & analytics is that the majority of systems used for this effort will be cloud-hosted in some way. This both reduces cost, as opposed to on-premise hosting, & reduces risk as cloud services take on the maintenance & upgrade efforts needed to make the analytic effort work.
There is a different way of looking at this, though, one that I have to deal with currently on several data projects. The problem that this solution tries to address is the provision of (perceived) adequate levels of data privacy & security. A number of companies that I’m working with have specified that no data is to leave their security perimeter & similarly, no processing (analysis) is to be done off-site. This was fine when companies had 1 PB (1x109) of data but gets difficult to impossible at 500 PBs or, say, 5 TBs (5x1012). At least one of these companies has a respectable amount of data (35-40 PBs). This company has chosen a commercial private cloud deployment for BDA (from Rackspace) that allows for self-service provisioning of servers, elastic scalability, multitenancy & many of the advantages of cloud-based deployments in what is essentially a private environment. Luckily, they have a well-resourced & capable IT group as this is not an easy deployment, even with a commercial vendor. The most popular current private cloud deployment is an open source package from OpenStack. There are many (many) horror stories related to these deployments (not just OpenStack, but any of the open source vendors). The technical & deployment issues with open source private stack have been written about at least since mid-2013[3]. Of course, as Mr Asay & many others have pointed out, the real problem is the contradiction of developing & deploying a cloud infrastructure for private use. I actually think there is a place for this specific solution, but I agree that it is much more likely to be used as a hybrid solution where data & proprietary results may be stored privately, but processing may be done publically, so long as results & data are stored privately. Of course, the company I’m working with that has 40 PBs of data wanted everything on premise, although they did the deployment at a remote data center that they lease space at, so everything is relative…
3.     Shortage of skilled staff will persist. In the U.S. alone there will be 181K deep analytics roles in 2018 and 5x that many positions requiring related skills in data management and interpretation.
I don’t know how IDC derived these numbers, but they are probably as good as any. The thing that I think has to be addressed is how do we promote & advance analytics without having to wait for 181,000 doctoral level data scientists to be trained. It’s Q4 2015, can this many people be trained by 2018? Probably not… This means that we have to “make due” with what we have. In part we are already doing this by using SQL as a means of querying HDFS data, but we need to go beyond this. SQL is (kinda) OK as a database query language[4], but it was never intended, despite the provision for stored procedures[5], as a modeling & functional execution language. Data analysts need to begin to learn to use additional languages to model & query the data in massive stores.
As of today, the primary languages in use for this are: R, Python, Java, Scala, Pig Latin & various additions to the Hadoop stack such as Hive. R is the most used analytic language at present. It is excellent for complex statistical models & analysis & has a very rich ecosystem of added functions & libraries. It is probably just on the cusp of being superseded, primarily because it has some size limitations with respect to how much data it can deal with. Python is easier to learn & use than R, but it still has some size limitations & is not highly performant at scale. Java, of course, has the advantage of a very large programmer base, but it is not very good at statistical analysis (complex functions have to be written rather than called from libraries). It does have good options for the display of statistical results. Scala is Java-based & is mainly used today to build machine-learning algorithms. Pig Latin is run on the Pig platform (developed at Yahoo Research & now under Apache license) & is an abstract idiom (notation) of Java that allows high-level programming of MapReduce jobs for analysis.
Probably the easiest path here is to start with what you know, SQL moving to Java moving to something like Pig to be able to write MapReduce jobs directly. Also, keeping up with what is being put into open source in this area is important as many languages & visual palettes that allow direct MapReduce programming will be developed & released over the next 2-3 years.
4.     By 2017 unified data platform architecture will become the foundation of BDA strategy. The unification will occur across information management, analysis, and search technology.
This has already happened as evidenced by the offerings of Cloudera, Hortonworks, etc. I expect that offerings of this type will continue to be more & more integrated so that ultra-large scale information management analysis & search will seem almost seamless. This will also apply to offerings based on other (Non-Hadoop) platforms such as Spark.
5.     Growth in applications incorporating advanced and predictive analytics, including machine learning, will accelerate in 2015. These apps will grow 65% faster than apps without predictive functionality.
This is also essentially already happening. Very few people actually understand the internals of machine learning, but more & more businesses are basing business models & products on it. I think that there are two major paths here. The first is that products in the form of applications & add-on modules will become very important so that machine learning in some form can be integrated into many different types of business processes. Second, these machine-learning modules will also be integrated into the analytic stacks mentioned above… Currently there are several such modules associated with both Hadoop (Apache Mahout, Wabbit, Cloudera Oryx, oxdata H2O, MLLib) & Spark (MLLib, Cloudera Oryx). Most of these are trend analysis &/or predictive algorithms. Actual machine learning (supervised or unsupervised) integrated with analytics stacks is still a little ways off (2-3 years).
6.     70% of large organizations already purchase external data and 100% will do so by 2019. In parallel more organizations will begin to monetize their data by selling them or providing Value Added Content.
Again, this is not a real prediction. If 70% of organizations do this today, it will be considerably before 2019 that close to 100% of them do it. The real (IMHO) question is what kinds of data are being used & for what purpose(s)? Most of the BDA projects that I have seen fall into two categories: 1) optimization of operational processes & decisions, & 2) optimization of specific knowledge-based processes & decisions such as medical diagnosis or developing predictive trends in interest rate movements. In the first case mostly internal data is used except for operational benchmarking. This requires external data for comparison & trend development. In the second case, substantial amounts of data exist outside of a typical organization that will enhance the optimization of knowledge-based processes. In healthcare, for instance, this might include clinical data from partners or State & Federal sources, population health data from partners or State & Federal sources, registry data on immunization, best practice data from public & private sources & many more. Most of this will require acquiring data from external sources, & in turn, as IDC has pointed out, this will provide substantial opportunities for organizations to monetize their data & for intermediaries to aggregate & “sell” data.
7.     Adoption of technology to continuously analyze streams of events will accelerate in 2015 as it is applied to IoT analytics – which is expected to grow at a 5-year CAGR of 30%.
The main driver for this, at least commercially, will be the internet of things (IoT), so depending on how quickly you believe the IoT will be developed, deployed & adopted at scale, this rate is either wildly too low or wildly too high. I currently do not see a huge push commercially for the IoT, so I think this rate is too high. In 5 years, I think we’ll still be looking at early adoption, especially in at-home use, except in some specific areas. These include (but, as we futurists say, are not limited to):
·      Healthcare – remote monitoring both in facility & at home will increase the number of sensors that are reporting on an individual’s health status.
·      Transportation – There are separate areas: autonomous driving will increase the data flow as information is analyzed in both real-time & asynchronously. This will also be true for larger scale data streams for things like traffic control.
·      Process Manufacturing – This has already happened to a large extent. Industries such as chemical production have used continuous monitoring & analysis of data for years.
·      Other Manufacturing – More & more discrete manufacturing processes are designed with this type of sensing & monitoring

8.     Decision management platforms will expand at a CAGR of 60% through 2019 in response to the need for greater consistency in decision making and decision making process knowledge retention.
This one I am not so sure about. As above, I think this will vary wildly in different segments. For generalized business & strategy decisions, I think this will continue to be a hard sell. The technology exists now to greatly facilitate & enhance planning & decision-making processes, but there are real social & cultural impediments to adoption. In my experience these are of two types: 1) unfamiliarity with the concepts & possibilities of big data analysis, & 2) organizational & individual homeostasis that translates into resistance to change, even in the face of real need for change. I started this essay by stating the prediction is hard… I believe that change is even harder. People can be educated about the concepts & function of new technology & process, they may even come to understand the technology & its possible uses & advantages, but if they are socially &/or culturally biased against change, even if covertly, adoption of the new technology is very difficult. I believe that general adoption of decision making platforms, or any other manifestation of big data & analytics, will require some early adopters to have unqualified successes that are obvious & evident. This may happen in specific segments over the next 3 or so years, at which point many more individuals & organizations, in that segment, will be more amenable to adoption. This pattern of adoption by segment (healthcare, financial services etc.) will ensure that general adoption beyond early adopters & risk takers will be delayed, probably in the 5-8 year range.
9.     Rich media (video, audio, image) analytics will at least triple in 2015 and emerge as the key driver for BDA technology investment.
I’m also not so sure about this one. I cannot see this tripling by the end of 2015, as the investment made in general in BDA does not support any segment of it tripling this year. I also cannot see this being the primary driver for BDA investment anytime soon. Most companies that are investing at this point are not primarily in the business of media analytics, at least not in the segments I have a view into. This may be true in the media industry, but I do not yet see a business problem &/or business model that would lead to this rate of adoption.
The other issue here is that analysis of this type of data is still primarily of its metadata. Analysis & more detailed analytics of the internals (actual content) of media data types are still research topics. It will take another 3-5 years before we have reliable & accurate algorithms to do this type of analysis. Of course, I could be wrong,… but…
10. By 2018 half of all consumers will interact with services based on cognitive computing on a regular basis.

Well, this depends on what your definition of “cognitive computing” is. TechTarget defines it as “simulation of human thought processes in a computerized model. Cognitive computing involves self-learning systems that use data mining, pattern recognition and natural language processing to mimic the way the human brain works.[6]  The definition goes to say to say that the main function in these systems is machine learning. OK, I’ve been working in this area since the late 1970s & I am not convinced that such systems exist today or that they will, in fact, exist in 2018. Systems that use many of these capabilities certainly exist today. Some of those systems are even productive & interesting in specific ways. If this prediction is meant to mean that a large number of people will interact with systems that have used a supervised random tree algorithm to produce an optimized prospect list for an insurance product, then that is true today. If this prediction is meant to mean that a large number of people will interact with a system that has a general capability for interacting with & responding to a person with some underspecified request, I think that it is very unlikely in this timeframe. I have written much more extensively on this in a previous blog post.[7]

So where are we? I think many of IDC’s predictions were, in fact peridictions[8]. The two most interesting one (IMHO) were number 2 regarding cloud-based BDA solutions & number 3 regarding lack of knowledgeable people to do analysis. There are many pluses & minuses to cloud deployments, especially with respect to security & privacy requirements. The potential impediments in these deployments will be dealt with over the next few years by cloud vendors & the use of hybrid clouds will, I believe, be the dominant deployment model 5 years from now. Skilled people are another matter. If there really is a need for close to 200K data scientists of various levels in the next 5 years, that need cannot be met. We’ll have to proceed using what we already know & what can be learned during this period, which includes: use of integrated platforms that unify data storage, analytic design, execution of analysis & visualization of results. We’ll also have to start building our analytic models with what expertise is available… SQL moving to more suitable languages & modeling capabilities. The one thing that we do know about big data & analytics is that it will become increasingly important – initially more as a way of determining what questions to ask & eventually (within 5-8 years) as a way of developing & testing strategy[9]. Another thing we know is that this type of analytics, or any type, will not provide actual answers for our strategic questions. That we’ll still have to do ourselves, for the foreseeable future, until we really develop & can interact with cognitive computing.


[1] Big Data does not even appear on the Gartner 2015 Hype Cycle, although it was entering the “Trough of Disillusionment” on the 2014 chart.
[2] The Coevolution of Organizations & Technology. 1994. MIT/LFM Working Papers., A Framework & Model for Technology Adoption in Healthcare Organizations. 2006. PostTechnical Strategist.
[3] c.f. Matt Asay writing for TechRepublic, http://www.techrepublic.com/article/private-clouds-very-public-failure/
[4] The author was a member, representing the Digital Equipment Corporation, of the ANSI X3H2 Standards Committee that standardized SQL the first time in 1986.
[5] That I opposed during the initial standardization & am still not a fan of today… Stored procedures are a great way to make the function of an analysis opaque & not amenable to modification or debugging.
[7] Turing Tests, Search & Current AI. http://posttechnical.blogspot.com/2015/09/turing-tests-search-current-ai.html
[8] statements about the present
[9] see my blog post: Design Thinking as Work Process & Strategy.  http://posttechnical.blogspot.com/2015/09/design-thinking-as-work-process-strategy.html

Monday, September 28, 2015

Design Thinking as Work Process & Strategy


 
Design is the method of putting form and content together. Design, just as art, has multiple definitions; there is no single definition. Design can be art. Design can be aesthetics. Design is so simple; that’s why it is so complicated.

Paul Rand[1]







I undertook a project in the spring of 1994. I had been thinking about the future of work, lot’s of people had including many people that I was interacting with at MIT. I started a series of discussions with Bill Mitchell[3], Tom Malone[4], Joel Moses[5] & Peter Rowe[6] that led me to begin thinking about a new model for knowledge work[7] (this is a lot of footnotes already, even for me…). The model I discussed with Bill Mitchell & others cast knowledge work more as a design process & less as an industrial (?) process. This seemed self-evident to me, even if it didn’t to some other people. What did I (& still do) mean by this?

Studies of the evolution of work[8] suggest that the future (knowledge) work paradigm will have several major characteristics: a) work will primarily consist of the identification, acquisition and manipulation of information and knowledge, b) the convergence of information and communication technologies will provide the basis for this work, & c) design process will provide the framework for both doing work as well as planning & strategizing about it. This knowledge work will be different than the work people now do, even current "knowledge work[9]". In order to determine what types of systems and devices need to be developed to facilitate this future knowledge work paradigm, we must first produce and validate, to the extent possible, a description of this work.

First, though, more on evolution. I believe that the next ten years will be partitioned into three broad categories of work evolution. These are shown in this table.

Current – now
Near-future – 3-5 years
Post-convergence 8-12 years
Mainly content based
Content & process based
Knowledge & model based
Focus is: problem solving
Convergence i.e. process & teaming
Design model, i.e. episodic, contradictive,…
Basis is: content-based deduction, individual contribution
Process innovation, group decision making, mixed & modal logics
Personal & group intuition, induction
Content driven
Group & technology driven
Intuition & interconnection driven

The categories can be described as follows:
  • Content based (current)
o   Creation, management & use of content drives this phase
o   Work is mainly done by individuals & small (relatively isolated) teams
o   Focus is individual & small teams working to solve business problems
o   Feature needs revolve around integration of content management & collaboration with organization of projects/programs, budgets etc. also important
o   Technology is primarily program based, i.e. based on algorithms & deductive (formal) reasoning
o   Primary impediments & push-back focus on inadequate technology for collaboration (information & process) & inadequate organization for collaboration as well as strong commitment to co-location
o   Early adopters  have already moved on to the next phase
  • Process based (3-5 years)
o   Deep integration of Content & Process drives this phase
o   Work is done by distributed teams
o   Focus is on collaborative problem solving in the work process context
o   Will require cross-organizational workflows
o   Process innovation & group decision making become important
o   Technology enables work process & collaboration, algorithms become less formal, more based on people’s work process
o   Advances made in both how people work with (“Big C”) content & collaboration as well as context-based problem solving enable a move to the next phase
o   Primary impediments & push-back focus on attribution for work & position of individual contribution
  • Knowledge & model based (8-12 years)
o   Large-scale use of knowledge & models drive this phase
o   Work, i.e. what people do, takes on characteristics of the design work model
o   Work is done by a range of individuals, collaborative teams & collaborations among teams, location not an impediment
o   Focus is on knowledge & context-based problem solving to address business issues
o   Technology provides means of creating & maintaining context (organizational memory), large-scale knowledge modeling capability, problem solving assistance & ability to deal with very large-scale (Big C) content, use of ultra-large data sets for predictive analysis
o   Models of work & business problems create context for problem solving & business decision making
o   Primary impediments & push-back focus on intuition & contradiction based reasoning rather than standard deduction & induction

Before I move on to a description of a design work model, I’d like to focus on the impediments to this evolution for a minute. In the content-based phase, impediments to this evolution are mainly centered on the opinion that technology is inadequate to support the level of collaboration necessary. This is already inaccurate as technology today more than adequately supports most forms of collaboration. The more serious impediments, that are relevant to the next phase as well, are that organizational forms are not adequate to support collaboration & that a preference for individual contribution still exists in many organizational cultures. Many organizations, even those in technology companies, have not made the changes in composition &/or leadership needed to facilitate larger-scale collaboration that may have essential elements that are not co-located. These changes include:
  • Distributed & decentralized management & decision making
  • Formation of semi-autonomous to independent project & program teams
  • Nonhierarchical communication structures
  •  Nonhierarchical reward structures

Most organizations still adhere to either a push hierarchy (standard command & control structure) or a pull hierarchy[10]  (centralized management & decision-making with management pulling opinions, strategy & tactics from lower levels in the hierarchy), & these types of organizations have considerable difficulty in making the changes required for real collaboration.

In addition, there continues to be a bias towards individual contribution, especially in highly technical areas. The assumption is that a single person (or sometimes a single small group) is the only one with the knowledge or experience to create a design, architecture or program that will be effective & productive. The problem with this bias is that it is sometimes true &, at least in my experience, often much more productive than current egalitarian collaborative models.

Another issue that becomes more relevant in the process-based phase is that of attribution. Often group & individual reward structures, including the opportunity to continue to work on strategic & interesting problems, is tied to attribution. This is an issue if multiple groups can legitimately claim responsibility for the success of a project. Often the solution is to reward “everyone”. This can be less appropriate & much less appreciated by the people that actually affected the success of the project. There are organizational structures that may ameliorate this issue, but the cultural issues are harder to address.

Finally in the knowledge & model-based phase, the impediments are largely due to the real differences in how work is accomplished. People, except for some types of scientists & engineers, are not (yet) accommodated to working from models or from deep knowledge bases. In addition, during this phase, the decision-making norm will also be quite different – more biased towards guided intuition & non-deductive reasoning. The use of extremely large amounts of data for predictive analysis is also not yet well understood or accepted except in some specific & limited segments. It will take time for these norms, & for the design work model to take hold, therefore the 8-12 year timeframe.

There are a number of examples of knowledge work extant that provide a good start for the description of work & work process in the knowledge & model phase. I believe the best example of this is the general design process. Design is a knowledge-based component of much work, but in the future, most knowledge work will have the characteristics of design work. Rowe (1987), Norman (1990, 1992) and others have described design work that can be summarized as follows:

  • episodic - context is preserved across a number of temporally distinct work periods during which different viewpoints or potential resolutions may be explored
  • knowledge-based - deep and broad historical, technical and contextual knowledge is necessary
  • eclectic – uses a wide range of problem solving techniques
  • contradictions - fueled and/or motivated by the maintenance and resolution of tensions or contradictions including:
    • ill defined/well defined task definition
    • underconstrained/overconstrained
    • intuitive/nonintuitive
    • textual/nontextual
    • "logical/nonlogical"
    • theoretical/pragmatic
    • content/context
    • structure/function (organization/technology)
    • time/completeness
    • complexity/simplicity
    • strategic/tactical.

The recognition, maintenance and resolution of these contradictions are the major characteristics of this new work paradigm. A contradiction occurs when some statement or fact is asserted to be both true and false. In the sense used here, a contradiction is a true work condition (s) or fact(s) whose assertion mutually excludes another true work condition or fact. The presence of unresolved contradictions creates tensions in the work. These tensions are present, to some extent, in our work today. The focus of our current work, however, is technical contents. This is true regardless of what the "technology" is that we are working with (VLSI, accounting practice, manufacturing process, etc.). Our near future work will be increasingly focused on process and teaming. Once the technical support exists to support diverse process and teaming models in real-time, this will become less of a focus. This should take 3-5 years for early adopters and perhaps as much as 10 years to come into general use. Future knowledge work (5-10 years out) will encompass these process and teaming models, but will more and more come to be structured by the tensions and contradictions listed above.

      What does it mean to say that knowledge work will be based on our ability to recognize, maintain and resolve contradictions and tensions? Recognition is defined as the awareness that something perceived has been perceived before. Maintenance is defined as the act of retaining or preserving some item or idea and resolution is the act of providing a solution to a problem[11] often through the resolution of one of the contradictions. These three processes applied to the list of tensions represent the essence of future knowledge work. This work will be characterized by: 1) the need to be aware that the tensions and contradictions in both the context and content of the work are necessary and that these contradictions provide the dynamic that drives the work, 2) that part of the work may require maintaining these contradictions and exploring the consequences of all potential possibilities, and 3) that eventually a solution or set of solutions must be derived from this tension.

      The formal presentation of the recognition, maintenance and resolution of logical contradiction helps us understand the role of these tensions in the description of a work paradigm. This formal presentation is outlined in Appendix A.

Let’s look at a near-future example of knowledge work in light of this work process description. Natalia is a 28-year old employee of a large public hospital. She has been tasked with determining the most prevalent disease comorbidities in the hospital’s patient population, determining the statistical characterization of the cost of treating those comorbidities & making recommendations for reducing the cost of that treatment. This is an issue in all healthcare organizations today that is only just at the edge of our ability to analyze & make anything other than anecdotal recommendations for. Here is a “design work process” description of this task:
  • Episodic – This is not Natalia’s only work task, so she has to balance making progress on it with work on a set of other tasks. This means that she must take it up at intervals (in episodes) & that at the start of each episode, she must recover both the context & the content in the state that she left them & in a manner so that she can begin working on this task again with a minimum of redundant effort.
  • Knowledge-based – Natalia will need a broad range of knowledge including: general medical knowledge, hospital clinical procedure, hospital financial procedure, operation of hospital clinical (electronic health record, practice management) software & financial management (cost accounting, billing) software, reporting software, analysis & business intelligence software, development, utilization & interpretation of financial models & other areas of knowledge just for this one task.
  • Eclectic – Natalia will need to be able to access a variety of clinical & financial databases as well as create a financial model for determining revenue & cost associated with the prevalent comorbidities. She will need to be able to interpret the results of this modeling & translate those results (cost per comorbidity, revenue per comorbidity) into recommendations as well as be able to create a way to accurately & understandably present her results & recommendations.
  • Structured & motivated by contradictions – Some of the contradictions that affect this task are:
o   The task definition is ill-defined. This causes Natalia to have to interpret the intention of the task & how its results & recommendations will be used. Her interpretation could not be in line with management expectations & this tension requires Natalia to create a specific definition for the task that she is working on a solution to.
o   The task is under-constrained. Again, this causes Natalia to focus on developing a description of the boundaries of both the task & the solution; i.e. “what to leave in & what to leave out”.
o   Interpretation of the results & development of recommendations is somewhat intuitive rather than empirical. This is related to theoretical/pragmatic. Natalia must make recommendations that are feasible & that in turn will lead to results in finite time. The tension here is extreme as there are already many recommendations reported in the literature & the public press in this area as well as many expectations of what the “right” solution(s) may be. Natalia must be able to focus on interpretations & recommendations that are relevant & feasible for her organization & not based on general expectations or ideas in the public sphere.
o   The solution(s) may require structural (organizational) changes as well as functional (partly technological) changes. There is a tension in the balance between these that must be addressed.
o   Her manager has given Natalia two weeks (elapsed) to accomplish this task. Natalia has already told her manager that this is not enough time for anything other than a set of guesses, but there is a big strategy meeting coming up &,… There is usually this tension between time & completeness (that is often a proxy for accuracy/correctness).
o   As with most work tasks today, something that seems simple can actually be quite complex. Just count all the patients that have hypertension & figure out what it costs to treat them… simple. Not… First of all what is a patient? What definition do you use for hypertension (or any clinical condition)? What do you mean by cost? Does the cost accounting system even have patients in it? The more you peel this back the more complex it becomes. This means Natalia, once again has to be explicit about definitions for every important term as well as for each calculation. This means she can at least say that for this definition of these conditions & comorbidities & for this definition of the calculation of costs, this is the median cost in 2014 of treating this comorbidity for the 2,146 patients recorded with it.
o   This task can be interpreted at a tactical level: what are the relatively small changes that can be made to our clinical workflows & financial processes that will enable us to reduce costs? It can also be interpreted at the strategic level: what are the organizational & large-scale technology changes that can be made that would allow us to systemically reduce costs? As with all of these contradictions, there is a balance to be had here as well.
o   Finally, in healthcare there is a purely strategic & almost unmentionable contradiction that is very well illustrated by this task – that is cost reduction versus improvement in patient outcome. These are not necessarily mutually exclusive, but the trade-offs made, especially when resource constrained (think dollars) can make it seem as if they are. Does reducing the cost of treatment on a per patient or per condition basis improve or worsen patient outcomes. Certainly we do not want to have worse outcomes at the “cost” of cost reduction, but it often seems as if this is the choice. There is enough fat & waste in the healthcare system that these do not have to be mutually exclusive, but this kind of general statement does not work on a per patient basis. The tension here can be extreme.

Natalia’s work on this task does appear to be very aligned with our description of design work process. Can we learn something from how designers work to make her work more effective, or even more efficient? Also, I hear you say, “what about strategy, I thought this was about strategy.” Right, lessons learned first…

Several relatively recent sources have looked at these lessons from different perspectives these include a 2007 report on how global brands use the design process to create leverage & competitive advantage[12] & several Harvard Business Review articles[13].

A primary lesson described in both reports is that design must be tightly tied to & aligned with the end user experience – not the technology used, general design trends or management expectations. This requires both knowing who the end users will be & creating an experience that is compelling & useful for them in both functional & emotional ways. This is the case whether you are designing new communication devices or, as Natalia is, designing a solution & recommendations for strategic decision-making.

A second lesson is that models & prototypes are essential in the design process, & that they do different things. Models allow you to describe & examine the problem space; while prototypes allow you explore the solution. Models & prototypes should be readily available to people who are using or evaluating the work & feedback loops must be part of the process so that criticism & suggestions from feedback can be incorporated.

Design processes also tend to be optimal as collaborative processes. Teams may be co-located or distributed, but that cross-functional collaboration improves results & outcomes of projects. In Natalia’s case, she is working as an individual contributor but may, in fact collaborate in areas that she is not as experienced in such as development of financial models.

Finally, several challenges have also been incorporated as lessons:
  • Larger amounts of ambiguity must be accepted by both end users & peers of the designer
  • The same is true of risk
  • Expectations must be set appropriately – design process is effective at reducing complexity, facilitating innovation & identifying future trends. It can also be effective at producing solutions to complex problems, as in Natalia’s case. It is not so good at actually operating a business. Even design companies like frog[14] & ideo[15] have business processes for program management, financial management etc.

OK, you say, but you promised strategy… There are two aspects to this. The first is that a strategy is a designed object & needs to be developed through a design process. The second is that design process itself is a strategy that can be adopted by organizations. I’ll deal with both of these.

As a designed object, an organization’s strategy needs to have certain characteristics. The primary one of these is that it is focused on a user end-state. What does the organization want to accomplish. What experience is it trying to create for the users of its product or service? Often organizational strategies, especially those of for-profit companies, are specified with respect to how much growth they need to achieve, what industry segments they’ll penetrate & by how much, how much revenue & profit they’ll generate. These are not a strategy but rather the results of a strategy. Strategy can be designed by collaboratively iterating descriptions & tactics (prototyping) & continuously modifying the strategy according to criticism & feedback. As we have seen, such a process must incorporate the acceptance of higher levels of ambiguity & risk in order to succeed. Once such a strategy is developed & decided upon, tactics & business processes must be developed to initiate & evolve the strategy.

What we are talking about is using the design work process to develop the organization’s strategy, so that the strategy is a designed object. We can also see, through this effort, that the design work process itself can be a strategy that is applied to many of the efforts & work tasks that an organization undertakes. This was shown in Natalia’s work example above. Using the design work process as a framework:
  • Episodic & iterative approach so that multiple tasks are worked on at once
o   Use of models for problem description & prototyping for solution exploration
  •  Classification & utilization of deep & large volumes of knowledge
o   Requires ways of searching, storing, analyzing & utilizing such knowledge
  • Use of a broad variety of modeling, prototyping & problem-solving methods
o   Provides the best possibility of interpreting & presenting actionable results
  • Allowing for the recognition, maintenance & resolution of the contradictions inherent in work efforts (as described above)
o   Recognition of contradictions allows for important aspects of a work task or problem to be identified & described
o   Maintenance of such contradictions allows for appropriate levels of ambiguity & risk to be maintained so that solutions are not tracked into & collapsed prematurely, &
o   Resolution of such contradictions allows solutions to be derived

These general characteristics outline a framework for addressing work tasks, up to & including strategy development that organizations can productively use. By looking at both the development of strategy & the accomplishment of work tasks as design problems, people (working in organizations or independently) can use a highly flexible model of work process to provide leverage in design, development, modification & measurement of work & work process. When I say “organizations will need to…”, I actually am talking about the people in those organizations who will need to be more tolerant of ambiguity & risk, understand that their work will be accomplished interatively & in episodes rather than in a single effort, be able to work with large amounts of information & knowledge & apply a broad set of problem-solving tools, applications & ways of thinking to creating solutions. Organizations that encourage & facilitate this approach to work & strategy, & that are made up of people that can utilize the design work model will be very well positioned to compete successfully as how we do our work, & even what our work is evolves over the next 5, 8, 10 years.



[1] Professor of Graphic Design, Yale University, 1956-1969, 1974-1985.
[2] M.C. Escher. Drawing Hands, 1948
[3] 1944-2010, Former Dean of the School of Architecture & Planning, MIT
[4] currently Patrick J. McGovern Professor of Management, Sloan School of Management, MIT
[5] Former Provost & Dean of Engineering, currently Institute Professor, MIT
[6] formerly Dean of the Graduate School of Design & currently Harvard University Distinguished Professor, Harvard University
[7] Knowledge work can be differentiated from other forms of work by its emphasis on "non-routine" problem solving that requires a combination of convergent, divergent, and creative thinking. https://en.wikipedia.org/wiki/Knowledge_worker
[8] Amazon.com lists 16,547 books under this topic ranging from The Critique of Pure Reason. Immanuel Kant. 1781, to The Future of Work – Human Value in a Digital World. Marcus Clarke. February, 2015. Accessed 18 August 2015.
[9] Knowledge work can be differentiated from other forms of work by its emphasis on "non-routine" problem solving that requires a combination of convergent, divergent, and creative thinking. https://en.wikipedia.org/wiki/Knowledge_worker. Accessed 15 August 2015.
[10] D. Lavoy. 2014. Is collaboration limited by organizational structure. http://www.cmswire.com/cms/social-business/is-collaboration-limited-by-organizational-structure-024450.php CMS Newswire, March 2014. accessed 10 September 2015.
[11] all definitions are from The American Heritage Dictionary of the English Language, 3rd Edition. 1992
[12] Eleven Lessons – Managing the Design Process in Global Brands: A study of the Design Process. Design Council. 11/2007.
[13] Design for Action. T. Brown & R. Martin. Harvard Business Review. pp. 56-65
     Design Thinking Comes of Age. Jon Kolko. Harvard Business Review. pp. 66-71. 11/2015.
[14] www.frogdesign.com/
[15] www.ideo.com/