Tuesday, July 05, 2016

Orchestration Sets The Beat For Agile IT

Agile IT has been widely heralded (and equally widely decried) as a way to align the pace of change in IT with the pace of change in the broader business.

At its core, Agile IT is making the very basic point that if in house IT cannot keep pace with the business there are a rapidly increasing number of cloud and SaaS providers for whom that is not a problem.

In short, IT must innovate around time to market or die a death of a thousand credit card cuts as individual developers outsource the IT they need to Amazon and other public cloud providers.

But what is the core activity that drives the shift to agility? Most often this shift is characterized as a vat migration to cloud. This is true, but it misses the key driver of IT agility: orchestration.

How Orchestration Drives The Cloud
IT automation  deals with performing a particular task, such as setting up a single compute node. IT orchestration manages execution of multiple, interdependent tasks. For example, an orchestration workflow manage dependencies such as the need install a database before installing an app.

A cloud without effective orchestration is more like a demo - if it works at all, it is likely to break the first time the need arises to update the apps, data or infrastructure supporting that cloud. Key areas of focus for cloud orchestration include:
  • Reference architecture - any orchestration solution must start with a clear understanding of how the pieces fit together, particularly to guarantee reliability at a specific scale and set of workloads. 
  • OS patching - far from being a settled capability, OS patching remain more of a dark art than a science. While patching itself is straightforward, for every 100 servers patched, several will not reboot properly, causing cascading faults across the cloud. Next-gen computing companies like CoreOS are offering some innovative approaches to solving the OS patching problem.
  • Infrastructure lifecycle management - OpenStack upgrades are notoriously challenging and one of the reasons companies like Mirantis have achieved such success in helping customers build and manage large OpenStack installations.
  • Application lifecycle management - a cloud is only as valuable as the applications running on it. Orchestration and DevOps is needed at the application, infrastructure and OS level.
How to Get To Agile IT
Here are a few hints to simplify the task of building an orchestration-first cloud:
  1. Usage drives design - Understand intended usage (workloads and scale) before you design the architecture. There is no such thing as a “one size fits all” cloud - the cloud architecture is dictated by its intended use.
  2. Don’t skimp on designing and testing the reference architecture - understand what happens to network, storage and compute at scale running realistic workloads. Work through failure scenarios (think chaos monkey) and ensure that HA and DR work under real-world conditions
  3. Don’t just automate, orchestrate -  while small, pilot clouds can be managed with manual processes and single-node automation, large, production-quality clouds require a significant investment in multi-node orchestration and change management.
  4. Address organizational and business process disruptions up front. Understand the impact of cloud on individual IT roles/responsibilities, career paths and opportunities for advancement

Monday, June 20, 2016

How Not To Build A Cloud

Thomas Bitman of Gartner wrote a blog post last year about why OpenStack projects fail. In that article, he outlined three particular metrics which together cause 60% of OpenStack projects to fall short of expectations:
  • Wrong people (31% of failures): a successful cloud needs commitment both from the operations team as well as from “anchor” tenants.
  • Wrong processes (19% of failures): a successful cloud automates across silos in the software development lifecycle, not just within silos.
  • Wrong metrics (10% of failures): a successful cloud focuses on top line transformation by accelerating delivery of innovative applications and services, not merely on squeezing bottom line costs. 

Wrong people
"Agile clouds need agile processes — and people are your biggest supporters, 
or your biggest roadblocks.” - Thomas Bitman
Many OpenStack projects start as technology pilots with part time technical staff. If there is not a single champion responsible for the success of an OpenStack cloud initiative as their full-time job, the chance of failure is high. There are two critical roles that govern cloud success:
  Cloud operations champion: this champion is not just responsible for building and operating the cloud (supplying cloud capacity), they are equally responsible for on-boarding developers and workloads onto the cloud (building cloud demand). Their job is to work closely with developer tenants to make sure that the developer on boarding process is smooth and that key developer tools are available in the cloud application catalog.
   Cloud anchor tenant: developers are overwhelmingly the most important early adopters of private cloud. Accelerating the software development lifecycle through DevOps automation is by far the highest value of private cloud. Therefore the most important validation for a private cloud is to on-board a key set of developers and show the impact of accelerating the development and go live process for their applications. Having an anchor tenant committed to using the cloud is a key prerequisite for achieving success.

Wrong processes
"Is this really cloud? Or just virtualization? And what about 
the stuff running inside the VMs?” - Thomas Bitman
Many OpenStack projects start with very limited goals around provisioning generic VMs or delivering relatively limited development services. This effectively automates just a silo within the software development lifecycle. Business value comes from being able to automate not just within but also across the silos of the software development lifecycle.
  Beware of automating silos: for many IT organizations, the tragedy of virtualization has been that developers can provision a VM within 20 minutes, but getting a fully configured development environment takes over 6 weeks. 
   Aim to automate entire Go Live process: The ultimate goal of a private cloud should be to accelerate the delivery of applications and features by automating the entire process from code check in to go live. This level of automation is also the only way a traditional enterprise can compete with “born in the cloud” SaaS businesses.

Wrong metrics
“Not putting the right metrics in place - usually, this is focusing 
on cost-savings, not agility." - Thomas Bitman

Private cloud has often been sold as a natural extension of virtualization - as such, customers often justified their OpenStack investments based on IT cost savings. While cost savings are one value of a successful cloud, enabling business agility is the core value delivered by OpenStack.

OpenStack projects should measure business value not just for the cloud overall but for each tenant. In particular, they should focus on two tenant metrics:
      Uptime dashboard: public clouds have long delivered detailed uptime metrics. Private clouds must do the same if they are to build trust with tenants and create a business case to justify additional cloud investments.
      Value dashboard: private cloud value is primarily driven by its ability to accelerate the software development lifecycle. McKinsey has documented that DevOps automation can accelerate the go live process by 80%, which in turn can deliver top line revenue growth, for example by enabling greater innovation in customer facing apps. Tracking continuous integration deployments is a proxy for the overall acceleration enabled by private cloud.

Planning for OpenStack Success
The antidote for OpenStack project failure is to build a business case for private cloud that addresses people, process and metric issues. This business case should lay out a phased approach for rolling out their private cloud.

The starting point is identifying a full time cloud champion and teaming them with an anchor tenant who will use the cloud and provide input on how to deliver value by accelerating delivery of new applications and features. The next step is to define a phased set of investments, each with clear success metrics that govern timing for subsequent investment:
  Phase 1: stand up cloud and on-board anchor tenant. Success metric: 99% uptime, 1.5X software development acceleration. Once these metrics are achieved, the company should invest in phase 2 of their rollout.
   Phase 2: on-board additional tenants. Success metric: 99.9% uptime, 2.0X software development acceleration.
   Phase 3: automate go live process from code checkin to production. Success metric: 99.99% uptime, 4.0X software development acceleration.

An ideal approach for a company looking to make a strategic investment in private cloud is to conduct a short pilot in an OpenStack lab that allows them to validate the business case. This kind of a pilot can also allow the cloud champion and “anchor” tenant to work together on clarifying requirements for successfully on-boarding an initial application to the private cloud.

Wednesday, September 23, 2015

For Private Cloud, No Pain Means Big Gains

When virtualization took the data center, it offered huge cost savings for IT ops and zero migration pain for developers. Coming at a time when IT was being pressed by the business for savings, vSphere took the data center by storm.

This example is instructive when trying to consider why private cloud has had a slower adoption. The short answer is that cloud offers fuzzier benefits for IT ops while forcing a lot of pain on developers.

The lack of a smooth migration path for existing workloads to the cloud goes a long way to explain the relatively bumpy growth of the private cloud market itself.

For example, the latest craze for “cloud native” apps seems like an explicit acknowledgement that vendors are giving up on minimizing cloud migration pain. Rather than focusing on simplicity, the cloud native initiative seems to make a virtue needing to rebuild existing apps for the cloud.

Of course, for greenfield apps, cloud native and 12 factorapps make great sense. But for the enterprise, greenfield is a small part of what they do (like less than 10%). There is still a big white space in the market for a vendor who can provide cloud benefits for existing workloads.

This may be the reason for the buzz behind next generation cloud companies like Apcer,  Mesosphere and Google's Kubernetes who offer ways to support existing Windows and vSphere workloads. The idea of getting improved automation and security while having a migration to new technologies like Docker gives enterprise the best of both worlds.

Of course it is early days for these new cloud technologies, but my bet is on whichever vendor can duplicate the original VMware offer of  max cloud ops gain for minimum dev pain.

Wednesday, September 16, 2015

Entrepreneur’s Note to Self - Don’t Die

“Note to self: don't change for anyone / Note to self: don't die / Note to self: don't change for anyone / Don't change, just lie” - Ryan Adams
 If Woody Allen is right that 80% of life is just showing up, then the bulk of an entrepreneur’s job is keeping the company alive long enough to succeed. That in turn means constantly scanning the horizon for what is most likely to kill you next.

It turns out, this is how NASA trains their astronauts to stay alive in the unforgiving environment of space. The singing astronaut Chris Hadfield gave an interview where he described this approach:
“Half of the risk of a six-month flight is in the first nine minutes, so as a crew, how do you stay focused? How do you not get paralyzed by the fear of it? The way we do it is to break down: What are the risks? And a nice way to keep reminding yourself is: What's the next thing that's going to kill me?”
In the startup world, death comes most quickly through failing to grow rapidly. That means the two most critical tasks are keeping current customers happy and getting new ones. Growth is the bait that attracts capital and capital is oxygen for a startup.

Once a startup company is funded, there is an immediate desire to draw a huge sigh of relief and think about how to fix everything that is wrong with the product, starting with a ground up redesign. However, startups are fragile creatures. Customers pay for solutions - for them, elegance is secondary.

A company can easily die while it is “fixing” its product. A better approach is to prioritize resources guarantee growth and approach product redesign in a modular fashion that still enables a steady stream of customer-facing enhancements.

Wednesday, August 26, 2015

People Only Buy To Get Promoted - The Key To Enterprise Sales

I have been fortunate to have many good sales mentors in my career but the best hands down was Joe Roebuck. Joe headed sales at Sun Microsystems for 17 years and was on my board at Persistence Software for 5 years.

Joe also gave me the most important insight about how to sell enterprise software:

People only buy to get promoted.

The enterprise software version of this pithy statement would be something like: enterprise buyers will only buy your shiny object if they see it leading directly to recognition, acclaim and promotion or at least a raise.

There is a lifetime of sales knowledge encapsulated in that quote. Here is how I interpret it:

  • Status quo is easy: enterprise software is a business in which innovative upstarts try to unseat incumbents. The easy purchase decision in enterprise software is always to go with the incumbent. 
  • Shiny objects are risky: Enterprise buyers always have a choice between safe status quo vendors and an array of risky but alluring new vendors
  • Career advancement is why buyers take risks: if a buyer does not get a personal benefit - attention, a raise, a promotion - the risk quite literally does not outweigh the reward
  • Advancing customer careers is how companies win: most sales people think only through the customer signature and maybe the initial implementation. Making a customer successful is a longer-term venture and extends at least to the buyer’s next HR review cycle.

There is no more passionate evangelist than a successful buyer and it only takes a few really happy buyers for the market herd instinct to kick in.  For example, VMworld pulls in 10,000 attendees a year, all of whom believe that VMware products are advancing their career.

Buyers know that product features don't guarantee success. Just because a product is objectively better doesn’t mean it will be successfully implemented, integrated and maintained by the vendor.  A key success in sales it to structure a deal in such a way that the company has incentive to stay focused on the success of the deal over time.

It is interesting that people always say of incumbents like IBM, “nobody gets fired for buying IBM.” The flip side of that is the only reason a buyer would make a riskier choice would be for the opportunity to be promoted, aka the opposite of being fired.

Wednesday, August 19, 2015

When Will Cloud Come to PaaS?

One of the perennial cloud predictions has been that 200x would be the year of the Platform as a Service (PaaS) cloud. The logic goes that if an automated data center in the sky is good, an automated development platform in the sky must be even better.

“Normal” clouds like Amazon AWS give the developer a virtual computer to load their OS and App onto. PaaS gives the developer a virtual computer with the OS, database and middleware “pre-loaded,” thereby simplifying the deployment.

Yet so far, PaaS adoption has been anemic and Gartner puts PaaS at 1% of the overall cloud market. At the same time, new technologies like Docker and containers have attracted far more attention from the developer community.

PaaS Lacks “Write Once, Run Anywhere” Simplicity

Developers love the simplicity of “write once, run anywhere.” This is what gave Java its initial allure and it is at the core of Docker’s recent ascendance to the top of the shiny tech object heap. PaaS has traditionally been more of a “write differently for each place” kind of solution.  Issues include:
  1. PaaS lock-in – there is no example in the industry of PaaS portability – each PaaS has its own unique services and configuration. While IaaS also suffers from similar lock in issues, the effort required to port from one cloud to another is much lower here.  
  2. Anemic ecosystem - real applications use many different services, such as database, file storage, security and messaging. In order to deploy an application in a PaaS, the PaaS must support every service that app needs,.
  3. Public/private inflexibility – many PaaS offerings are cloud only (Heroku) or on premise (OpenShift). Even for PaaS offerings that can run on or off premise, replicating the exact service ecosystem in each environment is challenging.

PaaS For SaaS Is a Winner

A no-brainer use of PaaS is to extend existing SaaS applications. In this case, the write once run anywhere problem goes away because there is only one place to build and run the application. 

The big winner in PaaS to date has been SalesForce. Their Force.com platform makes it easy for companies to extend their CRM applications or build entirely new applications. With this platform, SalesForce has created huge competitive differentiation in CRM space while also building a PaaS revenue stream approaching $1B a year, dwarfing any other PaaS offering. 

Cloud Native PaaS Could Go Mainstream

Google recently released their cloud native platform, called Kubernetes (which means pilot in Greek). Kubernetes is a cloud operating system for containers that runs anywhere. A number of PaaS vendors are banding together to define the requirements for cloud native computing.

The promise is to simplify still further the process of provisioning services to cloud containers, regardless of where they are running. It will be exciting to see how existing PaaS vendors like CloudFoundry incorporate these new technologies into their offerings.

Monday, August 10, 2015

Enterprises Need A Panic Button for Security Breaches

Most home security systems have a panic button - if you hear something go bump in the night you can push a panic button to starts the sirens wailing, call the cops and hopefully sends the bad guys scurrying. As useful as this is for home owners, enterprises need a security panic button even more.

Security spending is heavily weighted towards keeping bad guys out. Media coverage has demonstrated how often they get in anyway. According to the CyberEdge Group, 71% of large enterprises reported at least 1 successful hacking attack in 2014.

While there is extensive advice around the manual steps to take to respond to a malicious attack, there is little in the way of an automated response to an attack. This is important area to extend enterprise automation.

What might a Panic Button for automated response to security incidents look like? Essentially this would be an automated workflow that would implement a set of tasks to eliminate the current attack, identify existing losses and minimize future damage. An example workflow could include:

  1. Identify compromised systems from intrusion detection tools and disconnect compromised systems from network
  2. Search for unauthorized processes or applications currently running or set to run on startup and remediate
  3. Run file integrity checks and restore files to last known good state
  4. Examine authentication system for unauthorized entries/changes and role back suspect changes 
  5. Make backup copies of breached systems for forensic analysis
  6. Identify information stolen from OS and database logs

By creating automated “Panic Button” workflows that respond to security incidents, enterprises can reduce the damage of an attack. This automated approach can also show customers that an enterprise is taking full precautions to protect their personal information from falling into the wrong hands.

Wednesday, May 13, 2015

Entrepreneurial Management – The Loose-Tight Loop

For the last 20 years, I have been leading teams both small (2 partners and a turtle) and large (over 850 employees). During that time I have had big successes (IPO on Nasdaq, sale to VMware) and crushing failures (remember the Y2K bubble?) Sitting on numerous boards also gave me a ring-side seat to observe different management styles.

Through this experience I have evolved a management style to drive rapid business transformation and growth. I call this style the “loose-tight loop (a mash-up of ideas from the Tom Peters book “In Search of Excellence” and OODA loops). 

In the very dynamic startup world, it is often hard to strike the right balance between “if I do it myself I know it will get done right” and letting chaos rule. Because the market is evolving at the same time as the company, assumptions about customers, competitors and technology change rapidly as well.

I see the job of the CEO as aligning the team on a set of audacious goals and orchestrating the achievement of those goals through three activities:
  • Tight on what to do – align the team on goals and priorities
  • Loose on how to do it – trust the team to reach those goals efficiently and creatively
  • Loop to learn – communicate regularly to learn what is working and not working (aka trust but verify)

Over time, I have adopted a number of agile process ideas to put the loose-tight loop into practice:
  • Daily standup – 15 min call to communicate actions and identify issues 
  • Weekly top 5s – on Monday, each exec lists their 5 priorities for that week , summarizes status for the top 5 priorities for last week and updates MBOs
  • Weekly check in – 1 hr one on one meeting to collaborate and coach
  • 6 week sprint – 2 hr meeting to go deep on 1-2 issues, review MBOs for last sprint and & set MBOs for next spring
  • Annual plan – 2 day planning session to rebuild business plan for next year

Management By Objectives (MBOs) are critical as they are the explicit link between team objectives and executive priorities. Linking MBOs too closely to compensation can reduce their value. MBOs should represent challenging tasks – 100% achievement is not expected and is likely a sign that the goals were too easy. These MBOs become calls to action for the team to support each other in accomplishing tough tasks.  

In the loose-tight loop, the CEOs job is to get everyone onto the same map and working together to reach the same destination. The executives’ job is to execute in alignment with the plan and ask for help if it turns out our assumptions are wrong.

In fact, the biggest risk execution risk is that execs are too slow in asking for help when they run into trouble.  More experienced execs have the confidence to ask for help when they need it. Less experienced execs try to bluff their way through the problem. This is dangerous to the whole team because often execution challenges mask underlying mistaken assumptions.

Saturday, March 08, 2014

Location, location, location - Why I Joined BMC

The enterprise software market is not that different than the real estate market - where you are positioned in the market is everything.

In the nerdier-than-thou Bay Area, moving from VMware to BMC is not the most obvious move, so here are some of my thoughts on my decision.

At this point, I have started 2 companies (Persistence, Medaid), gone public once (PRSW - never again!), sold 3 companies (Persistence, WaveMaker, Reportive) and led one spinout (Pivotal).

Figuring out what to do next was a challenge.

I had always felt that in evaluating a job, team comes first and opportunity comes second (or in Jim Collins-speak, first who, then what).

When I was first introduced to BMC, I spoke to Eric Yau and was impressed by his vision about transforming BMC — I felt it was very similar to the transformation project I had worked on at VMware. As I met with other BMC executives, I was struck by the overall quality of the executives and their commitment to make BMC the leader in the cloud and automation management.

I believe that BMC has a unique position in the cloud space because they are not tied to a particular cloud platform. The other key players in the space - VMware, Amazon, Microsoft - all have a dog in the fight. They *care* which underlying platform their cloud automation manages.

In short, the other production-class cloud managers are focused on building a purebred cloud backed by their OS or hypervisor - only BMC has a singular focus on hybrid cloud.

If a key reason to move to cloud is greater customer choice, those same customers will be looking for the “Switzerland of cloud managers” to preserve their choice.

Time will tell, but so far I am thrilled with both the market opportunity in front of BMC and the collaborative culture within BMC.

Thursday, September 12, 2013

Engineering Management - Shaolin Style

A friend of mine just got a well-deserved promotion from code horse to manager. Here are my quick thoughts on making that transition.

The basic idea is that when you are given a little more responsibility, your words and actions carry more weight. For that reason, it is important to be careful about throwing that weight around.

You job is no longer to optimize your output, but to optimize the output of your group. Don't be the genius with a thousand helpers!

In particular, here is some advice to ease into a new engineering manager role:

  • Listen more. There is an expression about argumentative people - "they don't listen, they just reload." Since your words carry more weight, make sure you really understand other people's point of view before you offer your own. Once you wade in with guns blazing, other engineers will be less likely to confront you.
  • Code less. The tradeoff for more human communication is less computer communication. The time you spend helping make other people effective comes directly out of your average daily KLOC. Remember, you are making the team's total output better at the expense of your own output - this will smart a bit at first!
  • Start team building.
  • Stop architecting. If your vote counts for more than other engineers by dint of your hierarchical position, you can win architecture arguments just by yelling louder. To build a real engineering team, you have to separate the team leadership position from the tech leadership position. If you are the team leader, you just can't be the tech leader as well.

The net of it all is to use more influence, less telling; more carrot, less stick; you get the picture!