Saturday, September 28, 2024

The Power of "I Don't Know"

I recently read one of those influencer posts on LinkedIn. You know, one of those posts that say "Don't do this.... Do this instead..." This one was about acing the interview and it said, "Never say 'I don't know' in an interview."

This statement really surprised me and I started to wonder how many people are actually taking this statement to heart.

Before I get into this, let me tell you a story.

Many moons ago, I worked for Microsoft in the professional services group (Microsoft Consulting Services or MCS). In those days, Microsoft was growing massively and MCS more than most orgs. This meant we were constantly hiring and everyone senior in MCS was doing technical interviews all the time. As a member of the Exchange team, it was my job to interview candidates who claimed to know Exchange. I had a standard list of questions I would ask. There was one and only one hard and fast rule: the interview could not conclude until the candidate said "I don't know" at least once.
 
There was a very practical reason for this. The goal was to accurately assess the candidate’s technical knowledge. Until you reach the EDGE of that knowledge, you don't really know how much they know. So, you keep asking more and more detailed questions until you get beyond their ability. Then you stop because you found their limit. It also had the added benefit of finding out what happened when they didn’t know something. Did they just make up an answer? Not a good habit for a consultant.
 
When this strategy was explained to me, I was perfectly fine. Made logical sense. However, when interviewing candidates, many of them were distressed. Typical conversation:

Candidate: "Gosh, I guess I failed the interview!"
Me: "Oh, why do you say that? You actually did pretty well."
Candidate: "Oh, well I didn't know all the answers."
Me: "Oh, don't worry. We would have kept going until you didn't know the answer."
Candidate: "WHAT! Why would you do that?"
Me: "So I know what you DON'T know, which is way more important than what you do know."
Candidate: "I have never failed an interview question like this in my life."
Me: "Welcome to Microsoft?"

I really didn't know what to say. I had literally never done a technical interview in my life until I worked at Microsoft (I was only 27 when I joined). It seemed to make sense to me but apparently wasn't common. I had one candidate drop out of the process because of this. TBH, if that kind of thing rattled them, they would not have done well at MCS anyway (at least in those days) but I did feel bad that the candidate had such a bad experience.
 
Here’s the thing though: The people we hired didn’t all perform well. We exhaustively tested their knowledge. We were extremely diligent in this process. However, we routinely brought in candidates who knew everything about Exchange and they utterly failed as a consultant.
 
It turns out that there was almost zero correlation between their technical knowledge at time of hire and their ability to work in MCS. In the end, someone had the correct skillset, abilities, and drive to get the job done. Those things made them successful. So, the ability to learn was vital. The current knowledge of the current product was just a side note and not really a strong indicator of success. After all, the software will change with the next version. Technical product knowledge has a very short shelf life. So, we changed the interviewing process from “Tell me how X feature works” to “Tell me about a time when you didn’t know the answer and how you were able to figure it out.”
 
Perhaps this early experience broke me, but for my entire career I have been completely uninterested in what a candidate knows. This seems completely irrelevant to their ability to do a given job. I mean, we can all use Google, right? Now if a candidate makes a crazy claim on their resume like "I invented SMTP" or something, we are going to talk about SMTP quite a bit just to be sure that's true (it's usually not). However, the vast majority of the time we are going to talk about your ABILITIES and the way you solve problems. You see, knowledge is transient. I know all kinds of things. Most of them are totally useless to my current job. Wanna know how to do IPV4 subnetting? I'm your guy. Or you could Google it. Actually, just go ahead and Google it. We'll all save time.
 
On the other hand, if you are really struggling with interpreting contradictory customer feedback from your target persona interview session, you should probably call me. You are not going to Google that one. You see, I have experience and a skill set in this area. This is my core job function and I've been doing it for quite a while. You don't get that from Google.

Thus, when you interview someone, the amount of core knowledge in their head is interesting, but not the most important thing. I mostly ask factual questions just to make sure they have actually worked on things they claimed to have worked on. Once that's settled, I don't do that any more.

What the heck does that have to do with "I don't know"?

Well, because if you are being interviewed and the interviewer starts asking factual questions, they are just trying to figure out where you are on the experience spectrum. You know or you don't know. Assuming you're not claiming experience you don't have (don’t do that, we will find out), the fact you do or don't know a specific fact shouldn't really matter. When that happens, you just say, "Sorry, I don't know that detail, I didn't work on that part of the system. I can look it up for you if you like. What I actually worked on was...."

And off you go. You show that you have related knowledge, even though you don't know the exact fact they're asking for, and show willingness to look things up. If you want to go the extra mile, go ahead and Google it after the interview and mention it in your follow-up email. "Hey, thanks for the interview. By the way, I Googled that detail we were talking about and the answer is X. Your question made me wonder about the answer so I went ahead and looked it up."

Boom. Done. You showed that you were paying attention, you cared enough to follow up, and you were able to come up with the correct answer quickly. I’d hire someone who could do that.

The bottom line: If the interviewer turns you down because you didn’t know fact X, they actually did you a favor. Yeah, you don’t want to work there. Organizations that hire primarily on the ability to memorize facts are not going to push you to learn and grow. They will focus on the “right” answer and you’ll be expected to repeat the same facts each day. Personally, I don’t want to work in an organization like that, sorry.
 
My ideal job is one where I can solve complex problems and do things that require me to stretch my brain. This is why I love product management. There is no “RIGHT” answer in PM land and I love that. That’s just what I like, but I think most people like to do challenging work.

If you find yourself bored and doing repetitive work, you may want to seek out a role where the interview is about your abilities, not about the facts you know.

Similarly, if you are hiring, think about what the candidate is going to be doing for your organization three or five years from now. Does it matter what facts they know today? Probably not. Focus on things that are evergreen like their abilities, their inquisitiveness, and their grit. Not on short-term things like factual information.

Saturday, September 7, 2024

Layers of Abstraction

As I mentioned in a previous blog post, my team and I have been working on GenAI lately.  You may have seen Outshift’s announcement about the AI Assistant in Panoptica:

https://docs.panoptica.app/docs/cisco-ai-assistant-for-panoptica


Building the AI Assistant for Panoptica was an interesting learning experience.  It helped us understand what the limits of current LLM-based agents are and allowed us to start work on what the next generation will be.


This has also caused us to start thinking about how applications will change over time due to LLMs.  There is a very interesting debate raging at the moment that ranges from “nobody will write code in five years” to “everyone will write code in five years” and pretty much every other variation.  It’s exciting because we are right on the cusp of a major paradigm shift and nobody really knows where this is going.  I’ve been around long enough that I’ve seen these paradigm shifts happen a few times in my career so I do have some expectations about how this will go.


The first thing that I think we should remember is that this trend towards higher and higher layers of abstraction is not new at all.  If we look back into the history of computer science, we see a steady progression into higher and higher levels of abstraction.




For extremely old computers like the Altair above, you had to manually enter the boot code via switches before it would start (yes, they had ROMs but you had to program them yourself).  Then boot loaders were introduced.  Eventually, shops would write their own operating systems.  Older mainframe shops didn’t buy their OS, they built it custom.  Later, operating systems became a standard commercial product.  Then came the GUI revolution.  Later, we moved to distributed computing and the Web.  Then came virtualization.  Containerization, microservices, PaaS and cloud followed.


At each stage of this evolution, higher order services were offered.  When I was in college, I learned to program and I had to talk to storage devices directly.  For example, you could write to disk.. Today, with something called S3, I can create an object, persist information to that object and then refer back to that object whenever I want from wherever I want.  Yes, there are still disks sitting somewhere in a data center, but I don’t really need to know how S3 works and I don’t really care.


Does that mean that I don’t write software any more?  Well, as a Product Manager, I don’t write software at all, but you know what I mean.  We still write software.  We just don’t write low-level stuff any more and we don’t use low-level constructs.  Of course, this evolution wasn’t even or orderly.  Does anybody remember 4GL?  Fourth Generation Languages were supposed to take over and make things like C irrelevant.  That didn’t happen.  I used one called R:Base for a while but it was just too limited to write serious applications.


Just like the rest of the stack, programming languages have evolved but the low-level stuff is mostly still there.  In school, I learned machine language, assembler, Ada and C.  Other than Ada, all those programming languages are still around even though most software teams would never touch assembler any more.  However, if you work on an operating system project like ESXi from VMware, you are still writing pretty low-level code.  You’re probably writing in C, but you’re still down at the hardware.  So, the lower levels don’t go away.


What happens is really more about leverage.  Instead of EVERYONE writing an operating system, we buy that operating system from someone who only does that.  Then we use it without worrying about the down-in-the-weeds details.  It is similar with cloud.  Yes, I probably COULD write my own object store or set up a Ceph server, but I choose not to most of the time because renting S3 buckets from Amazon is super convenient.  There is still a team in Amazon deploying servers, disks and all that stuff but relatively few people in the industry are dealing with that detail because most of us just rent it by the hour from AWS, Azure, GCP or another cloud provider.


What does this say about GenAI and the future of programming?


Well, it’s pretty clear that writing source code is going to get cheaper.  We know that LLM’s can already do this and are getting better every day.  Tools like CoPilot mean that a developer can produce way more code more quickly, and thus the code costs less to produce in terms of human hours invested.


Andy Jassy recently claimed that AWS saved 4,500 developer years of work due to GenAI based programming tools.  That is an insane number.  Even if the number is only half that, it means that their cost per line of code is WAY WAY less than it would have been if that work had been done manually.  You can read his quote here:


https://x.com/ajassy/status/1826608791741493281?lang=en


However, it’s extremely unlikely that traditional algorithmic programming is going anywhere anytime soon.


What is much more likely is “hybrid” systems where traditional code sits alongside GenAI-based agents and they work in concert to solve problems.  This means that even if you are not writing your code by hand, you are still writing and maintaining software.  All the labor and effort that goes into running a SaaS-based application is still there, just hidden behind a new layer of abstraction.


Thus, GenAI is likely to change HOW we do things but probably won’t change WHAT we are doing that much.  We will still have security issues, we will still need observability, we will still need to diagnose and fix bugs just like we do today.  The tool chain will change and we will probably be more efficient, but we’re still building and running software.  


So, the question for all of us is to figure out how to build and optimize this hybrid world we will likely wind up in.  How do we automate away the toil?  How do we move faster?  How do we operate more efficiently?  Not surprisingly, these are the same questions I was asking ten years ago when I first started working on cloud and SaaS-based products.


Thursday, August 29, 2024

UX As We Know It Is Dead

Apologies in advance for yet another ChatGPT blog.

However, I think that the most interesting aspect of ChatGPT and related natural language technologies isn't being discussed enough.  That aspect is around the profound effect they will have on the way we design our UX going forward.

Looking back, it seems extremely similar to the GUI transition.  There was a hard line between the "haves" and the "have nots" that made a huge difference in the market.


Released in 1993, WordPerfect 6 was the end of the line for non-GUI word processors.  At the time, WordPerfect had 60% of the market.

At about the same time, Word for Windows 95 was released:



While not "modern" compared to current word processors, it's very familiar and pretty much any user of GDocs today could probably figure out Word in a few minutes.  The big change was from text (WordPerfect) to GUI (MS Word).  While GUI had been around for a long time, there came a point when you just didn't release non-GUI software any more.  That point was about 1993-4.

Similarly, natural language systems have been around for some time with things like Siri really pioneering the mass market.  We have also had Chatbots for many years.  However, they haven't really dominated the market.  They've had their uses, but non-NL type interfaces remain.

The question, then, is will Natural Language grow to the point where it eliminates other types of UI?

I believe it will.

Not to the extent that we no longer have a GUI.  That wouldn't make any sense.  You still need some way to interact with the system.  However, software post natural language will look and feel very different than pre-natural language.

Perhaps a better example from the word processing world would be the "ribbon bar" from Word 2007:



While you might not feel that this UI is very exciting, it was EXTREMELY controversial at the time.  The idea that you didn't have menus any more completely flipped people out.  I was working at Microsoft at the time and had several customers complain bitterly about this change.  It got to the point where we actually had an all hands at MSFT where the lead designer of the ribbon did a whole presentation defending the decision.  The short version is that as things like Word got more complicated, the menu system was getting deeper and deeper. Thus, it required more and more clicks to do anything.  Even if you knew EXACTLY what you were doing, your productivity was slowing down.  Thus, the ribbon.  Each tab has a focus area and you click on that part of the UX and the ribbon changes.

I think GenAI and specifically LLMs will be similar.

This is not to say that Chatbots are the ultimate expression of this, because I don't think they are.  The current state of the art is what I call a "20 Questions" experience.  You have to interrogate the system and use what is called "prompt engineering" to get it to do what you want.  My normal rule is that any design I have to adapt myself to is a bad design.  Good design is when I just pick up the feature and use it.  We're not there yet.

So, when you get that "How can I help You" pop up in the lower right of the screen, I usually just ignore it.  I know that if I click there, it will be a series of questions and sorta helpful answers.  Not my favorite UX.

Instead, I think the future is hyper customization.  What I mean by that is that the user experience will be extremely different depending on who uses the system.  We already see the beginnings of this with products like Jira where you have so many customization options that you can get seriously lost in the system.  The difference is that the system will auto-configure itself depending on who you are and what you're doing.  This type of deep learning is perfect for GenAI and extremely difficult to do in traditional algorithmic programming. 

The first place I think we will see this is places where there is high cognitive load to use the product.  By "high cognitive load" I mean that I need to spend an immense amount of time learning before I can start getting value from the product.   So, things like data tooling that require me to write queries.  Today, you have to spend time either in the product or in a training class to learn the query language.  Going forward, I don't see how tools like that survive.  If I cannot get meaningful results in five minutes or less why would I use your product?




Sunday, February 25, 2024

Failure is an Orphan

 "Success has many fathers, failure is an orphan."

-English Proverb


As a seven year VMW veteran, I have been reading Chris Romano's VMW history posts with great interest. Most of the things he talks about happened before I joined, but I have met many of the people he talks about in those posts.


I thought I would post here my personal recollection of a seminal event that shaped my VMW experience: The creation of VMware Cloud on AWS (or just VMC as we called it).


At the time, I had been working for Dan Wendlandt who was running product for what we called "Cloud Native Applications" or CNA. CNA was the precursor for what is now called Tanzu. This is before the Heptio acquisition. The project I had been working on was cancelled (that's a different blog post) and I was looking for a new gig. I was pointed to Narayan Bharadwaj who was forming up a team to build something called "Skyscraper." I had no idea what that was, but I was told it was cloud related.


I set a meeting with Narayan and he pulled me into a small conference room. This was on Promontory campus in Palo Alto (Prom D, I think? We later moved to Prom E). As an aside, I have had the pleasure of working in all five of the Prom buildings (Prom F is the gym) that have offices. In the Prom buildings, the conference rooms all have glass walls. It's a nice airy feeling to use them. However, this one had those large flip chart pages stuck up all over the glass so that the room was completely hidden from outside view. We went into the room and Narayan closed the door. I was asked to sign an NDA document. Keep in mind I was already a PM working on VMW products so this was VERY strange. I had been read into many pre-release products over the years but had never signed an additional NDA. I signed the document and joked that this was a nice looking murder room but they forgot to put the plastic on the floor. Narayan didn't laugh.


Narayan proceeded to brief me on a project to run vSphere natively on public clouds and sell it as a service (i.e. a SaaS product). While selling vSphere as a service wasn't a new idea (we had recently spun out vCloud Air), the idea of running on public clouds WAS new. The plan was to run Skyscraper on multiple public clouds and the various clouds all had code names. I was shown a prototype running on "Sears Tower" and there was another one called "Empire State." The problem was that we had to choose from rapid provisioning velocity for clouds that allowed automated provisioning or we could get bare metal that was basically manually provisioned. At the time, bare metal provisioning for the clouds that supported it was basically manual. You had an API, but it just created a ticket and a human provisioned your server. It could take hours. To make Skyscraper work, we needed cloud velocity and the ability to provision servers in minutes, not hours. Thus, we needed to run on VM's.


As a former vSphere PM, I had worked on the ESXi kernel (I shipped a kernel feature caled IO Filters) and I knew the storage side (I worked on vVols). I thought I could help the team and was very excited to join. I pretty much got the job on the spot and was working for Narayan the next day. When I joined, it was a VERY small team. I think I was the third or fourth hire.


It soon became clear that Skyscraper wasn't going to work. ESXi was built to run bare metal. Yes, you could run "V on V" and ESXi and vSphere DID run on the VM's we could get from the Public Cloud provider (did you know that ESXi and vSphere are totally different things?) but the storage system was slow as hell and the overall system just didn't give the performance that we needed. It didn't look good.


Then, along came something called "Petronas". Petronas was different. Petronas ran on something called a "Metal" instance that was natively provisioned like a VM. That meant that we were using the Public Cloud API to provision a bare metal server and then running ESXi on that server natively. This took a few minutes (our target was less than 20mins) but it was WAY WAY WAY faster than the alternatives. To put it another way, we simply did a remote boot of a server running in the cloud to an ESXi image that we provided.


BOOM! Skyscraper suddenly made sense.




Yes, "Petronas" was AWS EC2.Metal. Did you ever wonder why early VMC hosts were called "i3p" instead of "i3.Metal" which was their official name? Ya, the "p" stood for Petronas. Petronas Tower was the largest office building in the world at the time, so we gave AWS that code name to hide the fact that we were building a new product on AWS. VMC was the launch customer for AWS Metal. Today, anyone can get a .Metal instance, but at the time, you had to have special permissions and we were the only ones who could provision them. Hilariously, everyone thought we were saying "Patronus" like from Harry Potter, but it was actually Petronas because Skyscraper instances were all named after buildings.


In some ways, VMC was a very simple project. We took vSphere as it was, ran it on a VM, booted a bunch of i3p instances to run ESXi, used the local SSD storage to host vSAN and away we went. We had to write drivers for the AWS gear, but that was something we knew how to do. IIRC, the AWS NIC's were pretty special and took some joint engineering between AWS and VMW, but in the end it was just a driver, it worked. In some ways, AWS just became another OEM like Dell or HP. They were the hardware "manufacturer" and we did the driver work. An i3p instance running ESXi was a full ESXi experience and when we added the rest of vSphere, you had an "Software Defined Datacenter" (SDDC) that could run any workload that a regular on prem deployment could run. Kaboom!


In other ways, VMC was a complete overhaul of everything VMware was and did. VMC meant that we were running a live service that we sold by the hour. VMC meant that we could deploy new features to customers in days, not months. VMC meant that we could fix critical security bugs in hours. It soon became clear that VMC was was a SERVICE, not a PRODUCT. We didn't "SHIP SOFTWARE" at VMC, we "PUSHED TO PROD" and then we used feature flags to expose those features. We eventually got to the point were a push became a non-event. We pushed to prod all the time, it got boring. Feature doesn't work right? Turn off the feature flag. This meant we could run much faster than the rest of VMware and take greater risks. Why bother running a three month beta? Just ship the feature and test in prod.


This caused conflicts. We were different, we were special. We were a customer of vSphere but not a direct part of the vSphere team. We were part of the same business unit, but were in many ways the little brother of vSphere with all that entails. I was personally briefing PatG on our progress, which was nuts for a PM Director. Attempts were made to kill us or shut us up. It was a grind. We all worked crazy hours. It really felt like a startup despite the fact that we lived inside a very large multi billion dollar software company. We had a mission and we had something to prove. I personally led the deployment of the first ten "lighthouse" customers. PM was extremely hands on and we were directly involved in every aspect of the business. Within six months it was obvious that VMC was a winner and we went into overdrive. New features, new regions, new team members. The team exploded, the product was generating millions of dollars in revenue.


For me, this was peak VMW. We were allowed to take a risk, to do something that was a little crazy and was certainly out of our comfort zone. To be honest, we had no business running an enterprise SaaS business. We had no idea what we were doing at first. We had to figure out how to accept credit cards, we had to figure out what a feature flag was, we didn't have a way to pre-release features without a beta. But we learned. We worked hard, we worked as a team and we solved the problem. In the end, VMC is a great service and probably at this point the crown jewel of the vSphere portfolio. I hope that Broadcom continues to invest in it.