(Comic courtesy of XKCD, via Creative Commons License)
When a user has to contact Product Support for help, he or she already has a problem. Dealing with Support should never add aggravation on top of that. Slogging through scripted steps that have nothing to do with the real problem or wading through tiers of call center workers to find someone who actually understands the situation is a frustrating waste of the user’s time, and it isn’t any fun for the Support staff either.
At Palantir, we work hard to make sure Product Support is a solution, not a chore. Our users have already said “shibboleet” just by contacting us. We don’t have tiers and we don’t have scripts. Every member of the team is a full-fledged engineer and product expert. It’s our mission to use our expertise to help our users succeed.
Palantir’s data fusion platforms are powerful and have applications across a broad range of use cases. They are used in the intelligence, defense, and law enforcement communities as well as in financial institutions and health organizations. Our software is applied in each of these domains to solve complex and important problems, so if a customer hits a snag it’s likely to be complex too. These aren’t the sort of issues that can be fixed by following a script—they require genuine problem-solving expertise. So what do we do when confronted with a truly perplexing problem to solve? This post explains how we handle these situations at Palantir.
How to Solve Any Problem Like a Palantir Product Support Engineer
Our approach is a four-step process:
- Gather Information
- Isolate the Cause
- Provide a Solution
- Record for Posterity
You will become a detective, a scientist, an inventor, and an archivist in turn. I’ll illustrate with a hypothetical problem in which a user is trying to import a spreadsheet:
Step One: Gather Information
During the first step, you are a detective, interviewing witnesses and collecting evidence with the goal of building a complete and accurate picture of what happened. And as a detective, you are interested in just the facts. Don’t be led astray by speculation or make unfounded assumptions. Testimony is valuable, but evidence is far superior. Ask the user what happened, but also check the logs. It’s not that users are malicious or dishonest—far from it. They’ve come to you for help, so they are motivated to find a solution. But they don’t have your deep understanding of the product. It’s easy for them to have inaccurate mental models of the software, or use the wrong terminology in describing their experience. Trust your users, but verify their claims.
Suppose you receive an email from a user that says the following:
“Dear Product Support: I’m trying to import a spreadsheet, and it isn’t working. I’m just getting an error message that says the file can’t be imported. Can you help me?”
The first thing you need to do is understand what you do and do not know, keeping in mind that the latter category will usually be much larger than the former. Here’s what you do know in this case:
- The user is failing to import a spreadsheet.
- The user is receiving an error message.
And here’s a sampling of relevant information that you don’t know:
- The exact contents of the error message.
- Whether any errors are being written to log files.
- Whether the user can import other files.
- Whether the spreadsheet file is malformed or corrupt.
- How large the spreadsheet is.
- How the spreadsheet’s contents are structured.
- Whether the user has previously been able to import spreadsheets like this one.
- How the user is attempting to import the spreadsheet.
- What version of your software the user is running.
- Whether the user’s administrators have made any policy changes that would prevent file imports.
- Whether there have been any recent changes in the user’s environment.
Once you’ve identified what you don’t know, it’s time to figure out where you can get that information. There are a lot of potential sources of knowledge. Here are some to start with:
- You can ask the user reporting the issue questions like, “What is the exact text of the error message you are seeing?”
- You can check the product documentation for answers to questions like, “What are the methods for importing a spreadsheet, and how are they expected to behave?”
- Your bugtracker can yield answers to questions like, “What problems have caused spreadsheet imports to fail in the past?”
- It’s often useful to check with your fellow engineers by asking questions like, “Hey, anybody have experience with failed spreadsheet imports?”
- Developers can answer questions like, “What assumptions does our import code make about spreadsheets?”
- Domain experts can be helpful for questions like, “What tricky things can be done with spreadsheets that prevent them from being recognized correctly in other programs?”
- Finally, some information can be found with some quick internet research, such as “What popular formats exist for spreadsheets?”
It is vital for your communications to be clear and specific during this process. Vague questions will receive vague answers. You should also require clarity and specificity from the user. If there is any ambiguity in his or her answers, ask for more information—don’t assume you know what he or she means. It’s easy to make incorrect assumptions that lead you attempt to solve a problem that doesn’t even exist. Pay extra attention to anything that seems unusual or out of place. These are often good jumping off points for further investigation, as they frequently indicate a hidden assumption that needs to be overturned to reveal the truth.
Step Two: Isolate the Cause
You’ve gathered all the information that is known about the problem. Now you must be a scientist, formulating theories and designing experiments to test them, with the goal of understanding exactly what part of the user’s process is breaking down and why. And as a scientist, you must not become too attached to any particular theory. If an approach isn’t panning out, abandon it and try another one. It’s important that you not blind yourself to the actual cause of the problem by fixating on something that turns out to be irrelevant.
The first thing you need to do in this step is reduce the problem space by eliminating potential sources of the problem until you drill down on the exact cause. For every possible cause, find a test to perform which will have a different outcome depending on whether that cause is the true cause. This will require some creative thought, but here are a few examples:
- The spreadsheet file might be corrupted. Try opening the spreadsheet in other programs.
- The user might be insufficiently privileged to import files. Have the user try to import a known good file.
- The high number of images in the spreadsheet might be preventing it from being recognized properly. Save a copy without the images and try importing that.
Ultimately, you are looking for the exact set of steps that are necessary and sufficient to reproduce the problem, so it’s important that your experiments provide you with new information. Try to produce evidence that contradicts your theory instead of seeking out data that supports your theory. It’s easy to find data that is consistent with multiple explanations of a problem, and thereby seems to support even incorrect theories. This is why you should use another program to try to open a spreadsheet you think might be corrupted, instead of merely trying to import a different spreadsheet. Whatever happens with the second spreadsheet proves nothing about the first, but if the first opens successfully in the other program, it must not be corrupted. Similarly, if you want to test the theory that the spreadsheet is too large a file, you may be tempted to have the user try to import a smaller spreadsheet—but whether this succeeds or fails, it doesn’t tell you if the first spreadsheet was too large. Try to import a larger spreadsheet. If that succeeds, then you know the first spreadsheet is not too large after all.
At every turn, try as aggressively as possible to prove yourself wrong. Attack every theory in such a way that only the truth can survive, and that’s exactly what you’ll be left with.
Step Three: Provide a Solution
Once you know what is blocking the user’s workflow, it’s time to become an inventor and devise a solution. You must apply your skills, knowledge, and creativity to find a way for the user to reach his or her goal. Be resourceful and flexible; adapt to the issue and work around it with any of the tools at your disposal. Sometimes this step will involve sending the user a plugin that expedites the desired workflow. In extreme cases it may involve shipping a patch. Most of the time, however, the user just needs a backup plan. You’ll need to craft one that matches the user’s specific problem, but here are a few illustrative examples:
- The spreadsheet is too large. Break it up into multiple spreadsheets and import it piecemeal.
- The spreadsheet’s structure is confusing the file importer. Reorganize it to a more standard structure.
- Normal GUI import methods don’t work. Use a different method, such as importing from the command line.
- The user’s current software version cannot import the file. Upgrade to a more recent version that can perform the import.
In articulating this plan to the user, you must ensure that you provide the necessary context. Anticipate the “Why?” questions the user may ask, and answer them preemptively. It’s also essential to keep perspective and not to get hung up on irrelevant details or confuse exactly which problem you’re trying to solve. If the user just needs to get the information from this spreadsheet into the system, that’s your goal—not importing a file of a particular format. If the user can save the spreadsheet as a CSV and get rolling that way, that is a perfectly valid solution.
Step Four: Record for Posterity
With the user satisfied, it is now time to become an archivist. The hard work is over; all that’s left to do is make sure this work is never hard again. The problem has been solved, so no one else should have to come in blind. The goal is to make sure the information you wish you’d had when you started investigating is easily available for the next person who runs into the same problem. Gather all of the following information and anything else that would be helpful:
- The version of the software on which the problem occurs.
- The minimal list of steps required to trigger the problem.
- Any relevant program settings or other environmental considerations.
- Any data or external files involved (such as a sample spreadsheet).
- Any workarounds or solutions found.
Record all of this data in the bugtracker. If it is relevant, get it added to the product documentation or public knowledge base as well. It’s easy for this to seem unimportant in the moment, but it’s essential for the team going forward. Document the information that renders the problem solvable using only that information. Your teammates, and indeed your future self, will be grateful.
If this process uncovered a product bug, this is also the time to file it with the product development team. Make sure to supply the same information here as well—the developers shouldn’t have to repeat any of your work. The less they have to do to find the bug, the faster they will repair it, and the sooner you can stop receiving emails and calls from users about it.
And there you have it! The Palantir Product Support approach to problem solving. It’s a lot more satisfying than running someone through a script and then putting them on hold!
While these steps are universal, applying them requires creativity and flexibility. There’s no surefire checklist that fixes everything. Most of the issues we face are far more complex and interesting than a stubborn spreadsheet, and every new issue means defining new diagnostic tests and finding new resolutions—and often, learning new skills. Being an engineer on Palantir’s Product Support team means continually broadening your horizons to help real people solve important problems.
We’re always on the lookout for more talent. If this sounds interesting to you, head on over to our careers page—maybe we’ll be working with you soon!