Recently, an artificial intelligence (AI) program learned to make puns and bad jokes. How bad? When the program tried to come up with a pun, it told researchers, “the greyhound stopped to get a hare cut.”
Clearly, they have some work to do. But if your goal is to make AI with a sense of humor, the way to know it’s working is if it makes people laugh. But determining if an artificial intelligence or machine learning program is working will take more subtle clues. As AI is being asked to do all sorts of new and esoteric tasks, what are the objective tests to determine if your AI project is a success?
The Gordon Flesch Company has its own AI program called AskGordy, a natural language, query-based interface based on IBM’s Watson AI. AskGordy’s goal is to use natural language processing and AI to find deeper and more relevant information from a collection of documents than an old-fashioned keyword search. One of our challenges has been to create metrics and benchmarks that can be used to determine relative success and return on investment (ROI).
Here are the key metrics we consider:
1. What’s the Problem You’re Solving?
To determine the success of an AI program, users need to consider big picture considerations including corporate objectives, business process optimization and user satisfaction, as well as the technical functions and features. These predetermined measures will form the basis to determine both project completion and ROI.
In a practical sense, that means taking a survey of existing conditions and user satisfaction in order to find your baseline. A survey of the same users can then be conducted after implementation to help document their opinions of the project. These should then be combined with Project Administration feedback to determine the project’s overall success.
2. Defining Obtainable Objectives
How can AI support your organization? What is AI “good at” (or not) and what is a realistic outcome? Which corporate business, innovation, AI or analytics objectives or policies does the application advance? Or is this project a preliminary opportunity to help develop such objectives? At the Gordon Flesch Company, we aim to identify which lines of business will support the application and whether it will increase efficiency, customer or user satisfaction, quality of research or response, and offer a competitive advantage.
3. Building a Better Business
When a business adopts AI to manage business processes, it obviously wants to see speed or efficiency gains. Do you have a clear understanding of your business processes and how the new AI solution will fit in with your existing business? For a document management system like AskGordy, this includes measures such as the time to respond, number of accurate hits on the first query, percent of questions answered in a first support call, or the total number of unique additional hits over and above normal search functions.
Any evaluation of your AI project should list items that, if successful, describe a completed and successful project. In the case of a document management system like ours, factors like the breadth of documents, the intensity of the natural language training, and other variables will influence how quickly AskGordy can master the task it is given.
4. Are Your People Satisfied?
An AI project cannot be successful if users have a bad experience. Telling bad jokes is one thing; making it harder to get your job done is unforgivable. First, define the Use Case, outlining what the human operators or users would like the AI agent to deliver. The project needs to include a comprehensive understanding of the current user experience and level of satisfaction compared to post-installation.
To make these subjective and qualitative measures meaningful, it is essential to capture a picture of the current state of user satisfaction.
5. Don’t Forget the Technical Considerations!
Perhaps the most important attribute of a successful AI implementation is to track all of the technical functions or features that were not available in the previous system. For example, our system is designed to provide a natural language interface and mobile access app where none existed. The Project Scope should anticipate and detail meaningful measures of success and compare them against pre-project status quo, such as:
- Documented search comparisons (pre-trial vs. post-trial)
- Dashboard reporting sessions and query information
- Documented machine-learning model performance
- Natural language query capability
- Mobile iOS and Android app capability
- Determined difference in access, discovery and decision support
The Gordon Flesch Company is a big believer in the future of AI in the office technology industry, but we believe it is important to think critically and deeply about how AI can fit with your business and help people do their work better and more efficiently, not replace them. AI adopters need to remember an AI-based solution may be able to help organizations dig deeper and make more intelligent business processes, or it may be that a non-AI based analytical solution will provide what is required.
The challenge is to adapt and find new ways to do the same work hand in hand with AI or machine learning tools. To learn more, visit the AskGordy website or download our free white paper below that explains it all.