We’ve all witnessed the hype around artificial intelligence (AI), and the financial services industry has certainly seen its share. However, while AI does indeed have great potential for the financial services industry, delivering AI at scale has many challenges. We are talking supercomputer levels of performance and execution, and standard legacy IT is simply not up to the task.
Success requires a good, high-level understanding of AI and an understanding of the technical requirements that the planned AI project will impose on the infrastructure. Here are the top lessons learned from our experiences to date:
1. Don’t underestimate the data challenge.
Training AI algorithms requires huge volumes of data, and invariably one of the key challenges for organizations is to get access to the data needed to train your algorithm. As a result, we consistently see the time to internally source data taking much longer than expected. We recommend starting this process ASAP.
2. Don’t be rigid about tools.
It’s easy to get sucked into proprietary tools and systems, but adopting open standards and avoiding vendor lock-in should be key architectural principles. Likewise, you want to be able to take advantage of current innovation happening in the open source analytics and AI domains.
3. Determine whether your initiative is strategic or tactical.
If it’s tactical, recognize that shifting to a strategic solution later could mean discarding the setup and starting over. If strategic, then look to build on, and invest in, proven foundations that enable scale and performance.
4. AI is a challenge to proof of concept (POC).
At small scale (single graphics processing unit [GPU]), AI experiments can run on any hardware, but this changes quickly with scale. Challenges include:
- Compute – Lab environments often lack the resources to effectively simulate and test large AI workloads.
- Data – Accessing meaningful datasets is problematic and time-consuming. Organizations often pivot to smaller datasets, but this creates significant risk. If your real-world requirement is a 1PB dataset, don’t test on a 1TB dataset. Use real-world data and use cases when possible.
- Synthetic testing – Organizations are often not specific about AI workload requirements. Test a wide range of data types and sizes. Certain infrastructures are good with large files, but not with small files, which is problematic. Look for flexible solutions that deliver, regardless of changing business requirements.
5. Identify the critical capabilities up front.
- Capacity – What’s your day one requirement? What size training dataset do you anticipate in 1, 3 or 5 years?
- GPUs – Where do you want to start? What number of GPUs will you scale to?
- AI as a service – Do you have a single defined training dataset? Or are you building an infrastructure that can support multiple undefined AI workloads – AI as a service?
6. Preparation is key.
AI involves about 80 percent data preparation and 20 percent training – yet most organizations focus on training. As a result, the preparation steps are neglected. Data is deployed, copied and duplicated on whatever infrastructure may be available. Seize the opportunity to build a robust, efficient, scalable, consolidated infrastructure, or a “data hub.”
7. Recognize that few people understand the end-to-end solution.
Delivering an integrated system capable of supercomputer levels of performance is not standard IT. Data scientists understand the tools and frameworks but seldom understand the impact on the infrastructure. Likewise, infrastructure experts have knowledge within a specific technology domain. Few individuals have the deep knowledge required across compute, storage and network. Individuals who have a good understanding of AI and data science are a rare breed!
Financial services organizations have two options for creating a high-performance infrastructure to support AI initiatives. One option is to build it yourself. While this do-it-yourself approach can seem daunting, incorporating scale-out storage solutions as a foundation can reduce some of the complexity. The other option is to leverage a purpose-built platform, thereby accelerating your AI initiatives while also reducing risk.
Whichever option you choose, first consider the bigger picture — beyond the math, models and data science — to help minimize the challenges of building AI at scale.