Extending AI Accessibility Across Cloud to Edge and Unleashing the Power of Gen AI

Sonya_Wach · ‎03-04-2024

Over the past few years, AI technology has exploded in usage everywhere, from content creation to personal assistants, autonomous vehicles, healthcare, and more. Lisa Spelman, Intel’s Corporate VP and GM of Intel® Xeon® Products and Solutions delivered the Great Minds keynote at CES 2024 about how enterprises are leveraging cutting-edge technologies to unlock the power of AI from cloud to edge. With decades of research and investment and the cloudification and scale of infrastructure, AI workloads have accelerated from Machine Learning to more advanced Deep Learning and Generative AI (Gen AI). The increasing intelligence of models and the autonomy of machines will empower developers to create even more solutions. Intel plays a role in ensuring everyone has access to AI with our compute ubiquity and flexibility, with AI PCs to powerful data center AI accelerators, from the edge to the cloud. Intel is dedicated to open ecosystems to ease development and deployment for all developers with responsible AI.

Model Improvements with Iterate AI*

Through the help of Intel’s AI hardware and software solutions, developers can easily utilize AI and push the limits of their solutions. Iterate AI is working to help customers develop and implement AI solutions with a low code AI platform and ready-made applications that allow enterprises to build AI solutions easier and faster. Iterate AI’s computer vision application is scaled across 4,000 edge instances for use cases, including license plate recognition, and utilizes the OpenVINO™ toolkit to increase CPU efficiency and load management by 300%. Iterate AI utilizes Intel® Xeon® CPUs because of Intel’s rapid speed to market, compute availability, familiarity, and cost-effectiveness. Intel® Xeon® CPUs further optimize Iterate AI’s applications using software optimizations like the Intel® Extension for DeepSpeed*.

Iterate AI works with companies like Ulta Beauty to develop generative AI solutions to better the guest experience. Ulta Beauty Quazi*, a beauty personalization platform, is built on the Iterate Interplay platform. It offers personalized recommendations and relevant searches, making accessing powerful data and applying AI solutions at scale easier. Through Interplay running on efficient Intel® Xeon® CPUs and software optimizations, Quazi performs with 75% less compute and 20x less memory, resulting in smaller infrastructure and cost-savings.

With the growing popularity of Gen AI and high-performance computing, increasing compute power while maintaining power efficiency is necessary. Having the right compute matched to solution requirements and the sizes of models delivers the highest ROI solution. Intel® Xeon® CPUs include specialized hardware and software tuning to improve the performance on generative AI workloads by up to 5x. In addition, Intel® Gaudi® 2 accelerators are built to handle the most intensive training and largest data sets, and Intel is setting the foundation with the AI PC. These end-to-end solutions allow for optimized infrastructure based on the needs of varying AI solutions. On a Llama v2 70B parameter model, Databricks* found that Intel® Gaudi® 2 offered the best model and inference performance per dollar.

Intel has continuously focused on the open-source community and built an open foundation by contributing to popular tools and frameworks such as PyTorch*, TensorFlow*, ONNX*, DeepSpeed*, Hugging Face*, and more. The commitment to openness, as well as the commitment to getting the best performance efficiency through software optimizations, is a part of the foundation of Intel. Making it simple for developers to create AI solutions is crucial in bringing AI everywhere.

Gen AI Code Generation Demo

Code generation assistance is becoming more popular as a tool to help developers increase productivity, with the potential to grow even further. Wei Li, Intel’s VP, and GM of Artificial Intelligence and Analytics, presented a code generation demo that shows how to use copilot to help create a specialized chatbot through a few prompts. AI accelerators and Intel Copilot were used to create a simple chatbot as a starting point and add a title feature with a code modification. Intel copilot is built on top of Intel® Extension for Transformers* (ITREX) and can run on your PC for copilot chat, run on a server for code generation, and includes a smart model switch. The chatbot demo utilized Intel® Gaudi® 2 accelerators to run a large language model and generate the code, an AI PC to generate the code improvements and Intel® Xeon® CPUs for the chatbot itself.

To run on varying hardware, abstracting the hardware differences is possible through tools like oneAPI. Intel copilot also utilizes AI frameworks such as PyTorch, Hugging Face, and DeepSpeed. Through AI Tools and Framework optimizations like oneAPI, developers don’t have to worry about lower-level hardware differences and can develop solutions more seamlessly across platforms.

(view in My Videos)

Intel is committed to the open-source ecosystem, especially in AI. Intel remains one of the top three contributors for PyTorch, a leading AI framework, and has been working with Hugging Face, a leader for large language models, to develop Intel® ITREX as an extension for transformers. Intel® ITREX was even able to achieve the number one spot on the Hugging Face leaderboard for 7B parameter LLMs. Open-source models are key to developing responsible AI, as developers can see the model, the dataset, and the training software. To further the development of responsible AI, Intel is creating new standards for open AI as a founder of MLCommons* AI Safety (AIS) initiative, alongside partners like Google* and Stanford University*. Another principle important for AI is security, and Intel’s confidential computing uses trusted domain extension technology to isolate and secure data.

As part of the Unified Acceleration Foundation (UXL), Intel is joined by other industry leaders in driving an open standard accelerator software ecosystem to unify the heterogeneous compute ecosystem around open standards. Tools like oneAPI offer open and unified multi-vendor programming models that deliver a common developer experience across architectures for faster AI application performance and greater innovation.

Generative AI Tools with Accenture*

Accenture is using generative AI to scale the use of AI across all brands that they work with. For example, Accenture has partnered with McDonald’s* for improvements in in-store layout and supply chain with Edge, Computer Vision, and GenAI solutions, as well as Telco* for a 30% improvement in customer service and a 60% improvement in customer satisfaction score with a strong digital foundation thanks to Intel. Accenture utilizes Intel® Xeon® CPUs for general-purpose AI, which allows Accenture to offer a cost-competitive solution for inference of workloads, such as a sub-10B parameter model. Utilizing Intel also provides an easy and fast way to scale, such as scaling small workload inference to LLMs with Intel® Gaudi® 2. Accenture also developed a generative AI sandbox on the Intel® Developer Cloud, allowing clients to experiment with their models rapidly. For example, a large utility company was able to utilize the sandbox and begin experimenting with Gen AI for asset management by using scanned asset documentation that combined images and natural language in a custom multimodal model built in less than a month. Intel® Developer Cloud offers a platform that allows developers to build and scale rapidly at a low cost on a variety of Intel hardware and Intel-optimized Software Stacks to optimize AI workloads further.

Through many of these examples with various partners, we can see how AI is revolutionizing the industry with enhanced customer experiences, new use cases being invented, and productivity being improved. Compute ubiquity and flexibility that increase performance, openness that increases ease of deployment, and responsibility through development, data security, and deployment are the three foundational principles for AI development at Intel and our commitment to bringing AI everywhere.

We encourage you to check out Intel’s AI Tools and Framework optimizations and learn about the unified, open, standards-based oneAPI programming model that forms the foundation of Intel’s AI Software Portfolio. Also, check out the Intel Developer Cloud to try out the latest AI hardware and optimized software to help develop and deploy your next innovative AI projects!