AI? While Model Performance is Crucial, Speed and Efficiency are Equally Essential | TechBlog | Research

Cloud Research Team Leader Min-sung Jang of Samsung SDS Research gave a presentation at the Intel AI Summit Seoul 2025. His talk focused on the team’s research priorities and featured a detailed case study examining the inference performance of Large Language Models (LLMs) using Intel’s AI accelerator, Gaudi 3. The case study offers quantitative evidence supporting the potential of AI accelerators to serve as viable alternatives to traditional GPUs in LLM inference tasks, emphasizing their practical applications.

👉 Click here fore more information on Intel AI Summit Seoul 2025

Interest in the Latest Trends in AI Technology and Industry-Specific Use Cases

There was a slight concern about rain, but it turned out to be just a light drizzle. As a result, the venue at COEX in Samsung-dong welcomed nearly 1,000 participants, far exceeding the expected 700, creating a palpable excitement even before the event officially began.

[Photo 1] Intel AI Summit Seoul 2025

[Photo 2] The bustling atmosphere on-site

Under the theme "AX: Innovation through AI, Designing the Future," the event served as a platform to share the latest trends in AI technology and industry-specific use cases, as well as to discuss the future. Even before the official opening at 10 a.m., the booths were packed with visitors, reflecting the industry's thirst for AI. Every corner of the venue buzzed with energy and enthusiasm

The Atmosphere of the Event Reaches New Heights

The Intel AI Summit served as a platform to share the latest AI trends and innovative use cases, exploring the potential of AI across industries and discussing future directions for its development. Alongside Samsung SDS, experts from major IT companies such as Lenovo, Naver Cloud, SK Hynix, Dell, Microsoft, Supermicro, Cisco, HPE, LG Innotek, and LG Electronics, as well as academic and public sector representatives from KAIST, the Ministry of SMEs and Startups, and the Korea Institute for Startup and Entrepreneurship, participated. They engaged in in-depth discussions on AI technology trends and collaborative strategies across industries.

[Photo 3] Key participating organizations from around the world

[Photo 4] Networking opportunities with diverse attendees

The event officially kicked off with a welcoming speech by Tae-won Bae, CEO of Intel Korea, followed by greetings from Hans Chuang, General Manager of Intel's Sales, Marketing and Communications Group (SMG) for Asia Pacific and Japan, and a keynote speech by Lynn Comp, head of Global Sales & GTM for Intel.

Samsung SDS’s Research Direction for AI Acceleration and Efficiency

The Cloud Research Team at Samsung SDS Research conducts various research aimed at enhancing system performance, improving scalability to handle increased workloads, and increasing efficiency to reduce energy consumption. During the presentation, Research Team Leader Min-sung Jang highlighted two key areas of research.

He began by introducing the Cloud Research Team's major research initiatives, emphasizing APEX, an AI platform designed to optimize the management and orchestration of AI workloads and GPU resources, and FireQ, a quantization technology that enables large language models (LLMs) to operate faster and more efficiently with fewer resources. Notably, FireQ has garnered significant attention from attendees, as its paper and code have recently been made open-sourced.

Paper: https://arxiv.org/abs/2505.20839
Code: https://github.com/llm-fireq/fireq

He then presented an analysis of the inference performance of Intel's new AI accelerator, Gaudi 3, by breaking it down into GEMM operations* and communication stages. Through performance validation of GEMM operations, the characteristics and limitations of the accelerator's computation unit (MME**) were identified based on various operation shapes, enabling a detailed analysis of inference performance. This case provided a quantitative understanding of the potential for dedicated hardware AI accelerators to replace general-purpose GPUs in LLM inference tasks. * GEMM(General Matrix to Matrix Multiplication): A standard operation designed to efficiently and quickly perform large-scale matrix multiplications, which are core computations in AI models.
** MME(Matrix Multiplication Engine): A hardware computation unit in AI accelerators dedicated to performing matrix multiplication operations.

[Photo 5] A lecture hall reminiscent of a concert venue

[Photo 6] One more stunning solo shot!

About this Event?

This event was more than just receiving information; it was a valuable opportunity to gain new insights by listening to innovative AI cases from both domestic and international companies. The biggest takeaway was the opportunity to network with industry-leading AI experts, fostering discussions on potential future collaborations.

The Cloud Research Team at Samsung SDS Research took this opportunity to share their research direction and practical analysis cases with a broader audience. Committed to advancing the AI community, they aim to continue their contributions in the years to come.

AI? While Model Performance is Crucial, Speed and Efficiency are Equally Essential – Review of Cloud Research Team Leader’s Presentation at the Intel AI Summit Seoul 2025

Interest in the Latest Trends in AI Technology and Industry-Specific Use Cases

The Atmosphere of the Event Reaches New Heights

Samsung SDS’s Research Direction for AI Acceleration and Efficiency

About this Event?

AI? While Model Performance is Crucial, Speed and Efficiency are Equally Essential
– Review of Cloud Research Team Leader’s Presentation at the Intel AI Summit Seoul 2025