{"id":3837,"date":"2026-05-11T20:00:00","date_gmt":"2026-05-11T18:00:00","guid":{"rendered":"http:\/\/stocks-future.com\/?guid=b147b0df1c9de2c81c233d8170071f15"},"modified":"2026-05-11T20:00:00","modified_gmt":"2026-05-11T18:00:00","slug":"friendliai-expands-to-san-francisco-to-scale-frontier-ai-inference-for-open-weight-and-custom-models","status":"publish","type":"post","link":"https:\/\/stocks-future.com\/?p=3837","title":{"rendered":"FriendliAI Expands to San Francisco to Scale Frontier AI Inference for Open-Weight and Custom Models"},"content":{"rendered":"<p>\n<i>New 7,000-square-foot SoMa office anchors FriendliAI\u2019s global push as AI agents drive a generational surge in token consumption<\/i><\/p><br\/><a href=\"https:\/\/mms.businesswire.com\/media\/20260511481484\/en\/2800153\/5\/friendliai-logo-primary-rgb.jpg\"><img src=\"https:\/\/mms.businesswire.com\/media\/20260511481484\/en\/2800153\/22\/friendliai-logo-primary-rgb.jpg\" \/><\/a><br\/><a href=\"https:\/\/mms.businesswire.com\/media\/20260511481484\/en\/2800153\/5\/friendliai-logo-primary-rgb.jpg\"><img src=\"https:\/\/mms.businesswire.com\/media\/20260511481484\/en\/2800153\/21\/friendliai-logo-primary-rgb.jpg\" \/><\/a><p>SAN FRANCISCO--(BUSINESS WIRE)--<a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=http%3A%2F%2Fwww.friendli.ai&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=FriendliAI&amp;index=1&amp;md5=d510f8b59d2921caf8e6a8eb25cfb90b\" rel=\"nofollow\" shape=\"rect\">FriendliAI<\/a>, The Frontier AI Inference Cloud, today announced the opening of its new San Francisco office at 20 Hawthorne Street, occupying 7,000 square feet in the historic Crown Point Press building, around the corner from the San Francisco Museum of Modern Art. The expansion places FriendliAI at the heart of the Bay Area AI ecosystem and closer to the customers, partners, and developers building the next generation of AI applications.<\/p><p>\nThe expansion lands at an inflection point for AI inference. Two forces are driving the shift: AI agents \u2014 which plan, reason across many steps, and call tools on every turn \u2014 require five to thirty times more tokens per task than chatbots, and that consumption compounds as agents move from pilots into always-on production workflows. Meanwhile, the latest open-weight models including Z.ai\u2019s <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Ffriendli.ai%2Fmodels%2Fzai-org%2FGLM-5.1&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=GLM-5.1&amp;index=2&amp;md5=69467f5520c7c81bf45872790c34d704\" rel=\"nofollow\" shape=\"rect\">GLM-5.1<\/a>, Moonshot AI\u2019s <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Ffriendli.ai%2Fmodels%2Fmoonshotai%2FKimi-K2.6&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=Kimi+K2.6&amp;index=3&amp;md5=b1b6c931b818bb305de4359c31bdd356\" rel=\"nofollow\" shape=\"rect\">Kimi K2.6<\/a>, <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Ffriendli.ai%2Fmodels%2Fdeepseek-ai%2FDeepSeek-V4-Flash&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=DeepSeek+V4&amp;index=4&amp;md5=04b3f6795c57942423592f99a076ac67\" rel=\"nofollow\" shape=\"rect\">DeepSeek V4<\/a>, and <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Ffriendli.ai%2Fmodels%2Fnvidia%2FNVIDIA-Nemotron-3-Super-120B-A12B-NVFP4&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=NVIDIA+Nemotron+3&amp;index=5&amp;md5=1b62b6ad49f85bef6e7a3f485754bb5c\" rel=\"nofollow\" shape=\"rect\">NVIDIA Nemotron 3<\/a> now match or exceed leading closed models like Anthropic\u2019s Claude Opus at a fraction of the cost, and custom fine-tunes align even more tightly with enterprise use cases. Production-grade inference infrastructure has become the bottleneck \u2014 and the prize.<\/p><p>\n\u201cSan Francisco is the epicenter of AI innovation, and a deeper presence here lets us partner with the customers and developers shaping what comes next,\u201d said Byung-Gon Chun, CEO of FriendliAI. \u201cThe industry is no longer asking whether to build with AI \u2014 it\u2019s asking how to run AI in production, profitably, at scale. FriendliAI, The Frontier AI Inference Cloud, was built for exactly that.\u201d<\/p><p>\n\u201cInference is where AI economics are won or lost,\u201d said Brian Yoo, Chief Business Officer at FriendliAI. \u201cEvery percentage point of GPU efficiency translates directly to margin, and every millisecond of latency translates to user experience. Putting senior commercial and engineering leadership on the ground in San Francisco lets us move at the speed our customers need as they scale.\u201d<\/p><p>\nFriendliAI was founded by Professor Byung-Gon Chun and members of his research team at Seoul National University, where they pioneered continuous batching \u2014 the inference optimization technique that is now an industry standard. Today FriendliAI runs state-of-the-art open-weight and custom models at production scale with industry-leading throughput, latency, and reliability. Independent benchmarks from Artificial Analysis and OpenRouter rank FriendliAI as the top inference provider for models such as GLM-5.1 and Gemma 4 across output speed, latency, tool calling, and structured outputs. The company partners with model creators on launch \u2014 most recently as a Day 0 partner for NVIDIA Nemotron 3 and Z.ai\u2019s GLM-5.1 \u2014 and with cloud providers including AWS, OCI, and Samsung Cloud Platform on infrastructure to scale globally.<\/p><p>\nCustomers including Twelve Labs and LG are already scaling with FriendliAI in production, and that momentum is translating into rapid business growth. FriendliAI is on a trajectory to grow revenue tenfold this year, with a goal of growing another tenfold the year after, as AI-native and AI-augmented SaaS companies migrate production workloads to its platform. The San Francisco expansion is built to support the trajectory: FriendliAI plans to significantly grow its U.S. team across go-to-market, partnerships, and engineering functions over the coming year.<\/p><p>\nThe bright, loft-style space is also purpose-built as a hub for the AI builder community, hosting developer meetups, hackathons, and executive briefings on the practical realities of deploying inference at scale \u2014 from open-weight model deployments and GPU efficiency to multimodal and agentic workloads.<\/p><p>\nLearn more at <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=http%3A%2F%2Fwww.friendli.ai&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=www.friendli.ai&amp;index=6&amp;md5=74696d2d10118d04981d82f4e44f480f\" rel=\"nofollow\" shape=\"rect\">www.friendli.ai<\/a>, or explore open roles in San Francisco at <a  href=\"https:\/\/cts.businesswire.com\/ct\/CT?id=smartlink&amp;url=https%3A%2F%2Ffriendli.ai%2Fcareers&amp;esheet=54532463&amp;newsitemid=20260511481484&amp;lan=en-US&amp;anchor=friendli.ai%2Fcareers&amp;index=7&amp;md5=3f166f1f7929bcd02931c96a57044e0e\" rel=\"nofollow\" shape=\"rect\">friendli.ai\/careers<\/a>.<\/p><p>\n<b>About FriendliAI<\/b><\/p><p>\nFriendliAI is The Frontier AI Inference Cloud. Built by the researchers who invented continuous batching, an inference optimization technique that is now an industry standard, FriendliAI efficiently runs state-of-the-art open-weight and custom models, enabling model ownership and advanced performance tuning. By optimizing every layer of the inference stack \u2014 from GPU kernels to serving infrastructure \u2014 FriendliAI delivers industry-leading throughput, latency, and reliability for engineers deploying frontier AI in production.<\/p><br\/> <b>Contacts<\/b> <br\/><p>\n<a  href=\"mailto:press@friendli.ai\" rel=\"nofollow\" shape=\"rect\">press@friendli.ai<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>New 7,000-square-foot SoMa office anchors FriendliAI\u2019s global push as AI agents drive a generational surge in token consumptionSAN FRANCISCO&#8211;(BUSINESS WIRE)&#8211;FriendliAI, The Frontier AI Inference Cloud, today announced the opening of its new San Fran&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-3837","post","type-post","status-publish","format-standard","hentry","category-infos-businesswire"],"_links":{"self":[{"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/posts\/3837","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/stocks-future.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3837"}],"version-history":[{"count":1,"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/posts\/3837\/revisions"}],"predecessor-version":[{"id":3838,"href":"https:\/\/stocks-future.com\/index.php?rest_route=\/wp\/v2\/posts\/3837\/revisions\/3838"}],"wp:attachment":[{"href":"https:\/\/stocks-future.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3837"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/stocks-future.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3837"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/stocks-future.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3837"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}