We’re excited to announce today that Aembit can now provide policy-based, identity-driven access from your workloads (applications and scripts) to the most commonly used large language models (LLMs).
While we use OpenAI, Claude, and Gemini as examples below, Aembit’s approach to workload IAM applies broadly to other LLMs offered as a service or self-hosted on your cloud instances.
Business Context for Greater LLM Identity and Access Security
In November 2022, OpenAI introduced ChatGPT, and the digital world hasn’t been the same since. Companies like Google, Microsoft, and Anthropic have invested billions of dollars, and AI has made huge strides forward. Most people have found ways to leverage generative AI. Students are using it to help write essays. Developers are using it to write code. Marketers are using it to do research.
Organizations have taken various approaches to securing access to LLMs. In many cases, user access to LLMs have been completely blocked in fear that proprietary code and/or credentials and API keys will be manually uploaded. Some organizations are leaving it ungoverned and trusting their employees will safely manage AI on their own.
At the same time, governing use of AI is moving well beyond a user simply accessing a chatbot. As LLMs are advancing, enterprises are building applications and services that LLMs and SaaS-based generative AI services leverage to bring real value to both their own organization and their customers.
Ungoverned Access: LLMjacking and Other Risks
Securing access to LLMs isn’t just necessary based on theoretical threats. On May 6, container security company Sysdig posted a blog outlining a new attack, known as LLMjacking, that leveraged stolen cloud credentials to target 10 cloud-hosted large language model (LLM) services. Attackers acquired credentials from a widely targeted system exploiting a vulnerability in Laravel (CVE-2021-3129), aiming to resell access to the LLMs with the account owners paying for that access.
Beyond these brazen attacks, however, there are even more subtle risks at play as your organization adopts AI:
Where are my developers adopting or testing the use of AI?
Are they appropriately managing credentials, which in turn provide access to sensitive data?
How many people are manually administering, storing, and rotating those credentials?
These are, in fact, risks that your organization deals with to secure almost every piece of sensitive data in your company, but with emerging AI services it has come front-and-center.
Workload Identity and Access Management for AI
In light of this evolution – and with exploited vulnerabilities already a reality – enterprises require a robust, policy-based way to help secure access for non-human workloads that access these services
The Aembit Workload Identity and Access Management Platform now offers support available for OpenAI, Gemini (in beta), and Claude, and we have built an extensible platform that can easily cover more services based on authentication methods used.
Aembit brings the concepts of identity and access management to your development of AI. While other approaches are thinking about how users access AI tools and the underlying data they can see, Workload IAM is solving for how applications, scripts, and services can access LLMs.
With Aembit:
- Organizations get a policy-based way of securing access to LLMs.
- They eliminate the need to scatter credentials throughout workloads, reducing risk of breaches and leaks.
- Developers no longer have to code authentication into their applications.
- Organizations gain better audit and visibility into ongoing LLM workload access.
Workload Discovery of LLMs
While developers and DevOps organizations have a sense of where LLMs are being used, distributed application teams within lines of business may be spinning up servers and applications. In these cases, the discovery of services and applications accessing LLMs may be done in parallel to ensure coverage regardless of where the application lives. Getting an expansive view of LLM usage is the first step to enabling secure access.
To learn more about Workload Discovery, see https://aembit.io/blog/introducing-aembit-preview-for-workload-discovery/
How to Enable Secure Access to LLMs with Aembit
Enabling secure access to LLMs is quick and easy. To configure access to the many models from OpenAI, including ChatGPT, follow the steps outlined below:
1) Generate a project API key in the OpenAI account portal. User API keys are also available but will be deprecated soon.
2) Create a ‘credential provider’ using that project API key.
3) Create a ‘server workload’ using the host api.openai.com.
4) Create a ‘policy’ that ties these all together.
5) You may optionally configure conditional access to further protect the access between client and server workloads.
Your developer can now make calls to OpenAI without worrying about authentication. Here is an example using the GPT-3.5 model:
To learn more about OpenAI policies, visit https://docs.aembit.io/server-workloads/openai
Aembit has also qualified support for Google Gemini (formerly known as Bard) and Anthropic Claude (see https://docs.aembit.io/server-workloads/claude), along with many other API-enabled AI platforms.
No Code Auth to LLMs
Aembit enables authentication to APIs without code changes, using what we call no-code auth. Developers not needing credentials enhances security, enables flexibility in code, and highlights what is possible.
For example, our solution allows developers to further enhance how they are leveraging LLMs. Organizations may write an abstraction layer with normalized calls like ‘fetch_list_llm’ that can request messages from one or many LLMs at the same time.
Aembit takes this high-level call and injects the authentication for each of the LLMs without requiring developers to make changes to the code. A side benefit is this method also allows operations to better track who is making calls to what, allowing LLM billing to be better directed to the appropriate internal teams.
Ready to Learn More?
We invite DevOps teams, security professionals, and software developers to give Aembit a try. We provide production-grade service for up to 10 workloads for free, and we’re happy to help you get set up and running.