The Fraunhofer Gesellschaft, one of Germany’s largest and most renowned research agencies, published a paper “FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and Scientific Use“, in which the authors describe a customized chat AI, baptized FHGenie.
The paper describes the motivation and requirements leading to the design, as well as the solution’s architecture.
1. Goals and Requirements
The goal of leveraging generative AI technology whilst guaranteeing confidentiality was reflected in the requirements list compiled from interviews with current and prospective users of the technology. The list includes the use of state-of-the-art AI models, confidentiality, compliance with European regulation, data confinement to Europe, ease of use, focus on natural language (over e.g. coding), responsibility, cost-effectiveness, and acceptable latencies.
2. Architecture
The FhGenie architecture is hosted primarily on Microsoft Azure within Fraunhofer’s Europe subscriptions to leverage security and compliance features. User authentication utilizes Azure Active Directory integrated with Fraunhofer’s identity and access management system. Users interact through a web application that handles prompt formatting to and responses from AI models within Azure OpenAI Services. The UI allows language selection, conversation state downloads/uploads for privacy, and temperature adjustment to control determinism vs potential hallucination. Load balancing was added between EU endpoints to provide sufficient bandwidth and low latency for GPT-4 while meeting cost targets.
An API also allows Fraunhofer software projects, authenticated via security tokens, to interact with the models. Responsible AI approaches in Azure OpenAI filter potentially harmful interactions into an abuse log. Customization options exist but require careful consideration of the impact on responsibilities between Microsoft and Fraunhofer. After an assessment, the default Azure system was deemed adequate for internal use. Non-technical means like guidelines were additionally implemented to encourage responsible use, given the limitations of technical controls for judging people.
3. Development and Deployment
FhGenie’s codebase uses NodeJS, Python, and the Flask framework. It has a frontend and a backend instantiated twice: once for the frontend and once for API access. The backend encapsulates the AI models spread across multiple Azure regions in the EU for redundancy. It also handles load balancing and prompt engineering, which enriches user questions with context and instructions before sending them to the AI models. They found overly specific instructions had adverse effects on their general-purpose chatbot. The load balancer detects and mitigates AI model throughput limits. Frontend sharing can impact API users, so they collaborate on solutions. The frontend implementation lacked robustness initially, but they added features and fixed vulnerabilities. Deployment uses Azure Bicep infrastructure-as-code, though some steps are still manual. For runtime management, they utilize Azure mechanisms for budgeting and monitoring usage, cost, health metrics, etc.
4. Reflections
Fraunhofer’s paper offers a rare and valuable opportunity to learn from a professional organization’s journey toward a proprietary Chat AI. The article not only lays out the many good reasons to embark on this journey but also provides an intuition for the challenges the traveler encounters. Developing and deploying FhGenie is far from straightforward. Significant resources had to be devoted to its development and will continue to be allocated to maintain, update, and further develop the solution. With over 30,000 employees in 76 institutes and research institutes and a strong central ICT department, Fraunhofer has the resources to master this task.
If your organization is looking to leveraging generativeAI as well, but lacks the resources or know-how to develop a solution similar to FhGenie, AIdoes.eu will help you out. Without any installation effort, infrastructure requirement, or need to monitor a rapidly changing generative AI market. Safely and compliant with European regulations. Offering you access to all the state-of-the-art models. Contact us if you want to learn more about AIdoes.eu.