Five practical ways to deploy Generative AI

Summary:

The wildly popular GPT-4, multimodal generative AI model succeeds ChatGPT, a proprietary instruction-following model has taken the world by storm. This brand-new computing model has the potential to unleash existing business models by radically transforming the way companies capture, create, and deliver value to customers.

However, building, owning and maintaining is a costly affair. Here’s brief run down on the complexities of this model .

A model with over 170 million parameters trained on millions of words from the web
Intensive carbon footprint just to train the model akin to driving a car to the moon back, and a reported daily CO2 emission of 23.04 kg
Costs over $4 million to train and does not even include AI human and technology resources

Bottomline – the cost to build, maintain and deploy can skyrocket burdening enterprises with owning its carbon footprint.

Background:

Before jumping into the adoption options below, it is critical to keep in mind that the AI solution stack varies according to ‘how it will be used’. This implies one must have a deep understanding of the AI use case being built or deployed.

The dependencies of which include:

Understanding of AI use case including its scope, context, purpose and value
Mapping AI use case to infrastructure, software and services

Let’s dive into five practical options based on the factors above:

Number	option	example	risk characteristics
1	Usability at the individual level: Most commonly, subscription-based plans are available for use on foundation models.	As simple as $20 monthly subscription to ChatGPT or GPT-4	fully open low risk control high auditability
2	For light-weight productivity tools and similar applications: Through the use of open-source APIs available for businesses that want to develop applications on top of its foundational models.	OpenAI APIs are leveraged to develop applications typically for marketing, productivity gains, and for creating assistants aligned with specific business needs.	gated-to-public low risk control high auditability
3	Specialized vertically aligned AI engines: Through the use of open-source API on top of its foundation models. This option typically does not differentiate or support personalization. If that is not a concern, foundation model like Open AI or Bard AI should support business needs.	Answering complex questions, pulling from vast amounts of legal or financial documentation, and drafting and reviewing annual reports, earning reports, etc.	gated-to-public low risk control high auditability
4	Use of off-the-shelf open source large language model (LLM): A cheap-to-build LLM model that supports AI democratization for enterprises wanting to build their own generative AI language models.	Databricks introduced Dolly, affordable model, that exhibits an instruction-following capabilities similar to ChatGPT. Trained on 6 billion parameters as opposed to 175 billion parameters in GPT-4. The LLM is cloned from the Alpaca open model built by Stanford.	open-model high risk control high auditability
5	As a service Foundation AI model: Provides ability differentiate on specialized use cases or to gain competitive advantage. Using services, companies have an option to build (personalized database) on high-quality pre-trained models specifically built and trained on specialized industry use cases, where developers may further customize or fine-tune.	NVDIA has over 50 specific use cases to explore across NLP, Speech AI, Computer Vision, Healthcare, Cybersecurity, Art and Creative workflow, etc. Scalable pretrained AI models can be applied across industries. Further customization and fine-tuning can lead to infinite possibilities for use cases.	closed-model medium-high risk control mid-high auditability

Additional Notes on Option 5:

NVIDIA’s as a service solutions foundation model is a family of cloud services with which enterprises will be able to build their own LLMs and run them at scale, calling them from enterprise applications.

In NVIDIA’s recent developer’s conference, NVIDIA outlined 3 services launched on limited access. Each include a pretrained model, data processing frameworks (providing raw data for differentiations), ability to couple personalization databases, inference engines and APIs to access the service.

Examples of these include:

NeMo for generating text
Picasso for visuals – So to elaborate, use of in the recent Adobe Firefly (Adobe debuting its beta-version to generate images & videos from text) and soon to be integrated into Abode Photoshop, Illustrator and Express, and NVIDIA’s own Picasso/Imagery systems for generating images, videos and 3D apps
BioNeMo for molecular structures

Closing Remarks

In order to differentiate on specialized use cases or to gain competitive advantage, organizations have an option to:

build on open model like Dolly from affordability standpoint (option 4)
secure high-quality pre-trained models specifically built and trained on specialized industry use cases, where developers may further customize or fine-tune (option 5)

Every enterprise building open-source foundations or specialized pre-trained models, should prioritize the importance of using ethical data that is transparent, explainable, and obtained with consent. The model should be evaluated for risks associated with bias, privacy, security and reliability of its outcomes.