Large Language Model (LLMs) |

Large Language Model (LLMs)

In Context

∙ The ability of Generative AI models to “converse” with humans and predict the next word or sentence is due to something known as the Large Language Model, or LLM.


∙ It is to be noted that while not all generative AI tools are built on LLMs, all LLMs are forms of Generative AI which in itself is a broad and ever-expanding category or type of AI. 

∙ LLMs are large general-purpose language models that can be pre-trained and then fine-tuned for specific purposes.

∙ An LLM is like a super smart computer program that can comprehend and create human-like text.

Meaning of LLMs

∙ Firstly, the ‘Large’ indicates two meanings — the enormous size of training data; and the parameter count. 

∙ In Machine Learning, parameters, also known as hyperparameters, are essentially the memories and knowledge that a machine learned during its model training.

How Many types of LLMs are there?

∙ Various types are there, type depends on the specific aspect of tasks they are meant to do.

∙ On the basis of architecture, there are three types — autoregressive, transformer-based, and encoder-decoder.

∙ Autoregressive: GPT-3 is an example of an autoregressive model as they predict the next word in a sequence based on previous words.

∙ Transformer-based: LaMDA or Gemini (formerly Bard) are transformer-based as they use a specific type of neural network architecture for language processing. 

∙ Encoder-decoder: Models that encode input text into a representation and then decode it into another language or format.

∙ Open-source: LLaMA2, BlOOM, Google BERT, Falcon 180B, OPT-175 B are some open-source LLMs

∙ Closed-source: Claude 2, Bard, GPT-4

How do LLMs work?

∙ It works on the principle of “deep learning”. It involves the training of artificial neural networks, which are mathematical models which are believed to be inspired by the structure and functions of the human brain.

∙ For LLMs, this neural network learns to predict the probability of a word or sequence of words given the previous words in a sentence.

∙ Once trained, an LLM can predict the most likely next word or sequence of words based on inputs also known as prompts.

Applications & Advantages of LLMs

∙ These models are trained to solve common language problems of humans such as text classification, question answering, text generation, document summarisation, aiding in marketing strategies etc. 

∙ They have the ability to continuously improve their performance when provided with more data sets. 

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments

You cannot copy content of this page

Would love your thoughts, please comment.x
Scroll to Top