How does a typical chatbot works?
Ever wondered how these chatbots work if you are a programmer it’s essential for you to understand how they work and if you aren’t a programmer it’s still fine you can learn something new so let me elaborate the workings for you all.
Nowadays chatbots like Chatgpt, perplexity, Google’s Gemini and Blackbox ai are very popular among people, These chatbots are making everything easy as AI is rapidly evolving it is also evolving the way we work with so it’s essential to understand how this technology works. I will explain in programmer point of view
Rightnow these chatbots are leading the market: on number 1 spot chatgpt comes from OpenAI it almost covers 80% of the market then there is Google’s Gemini and than perplexity and blackbox ai. I was reading an article and I came to know that Meta AI is also gaining popularity so it’s good for the future until unless they are used for good things.
so basically these chatbots are based on Large Language Models (LLMs) and these (LLMs) are built using transformer neural networks (this is a type of neural network that processes data).
1- Architecture:
As I have mentioned above these chatbots are based on LLMs and LLMs are built using TNNs (transformer neural networks) so the core components are:
— TOKENIZER → This component basically converts text data into numerical data that model can understand.
— TRANSFORMER ENCODER →Reads the input given tokens and translates it in numerical values.
— TRANSFORMER DECODER → Generates output tokens auto-regressively based on the input representation.
2- Training Process:
— DATA COLLECTION → Gather large text data from the internet it could be from books, articles and websites.
— PREPROCESSING → In this step scientists have to clean data and convert that data into plain text, split it into tokens.
— TOKENIZATION → In this step scientists have to assign a unique ID to each token.
— MASKING → In this step scientists simply masks some tokens and train the model to predict them.
— TRAINING → Feed the masked tokens to the model and backpropagate loss to update weights.
Till now the whole process makes our model to understand general language representations.
3- Fine Tuning:
— COLLECT TASK SPECIFIED DATA → This includes dialogue data sets, Q&A, pairs etc.
— FINETUNE ON TASK DATA → This step generally trains modelto specialize the task data.
— REWARD MODLING → In this step human feedback is used to train a reward model that scores the quality of responses.
— ITERATIVE REFINEMENT → In this step reward model is used to fix the quality of model’s output.
THIS ALOWS CHATBOT TO ENGAGE IN CONTEXTUAL CONVERSATIONS!
4- Inference:
— TOKENIZE INPUT → In this step user’s input is converted into tokens.
— PASS THROUGH ENCODER → In this step encoder reads the input and generates a presentation.
— AUTOREGRESSIVE DECODING→ Decoder generates output tokens one-by-one, feeding each token back in to predict the next.
— DETOKENIZE OUTPUT → Convert the generated tokens back into the readable text.
Comments
Post a Comment